Portrait of Shayne Longpre

Shayne Longpre

Member of Technical Staff, Anthropic

PhD, MIT · Founder, Data Provenance Initiative

I study the data behind AI systems and the public systems AI is reshaping: the web, markets, science, governance, and the information commons.

Bio

Researcher, builder, and public technologist.

I am a Member of Technical Staff at Anthropic, where I work on AI systems, data, evaluation, and public impact.

I recently completed my PhD at MIT, advised by Sandy Pentland, and previously conducted AI research at Google Brain, Apple, and Stanford.

I founded the Data Provenance Initiative, a 50+ member research collective auditing AI datasets and ecosystems. My research has received five best or outstanding paper awards and broad coverage in outlets including NYT, WaPo, The Atlantic, and MIT Tech Review.

Anthropic2026-MIT2021-2026Google2022, 2024-25Apple2018-2021Stanford2012-2018
Copy bio

Throughline

Data, institutions, and public AI systems.

My work moves between technical AI research, empirical audits like the Data Provenance Initiative, and public arguments about how AI should be built, measured, and governed.

Data for AI systems

Building more reliable, efficient, and generalizable AI systems through data-centric methods.

AI's public footprint

Auditing how AI reshapes the web, data commons, markets, scientific practice, and accountability.

Open model ecosystems

Measuring how open models, benchmarks, licensing, and transparency shape technical and geopolitical power.

Featured work

Research with live public infrastructure.

Dashboards, audits, open letters, reports, and papers built for both technical scrutiny and public use.

Research collective · infrastructure · public data audit

Data Provenance Initiative

A 50+ member research initiative auditing the licensing, attribution, consent, and transparency of the data that powers AI systems.

Flaw disclosure · safe harbor · accountability

Third-Party AI Evaluation

Research and policy work arguing for robust independent AI evaluation, coordinated disclosure, and legal protections for public-interest auditing.

Recent

Current work.

2026Joined Anthropic as Member of Technical Staff.

2026Open model ecosystem data featured in the Stanford AI Index Report.

2025ATLAS released, with practical scaling laws for multilingual transfer.

2025Leaderboard Illusion accepted to NeurIPS and covered by TechCrunch, Ars Technica, 404 Media, and others.

Selected papers

A few anchors.

See publications