top of page

Register to Be Notified of New Summaries!

Do State-Level Gains in NAEP Scores Predict Better Long-Term Economic and Social Outcomes?

  • Writer: Greg Thorson
    Greg Thorson
  • 5 days ago
  • 7 min read
ree

This study asks whether changes in state NAEP math scores predict long-term economic and social outcomes for the students who experienced them. The authors link state-level 8th-grade NAEP math results from 1990–2019 to later-life data from the Census, American Community Survey, and FBI crime records. They find that a one–standard deviation increase in 8th-grade math scores is associated with about an 8% rise in adult earnings, higher educational attainment, and lower rates of teen motherhood, incarceration, and arrests. These results suggest that gains in NAEP performance meaningfully forecast better life outcomes across multiple generations.


The Policy Scientist’s Perspective

The relationship between educational achievement and later-life outcomes is one of the most important questions in U.S. social policy, cutting across education, labor, and economic development. This article addresses that question with unusual scope, linking three decades of state-level NAEP results to national data on income, education, and incarceration. Although the study relies on observational methods rather than causal inference techniques, the analysis is statistically rigorous, transparent, and unusually comprehensive. The underlying datasets—NAEP, the American Community Survey, and FBI arrest records—are large, high-quality, and nationally representative, lending strong external validity to the findings. The magnitude and consistency of effects across domains (for example, an 8% earnings increase per standard deviation in math gains) make this one of the most significant empirical contributions on education and long-term outcomes published in recent months. It is especially timely given the steep post-pandemic decline in NAEP scores, which raises urgent questions about the future economic consequences of lost learning.



Full Citation and Link to Article

Doty, E., Kane, T. J., Patterson, T., & Staiger, D. O. (2025). What do changes in state NAEP scores imply for birth cohorts’ later life outcomes? Journal of Policy Analysis and Management. Advance online publication. https://doi.org/10.1002/pam.70018


Extended Summary


Central Research Question

This article investigates whether changes in state-level National Assessment of Educational Progress (NAEP) scores—specifically in 8th-grade mathematics—predict the later-life economic and social outcomes of the birth cohorts who experienced those educational improvements. The authors seek to determine whether test score gains, often used as indicators of educational progress, reflect genuine increases in human capital with long-term value, or whether they simply capture transient changes in test-taking ability or curricular focus. More broadly, the study asks if the substantial investments states have made to raise NAEP scores have yielded measurable benefits in adulthood, such as higher earnings, greater educational attainment, and reduced social pathologies like teen motherhood and incarceration. This question is especially timely given the sharp post-pandemic declines in student achievement and ongoing debates about the real-world importance of standardized test performance as a policy metric.


Previous Literature

The study builds on a large empirical literature linking early cognitive achievement to later-life outcomes, both at the individual and macroeconomic levels. Seminal work by Neal and Johnson (1996) and Murnane, Willett, and Levy (1995, 2000) established that a one-standard-deviation increase in achievement test scores is associated with 10–20 percent higher earnings in adulthood. Later, Chetty et al. (2011, 2014) extended this literature using causal methods, finding that kindergarten classroom quality and teacher value-added both predict significant differences in later earnings, college attendance, and other long-term outcomes. Deming et al. (2016) further demonstrated that state accountability laws raising achievement also increase earnings, implying a roughly linear relationship between improved performance and economic returns.


However, most of these studies focused on micro-level or experimental variations—such as differences among classrooms or teachers—rather than large-scale shifts in average state performance over time. This leaves an open question: do macro-level improvements, such as sustained state-level gains on the NAEP, translate into better outcomes for entire cohorts? Prior cross-sectional analyses have suggested strong correlations between test scores and outcomes, but the causal direction remains ambiguous. Moreover, some researchers have noted that despite large gains in 4th- and 8th-grade NAEP scores since 1990, 12th-grade achievement has remained flat or declined, raising doubts about whether early gains persist. The current study addresses this gap by using variation across states and birth cohorts to assess whether changes in NAEP scores predict changes in adult outcomes, thereby testing the long-term policy relevance of the NAEP itself.


Data

The authors use multiple large-scale datasets to link educational achievement with adult outcomes across U.S. states. The central educational data come from the Main NAEP 8th-grade mathematics assessments administered between 1990 and 2019. These provide state-level averages based on samples of approximately 125,000 students annually across roughly 4,800 schools. The analysis focuses on math scores because they are available for the longest time span and have been found to be more strongly correlated with earnings than reading scores.


Each state’s average 8th-grade math score is assigned to the cohort born 13 years earlier, approximating the age at which students would have been tested. To measure adult outcomes, the study uses individual-level data from the U.S. Decennial Census (2000) and the American Community Survey (ACS) from 2001 through 2019. These data provide information on earned income, educational attainment, employment, teen motherhood, homeownership, and incarceration. Arrest data by state, age, and year are drawn from the FBI’s Uniform Crime Reports (UCR), while supplementary contextual variables—such as parental education, state-level unemployment rates, median household income, and rates of low-birthweight births—are compiled from the IPUMS, CPS, and NBER databases.


The dataset thus combines a rich longitudinal perspective (spanning three decades) with broad geographic coverage across nearly all U.S. states. Because approximately 80 percent of 13-year-olds in 2000 still lived in their state of birth, the authors use state of birth as a proxy for the state where each respondent attended middle school, minimizing potential migration bias. The overall sample exceeds 900,000 individual observations for income outcomes and several million across all dependent variables. The authors note that data quality is exceptionally high, given the NAEP’s standardized administration and the Census/ACS’s large, representative samples.


Methods

The empirical strategy is based on a series of state-by-cohort regressions linking average 8th-grade math achievement to later-life outcomes. The main specification models individual income as a function of mean NAEP math achievement in the respondent’s birth-state cohort, controlling for gender, race/ethnicity, and fixed effects for state of birth, year of birth, year of observation, and age. The model is then expanded with additional controls for parental education, state-level labor market conditions, regional trends, and demographic composition.


To reduce confounding from shifts in population composition, the authors adjust NAEP scores for parental education and race using the student-level NAEP microdata. Specifically, they regress individual scores on these background variables and use the resulting state-by-year residuals as adjusted measures of achievement. They then estimate linear models of the form:


ln(Y_ijct) = β * Score_jc + X_iγ + δ_j + μ_c + τ_t + ε_ijct


where Y_ijct is the adult outcome (such as income) for individual i, born in state j, cohort c, and observed in year t. State and cohort fixed effects absorb unobserved heterogeneity, while division-by-cohort and state-by-year fixed effects control for regional and temporal confounds. Standard errors are clustered at the state-by-cohort level.


The authors conduct extensive robustness checks, adding controls for macroeconomic variables (unemployment, household income, low birthweight), interacting division and birth-year fixed effects, and allowing state-specific time trends. They also replicate the analysis for additional outcomes, including educational attainment, incarceration, and arrests, using analogous regression structures. The approach is not experimental but approximates causal inference by relying on within-state changes over time rather than cross-sectional differences, thereby minimizing bias from fixed state characteristics.


Findings/Size Effects

The study finds a strong, statistically significant relationship between increases in 8th-grade math achievement and improved adult outcomes. Across preferred specifications, a one-standard-deviation increase in state NAEP math scores predicts an 8 percent rise in adult earnings—roughly two-thirds as large as cross-sectional estimates from prior studies linking test scores to income. The estimated coefficients are robust to numerous controls, including state economic conditions and demographic changes, and remain consistent across alternative model specifications.


Educational outcomes also show meaningful improvements: a one-standard-deviation gain in math achievement corresponds to a 1.3 percentage point increase in high school completion and a 1.6 percentage point rise in college enrollment, though the association with bachelor’s degree attainment is smaller and not statistically significant. Labor force indicators improve modestly—unemployment declines by about 1.9 percentage points, and average weekly hours worked rise by approximately 0.7 hours.


Social outcomes show parallel gains. Female cohorts from high-achievement states experience a 1.8 percentage point decline in teen motherhood, while male cohorts see a 1.7 percentage point reduction in incarceration or institutionalization rates. Data from the FBI’s arrest records reveal that a one-standard-deviation rise in math achievement corresponds to roughly a 30 percent reduction in violent crime arrests and an 18 percent reduction in property crime arrests for young adults aged 15–24.


Importantly, the authors find no evidence that NAEP gains simply reflect changes in social composition or economic conditions. The correlations between increases in test scores and changes in household income, unemployment, or child health indicators are weak or even negative, suggesting that improvements in school quality, rather than favorable economic shifts, are driving the observed results. Overall, the effect sizes are large enough to imply that nationwide improvements in 8th-grade math achievement since 1990 have boosted young adult earnings by approximately 4–5 percent, with the largest-gain states such as North Carolina realizing increases of 7 percent or more.


Conclusion

The study concludes that gains in NAEP math achievement have substantial and enduring consequences for the life outcomes of students. Despite concerns that standardized test improvements might represent superficial or short-lived gains, the evidence indicates that these changes correspond to meaningful increases in human capital. The authors argue that the pattern of results—particularly the persistence of effects across income, education, and crime outcomes—closely mirrors the fade-out patterns observed in early childhood interventions: even when test score effects diminish at later grades, the underlying skills continue to influence long-term outcomes.


From a policy perspective, the findings validate the use of NAEP as a genuine indicator of educational progress. They also underscore the potential long-term costs of pandemic-era learning losses, which erased roughly 40 percent of prior gains in 8th-grade math between 2019 and 2022. If those losses persist, the study suggests, future cohorts could experience a 1.5–2 percent reduction in lifetime earnings—a decline large enough to have macroeconomic implications.


Methodologically, while the study is observational and cannot isolate causality with the precision of randomized controlled trials, its design is rigorous, its data quality exceptionally high, and its results robust across multiple specifications and outcome measures. The large sample sizes, long time span, and multi-source linkage make it one of the most comprehensive investigations of educational achievement and life outcomes to date.


The generalizability of the findings is strong for the U.S. context and likely extends to other developed nations with comparable education systems and labor markets. By connecting statewide educational reforms to adult well-being at the population level, the study provides rare empirical evidence that rising test scores are not merely statistical artifacts but meaningful indicators of progress in human capital formation. It stands as a significant contribution to the policy literature and an important benchmark for evaluating the long-term returns to public investment in education.

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Screenshot of Greg Thorson
  • Facebook
  • Twitter
  • LinkedIn


The Policy Scientist

Offering Concise Summaries*
of the
Most Recent, Impactful 
Public Policy Research

*Summaries Powered by ChatGPT

bottom of page