top of page

Be Notified of New Research Summaries -

It's Free!

How Do Subjective Assessments of “Potential” Influence Gender Gaps in Promotion?

  • Writer: Greg Thorson
    Greg Thorson
  • 4 days ago
  • 6 min read


Benson, Li, and Shue (2025) study whether subjective “potential” ratings help explain why women are promoted less often than men. They use personnel data from 29,809 management-track employees in a large North American retail firm and track performance, potential ratings, promotions, and turnover. They find that women receive higher performance ratings but lower potential ratings, and that potential ratings predict promotions more strongly than performance ratings. Differences in potential ratings explain about half of the gender promotion gap. Women are 33% more likely to be rated high-performance/low-potential, and a one-point increase in potential raises annual promotion rates by about 9 percentage points.


Why This Article Was Selected for The Policy Scientist

The topic is important because promotion systems shape the allocation of talent, long-run earnings, and organizational productivity. It is timely given widespread corporate use of subjective “potential” assessments and continued attention to gender gaps in managerial representation. The article contributes by linking subjective potential ratings to measurable promotion outcomes using a rich firm-level dataset with detailed ratings and personnel flows. The data are high quality but restricted to one large U.S. retailer, which limits external validity. The empirical work is rigorous but largely observational; future work using causal inference would strengthen inference.


Full Citation and Link to Article

Benson, A., Li, D., & Shue, K. (2026). “Potential” and the gender promotion gap. American Economic Review, 116(2), 1–40. Advance online publication / forthcoming. https://doi.org/10.1257/aer.20220831


Central Research Question

The central research question asks whether subjective assessments of employee “potential” contribute to gender gaps in promotion and pay, and if so, through what mechanisms these assessments translate into differential career outcomes. The authors further investigate whether potential ratings reflect accurate forecasts of future performance or whether they systematically undervalue women’s realized contributions. The question is motivated by the prevalence of potential-based evaluation systems, such as the Nine Box grid, in large firms and the possibility that these practices amplify or institutionalize existing biases. The authors aim to identify whether gender disparities in potential ratings are informationally justified or symptomatic of misperceptions and strategic managerial behavior, and whether such biases lead to talent misallocation within hierarchical organizations. 


Previous Literature

The authors situate the work within two major strands of research: organizational promotion and gender in labor markets. They note that potential-based evaluations are widely used, yet poorly studied in empirical economics relative to traditional performance reviews. Within labor economics, prior work has examined role congruity theories, stereotyping, mentorship gaps, homophily, and gender differences in networking and advocacy. Studies have shown that subjective assessments may systematically underrate women’s leadership abilities due to stereotypes associating managerial qualities with masculinity. Experimental evidence (e.g., Player et al. 2019) demonstrates that identical resumes receive different assessments of leadership potential depending on the applicant’s gender. Additional research has examined workplace homophily, differential access to networks, and gendered attribution of performance, suggesting that subjective processes may disadvantage women even absent objective performance differences.


Parallel literature explores selection into hierarchical roles and mismatch between observed performance and promoted skill sets, including concerns about the “Peter Principle” and distortion of skill allocation. The authors’ earlier empirical work (Benson, Li, and Shue 2019) quantifies how inappropriate promotion criteria can generate mismatch between skill and role. The current paper extends that line of inquiry into the domain of gender. Other cited work explores the effects of talent hoarding, managerial favoritism, and political behavior inside firms. The contribution here is to integrate subjective potential assessments into broader organizational economics, demonstrating how endogenous managerial incentives and belief distortions affect gender gaps in promotions. 


Data

The empirical analysis uses detailed administrative personnel records from a large North American retail firm, covering 29,809 full-time, salaried management-track employees between February 2009 and October 2015. These workers span corporate functions such as IT, finance, supply chain, HR, and real estate, as well as retail operations across more than 4,000 establishments. The data include monthly job titles, compensation, Nine Box performance and potential ratings, promotion events, managerial relationships, geographic assignments, and limited demographic variables including gender, age, race, and tenure. Promotion events are identified using changes in job titles combined with either compensation increases or market-based title rankings. Attrition and risk-of-loss assessments are available for a three-year subsample. The scale and granularity of the dataset exceed what is typically available in studies of internal promotion dynamics, and the authors argue that it provides a rare window into succession planning and subjective evaluation processes at scale. Nonetheless, the sample is confined to a single firm, which affects external validity. 


Methods

The authors employ regression analysis to estimate conditional gender gaps in promotion, performance ratings, and potential ratings. Specifications account for year fixed effects, business unit fixed effects, and demographic controls such as age, tenure, and race. To assess informational content, they regress future performance ratings on current performance and potential ratings, stratified by gender. They then estimate future potential ratings as outcomes to evaluate updating behavior. Additional analyses compare turnover rates across gender and risk-of-loss categories, and examine whether managers assign higher potential ratings strategically to retain perceived flight risks. They further explore heterogeneous effects by age, relocation requirements, and managerial characteristics using interaction terms.


The study also uses an instrumental variables approach to estimate performance outcomes of marginally promoted employees. The instrument leverages variation in the availability of promotion opportunities across roles and time, enabling comparisons at promotion thresholds. This design approximates marginal treatment effects and identifies whether marginally promoted women outperform marginally promoted men, conditional on underlying ratings. The empirical strategy is observational and relies heavily on multivariate regression, but the threshold analysis introduces quasi-experimental variation. The paper does not employ randomized controlled trials nor fully formal structural causal models. The authors acknowledge that causal identification rests on institutional variation and internal comparability rather than experimental assignment. From a methodological standpoint, the dataset enables unusually detailed internal comparisons within a firm, though future work using formal causal inference techniques or experimental variation could strengthen causal claims. 


Findings/Size Effects

The authors find substantial gender gaps in promotion: women are 13 percent less likely to be promoted annually than men, even though women receive significantly higher performance ratings. A one-point increase in potential ratings increases annual promotion probability by approximately 9 percentage points, whereas a one-point increase in performance ratings increases it by roughly 3 percentage points. Differences in potential ratings explain up to 53 percent of the gender promotion gap, depending on specification. Women are 12 percent more likely to receive the lowest potential rating and 28 percent less likely to receive the highest potential rating, despite being more likely to receive top performance ratings. Women are also 33 percent more likely to be rated as high-performance/low-potential (“workhorses”). 


Next, the authors show that potential ratings have predictive power for future performance, indicating that managers possess some private information not captured by backward-looking performance scores. However, controlling for both current performance and current potential, women receive higher future performance ratings than otherwise similar men. This indicates that women systematically outperform managerial forecasts of their potential. Despite this updating opportunity, women continue to receive lower potential ratings in subsequent years, indicating persistent belief distortions or incentive-driven rating behavior.


In examining mechanisms, the authors test whether women are rated lower due to anticipated attrition or childcare-related leave. They find that women are significantly less likely to leave the firm and are accurately assessed as lower flight risks by managers. Nonetheless, employees assessed as higher flight risks—disproportionately men—receive higher potential ratings, are promoted more, and receive higher pay, despite performing no better on average. The authors show that men passed over for promotion are up to 85 percent more likely to exit than similarly situated women, which rationalizes managers’ strategic use of potential ratings as a retention lever. The gender gap in potential ratings narrows significantly among the highest-risk-of-loss employees, consistent with strategic retention logic.


The authors also examine heterogeneity by age, promotion desirability, and geography. Gender gaps persist across most subsamples, though gaps are smaller for promotions not requiring relocation. Finally, the IV analysis indicates that marginally promoted women outperform marginally promoted men, implying talent misallocation generated by biased or incentive-distorted potential ratings. 


Conclusion

The authors conclude that subjective assessments of potential meaningfully contribute to gender gaps in promotion and compensation by discounting women’s future performance relative to realized outcomes. Potential ratings contain informational content but also encode stereotyping and strategic retention incentives. As a result, firms allocate too many advancement opportunities to men who may be more mobile but not more productive, while marginally excluding higher-performing women. This produces measurable misallocation of talent in managerial hierarchies. The authors evaluate candidate remedies such as changing managerial assignments or removing potential ratings from promotion decisions, but they find limited efficiency gains absent debiasing. The study demonstrates that the design of subjective evaluation systems has first-order implications for advancement, earnings, and organizational productivity, and that reforms require attention to both belief formation and managerial incentives. 

Comments

Rated 0 out of 5 stars.
No ratings yet

Add a rating
Screenshot of Greg Thorson
  • Facebook
  • Twitter
  • LinkedIn


The Policy Scientist

Offering Concise Summaries*
of the
Most Recent, Impactful 
Public Policy Research

*Summaries Powered by ChatGPT

bottom of page