Are Algorithms Used in Higher Education Predictive Models Reinforcing Racial Inequities in Student Success?
- Greg Thorson
- Jun 11
- 5 min read

This study investigates whether predictive algorithms used in higher education exhibit racial bias that disadvantages Black students. Using student-level administrative data from the Virginia Community College System, the authors developed two random forest models to predict course and degree completion. They found evidence of calibration and accuracy bias: among students with the same predicted risk scores, Black students had lower actual success rates than White peers. Calibration bias was over five times greater in certain thresholds of predicted success. Including race-specific models reduced bias for course completion but not for degree completion, highlighting the contextual nature of algorithmic bias.
Full Citation and Link to Article
Bird, K. A., Castleman, B. L., & Song, Y. (2025). Are algorithms biased in education? Exploring racial bias in predicting community college student success. Journal of Policy Analysis and Management, 44(2), 379–402. https://doi.org/10.1002/pam.22569
Extended Summary
Here is an accurate and detailed 1,000-word summary of the article “Are Algorithms Biased in Education? Exploring Racial Bias in Predicting Community College Student Success” by Bird, Castleman, and Song (2025):
Central Research Question
The central research question guiding this study is whether predictive algorithms used to identify students in need of academic support in higher education exhibit racial bias. Specifically, the authors investigate whether race-blind machine learning models used to predict course and degree completion systematically disadvantage Black students by underestimating their risk levels. This question is motivated by the growing use of predictive analytics in educational decision-making and concerns about exacerbating racial disparities in outcomes.
Previous Literature
The study builds upon an existing body of literature that recognizes the dual potential of predictive analytics in education: while these tools can improve the targeting of student support services, they may also reinforce existing inequities. Previous research has documented how predictive models often exhibit disparities in performance across racial groups, particularly in terms of calibration and accuracy. Much of the prior work in algorithmic fairness has emphasized computational techniques to mitigate bias, yet few studies have examined these concerns in real-world educational contexts, especially where algorithms directly influence resource allocation. This paper fills that gap by focusing on policy-relevant metrics of fairness and applying them to administrative data from a large statewide community college system.
Data
The analysis uses comprehensive student-level administrative records from the Virginia Community College System (VCCS), covering 23 colleges over a 12-year period from 2007 to 2019. The dataset includes over 5 million student-course records and more than 380,000 student-level observations for the degree completion model. The data encompass demographic characteristics (e.g., race, gender, age), academic information (e.g., GPA, credits attempted and earned), course enrollments, grades, financial aid status, and degree outcomes. The use of a large, diverse, and longitudinal dataset allows the researchers to rigorously train, validate, and test machine learning models while examining subgroup disparities.
Methods
The authors estimate two predictive models using random forests: one to predict course completion (defined as earning a grade of A, B, or C) and another to predict degree completion within six years of first enrollment. Each model is developed using a training cohort of historical data and tested on a future cohort to simulate real-world application. Both models are race-blind in the sense that race is not included as a predictor.
To evaluate algorithmic bias, the study assesses two fairness metrics:
Calibration Bias: This occurs when students of different racial groups receive the same predicted success score but have systematically different observed success rates. For example, if Black and White students are both predicted to have a 70% chance of success, but only White students actually succeed at that rate, the model is miscalibrated by race.
Accuracy Bias: This refers to differences in predictive accuracy across groups, measured using the area under the receiver operating characteristic curve (AUC or c-statistic). A higher AUC indicates better discriminatory power.
Additionally, the authors explore several model modifications to assess potential mitigations, including the addition of race as a predictor, creation of race-specific models, and stratification by student experience level (i.e., first-time vs. returning students).
Findings/Size Effects
The analysis reveals significant evidence of racial bias in both predictive models, with important implications for the equitable distribution of academic resources.
1. Calibration Bias
In both the course and degree completion models, Black students with the same predicted probability of success as White students are less likely to succeed.
For the course model, calibration bias is most severe when using broader thresholds to define at-risk students. At the 50% cutoff (i.e., targeting the bottom half of students), Black students were about 6 percentage points less likely to succeed than White peers with the same predicted risk score. This difference shrinks to about 1.3 percentage points when targeting only the bottom 10%.
For the degree completion model, the pattern is reversed: bias is more pronounced at the bottom 10% than the bottom 50%.
2. Accuracy Bias
The random forest models show lower predictive accuracy for Black students than for White students.
In the course model, the c-statistic is 0.8286 for White students and 0.8037 for Black students, a 3.01% gap.
In the degree model, the c-statistic is 0.8981 for White students and 0.8878 for Black students, a 1.15% gap.
Although modest, these differences in AUC are statistically significant and indicate reduced model performance for Black students.
3. Race-Aware Models
Including race as a predictor and estimating race-specific models both reduce calibration bias in the course completion context.
For example, when separate models are used for Black and White students, the average difference in success rates between equally scored students is reduced by 61% in the course model.
However, such modifications do not meaningfully reduce bias in the degree completion model and may even worsen it under some specifications.
4. Bias in Simple Heuristics
Common academic heuristics (e.g., GPA < 2.0, part-time enrollment) also exhibit significant calibration bias.
For instance, among students flagged as at-risk based on a GPA below 2.0, Black students are consistently less likely to succeed than White students with the same GPA.
These simple rules not only misclassify many students but also amplify existing disparities.
5. Mechanisms Driving Bias
The study rules out several alternative explanations for bias, such as differential sorting into programs or underrepresentation of Black students in the training data.
The most important contributor appears to be that Black students are more likely to be first-time enrollees, meaning the model has less prior information to make accurate predictions.
Among first-time students, calibration bias is more than double that seen in returning students.
Conclusion
This study provides rigorous, policy-relevant evidence that predictive models in higher education can perpetuate racial inequities even when race is excluded from the model. The authors find that race-blind algorithms systematically underestimate the risk of academic failure for Black students, leading to their under-identification as candidates for intervention. These biases are present in both machine learning models and simpler heuristics, suggesting that the problem is structural rather than purely technical.
The findings underscore the need for institutions to critically evaluate the fairness of their predictive systems, particularly in contexts where these tools inform decisions about resource allocation. The authors advocate for the cautious use of race-aware modeling approaches and emphasize the importance of collecting richer data for students with less academic history.
Importantly, the authors make their code publicly available to encourage transparency and replication. Their work serves as a call to action for educators, policymakers, and data scientists to ensure that advances in educational technology do not come at the cost of equity. Institutions deploying predictive analytics must balance accuracy with fairness and rigorously assess whether their tools are serving all students equitably.
Comentarios