Facial Recognition: The Technology That Arrests the Wrong Black Person
In 2019, the National Institute of Standards and Technology (NIST) evaluated 189 facial recognition algorithms from 99 developers. The finding: most algorithms showed higher false positive rates for African-American and Asian faces compared to Caucasian faces — by factors of 10 to 100. For African-American women, some algorithms produced false positive rates 100 times higher than for white men. A false positive in facial recognition means: the algorithm says this person committed the crime when they did not.
This is not theoretical. Robert Williams of Detroit was arrested in January 2020 for a robbery he did not commit. He was detained for 30 hours before police acknowledged the facial recognition match was wrong. Michael Oliver of New Jersey had a felony charge against him based on a facial recognition match — which was wrong. Nijeer Parks of New Jersey spent 10 days in jail before the charges were dropped. All three men are Black. Detroit's police department was using a system that its own internal audit found produced wrong results for 96% of its matches.
The disparity exists because facial recognition algorithms are trained primarily on data sets of white faces — most were built with data scraped from public sources dominated by white subjects. An algorithm trained on non-representative data will perform poorly on underrepresented groups. The technology companies knew this. The police departments that licensed the technology were told this. They used it anyway.
Predictive Policing: The Feedback Loop That Manufactures Crime
Predictive policing systems — software like PredPol, ShotSpotter, and Chicago's Strategic Subject List — use historical crime data to predict where crimes are likely to occur and who is likely to commit them. The systems are marketed as objective, data-driven tools. The data they are trained on is not objective.
Historical crime data reflects not where crime actually occurred, but where police were sent to look for crime. If police are disproportionately deployed in Black and Latino neighborhoods — as they have been for decades — then crime data will show more arrests in those neighborhoods. A predictive policing algorithm trained on that data will direct more police to those neighborhoods, generating more arrests, which feeds more data back into the algorithm. The disparity is self-reinforcing. This is not the algorithm detecting crime — it is the algorithm reproducing the policing pattern that produced the original data.
"Algorithms are not neutral. They encode the values and assumptions of their designers, and when those designers have never had to think about race as a variable that affects their own lives, they build systems that ignore it — until someone else pays the price."
— Safiya Umoja Noble, Algorithms of Oppression, 2018In New Orleans, the police department secretly contracted with a company called Palantir to build a predictive policing program — without informing the city council or the public. The program was exposed in 2018 by a journalist. In Chicago, the Strategic Subject List assigned "risk scores" to individuals — before any crime was committed — based on factors including prior arrests and social connections to other people with prior arrests. Studies found Black and Latino residents were disproportionately flagged. The program was discontinued after public criticism, but not before it had shaped years of policing decisions.
Digital Redlining: Mortgage Algorithms and the Automated Wealth Gap
In 2021, the Markup published a landmark investigation: using federal mortgage data, they found that lenders in 2019 were more likely to deny home loans to Black applicants than to white applicants with similar financial profiles — controlling for debt-to-income ratio, loan-to-value ratio, and other standard variables. The disparity appeared not just at traditional banks but at fintech lenders who marketed themselves as unbiased because their decisions were algorithmic.
How does this happen? Many mortgage algorithms use "alternative data" — factors beyond traditional credit scores, including employment history, purchasing patterns, and neighborhood characteristics. Some algorithms use zip code as a proxy variable. Zip codes in America are not race-neutral: they are the product of 80 years of racially structured residential segregation. An algorithm that penalizes borrowers from "high-risk" zip codes is not being color-blind — it is using geography as a stand-in for race, which is exactly what redlining did with maps and red ink. The discrimination is structurally identical; the mechanism is digital.
The Fair Housing Act of 1968 prohibits racial discrimination in housing — but algorithmic systems create "disparate impact" without explicit racial intent, making them difficult to prosecute under existing law. The Consumer Financial Protection Bureau and Department of Justice have opened investigations into algorithmic lending bias; as of 2024, no major fintech lender had faced a successful fair housing enforcement action. The law was not designed for discrimination that lives in a model's weights and parameters.
Hiring Algorithms, Healthcare Algorithms, and the Full Architecture of Automated Bias
Amazon built a machine learning recruiting tool and tested it between 2014 and 2017. The algorithm was trained on résumés submitted to Amazon over a 10-year period — a period during which Amazon's technology workforce was predominantly male. The algorithm learned to penalize résumés that included the word "women's" (as in "women's chess club") and downgraded graduates of all-women's colleges. Amazon scrapped the tool in 2018 when it became clear the bias could not be corrected — but only after internal reporting, not from external accountability.
In healthcare, a 2019 study published in Science found that a widely used commercial algorithm — deployed across hundreds of hospitals to identify patients who needed additional care — systematically underestimated the health needs of Black patients. The algorithm used healthcare costs as a proxy for health needs. Because Black patients, historically, have received less healthcare due to discrimination and access barriers, they have lower historical healthcare costs — which the algorithm interpreted as being healthier. The result: Black patients were systematically under-referred for additional care compared to white patients with equivalent actual health needs.
"The model is not neutral. It was built on data that reflects a world in which Black patients had fewer resources, received less care, and were denied access to services. The algorithm learned those patterns and called them health status."
— Ziad Obermeyer et al., Dissecting racial bias in an algorithm used to manage the health of populations, Science, 2019The study estimated the bias affected approximately 200 million Americans who had been scored by this system. The company that made the algorithm, Optum, updated it after the study — but only after academic researchers, not regulators or the company itself, detected and published the bias. The pattern is consistent across hiring, healthcare, housing, and criminal justice: the algorithms are built on biased data, the companies discover the bias internally and do not disclose it, and the only accountability comes from external research.
Why "The Algorithm Did It" Is Not a Defense
The standard defense when algorithmic discrimination is exposed is: the algorithm is neutral; it doesn't know about race; it is just following the data. This defense is used to claim that disparate outcomes do not constitute discrimination — because there was no discriminatory intent. This defense has two problems. The first is empirical: the algorithms frequently produce demonstrably racialized outcomes, as documented above. The second is legal and philosophical: disparate impact is illegal under the Fair Housing Act, the Equal Credit Opportunity Act, and Title VII of the Civil Rights Act — regardless of intent.
The deeper problem is structural. Every algorithm trained on American data is trained on data produced by American history — a history of explicit racial discrimination in housing, employment, education, and criminal justice. That history shaped which zip codes have high property values, which names appear on successful résumés, which neighborhoods have high crime data, which patients had expensive healthcare. A machine trained on that data will reproduce those patterns unless it is specifically designed not to — and most are not.
Virginia Eubanks, in Automating Inequality (2018), documented how automated decision-making systems used in public services — welfare benefits, child protective services, public housing eligibility — systematically disadvantaged poor people and people of color. Her conclusion: these systems do not introduce new forms of discrimination. They automate existing ones, at greater scale, with greater opacity, and with less accountability than their human predecessors. The machine did not create the inequality. But it runs it more efficiently, and it can point to its own objectivity while doing so.