Six Federal Agencies Wrote a Rule for AI Home Appraisals. It Says 'Don't Discriminate' but Doesn't Say How to Check.

In June 2024, the Consumer Financial Protection Bureau, the Federal Housing Finance Agency, the FDIC, the Federal Reserve, the NCUA, and the OCC issued a final rule governing automated valuation models used in residential mortgage lending. Six agencies. One rule. Thirteen years after the Dodd-Frank Act mandated it. And the central enforcement mechanism for preventing racial discrimination in algorithmic home pricing is four words long: "comply with nondiscrimination laws."

No testing protocol, no required audit frequency, no mandated demographic error-rate reporting, no specification of which statistical method should be used to determine whether an AVM produces disparate outcomes across racial groups. Just the word "comply" and the assumption that institutions will figure out what that means on their own.

I have read a lot of federal rules over the years, and most of the vague ones fall into two categories. Some are vague because the problem is genuinely hard and regulators are deferring to evolving best practices. Others are vague because specificity would create accountability, and accountability would create litigation, and litigation would slow down a market that six agencies have collectively decided should keep moving, so the rule gestures at a standard without ever defining one. This reads like the second category.

What AVMs Actually Are

An automated valuation model is software that estimates what your home is worth. It ingests property records, recent comparable sales, tax assessments, local market trends, and sometimes satellite imagery or MLS listing data, then outputs a number. If you have ever checked your home's value on Zillow, Redfin, or Realtor.com, you have used one.

Consumer-facing tools like Zillow's Zestimate report a national median error of 1.83% for on-market homes, which sounds precise until you check the off-market figure, where most homeowners actually live most of the time: 7.01%. On a $400,000 home, that 7% error is a $28,040 swing in either direction; on a $600,000 home in a competitive Bay Area market, the uncertainty widens to $42,060 before anyone has opened a door or walked through a room. An academic comparison published in the Journal of Real Estate Finance and Economics found that even the best-performing ML model (XGBoost) achieved only a 5.17% median absolute percentage error on out-of-sample residential predictions.

These are aggregate numbers, and they obscure the most important question: whose homes get the biggest errors?

$162B

Total cost of home devaluation in majority-Black neighborhoods across 113 U.S. metro areas, per Brookings Institution analysis of FHFA appraisal data.

Where the Errors Concentrate

Brookings researchers Jonathan Rothwell and Andre Perry analyzed FHFA neighborhood-level appraisal data and found that homes in majority-Black neighborhoods are valued 21 to 23% below what identical properties would be worth in non-Black neighborhoods. Appraisals in majority-Black neighborhoods are 1.9 times more likely to come in below the contract price compared to majority-white neighborhoods. After adjusting for every measurable characteristic of the homes and their surroundings, a persistent 4.4% bias remained. Ten percent of all appraisals in majority-Black neighborhoods landed on the wrong side of the contract price, relative to what would be expected absent racial bias.

That was the human appraiser problem, and algorithms were supposed to fix it, because a machine does not notice the race of the homeowner, does not adjust its comp selection based on the neighborhood's demographics, does not carry the unconscious biases that congressional hearings have documented in appraiser after appraiser after appraiser.

They did not fix it. A HUD Cityscape study examined whether improved data and modern machine learning techniques would eliminate racial disparities in automated valuations, and it found that AVMs still produce systematically larger errors in majority-Black neighborhoods. Property condition data, which researchers hypothesized might close the gap by giving models richer inputs about the actual state of each home, did not solve the problem, and neither did more sophisticated algorithms like XGBoost, random forests, or neural networks: the architecture changed, the disparities persisted.

Because the training data is the problem, and no amount of algorithmic sophistication can overcome it. AVMs learn from historical sales, and historical sales in majority-Black neighborhoods reflect decades of redlining, disinvestment, discriminatory lending, and racially biased prior appraisals. A model trained on those prices will reproduce their biases with mathematical precision and call it objectivity.

What the Rule Requires

Read the final rule and you will find four requirements for institutions using AVMs in mortgage lending: first, ensure a high level of confidence in estimates; second, seek to avoid conflicts of interest; third, require random sample testing and reviews; and fourth, comply with nondiscrimination laws.

Requirements one through three are procedural and auditable, the kind of provisions a compliance officer can build a checklist around. A regulator can ask to see your confidence intervals, your conflict-of-interest policies, your testing results, and the institution either has them or does not. Requirement four is fundamentally different in character: it points at the Fair Housing Act, the Equal Credit Opportunity Act, and every other nondiscrimination statute on the books without specifying how compliance should be measured, how often it should be tested, or what statistical threshold would constitute a violation.

Consider what a lender implementing this rule must actually do: adopt "policies, practices, procedures, and control systems" to ensure nondiscrimination compliance. But which policies? Disparate impact testing, matched-pair analysis, demographic parity in error rates, equalized odds? Every fair-ML researcher knows these metrics conflict with each other, that achieving parity on one often degrades another, and that the choice between them is ultimately a values question, not a technical one. The rule picks none of them, leaving each institution to interpret "comply" however its lawyers find most defensible and its regulators find least objectionable.

A lender could, in theory, satisfy requirement four by issuing a written policy stating "we comply with all applicable nondiscrimination laws" and conducting zero demographic testing of their AVM's outputs. Nothing in the rule text prevents this interpretation.

Meanwhile, Fannie Mae Scaled the Algorithm

While six agencies spent 13 years crafting this rule, Fannie Mae moved fast. Appraisal waivers, rebranded in September 2025 as "value acceptance," now cover approximately 46% of all GSE-backed mortgage valuations. In 2017 it was 5%. Fannie Mae estimates the shift has saved borrowers $2.5 billion since 2020.

In Q1 2025, Fannie Mae raised the eligible loan-to-value ratio for value acceptance from 80% to 90%. That means a borrower putting 10% down on a primary residence can now receive an algorithmically determined home value without any human appraiser ever setting foot on the property.

46%

Share of GSE-backed mortgage valuations now using algorithmic "value acceptance" instead of a traditional human appraisal, up from 5% in 2017.

Fannie Mae calls this "valuation modernization," and it is also a rational response to a workforce crisis: appraiser ranks have declined 30% since 2007, licensing requirements remain onerous enough to discourage younger entrants from pursuing the credential, and you cannot solve a staffing shortage by hiring people who do not exist, so you solve it by automating the function they used to perform. The economic logic is sound, and the civil rights question hangs in the air unanswered.

Original Analysis: Three Facts Nobody Is Connecting

Connect the three facts that nobody in a regulatory filing or agency press release has stitched together. First: the federal AVM rule mandates nondiscrimination compliance without specifying any testing methodology. Second: HUD's own peer-reviewed research demonstrates that AVMs produce larger errors in majority-Black neighborhoods regardless of which machine learning technique is deployed. Third: Fannie Mae expanded AVM-based valuations to 90% LTV eligibility while that second fact remains unresolved.

A federal rule now governs algorithmic home appraisals but deliberately avoids prescribing how lenders should test for racial bias, while a sister federal agency's own research confirms that the bias exists in these very algorithms and is not eliminated by better models or richer training data, and a government-sponsored enterprise has simultaneously scaled the use of these algorithms to cover nearly half of all conforming mortgage valuations at higher leverage ratios than ever before.

Nobody in a regulatory filing or agency press release has addressed the contradiction directly. The CFPB's announcement of the final rule mentioned algorithmic bias in general terms but did not cite the HUD Cityscape findings; FHFA's announcement of the expanded value acceptance program did not reference the AVM rule's nondiscrimination requirement. These are parallel conversations about the same technology affecting the same asset class, conducted by agencies that share office buildings in Washington, and they are not talking to each other.

What This Means If You Are Buying or Selling a Home

If your mortgage goes through Fannie Mae or Freddie Mac, there is roughly a coin-flip chance that an algorithm determined your home's value instead of a human appraiser, and you may not have been told which method was used. Lenders are not required to disclose the valuation method at the time of origination, and Fannie Mae replaced the term "appraisal waiver" with "value acceptance" in part to reduce borrower confusion about what the distinction even means, which has the secondary effect of making the distinction harder to notice.

If you are a homeowner in a majority-Black neighborhood, the Brookings data says your home is statistically more likely to be undervalued by either method. But at least with a human appraiser, you have the reconsideration of value (ROV) process established by CFPB guidance: you can request a second look, submit comps, argue your case. With an AVM, there is no appraiser to appeal to. You got a number generated by a model you cannot see, trained on data you cannot inspect, operating under a nondiscrimination standard that does not specify how compliance is measured.

Practical advice for homebuyers and sellers today:

Ask your lender whether a traditional appraisal or value acceptance will be used on your transaction. You have the right to request a full appraisal even when an AVM-based waiver is offered, though you will pay $400 to $600 for it. If you suspect the valuation is low, request a reconsideration of value through your lender and submit your own comparable sales data. If you believe the undervaluation is discriminatory, file a complaint with the CFPB and your state attorney general's office. Fair Housing Act protections apply regardless of whether the valuation was performed by a human or a machine.

Strongest Counterargument

Human appraisers are demonstrably worse. The Brookings analysis found homes in majority-Black neighborhoods are undervalued 21 to 23% by human appraisals, and the HUD study found that AVMs, while biased, produce errors that are more consistent and potentially more correctable than human judgment calls. An algorithm at least applies the same methodology to every property; a human appraiser might consciously or unconsciously adjust comparable selections based on neighborhood demographics, a practice documented in congressional testimony. Arguably, a biased but auditable algorithm is better than a biased and unauditable human, because you can at least measure algorithmic bias at scale and correct for it programmatically. The rule's vagueness on testing methodology may also be deliberate flexibility, allowing institutions to adopt the fair-ML approach best suited to their particular AVM rather than mandating a one-size-fits-all test that locks in a specific statistical definition of fairness that may become outdated.

This argument has real force. I am not fully persuaded by it, because "better than the broken thing it replaced" is a low bar for a system that determines the largest financial asset most Americans will ever own, and because the rule's flexibility is indistinguishable from the absence of a requirement when no enforcement mechanism compels any specific action. But the argument is legitimate, and anyone analyzing this policy space should engage with it honestly.

Limitations

Major AVM vendors, including CoreLogic, Black Knight (now ICE Mortgage Technology), and Zillow, do not publish error rates disaggregated by neighborhood racial demographics. We are therefore relying on academic and government studies that may use different model architectures than commercial systems. The HUD Cityscape study used specific ML methods on specific datasets; commercial AVMs may perform differently. Fannie Mae's $2.5 billion savings estimate is aggregate and does not reveal how savings distribute across demographic groups; it is possible that the expansion of value acceptance disproportionately benefits neighborhoods with abundant comparable sales data, which tend to be wealthier and whiter. The federal rule has not yet taken full effect, and agencies may issue supplementary guidance on nondiscrimination testing; this analysis is based on the rule text as published. Brookings' 21 to 23% devaluation figure includes factors beyond appraisal bias, including lending practices, consumer preferences, and neighborhood investment patterns; appraisal bias alone accounts for an estimated 9 to 19% of the total gap. Finally, this article does not address the separate question of property tax assessments, which use similar algorithmic methods but operate under different regulatory frameworks and create distinct equity concerns.

← Back to AI Home Building