Predicting Home Prices in Philadelphia
Improving Property Tax Assessments
Alejandro Duque, Ethan Harner, Wesley Nay, Umair Bin Saad
NHD&S Advisors
2026-03-17
Motivation & Research Question
Motivation
The City of Philadelphia is seeking to improve its Automated Valuation Model (AVM) for property tax assessments. A good AVM is important to ensure fair taxation based on current market values.
Research Question
Which structural characteristics, spatial features, and fixed effects best predict property sale prices in Philadelphia?
Data Sources
- Philadelphia property sales, 2024–2025
- Census ACS 2024 5-Year Estimates
- income
- educational attainment
- racial and ethnic composition
- population density
- OpenDataPhilly
- parks
- SEPTA and Regional Rail access
- violent crime
- vacant buildings
- neighborhood boundaries
Spatial Pattern of Sale Prices
Residential sale prices across the city
{width=“85%”}
Neighborhood value categories
{width=“85%”}
Model Building
We built four models in sequence
- Model 1: Structural housing features only
- Model 2: Structural features + census tract characteristics
- Model 3: Structural + census + spatial accessibility and neighborhood conditions
- Model 4: Full model with fixed effects and interaction terms
Final Model: Actual vs. Predicted Prices
![]()
The final model captures the broad geography of Philadelphia’s housing market fairly well. It reproduces the major high-value and low-value submarkets, even though some local prediction error remains.
What Mattered Most
Strong predictors in the final model
- Total livable area
- Nearby sale prices
- Neighborhood category / neighborhood fixed effects
- Distance to downtown
- Neighborhood vacancy
- Violent crime
- Bathrooms and garage spaces
Structural characteristics matter, but they are not enough on their own. The strongest model is the one that combines housing features with neighborhood and spatial context.
Where the Model Struggles
- unusual or luxury properties
- rapidly changing neighborhoods
- homes with limited quality information in administrative data
- atypical sales that may not reflect normal market behavior
These are cases where price may depend on factors not fully observed in the data, such as renovations, block-level dynamics, investor demand, or non-standard transactions.
Recommendations
For a better AVM in Philadelphia
- incorporate neighborhood fixed effects
- include spatial measures like downtown access, vacancy, and crime
- use recent nearby sales to reflect local market conditions
- flag unusual properties for manual review
Limitations & Next Steps
Limitations
- administrative data do not fully capture housing quality
- some neighborhood measures are only available at the tract level
- the model may miss very local block-by-block market variation
- results are based on recent sales only
Next steps
- test nonlinear machine learning approaches
- add richer measures of condition and renovation quality
- explore finer-grained spatial effects
- compare results directly to the current City AVM