Assignment 2: Spatial Analysis and Visualization

Healthcare Access and Equity in Pennsylvania

Author

Your Name Here

Published

February 23, 2026

Assignment Overview

Learning Objectives: - Apply spatial operations to answer policy-relevant research questions - Integrate census demographic data with spatial analysis - Create publication-quality visualizations and maps - Work with spatial data from multiple sources - Communicate findings effectively for policy audiences


Part 1: Healthcare Access for Vulnerable Populations

Research Question

Which Pennsylvania counties have the highest proportion of vulnerable populations (elderly + low-income) living far from hospitals?

Your analysis should identify counties that should be priorities for healthcare investment and policy intervention.

Required Analysis Steps

Complete the following analysis, documenting each step with code and brief explanations:

Step 1: Data Collection

Your Task:

# Load required packages
#| echo: false
#| message: false
#| warning: false

library(sf)
library(tidyverse)
library(tidycensus)
library(scales)
library(RColorBrewer)
library(here)
library(tigris)

# Load spatial data

hospitals <- st_read(here("labs/lab_2/labs/hospitals.geojson"))
Reading layer `hospitals' from data source 
  `/Users/ethanharner/Documents/GitHub/CPLN5920/labs/lab_2/labs/hospitals.geojson' 
  using driver `GeoJSON'
Simple feature collection with 223 features and 11 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -80.49621 ymin: 39.75163 xmax: -74.86704 ymax: 42.13403
Geodetic CRS:  WGS 84
boundaries <- st_read(here("labs/lab_2/labs/Pennsylvania_County_Boundaries/Pennsylvania_County_Boundaries.shp"))
Reading layer `Pennsylvania_County_Boundaries' from data source 
  `/Users/ethanharner/Documents/GitHub/CPLN5920/labs/lab_2/labs/Pennsylvania_County_Boundaries/Pennsylvania_County_Boundaries.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 67 features and 19 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -8963377 ymin: 4825316 xmax: -8314404 ymax: 5201413
Projected CRS: WGS 84 / Pseudo-Mercator
tracts <- suppressMessages(suppressWarnings(tracts(state = "PA", cb = TRUE)))

  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |=                                                                     |   2%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |====                                                                  |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |======                                                                |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |=======                                                               |  11%
  |                                                                            
  |========                                                              |  11%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |==========                                                            |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |============                                                          |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |===============                                                       |  22%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |=================                                                     |  25%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |===================                                                   |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |=====================                                                 |  31%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |=======================                                               |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |=========================                                             |  36%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |============================                                          |  39%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |==============================                                        |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |================================                                      |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |==================================                                    |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  50%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |====================================                                  |  52%
  |                                                                            
  |=====================================                                 |  53%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |======================================                                |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |========================================                              |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |===========================================                           |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  64%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  67%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |=================================================                     |  71%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |=====================================================                 |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  78%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  81%
  |                                                                            
  |==========================================================            |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |==============================================================        |  89%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |================================================================      |  92%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |==================================================================    |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%

Questions to answer: - How many hospitals are in your dataset?

There are 233 hospitals in my dataset.

  • How many census tracts?

There are 3445 census tracts.

  • What coordinate reference system is each dataset in?
Coordinate Reference System:
  User input: WGS 84 
  wkt:
GEOGCRS["WGS 84",
    DATUM["World Geodetic System 1984",
        ELLIPSOID["WGS 84",6378137,298.257223563,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["geodetic latitude (Lat)",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["geodetic longitude (Lon)",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4326]]
Coordinate Reference System:
  User input: WGS 84 / Pseudo-Mercator 
  wkt:
PROJCRS["WGS 84 / Pseudo-Mercator",
    BASEGEOGCRS["WGS 84",
        ENSEMBLE["World Geodetic System 1984 ensemble",
            MEMBER["World Geodetic System 1984 (Transit)"],
            MEMBER["World Geodetic System 1984 (G730)"],
            MEMBER["World Geodetic System 1984 (G873)"],
            MEMBER["World Geodetic System 1984 (G1150)"],
            MEMBER["World Geodetic System 1984 (G1674)"],
            MEMBER["World Geodetic System 1984 (G1762)"],
            MEMBER["World Geodetic System 1984 (G2139)"],
            MEMBER["World Geodetic System 1984 (G2296)"],
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]],
            ENSEMBLEACCURACY[2.0]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4326]],
    CONVERSION["Popular Visualisation Pseudo-Mercator",
        METHOD["Popular Visualisation Pseudo Mercator",
            ID["EPSG",1024]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["False easting",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["easting (X)",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["northing (Y)",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    USAGE[
        SCOPE["Web mapping and visualisation."],
        AREA["World between 85.06°S and 85.06°N."],
        BBOX[-85.06,-180,85.06,180]],
    ID["EPSG",3857]]
Coordinate Reference System:
  User input: NAD83 
  wkt:
GEOGCRS["NAD83",
    DATUM["North American Datum 1983",
        ELLIPSOID["GRS 1980",6378137,298.257222101,
            LENGTHUNIT["metre",1]]],
    PRIMEM["Greenwich",0,
        ANGLEUNIT["degree",0.0174532925199433]],
    CS[ellipsoidal,2],
        AXIS["latitude",north,
            ORDER[1],
            ANGLEUNIT["degree",0.0174532925199433]],
        AXIS["longitude",east,
            ORDER[2],
            ANGLEUNIT["degree",0.0174532925199433]],
    ID["EPSG",4269]]

The hospitals and boundaries data sets are in the WGS 84 CRS, while the tracts data set is in NAD83.


Step 2: Get Demographic Data

# Get demographic data from ACS

vars_65plus <- c(
  male_65_66   = "B01001_020",
  male_67_69   = "B01001_021",
  male_70_74   = "B01001_022",
  male_75_79   = "B01001_023",
  male_80_84   = "B01001_024",
  male_85plus  = "B01001_025",
  female_65_66  = "B01001_044",
  female_67_69  = "B01001_045",
  female_70_74  = "B01001_046",
  female_75_79  = "B01001_047",
  female_80_84  = "B01001_048",
  female_85plus = "B01001_049"
)

tract_demographics2 <- get_acs(
  geography = "tract",
  variables = c(
    total_pop = "B01003_001",
    median_income = "B19013_001",
    vars_65plus
  ),
  state = "PA",
  year = 2024,
  output = "wide"
)

tract_demographics <- tract_demographics2 %>%
  mutate(
    pop_65_plus = rowSums(
      select(., matches("male_.*E|female_.*E")),
      na.rm = TRUE
    )
  )

tract_demographics <- tract_demographics %>%
  select(GEOID, NAME, total_popE, total_popM, median_incomeE, median_incomeM, pop_65_plus)

# Join to tract boundaries

tracts_proj <- st_transform(tracts,3365)

tract_demographics <- tracts_proj %>%
  left_join(tract_demographics, by = "GEOID")

Questions to answer: - What year of ACS data are you using?

I’m using 2024 ACS data for this analysis.

  • How many tracts have missing income data?
no_income <- tract_demographics %>%
  filter(is.na(median_incomeE))

I identified 69 census tracts that have missing income data.

  • What is the median income across all PA census tracts?

The median income across all of PA’s census tracts is $75,844


Step 3: Define Vulnerable Populations

Identify census tracts with vulnerable populations based on TWO criteria: 1. Low median household income (choose an appropriate threshold)

Threshold will be households under 30% state median income OR with over 30% of total population being elderly.

  1. Significant elderly population (choose an appropriate threshold)

Your Task:

# Filter for vulnerable tracts based on your criteria

tract_demographics <- tract_demographics %>%
  mutate(percent_65_plus = pop_65_plus / total_popE)

vulnerable_tracts <- tract_demographics %>%
  filter((median_incomeE <= 0.3*75844) | percent_65_plus >= .30)

Questions to answer: - What income threshold did you choose and why?

I chose making less than or equal to 30% of state-wide median income to be the income threshold for vulnerability. This is because that is the HUD definition of “very-low income” and the threshold for qualifying for social assistance such as housing choice vouchers.

  • What elderly population threshold did you choose and why?

I chose the threshold of at least 30% of population being over 65, as after plotting the proportion of residents over 65, it showed that most tracts were clustered between 15 and 25%. This made 30% feel like a reasonable cut-off for counties that skewed older than what is typical.

  • How many tracts meet your vulnerability criteria?

281 tracts meet my criteria.

  • What percentage of PA census tracts are considered vulnerable by your definition?

Approximately 8.1% of tracts are deemed vulnerable.


Step 4: Calculate Distance to Hospitals

First I want to visualize tracts and hospitals.

ggplot() +
  geom_sf(data = hospitals, color = "blue") +
  geom_sf(data = tract_demographics, color="red") +
  theme_void() 

Your Task:

# Transform to appropriate projected CRS

tract_demographics <- st_transform(tract_demographics, crs = 3365)
vulnerable_tracts <- st_transform(vulnerable_tracts, crs = 3365)
hospitals <- st_transform(hospitals, crs = 3365)

# Calculate distance from each tract centroid to nearest hospital

tract_centroid <- st_centroid(vulnerable_tracts)

dist_matrix <- st_distance(tract_centroid, hospitals)

nearest_dist <- apply(dist_matrix, 1, min)

vulnerable_tracts$nearest_hospital_dist_ft <- as.numeric(nearest_dist)

vulnerable_tracts <- vulnerable_tracts %>%
  mutate(nearest_hospital_dist_miles = as.numeric(nearest_hospital_dist_ft / 5280))

Questions to answer: - What is the average distance to the nearest hospital for vulnerable tracts?

The average distance to the nearest hospital for vulnerable tracts is 4.7 miles.

  • What is the maximum distance?

The maximum distance is 29 miles.

  • How many vulnerable tracts are more than 15 miles from the nearest hospital?

Sixteen census tracts are more than 15 miles away from the nearest hospital.


Step 5: Identify Underserved Areas

Define “underserved” as vulnerable tracts that are more than 15 miles from the nearest hospital.

Your Task:

# Create underserved variable

vulnerable_tracts <- vulnerable_tracts %>%
  mutate(underserved = ifelse(nearest_hospital_dist_miles > 15, "underserved", "well served"),
         underserved_varb = ifelse(nearest_hospital_dist_miles > 15, 1, 0))

Questions to answer: - How many tracts are under served? - What percentage of vulnerable tracts are under served? - Does this surprise you? Why or why not?

There are sixteen under-served census tracts. This is approximately 5% of all vulnerable tracts. This suprises me, as I thought that more tracts that are vulnerable would be under-served. However, this might indicate that my selection criteria for under-served tracts biases urban areas where hopsitals are likely to be more densely concentrated.


Step 6: Aggregate to County Level

Your Task:

Reading layer `Pennsylvania_County_Boundaries' from data source 
  `/Users/ethanharner/Documents/GitHub/CPLN5920/labs/lab_2/labs/Pennsylvania_County_Boundaries/Pennsylvania_County_Boundaries.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 67 features and 19 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: -8963377 ymin: 4825316 xmax: -8314404 ymax: 5201413
Projected CRS: WGS 84 / Pseudo-Mercator

Questions to answer: - Which 5 counties have the highest percentage of underserved vulnerable tracts? - Which counties have the most vulnerable people living far from hospitals? - Are there any patterns in where underserved counties are located?

Bradford, Cameron, Clarion, Clearfield, and Clinton counties have the highest share of underserved vulnerable tracts, with all of them having 100% of vulnerable tracts marked as undeserved. Counties with the most vulnerable people living far from hospitals are Pike, Clearfield, Clinton, Sullivan, and Wyoming Counties. These are the most populous counties with an average vulnerable tract distance to hospital being over 15 miles. Based on what I’ve mapped, it appears that most underserved tracts are in central and northern Pennsylvania. This makes sense as they are the furthest from major cities such as Pittsburgh and Philadelphia.

ggplot(county_analysis) + 
  geom_sf(aes(fill = underserved_tracts_county)) +
  scale_fill_viridis_c()+
  theme_minimal() + 
  labs(fill = "Number of Underserved Tracts")


Step 7: Create Summary Table

Your Task:

# Create and format priority counties table

library(knitr)

table <- county_analysis %>%
  st_drop_geometry() %>%
  arrange(desc(per_vul_underserved)) %>%
  slice(1:10) %>%
  select(
    COUNTY_NAM, 
    vulnerable_tracts_county,
    underserved_tracts_county,
    per_vul_underserved,
    vulnerable_pop
  ) %>%
  kable(
    digits =2,
    col.names = c(
      "County",
      "Vulnerable Tracts",
      "Underserved Tracts",
      "Percent of Vulnerable Population Underserved by Healthcare",
      "Total Vulnerable Population"
    )
  )

table
County Vulnerable Tracts Underserved Tracts Percent of Vulnerable Population Underserved by Healthcare Total Vulnerable Population
BRADFORD 1 8 1.00 1640
CAMERON 1 5 1.00 2452
CLARION 1 3 1.00 2552
CLEARFIELD 2 11 1.00 5762
CLINTON 3 5 1.00 5301
FULTON 1 2 1.00 1797
HUNTINGDON 1 3 1.00 1797
SULLIVAN 3 7 1.00 3871
ELK 3 4 0.67 11149
TIOGA 3 5 0.67 6703

Part 2: Comprehensive Visualization

Map 1: County-Level Choropleth

Create a choropleth map showing healthcare access challenges at the county level.

Your Task:

# Create county-level access map

ggplot(county_analysis) + 
  geom_sf(aes(fill = per_vul_underserved),
          color = "black", size = 0.2) +
  geom_sf(data = hospitals, color = "red", size = 1, shape = 21, fill = "red") +
  scale_fill_gradient(
    low = "lightblue",
    high = "darkblue",
    name = "Proportion Vulnerable & Underserved"
  )+
  theme_void() +
  theme(
    plot.title = element_text(size = 18, face = "bold"),
    plot.subtitle = element_text(size = 14),
    plot.caption = element_text(size = 10)
  ) +
  labs(
    title = "Access to Healthcare for Vulnerable Populations",
    subtitle = "Aggregated to PA Counties",
    caption = "Data ACS 2024"
  )


Map 2: Detailed Vulnerability Map

Create a map highlighting underserved vulnerable tracts.

Your Task:

# Create detailed tract-level map

tract_demographics <- tract_demographics %>%
  mutate(vulnerable_underserved = ifelse(vulnerable == 1 & underserved == 1, 1, 0))

ggplot(tract_demographics) + 
  geom_sf(data = boundaries, color = "black", size = 0.5) +
  scale_fill_manual(
    values = c("0"= "white", "1" = "blue"),
    labels = c("All Else", "Vulnerable and Underserved"),
    name = "Category") + 
  geom_sf(aes(fill = factor(vulnerable_underserved)),
          color = "grey", size = 0.05) +
  geom_sf(data = hospitals, color = "red", size = 0.5, shape = 21, fill = "red") +
  theme_void() +
  theme(
    plot.title = element_text(size = 18, face = "bold"),
    plot.subtitle = element_text(size = 14),
    plot.caption = element_text(size = 10)
  ) +
  labs(
    title = "Tracts that are Vulnerable and Underserved",
    subtitle = "Census Tract Level",
    caption = "Data ACS 2024"
  )


Chart: Distribution Analysis

Create a visualization showing the distribution of distances to hospitals for vulnerable populations.

Your Task:

# Create distribution visualization

ggplot(vulnerable_tracts, aes(x = nearest_hospital_dist_miles)) +
  geom_histogram(binwidth = 2, fill = "steelblue", color = "white", alpha = 0.8) +
  scale_x_continuous(name = "Distance to Nearest Hospital (miles)") +
  scale_y_continuous(name = "Number of Vulnerable Tracts") +
  labs(
    title = "Distribution of Distances to Nearest Hospital",
    subtitle = "Vulnerable Census Tracts in Pennsylvania",
    caption = "Data: ACS 2024"
  ) +
  theme_minimal()

The above histogram suggests that the majority of vulnerable census tracts are well-served by hospitals. However, there are extreme outliers that deserve considerable attention, with one vulnerable census tract being 30 miles away from the nearest hospital.


Part 3: Bring Your Own Data Analysis

Choose your own additional spatial dataset and conduct a supplementary analysis.

Investigating Cultural Ammenities in Low-income minority neighborhoods

# Load your additional dataset

landmarks <- st_read(here("labs/lab_2/labs/Landmark_Points.geojson"))
Reading layer `Landmark_Points' from data source 
  `/Users/ethanharner/Documents/GitHub/CPLN5920/labs/lab_2/labs/Landmark_Points.geojson' 
  using driver `GeoJSON'
Simple feature collection with 1147 features and 20 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: -75.28169 ymin: 39.85929 xmax: -74.96915 ymax: 40.11991
Geodetic CRS:  WGS 84
#public_art <- st_read(here("labs/lab_2/labs/percent_for_art_public.geojson"))

landmarks <- st_transform(landmarks, crs = st_crs(tract_demographics))
#public_art <- st_transform(public_art, crs = st_crs(tract_demographics))

vars <- vars <- c(
  median_income = "B19013_001", 
  total_pop      = "B02001_001",
  white_pop      = "B02001_002"
)


phila_demos <- get_acs(
  geography = "tract",
  state = "PA",
  county = "Philadelphia",
  variables = vars,
  year = 2024,
  output = "wide",
  geometry = TRUE
) %>%
  mutate(percent_nonwhite = (1-(white_popE / total_popE))*100)

I chose the landmarks dataset provided by OpenDataPhilly. I chose these two to get an understanding of the spatial distribution of city landmarks based on socio-economic patterns. The landmarks data set was last updated in 2017 and uses WGS 84. I transformed it to PA state-plane south for better accuracy.


My research will examine whether non-white and low-income census tracts (defined as less than median income) have access to art and cultural installations within approximate walking distance. Walking distance will be defined as a quarter-mile radius from the centroid of the census tract. Furthermore, I will analyze hot-spot patterns of landmarks to determine the demographic characteristics of cultural clusters.


Your Task:

# Your spatial analysis

#I want to remove neighborhoods from landmarks. 

landmarks <- landmarks %>%
  filter(!SUBTYPE %in% "Neighborhood")

phila_demos <- phila_demos %>%
  mutate(low_inc_minority = ifelse((median_incomeE < median(median_incomeE, na.rm = TRUE)) & (percent_nonwhite > 65), 1, 0)) #creating a category that classifies each tract as low income and minority according to these thresholds. 

phila_demos <- st_transform(phila_demos, st_crs(landmarks))

phila_tracts_centroid <- st_centroid(phila_demos) #finding centroid of each tract.

landmarks_matrix <- st_distance(phila_tracts_centroid, landmarks) #finding distances from centroids to landmarks

nearest_landmark <- apply(landmarks_matrix, 1, min) #finding the closest landmark to each centroid

phila_demos$nearest_landmark <- as.numeric(nearest_landmark) #attaching the nearest landmark distance to my demographic dataset

phila_demos <- phila_demos %>%
  mutate(nearest_landmark_mi = as.numeric(nearest_landmark / 5280)) #converting to miles

phila_demos <- phila_demos %>%
  mutate(low_inc_minority_low_service = ifelse(nearest_landmark_mi < .25 & low_inc_minority == 1, 1, 0))

# Now performing hot-spot analysis of cultural institutions. 

#distance neighbors
#install.packages("spdep")
library(spdep)
library(sp)
 
coords <- st_coordinates(landmarks) #getting coordinates
knn_neighbors <- knearneigh(coords, k = 5) #finding five nearest neighbors of each land mark
neighbors <- knn2nb(knn_neighbors)
weights <- nb2listw(neighbors, style = "W")

#running Getis-Ord Gi*

gi_star <- localG(landmarks$OBJECTID, weights)
landmarks$Gi_star <- as.numeric(gi_star)

#Plotting to see where there are hotspots and cold spots for culture
ggplot() +
  geom_sf(
    data = phila_demos,
    aes(fill = low_inc_minority),
    color = "black",
    size = 0.25
  ) +
  scale_fill_gradient(low = "white", high = "darkgrey", name = "Low Income and Minority Communities") +
  geom_sf(
    data = landmarks %>% filter(!is.na(Gi_star)),
    aes(color = Gi_star),
    size = 2
  ) +
  scale_color_viridis_c(option = "plasma", name = "Gi* Statistic") +
   labs(
    title = "Cultural Clustering and Low-Income Minority Communities",
    subtitle = "Census Tract Level",
    caption = "Data ACS 2024"
  ) +
  theme_void()

# Now plotting to see where underserved low-income communties are.

ggplot() + 
  geom_sf(
    data = phila_demos,
    fill = "white",      # fill for all tracts
    color = "black", 
    size = 0.25
  ) +
  geom_sf(
    data = phila_demos %>% filter(low_inc_minority_low_service == 1),
    fill = "blue",
    color = "grey",
    size = 0.05
  ) +
  theme_void() +
  theme(
    plot.title = element_text(size = 18, face = "bold"),
    plot.subtitle = element_text(size = 14),
    plot.caption = element_text(size = 10)
  ) +
  labs(
    title = "Low Income Minority Tracts w/o Cultural Ammenities",
    subtitle = "Census Tract Level",
    caption = "Based on 1/4 mile proximity to registered landmarks"
  )

phila_demos <- phila_demos %>%
  mutate(low_cultural_ammenities = ifelse(nearest_landmark_mi > .25, 1, 0))

table2 <- phila_demos %>%
  st_drop_geometry() %>%
  summarize(
    prop_culturally_underserved = sum(low_inc_minority_low_service, na.rm = TRUE) / sum(low_inc_minority, na.rm = TRUE),
    low_service_prop = mean(low_cultural_ammenities, na.rm = TRUE)
  )

table2 %>%
  kable(
    digits = 2,
    col.names = c(
      "Culturally Underserved Low-Income Minority Population",
      "Culturally Underserved General Population"
    ),
    caption = "Proportion of Tracts within a quarter-mile of a Cultural Ammenity"
  )
Proportion of Tracts within a quarter-mile of a Cultural Ammenity
Culturally Underserved Low-Income Minority Population Culturally Underserved General Population
0.18 0.65

Findings

My findings indicate that low-income minority neighborhoods, as defined by over 65% non-white and under Philadelphia median income, are not under served by cultural amenities. In fact, low-income minority neighborhoods are on average closer to cultural amenities, with approximately 65% of census tracts being within a quarter-mile of a registered landmark. However, when looking at the spatial clustering of cultural amenities, hot spots are not clustered in low-income minority communities. This suggests that while low-income and minority communities have access to cultural amenities at a higher rate than the city on average, they do not live in neighborhoods that the land-mark commission deems culturally-rich.

Finally - A few comments about your incorporation of feedback!

As recommended in my feedback, I set my set-up code chunks as echo = FALSE so that they are not rendered. I also removed printing to the document to make the user experience cleaner. Finally I removed unnecessary assignment instructions.