Lab 1: Census Data Quality for Policy Decisions

Evaluating Data Reliability for Algorithmic Decision-Making

Author

Your Name Here

Published

February 3, 2026

Assignment Overview

Scenario

You are a data analyst for the [Your State] Department of Human Services. The department is considering implementing an algorithmic system to identify communities that should receive priority for social service funding and outreach programs. Your supervisor has asked you to evaluate the quality and reliability of available census data to inform this decision.

Drawing on our Week 2 discussion of algorithmic bias, you need to assess not just what the data shows, but how reliable it is and what communities might be affected by data quality issues.

Learning Objectives

Apply dplyr functions to real census data for policy analysis
Evaluate data quality using margins of error
Connect technical analysis to algorithmic decision-making
Identify potential equity implications of data reliability issues
Create professional documentation for policy stakeholders

Submission Instructions

Submit by posting your updated portfolio link on Canvas. Your assignment should be accessible at your-portfolio-url/labs/lab_1/

Make sure to update your _quarto.yml navigation to include this assignment under an “Labs” menu.

Part 1: Portfolio Integration

Create this assignment in your portfolio repository under an labs/lab_1/ folder structure. Update your navigation menu to include:

- text: Assignments
  menu:
    - href: labs/lab_1/your_file_name.qmd
      text: "Lab 1: Census Data Exploration"

If there is a special character like a colon, you need use double quote mark so that the quarto can identify this as text

Setup

# Load required packages (hint: you need tidycensus, tidyverse, and knitr)

library(tidycensus)
library(tidyverse)
library(knitr)

# Set your Census API key

#census_api_key("459aac3635030875e83d67456e41ee620321ed9e", install = TRUE)

# Choose your state for analysis - assign it to a variable called my_state

my_state <- c("West Virginia")

State Selection: I have chosen [Your State Name] for this analysis because: [Brief explanation of why you chose this state]

Part 2: County-Level Resource Assessment

2.1 Data Retrieval

Your Task: Use get_acs() to retrieve county-level data for your chosen state.

Requirements: - Geography: county level - Variables: median household income (B19013_001) and total population (B01003_001)
- Year: 2022 - Survey: acs5 - Output format: wide

Hint: Remember to give your variables descriptive names using the variables = c(name = "code") syntax.

# Write your get_acs() code here
data <- get_acs(
  geography = "county",
  variables = c(
    median_hh_income = "B19013_001", 
    total_population = "B01003_001"),
  state = "WV",
  year = 2022,
  output = "wide",
  survey = "acs5"
)
# Clean the county names to remove state name and "County" 

data_working <- data %>%
  mutate(new_name = str_remove(NAME, " County, West Virginia")) %>%
  select(!NAME)

# Hint: use mutate() with str_remove()

# Display the first few rows

head(data_working)

# A tibble: 6 × 6
  GEOID median_hh_incomeE median_hh_incomeM total_populationE total_populationM
  <chr>             <dbl>             <dbl>             <dbl>             <dbl>
1 54001             44341              2402             15527                NA
2 54003             73619              1970            123283                NA
3 54005             56182              4897             21705                NA
4 54007             42245              4022             12505                NA
5 54009             51963              7343             22349                NA
6 54011             48944              3441             93965                NA
# ℹ 1 more variable: new_name <chr>

2.2 Data Quality Assessment

Your Task: Calculate margin of error percentages and create reliability categories.

Requirements: - Calculate MOE percentage: (margin of error / estimate) * 100 - Create reliability categories: - High Confidence: MOE < 5% - Moderate Confidence: MOE 5-10%
- Low Confidence: MOE > 10% - Create a flag for unreliable estimates (MOE > 10%)

Hint: Use mutate() with case_when() for the categories.

# Calculate MOE percentage and reliability categories using mutate()

data_working <- data_working %>%
  mutate(POP_MOE_Per = (total_populationM/total_populationE)*100,
         INC_MOE_Per = (median_hh_incomeM/median_hh_incomeE)*100,
         POP_reliability = case_when(
           POP_MOE_Per < 5 ~ "High Confidence",
           POP_MOE_Per > 10 ~ "Low Confidence",
           .default = "Moderate Confidence"),
        INC_reliability = case_when(
           INC_MOE_Per < 5 ~ "High Confidence",
           INC_MOE_Per > 10 ~ "Low Confidence",
           .default = "Moderate Confidence")
)


# Create a summary showing count of counties in each reliability category

POP_reliability_county <- data_working %>%
  group_by(POP_reliability) %>%
  count()

POP_reliability_county

# A tibble: 1 × 2
# Groups:   POP_reliability [1]
  POP_reliability         n
  <chr>               <int>
1 Moderate Confidence    55

INC_reliability_county <- data_working %>%
  group_by(INC_reliability) %>%
  count()

INC_reliability_county

# A tibble: 3 × 2
# Groups:   INC_reliability [3]
  INC_reliability         n
  <chr>               <int>
1 High Confidence         6
2 Low Confidence         26
3 Moderate Confidence    23

# Hint: use count() and mutate() to add percentages

2.3 High Uncertainty Counties

Your Task: Identify the 5 counties with the highest MOE percentages.

Requirements: - Sort by MOE percentage (highest first) - Select the top 5 counties - Display: county name, median income, margin of error, MOE percentage, reliability category - Format as a professional table using kable()

Hint: Use arrange(), slice(), and select() functions.

# Create table of top 5 counties by MOE percentage
Highest_Inc_MOE_counties <- data_working %>%
  arrange(desc(INC_MOE_Per)) %>%
  slice(1:5) %>%
  select(new_name, median_hh_incomeE, median_hh_incomeM, INC_MOE_Per, INC_reliability)

# Format as table with kable() - include appropriate column names and caption
kable(Highest_Inc_MOE_counties,
      col.names = c("County", "Median Household Income", "Margin of Error", "Percentage Margin of Error", "Reliability Category"),
      caption = "Five Highest Margin of Errors for Income Estimates Among West Virginia Counties"
      )

Five Highest Margin of Errors for Income Estimates Among West Virginia Counties
County	Median Household Income	Margin of Error	Percentage Margin of Error	Reliability Category
Calhoun	39031	7651	19.60237	Low Confidence
Doddridge	56587	9976	17.62949	Low Confidence
Pendleton	52458	8844	16.85920	Low Confidence
Summers	42991	6897	16.04289	Low Confidence
Clay	41530	6353	15.29738	Low Confidence

Data Quality Commentary:

[Write 2-3 sentences explaining what these results mean for algorithmic decision-making. Consider: Which counties might be poorly served by algorithms that rely on this income data? What factors might contribute to higher uncertainty?]

Data that have a high margin of error are less reliable, as the true value of the data is more variable. This can be especially problematic when considering algorithmic decision-making, as a county which seemingly has a high income may be de-prioritized for funding compared to one with a lower income. However, this comparison may not be true if the margin of error for the high-income county is large enough to include the possibility that the average income is in fact lower than the other geography. Looking at the West Virginia dataset, the margin of error for Pendelton County is +/- 16.8%. This means that we cannot confidently say that it has a lower median income than Doddridge, as Doddridge’s median income falls within Pendelton’s MOE.

Part 3: Neighborhood-Level Analysis

3.1 Focus Area Selection

Your Task: Select 2-3 counties from your reliability analysis for detailed tract-level study.

Strategy: Choose counties that represent different reliability levels (e.g., 1 high confidence, 1 moderate, 1 low confidence) to compare how data quality varies.

# Use filter() to select 2-3 counties from your county_reliability data
# Store the selected counties in a variable called selected_counties

selected_counties <- data_working %>% 
  filter(new_name %in% c("Jefferson", "Kanawha", "Lewis"))

# Display the selected counties with their key characteristics
# Show: county name, median income, MOE percentage, reliability category

selected_counties <- selected_counties %>%
  select(new_name, median_hh_incomeE, INC_MOE_Per, INC_reliability)

kable(selected_counties,
      col.names = c("County", "Median Household Income,", "Margin of Error", "Reliability Category"),
      caption = "Selected Counties and Associated Median Household Incomes")

Selected Counties and Associated Median Household Incomes
County	Median Household Income,	Margin of Error	Reliability Category
Jefferson	93744	5.964115	Moderate Confidence
Kanawha	55226	2.987723	High Confidence
Lewis	50552	12.151844	Low Confidence

Comment on the output: [write something :)]

Based on my knowledge of the state, I imagine that the MOE is greatly impacted by the total county population. The margin of error corresponds to the population rank of each county, with the lowest margin of error corresponding to the most populous county.

3.2 Tract-Level Demographics

Your Task: Get demographic data for census tracts in your selected counties.

Requirements: - Geography: tract level - Variables: white alone (B03002_003), Black/African American (B03002_004), Hispanic/Latino (B03002_012), total population (B03002_001) - Use the same state and year as before - Output format: wide - Challenge: You’ll need county codes, not names. Look at the GEOID patterns in your county data for hints.

# Define your race/ethnicity variables with descriptive names

# Use get_acs() to retrieve tract-level data
# Hint: You may need to specify county codes in the county parameter

# Calculate percentage of each group using mutate()
# Create percentages for white, Black, and Hispanic populations

# Add readable tract and county name columns using str_extract() or similar

tract_analysis <- get_acs(
  geography = "tract",
  variables = c(
    White_Alone = "B03002_003", 
    Black_African_American = "B03002_004",
    Hispanic_Latino = "B03002_012",
    Total_Population = "B03002_001"
  ),
  state = "54",
  county = c(
    "037", 
    "039", 
    "041"
  ),
  year = 2023,
  output = "wide"
)

tract_analysis <- tract_analysis %>%
  mutate(census_tract = str_extract(NAME, "^[^;]+"),
        county = str_extract(NAME, "(?<=;).*")
        )

tract_analysis <- tract_analysis %>%
  select(!"NAME")

tract_analysis <- tract_analysis %>%
  mutate(
  county = str_remove(county, "County; West Virginia"),
  census_tract = str_remove(census_tract, "Census Tract ")
  )

tract_analysis <- tract_analysis %>%
  mutate(county = str_remove(county, " County"))

tract_analysis <- tract_analysis %>%
  mutate(Per_White = White_AloneE / Total_PopulationE,
         Per_Black = Black_African_AmericanE / Total_PopulationE,
         Per_Hispanic = Hispanic_LatinoE / Total_PopulationE)

3.3 Demographic Analysis

Your Task: Analyze the demographic patterns in your selected areas.

# Find the tract with the highest percentage of Hispanic/Latino residents
# Hint: use arrange() and slice() to get the top tract

highest_hispanic <- tract_analysis %>%
  arrange(desc(Per_Hispanic)) %>%
  slice(1:5)

# Calculate average demographics by county using group_by() and summarize()

county_demographics <- tract_analysis %>%
  group_by(county) %>%
  summarize(
    Per_Black = sum(Black_African_AmericanE)/sum(Total_PopulationE),
    Per_White = sum(White_AloneE)/sum(Total_PopulationE),
    Per_Hispanic = sum(Hispanic_LatinoE)/sum(Total_PopulationE)
  )

# Show: number of tracts, average percentage for each racial/ethnic group

table <- tract_analysis %>%
group_by(county) %>%
  summarize(
    Per_Black = sum(Black_African_AmericanE)/sum(Total_PopulationE),
    Per_White = sum(White_AloneE)/sum(Total_PopulationE),
    Per_Hispanic = sum(White_AloneE)/sum(Total_PopulationE),
    Tracts = n()
  )


# Create a nicely formatted table of your results using kable()

kable(table,
      col.names = c("County", "Percent Black", "Percent White", "Percent Hispanic", "Number of Census Tracts"),
      caption = "Demographic Information at County Level, West Virginia")

Demographic Information at County Level, West Virginia
County	Percent Black	Percent White	Percent Hispanic	Number of Census Tracts
Jefferson	0.0528473	0.8041540	0.8041540	15
Kanawha	0.0613419	0.8624339	0.8624339	57
Lewis	0.0032723	0.9122442	0.9122442	5

Part 4: Comprehensive Data Quality Evaluation

4.1 MOE Analysis for Demographic Variables

Your Task: Examine margins of error for demographic variables to see if some communities have less reliable data.

Requirements: - Calculate MOE percentages for each demographic variable - Flag tracts where any demographic variable has MOE > 15% - Create summary statistics

# Calculate MOE percentages for white, Black, and Hispanic variables
# Hint: use the same formula as before (margin/estimate * 100)

MOE_analysis <- tract_analysis %>%
  mutate(WhiteMOE = (White_AloneM/White_AloneE)*100,
         BlackMOE = (Black_African_AmericanM/Black_African_AmericanE)*100,
         HispanicMOE = (Hispanic_LatinoM/Hispanic_LatinoE)*100
  )


# Create a flag for tracts with high MOE on any demographic variable
# Use logical operators (| for OR) in an ifelse() statement

MOE_analysis <- MOE_analysis %>%
  mutate(Quality = case_when(
    WhiteMOE > 20 | BlackMOE > 10 | HispanicMOE >10 ~ "Very Low Quality",
    WhiteMOE < 10 | BlackMOE < 10 | HispanicMOE < 10 ~ "Fine",
    .default = "Low Quality"
  ))

# Create summary statistics showing how many tracts have data quality issues

MOE_analysis %>% count(Quality)

# A tibble: 1 × 2
  Quality              n
  <chr>            <int>
1 Very Low Quality    77

4.2 Pattern Analysis

Your Task: Investigate whether data quality problems are randomly distributed or concentrated in certain types of communities.

# Group tracts by whether they have high MOE issues
# Calculate average characteristics for each group:
# - population size, demographic percentages
# Use group_by() and summarize() to create this comparison
# Create a professional table showing the patterns

MOE_Table <- MOE_analysis %>%
  group_by(Quality) %>%
  summarize(average_population = mean(Total_PopulationE),
             Per_Black = sum(Black_African_AmericanE)/sum(Total_PopulationE),
              Per_White = sum(White_AloneE)/sum(Total_PopulationE),
                Per_Hispanic = sum(Hispanic_LatinoE)/sum(Total_PopulationE))

kable(MOE_Table, 
      col.names = c("Quality Category", "Average Population", "Percent Black", "Percent White", "Percent Hispanic"), 
      caption= "Demographic Characteristics of Data Quality")

Demographic Characteristics of Data Quality
Quality Category	Average Population	Percent Black	Percent White	Percent Hispanic
Very Low Quality	3292.883	0.055531	0.8522788	0.0287239

Pattern Analysis: [Describe any patterns you observe. Do certain types of communities have less reliable data? What might explain this?]

Unfortunately, the data in my selected counties in West Virginia do not have sufficient reliability to make confident use them. This makes sense however, as the average census tract population for the counties I am using is only approx. 3,200 people. Furthermore, these tracts are very racially homogenous, meaning that all tracts have one minority group that is small enough as to create very low confidence data and flag the entire tract as unreliable.

Part 5: Policy Recommendations

5.1 Analysis Integration and Professional Summary

Your Task: Write an executive summary that integrates findings from all four analyses.

Executive Summary Requirements: 1. Overall Pattern Identification: What are the systematic patterns across all your analyses? 2. Equity Assessment: Which communities face the greatest risk of algorithmic bias based on your findings? 3. Root Cause Analysis: What underlying factors drive both data quality issues and bias risk? 4. Strategic Recommendations: What should the Department implement to address these systematic issues?

Executive Summary:

The systematic patterns across all of my analysis reveals that ACS-5 year data are not suitable for confidently assessing racial composition in the three selected counties in West Virginia. Given that the most populous county in West Virginia was included in my sample for census tract-analysis, as well as the county with the highest proportion of Hispanic/Latino residents, it can be assumed that census tracts in the rest of the state have a significant margin of error due to low sample sizes. Median income data is slightly higher quality, with certain counties suitable for immediate algorithmic usage. However, once more, county population significantly impacts data quality, with many rural, low-population counties experiencing significantly lower data confidence.

Minority and rural communities face the most significant risk of algorithmic bias in ACS 5-year estimates. In the census-tract data, the MOE that flagged as low-confidence were consistently found in minority communities. Thus, these communities are at risk of being over or under-counted, and thus being mis-represented in decision making that stems from public data analysis such as funding, representation, etc. Rural communities are at risk of having their incomes miscalculated, as certain counties’ margin of errors were within the range of each other’s estimated value. This means that ranking counties on median income is ineffective, and thus decisions stemming from a ranked approach to resource allocation may under or over-provision resources to counties with low-confidence data.

Population size appears to produce low-margins of error. This is likely due to the low sample size of ACS survey respondents associated with rural and low-population communities. This is especially true for minority communities living in ethnically homogeneous low-population regions, as the sample size of minority residents in this community has the potential to be extremely small. Indeed, if the ACS surveys 3% of the US population, in communities where minority groups comprise less than 3% of the total population (such as in sampled WV census tracts), surveyors risk not surveying Hispanic residents of that county.

The department should improve bias and errors in its data analysis by flagging counties that are at-risk of algorithmic bias, and supplementing ACS 5-year data with 10 year decennial census data, as 10 year data does not have a margin of error. This protocol should be expanded to require all analysis involving racial composition data to rely on the decennial census, as in West Virginia, there were no counties that had high-confidence racial data profiles. These protocols will mitigate under or overcounting minority populations, and provide more accurate comparisons of income across counties.

6.3 Specific Recommendations

Your Task: Create a decision framework for algorithm implementation.

# Create a summary table using your county reliability data
# Include: county name, median income, MOE percentage, reliability category

# Add a new column with algorithm recommendations using case_when():
# - High Confidence: "Safe for algorithmic decisions"
# - Moderate Confidence: "Use with caution - monitor outcomes"  
# - Low Confidence: "Requires manual review or additional data"

# Format as a professional table with kable()

County_Reliability_Table <- data_working %>%
  mutate(Algorithm_Rec = case_when(
    INC_reliability == "High Confidence" ~ "Safe for algorithmic decisions",
    INC_reliability == "Moderate Confidence" ~ "Use with caution - monitor outcomes",
    INC_reliability == "Low Confidence" ~ "Requires manual review or additional data")
  )

County_Reliability_Table <- County_Reliability_Table %>%
  select(new_name, median_hh_incomeE, INC_MOE_Per, INC_reliability, Algorithm_Rec)

kable(County_Reliability_Table,
      col.names = c("County Name", "Median Household Income", "Margin of Error", "Reliability", "Algorithmic Recommendation"),
      caption = "Data Reliability and Recommendations for West Virginia County Median Income Data")

Data Reliability and Recommendations for West Virginia County Median Income Data
County Name	Median Household Income	Margin of Error	Reliability	Algorithmic Recommendation
Barbour	44341	5.417108	Moderate Confidence	Use with caution - monitor outcomes
Berkeley	73619	2.675940	High Confidence	Safe for algorithmic decisions
Boone	56182	8.716315	Moderate Confidence	Use with caution - monitor outcomes
Braxton	42245	9.520653	Moderate Confidence	Use with caution - monitor outcomes
Brooke	51963	14.131209	Low Confidence	Requires manual review or additional data
Cabell	48944	7.030484	Moderate Confidence	Use with caution - monitor outcomes
Calhoun	39031	19.602367	Low Confidence	Requires manual review or additional data
Clay	41530	15.297375	Low Confidence	Requires manual review or additional data
Doddridge	56587	17.629491	Low Confidence	Requires manual review or additional data
Fayette	50090	8.390896	Moderate Confidence	Use with caution - monitor outcomes
Gilmer	51552	12.445686	Low Confidence	Requires manual review or additional data
Grant	52877	12.911096	Low Confidence	Requires manual review or additional data
Greenbrier	45519	6.669742	Moderate Confidence	Use with caution - monitor outcomes
Hampshire	55222	11.799645	Low Confidence	Requires manual review or additional data
Hancock	57515	7.356342	Moderate Confidence	Use with caution - monitor outcomes
Hardy	49205	10.838329	Low Confidence	Requires manual review or additional data
Harrison	56184	3.554393	High Confidence	Safe for algorithmic decisions
Jackson	55173	12.979175	Low Confidence	Requires manual review or additional data
Jefferson	93744	5.964115	Moderate Confidence	Use with caution - monitor outcomes
Kanawha	55226	2.987723	High Confidence	Safe for algorithmic decisions
Lewis	50552	12.151844	Low Confidence	Requires manual review or additional data
Lincoln	50985	11.997646	Low Confidence	Requires manual review or additional data
Logan	42194	10.029862	Low Confidence	Requires manual review or additional data
McDowell	28235	12.629715	Low Confidence	Requires manual review or additional data
Marion	59974	3.941708	High Confidence	Safe for algorithmic decisions
Marshall	58129	10.949784	Low Confidence	Requires manual review or additional data
Mason	53058	13.722718	Low Confidence	Requires manual review or additional data
Mercer	46409	6.061324	Moderate Confidence	Use with caution - monitor outcomes
Mineral	64728	8.180386	Moderate Confidence	Use with caution - monitor outcomes
Mingo	38305	11.155202	Low Confidence	Requires manual review or additional data
Monongalia	60893	4.302629	High Confidence	Safe for algorithmic decisions
Monroe	52392	7.598488	Moderate Confidence	Use with caution - monitor outcomes
Morgan	61021	7.312237	Moderate Confidence	Use with caution - monitor outcomes
Nicholas	48826	10.092983	Low Confidence	Requires manual review or additional data
Ohio	55521	5.554655	Moderate Confidence	Use with caution - monitor outcomes
Pendleton	52458	16.859202	Low Confidence	Requires manual review or additional data
Pleasants	59666	11.810747	Low Confidence	Requires manual review or additional data
Pocahontas	41680	10.928503	Low Confidence	Requires manual review or additional data
Preston	60136	5.302980	Moderate Confidence	Use with caution - monitor outcomes
Putnam	75725	9.031363	Moderate Confidence	Use with caution - monitor outcomes
Raleigh	47975	9.719646	Moderate Confidence	Use with caution - monitor outcomes
Randolph	51186	8.529676	Moderate Confidence	Use with caution - monitor outcomes
Ritchie	48973	7.551100	Moderate Confidence	Use with caution - monitor outcomes
Roane	41299	8.026829	Moderate Confidence	Use with caution - monitor outcomes
Summers	42991	16.042893	Low Confidence	Requires manual review or additional data
Taylor	52946	12.159559	Low Confidence	Requires manual review or additional data
Tucker	54053	9.522136	Moderate Confidence	Use with caution - monitor outcomes
Tyler	59167	12.894012	Low Confidence	Requires manual review or additional data
Upshur	49663	9.975233	Moderate Confidence	Use with caution - monitor outcomes
Wayne	52694	9.786693	Moderate Confidence	Use with caution - monitor outcomes
Webster	43409	12.384529	Low Confidence	Requires manual review or additional data
Wetzel	50715	6.144139	Moderate Confidence	Use with caution - monitor outcomes
Wirt	52776	12.484842	Low Confidence	Requires manual review or additional data
Wood	54350	4.903404	High Confidence	Safe for algorithmic decisions
Wyoming	44510	13.068973	Low Confidence	Requires manual review or additional data

Key Recommendations:

Your Task: Use your analysis results to provide specific guidance to the department.

Counties suitable for immediate algorithmic implementation:

Berkely, Harrison, Kanawha, Marion, Monongalia, and Wood counties all have margins of error below 5%, indicating that they can be used for immediate implementation.

Counties requiring additional oversight: [List counties with moderate confidence data and describe what kind of monitoring would be needed]

The following counties require additional oversight including monitoring of data outcomes to ensure that moderate data quality is not negatively impacting rural populations. These counties have margins of error between 5 and 10%.

Barbour
Boone
Braxton
Cabell
Greenbrier
Hancock
Jefferson
Mercer
Mineral
Monroe
Morgan
Ohio
Preston
Putnam
Raleigh
Randolph
Ritchie
Roane
Tucker
Upshur
Wayne
Wetzel

Counties needing alternative approaches: [List counties with low confidence data and suggest specific alternatives - manual review, additional surveys, etc.]

The following counties need further review due to having margins of error exceeding 10%. Additonal data sources should be used such as the decennial census alongside continued monitoring for bias in recommendations.

Brooke
Calhoun
Clay
Doddridge
Gilmer
Grant
Hampshire
Jackson
Lewis
Lincoln
Logan
McDowell
Marshall
Mason
Mingo
Nicholas
Pendleton
Pleasants
Pocahontas
Summers
Taylor
Tyler
Wayne
Wirt
Wyoming

Questions for Further Investigation

[List 2-3 questions that your analysis raised that you’d like to explore further in future assignments. Consider questions about spatial patterns, time trends, or other demographic factors.]

Technical Notes

Data Sources: - U.S. Census Bureau, American Community Survey 2018-2022 5-Year Estimates - Retrieved via tidycensus R package on [date]

Reproducibility: - All analysis conducted in R version [your version] - Census API key required for replication - Complete code and documentation available at: [your portfolio URL]

Methodology Notes: - In determining the class of reliability in census tract- racial data, I developed a new category, Very Low Reliability, with MOEs over 20%. The other categories were defined in accordance with county-level analysis.

Limitations: - Data limited to ACS survey. Did not check for correlation between population or racial demographics and margin of error.

Submission Checklist

Before submitting your portfolio link on Canvas:

All code chunks run without errors
All “[Fill this in]” prompts have been completed
Tables are properly formatted and readable
Executive summary addresses all four required components
Portfolio navigation includes this assignment
Census API key is properly set
Document renders correctly to HTML

Remember: Submit your portfolio URL on Canvas, not the file itself. Your assignment should be accessible at your-portfolio-url/labs/lab_1/your_file_name.html