← Back to portfolio

Data Analytics · Public Health

COVID-19 Data Analytics for Alabama Counties

Exploratory analysis of county-level COVID-19 data for Alabama, focusing on death rates, vaccination coverage, and provider distribution to understand which regions were most impacted.

Overview

This project explores COVID-19 outcomes across all Alabama counties using real-world case, death, testing, vaccination, and provider data. The goal was to clean and combine multiple datasets, engineer meaningful metrics (like death rate and vaccination rate), and visualize differences between counties.

The notebook walks through data cleaning, feature engineering, and exploratory visualizations, and can be extended into a foundation for simple dashboards or further modeling.

Data & Methods

Datasets

  • County-level COVID-19 cases, deaths, and tests for Alabama.
  • Vaccination data for population 16+ by county.
  • Provider dataset listing vaccine locations by county.

Key Steps

  • Loaded CSV files into pandas and removed aggregate or invalid rows.
  • Converted numeric fields (cases, deaths, population, doses) to proper types.
  • Engineered features such as:
    • death_rate = (deaths / cases) * 100
    • vaccination_rate = fully_vaccinated / population_16_plus
    • Provider counts per county.
  • Merged datasets on county name to align outcomes and access metrics.

Tech Stack

Python, pandas, NumPy, matplotlib, seaborn, Jupyter Notebook

Key Metrics & Findings

You can customize this section with specific numbers (e.g., “Top county death rate was X%”) once you finalize the analysis.

Key Charts

A few of the visualizations exported from the notebook:

Bar chart of COVID-19 death rate by county
Death rate (%) by county, sorted from highest to lowest.
Bar chart of vaccination rate by county
Vaccination coverage for population 16+ across Alabama counties.

Export your charts from Jupyter as PNGs and place them in assets/ to use these slots.

Challenges & Learnings

Project Links