- Project Abstract
- Problem Statement: Diagnostic Analysis of Wholesale Trade Health
- Significance: Data Objectives and Utility
- Configuration and Methodology
- Tableau Dashboard Documentation
- Key Analytical Findings
- Contact
This repository contains an automated diagnostic analysis of the U.S. Wholesale Trade Sector (NAICS 42) health, translating raw U.S. Census Bureau time series data (1992–Present) into actionable economic signals. The project utilizes a reproducible Python script to clean and process data, calculate two crucial metrics—the Inventory-to-Sales (I/S) Ratio and Year-over-Year (YoY) Sales Growth—and outputs a final dataset for immediate visualization.
The resulting Tableau dashboard provides clear, objective insights, allowing users to instantly assess inventory risk (by comparing the I/S Ratio to its historical average) and market momentum (by tracking YoY growth against the 0% contraction baseline).
This project addresses the need for a rapid, objective assessment of the wholesale sector's health. The Python analysis pipeline transforms complex, raw government data into four simple, easy-to-read charts that provide immediate clarity on the sector's current financial and supply/demand dynamics.
The analysis is structured to answer two critical, practical questions using the full historical dataset (1992–Present):
- Question: What is the current velocity of the wholesale market—is it expanding, decelerating, or actively contracting?
- Metric: Year-over-Year (YoY) Sales Growth – the definitive measure of momentum, using the 0% baseline as the non-negotiable signal for market contraction.
- Question: Is the supply chain currently balanced, or are wholesalers holding an unsustainable volume of stock relative to consumption?
- Metric: Inventory-to-Sales (I/S) Ratio – diagnoses supply-demand health by comparing the current ratio against its long-term historical average, establishing a clear benchmark for risk.
The significance of this project lies in its ability to translate raw government data into clear, objective signals, bypassing subjective interpretation of large datasets. The analysis serves three primary utility goals:
Determines if the sector is overstocked (signaling future markdowns) or understocked (signaling potentially lost revenue) by measuring the I/S Ratio against its established historical average.
Calculates and charts the YoY Sales Growth to precisely locate drops below the 0% baseline, providing an objective, data-driven signal of market contraction necessary for economic forecasting.
Provides long-term context (Nominal Sales Trend) and validates the I/S Ratio by visualizing absolute dollar levels of Sales and Inventories to confirm any supply-demand disconnect.
This project processes monthly time series data for Total Merchant Wholesalers (NAICS 42), specifically the Seasonally Adjusted Nominal Estimates in millions of dollars, sourced from the U.S. Census Bureau.
- Original Data Format: Multi-sheet Excel file (.xlsx).
- Conversion: Extracted, cleaned, and saved as
.csv(Sales_Adjusted.csv and Inventories_Adjusted.csv). - Configuration: Uses
pathlibandPROJECT_ROOT = Path(__file__).parent.parentfor platform-independent execution.
Executed using analysis_script.py, structured into three core components:
Data Loading/Cleaning, Core Processing, and Chart Generation.
The load_and_clean_data function standardizes and sanitizes Census data.
| Code Feature | Implementation | Explanation |
|---|---|---|
| Header Handling | pd.read_csv(file_path, header=16) |
Skips the 16-line metadata, reading correct headers. |
| Footnote Stripping | .str.replace(r'[^\w\s]', '', regex=True) |
Removes statistical symbols like p or r. |
| Date Parsing | pd.to_datetime(..., format='%B %Y', errors='coerce') |
Ensures consistent and accurate date conversion. |
| Value Conversion | pd.to_numeric(..., errors='coerce') |
Cleans commas and ensures numeric type reliability. |
The run_data_processing function merges Sales and Inventories data and calculates economic indicators.
| Metric | Formula (Plain Text) | Python Implementation | Economic Interpretation |
|---|---|---|---|
| I/S Ratio | (Inventories_Nominal) / (Sales_Nominal) | merged_df['Inventories_Total_Nominal'] / merged_df['Sales_Total_Nominal'] |
Indicates months to clear stock; higher = supply outpacing demand. |
| YoY Sales Growth | ((Sales_Current - Sales_12_Months_Ago) / Sales_12_Months_Ago) * 100 | merged_df['Sales_Total_Nominal'].pct_change(periods=12) * 100 |
Shows annualized growth rate; adjusts for seasonality. |
generate_charts uses Matplotlib to produce verification charts for validation.
Example: YoY Growth Plot
plt.plot(growth_df['Date'], growth_df['Sales_YoY_Growth'], color='#2ca02c', label='YoY Sales Growth')
plt.axhline(0, color='red', linestyle='-', linewidth=1)- The red 0% baseline serves as the clear contraction threshold.
- Any data below this line signals economic stress.
The final output (merged_wts_data_nominal.csv) is visualized in Tableau through four analytical charts.
- Purpose: Establishes macroeconomic context and confirms long-term growth.
- Feature: Upward slope shows compounding inflation and real growth.
- Purpose: Diagnoses supply-demand balance.
- Feature: Average reference line acts as a risk benchmark.
- Purpose: Acts as a reliable expansion/contraction indicator.
- Feature: The red 0% line is the definitive signal for recessionary stress.
- Purpose: Validates signals generated by the I/S Ratio.
- Feature: Divergence of inventory from sales indicates supply-demand disconnect.
- Contraction Signal: When YoY Growth < 0%, the market is contracting.
- Historical Validation: Matches the 2008 and 2020 economic downturns.
- Overstocked (High Risk): Ratio above average → excess inventory buildup.
- Understocked (Low Risk): Ratio below average → demand outpaces supply.
- Supply/Demand Disconnect: Confirmed when inventories grow faster than sales.
- Market Context: Despite volatility, long-term sales growth remains strong.
Author: Michael Arg.
Email: michgw7@gmail.com