Skip to content
View Thooms-coder's full-sized avatar
  • Clarkson University
  • Boston

Block or report Thooms-coder

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Thooms-coder/README.md

Mutsa Mungoshi

I am an Applied Data Scientist, Data Engineer, and Forward-Deployed Engineer focused on building end-to-end data and machine learning systems that operate in real-world environments.

I am currently pursuing an M.S. in Applied Data Science at Clarkson University (GPA: 4.0), where my work centers on deploying production-grade pipelines, interactive analytical systems, and decision-support tools across industrial, operational, and policy domains.

My work sits at the intersection of:

  • Engineering-driven problem solving
  • Full-stack data systems (OLTP → ETL → analytics → application layer)
  • Applied machine learning and statistical modeling
  • Forward-deployed systems integrated directly with users and workflows

Technical Focus

Data Engineering & Systems
SQL (PostgreSQL, MySQL), data warehousing, dimensional modeling, star schemas, SCD Type 2, ETL/ELT pipelines, incremental processing

Machine Learning & Analytics
Python, R, regression, classification, clustering, PCA, feature engineering, statistical diagnostics

Visualization & Decision Support
Plotly, Tableau, Shiny, interactive dashboards, analytical reporting systems

Applications & Backend Systems
Flask, SQLAlchemy, REST APIs, full CRUD systems, Streamlit

Tools & Infrastructure
Git, Docker, Airflow, Snowflake, AWS (EC2, S3, Lambda)


Selected Projects

BREAK IT — AI Agent Red Teaming Platform (Hackathon Winner)

React (Vite), TypeScript, LLM APIs, Adversarial Testing
Built an adversarial evaluation system for LLM agents, enabling real-time exploitation and behavioral validation of vulnerabilities such as prompt injection, role impersonation, and data exfiltration. Designed a closed-loop pipeline combining agent parsing, automated vulnerability detection, simulation, and model-based exploit validation.

🔗 https://github.com/Thooms-coder/agent-breaker-studio


Gateway Cities Investigative Analytics Platform (Hackathon Winner)

Python, SQL, Streamlit, Plotly, PostgreSQL, LLMs
Developed a civic analytics platform powered by large-scale census data, including a normalized metric warehouse and an LLM-driven copilot for natural language querying, statistical analysis, and interactive visualization.

🔗 https://github.com/Thooms-coder/ma-gateway-cities-dashboard


Multimodal Traffic Sensor Validation System

Python, PyTorch, Signal Processing, Pandas, Plotly
Engineered a multi-branch ETL pipeline integrating audio, image, and sensor data to perform cross-modal validation of traffic systems. Built independent feature pipelines and statistical workflows, reducing false-positive anomaly alerts by 22%.

🔗 https://github.com/Thooms-coder/multimodal-taxi-data-analysis-big-data


ZAGI Data Warehouse

SQL, Data Engineering
Designed and implemented a full OLTP → staging → warehouse pipeline with dimensional modeling, SCD Type 2 handling, incremental loads, and analytical aggregation for a retail and rental system.

🔗 https://github.com/Thooms-coder/zagi-data-warehouse


Applied Experience

Research Assistant — Applied Data Science (Clarkson University)
Built time-series pipelines and analytical models on 50,000+ high-frequency wastewater observations, developing predictive insights and decision-support tools for operational optimization.

Software Developer & Database Engineer (Clarkson University)
Designed and deployed a SQL-backed system for a 200+ member rowing club, automating scheduling, reporting, and operational workflows through a forward-deployed data system.


Contact


Interests

I am interested in roles and collaborations involving:

  • Data Engineering
  • Applied Machine Learning
  • Forward-Deployed Engineering
  • Analytics Systems & Decision Support
  • Research-driven data applications

Pinned Loading

  1. agent-breaker-studio agent-breaker-studio Public

    Forked from ClarkOhlenbusch/agent-breaker-studio

    Interactive AI red teaming platform for discovering, exploiting, and validating agent vulnerabilities in real time.

    TypeScript

  2. multimodal-taxi-data-analysis-big-data multimodal-taxi-data-analysis-big-data Public

    IA626 Big Data project analyzing multimodal urban traffic data (image and audio) through reproducible ETL pipelines and cross-modal visual analytics to detect anomalies and data quality issues.

    Python

  3. zagi-data-warehouse zagi-data-warehouse Public

    End-to-end OLTP to data warehouse implementation with SCD Type 2 dimensions, incremental ETL, and analytical aggregates.

    SQL

  4. material_selector_shinyApp material_selector_shinyApp Public

    Interactive Shiny app for exploring and comparing construction materials. Includes dynamic filtering, Ashby-style plots, and radar chart comparisons, with a polished themed UI.

    R

  5. metro-status-prediction-pipeline metro-status-prediction-pipeline Public

    Predicting U.S. metro status using structural cost-of-living shares and machine learning.

    Python

  6. ma-gateway-cities-dashboard ma-gateway-cities-dashboard Public

    Foreign-born and economic trends in Massachusetts Gateway Cities (ACS 2010–2024)

    Python