π¬ Research Analyst Β· π Data Scientist Β· π Statistician
Research analyst and data scientist with expertise in statistical modelling, ML pipelines, and Monitoring, Evaluation, and Learning (MEL) frameworks for evidence-based decision making.
Translating complex data into actionable insights for policy, public health, and operational decision-making.
βοΈ Flight Delay Prediction
Regression pipeline predicting airline delay minutes from flight volume and carrier data.
| Metric | Result |
|---|---|
| RΒ² Score | ~0.95 on monthly aggregates |
| Key Insight | Carrier identity explains ~30% of delay variance beyond traffic volume |
| Stack | Python Β· Pandas Β· Scikit-learn Β· Seaborn Β· Jupyter |
End-to-end ML workflow for diabetes risk prediction using the Pima Indians dataset.
| Metric | Result |
|---|---|
| Accuracy | ~75% with Logistic Regression |
| Segmentation | K-Means (k=3) identifies distinct metabolic-risk patient profiles |
| Stack | Python Β· Pandas Β· Scikit-learn Β· Matplotlib |
Compartmental ODE model evaluating educational campaign impact on alcohol-use population dynamics.
| Metric | Result |
|---|---|
| Model Type | SIR-style compartmental ODE |
| Output | Equilibrium behaviour & sensitivity analysis via phase-plane visualisation |
| Stack | Python Β· SciPy Β· NumPy Β· Matplotlib |
π Crime Analysis
Statistical and network-based analysis of crime patterns for intelligence-led decision making.
| Metric | Result |
|---|---|
| Method | Network centrality & exploratory spatial analysis |
| Insight | Identifies core distribution nodes and brokerage roles in urban networks |
| Stack | Python Β· NetworkX Β· Pandas Β· Matplotlib |
All featured repositories are built to these standards:
- β
Reproducible β Every repo includes
requirements.txtand step-by-step execution instructions - β Documented β Methodologies, assumptions, and limitations are explicitly stated
- β Licensed β Open-source licenses for reuse and collaboration
- β Decision-focused β Outputs designed for policy-makers and operational teams, not just technical peers
- π Statistical Modelling β Inference, regression, and experimental design
- π€ Machine Learning β End-to-end pipelines for regression, classification, and clustering
- π₯ Public Health Research β Epidemiological analysis and intervention modelling
- πΊοΈ Crime & Geospatial Analysis β Pattern detection, network analysis, and spatial statistics
- π MEL Systems β Monitoring, Evaluation, and Learning framework design and implementation
- π¬ Mathematical Modelling β Compartmental ODEs for assessing public health intervention impact
| Category | Tools |
|---|---|
| Languages | Python Β· R Β· SQL Β· TypeScript |
| Data & ML | Pandas Β· Scikit-learn Β· Statsmodels Β· NumPy |
| Visualization | Matplotlib Β· Seaborn Β· Power BI Β· ggplot2 |
| Data Collection | KoboToolbox Β· REDCap Β· ODK Β· SPSS Β· STATA |
| Workflows | Git Β· Jupyter Β· RMarkdown Β· GitHub Actions |
Based in Nairobi, Kenya π°πͺ
Work focuses on applying statistical and machine learning methods to public health, crime analysis, and decision-support systems.
Principles: Reproducibility Β· Interpretability Β· Policy-relevant insights
Contributions and replications are welcome. All flagship repositories include open-source licenses and end-to-end execution instructions.
- π Developed instructional materials for undergraduate mathematics (Algebra, Calculus)
- π Produced LaTeX-based academic and technical documents
- π Supported training in data collection tools (KoboToolbox, ODK) and reproducible analysis workflows
- πΊοΈ Crime pattern analysis using geospatial and statistical methods
- π Strengthening reproducible research workflows with CI/CD for Jupyter notebooks
- π Building MEL dashboards for public health programme evaluation
- Research collaborations in public health, epidemiology, and social policy
- MEL system design, implementation, and third-party evaluation
- Data science and analytics consultancy for NGOs and government agencies
- Academic partnerships, peer review, and teaching opportunities
Open to research collaborations, MEL consultancy, and data-driven projects.