Projects Completed

Disease Prediction System

Aug 2023 to Feb 2024

Live preview blocked by site, visit directly
Click to visit live site ↗
7
ML Models Trained
97.2%
Symptom Prediction Accuracy
41
Diseases Covered

Overview

This is an AI-powered health tool that tackles disease prediction from three angles: predicting diseases from symptoms across 41 conditions, assessing risk for diabetes, heart disease, and stroke, and flagging weather-driven disease outbreaks like dengue, malaria, and chikungunya.

You pick from 130+ symptoms and the system runs them through an ensemble of Random Forest and Gradient Boosting classifiers. It gives you ranked predictions with confidence percentages, shows which symptoms matched, and explains everything in plain English. The health risk models take your clinical measurements and return clear risk levels with practical next steps.

Key Features

  • Symptom Checker: select from 130+ symptoms, get ranked predictions across 41 diseases with confidence scores
  • Health Risk Assessment: separate models for Type 2 Diabetes (90.5%), Heart Disease (88.8%), and Stroke (94.7%)
  • Weather Disease Alerts: predicts dengue, malaria, chikungunya risk from temperature, humidity, and rainfall
  • Explainable predictions that show which symptoms matched and why each condition was flagged
  • Risk factor identification with tailored health recommendations for each assessment
  • 5 region types supported for weather alerts: Tropical, Subtropical, Temperate, Arid, Mediterranean

How It's Built

The system uses a layered architecture:

  • Frontend: React 18 SPA with TailwindCSS, Framer Motion animations, Recharts for data visualization, and React Select for symptom multi-select input
  • Backend: FastAPI with Pydantic validation and Uvicorn server. One router per feature domain (symptoms, risk, weather) with a clean service layer pattern
  • ML Pipeline: Ensemble of Random Forest + Gradient Boosting for symptom prediction (97.2% accuracy). Separate RF models for diabetes, heart, and stroke risk. Multi-output RF classifier for weather-disease correlations (93–95% accuracy)
  • Infrastructure: Models trained as part of Docker build, so no separate model storage needed. Containerized with Docker Compose for local development

Interesting Challenges

  • Ensemble Design: Combining multiple classifiers improved accuracy, but most gains came from dataset design, not algorithm tuning. Balancing synthetic data to mirror real-world medical distributions was the real engineering challenge
  • Multi-Output Classification: Weather-disease prediction required predicting three disease risks simultaneously. A multi-output Random Forest handled correlated outputs cleanly while maintaining 93–95% accuracy per disease
  • Explainability: Health predictions need transparency. Each result shows which specific inputs drove the prediction, building trust in the system's recommendations

Screenshots