Projects Completed

Disease Prediction System

Aug 2023 — Feb 2024

Live preview blocked by site — visit directly
Click to visit live site ↗
7
ML Models Trained
97.2%
Symptom Prediction Accuracy
41
Diseases Covered

Overview

Disease Prediction System is an AI-powered health companion that approaches disease prediction from three angles: symptom-based disease prediction across 41 conditions, health risk assessment for diabetes, heart disease, and stroke, and weather-driven disease alerts for dengue, malaria, and chikungunya.

Users select from 130+ symptoms and the system runs them through an ensemble of Random Forest and Gradient Boosting classifiers. Each prediction includes ranked probable diseases with confidence percentages, matching symptoms, and plain-English descriptions. The health risk models evaluate clinical measurements and return clear risk levels with actionable recommendations.

Key Features

  • Symptom Checker — select from 130+ symptoms, get ranked predictions across 41 diseases with confidence scores
  • Health Risk Assessment — separate models for Type 2 Diabetes (90.5%), Heart Disease (88.8%), and Stroke (94.7%)
  • Weather Disease Alerts — predicts dengue, malaria, chikungunya risk from temperature, humidity, and rainfall
  • Explainable predictions — shows which symptoms matched and why each condition was flagged
  • Risk factor identification with tailored health recommendations for each assessment
  • 5 region types supported for weather alerts: Tropical, Subtropical, Temperate, Arid, Mediterranean

How It's Built

The system uses a layered architecture:

  • Frontend: React 18 SPA with TailwindCSS, Framer Motion animations, Recharts for data visualization, and React Select for symptom multi-select input
  • Backend: FastAPI with Pydantic validation and Uvicorn server. One router per feature domain (symptoms, risk, weather) with a clean service layer pattern
  • ML Pipeline: Ensemble of Random Forest + Gradient Boosting for symptom prediction (97.2% accuracy). Separate RF models for diabetes, heart, and stroke risk. Multi-output RF classifier for weather-disease correlations (93–95% accuracy)
  • Infrastructure: Models trained as part of Docker build — no separate model storage needed. Containerized with Docker Compose for local development

Interesting Challenges

  • Ensemble Design: Combining multiple classifiers improved accuracy, but most gains came from dataset design, not algorithm tuning. Balancing synthetic data to mirror real-world medical distributions was the real engineering challenge
  • Multi-Output Classification: Weather-disease prediction required predicting three disease risks simultaneously. A multi-output Random Forest handled correlated outputs cleanly while maintaining 93–95% accuracy per disease
  • Explainability: Health predictions need transparency. Each result shows which specific inputs drove the prediction, building trust in the system's recommendations

Screenshots