An Explainable Disease Surveillance System for Early Prediction of Multiple Chronic Diseases

Shaheer Ahmad Khan, Muhammad Usamah Shahid, Ahmad Abdullah, Ibrahim Hashmat, Muddassar Farooq·January 27, 2025

Summary

An explainable disease surveillance system uses routine EHR data to predict multiple chronic diseases 3-12 months before diagnosis, focusing on medical history, vitals, diagnoses, and medications. It trains three models for each disease, internally validated with F1 scores and AUROC, and further evaluated by expert physicians for clinical relevance. The system aims to enhance explainability through Shapely attributes, surrogate models, and a new rule engineering framework. It addresses the need for a surveillance system capable of predicting multiple chronic conditions, focusing on routine EHR data to develop a clinically useful, practical, and explainable predictor for risks one year in advance, aiming to improve preventive measures and reduce healthcare costs.

Key findings

Introduction

Background

Overview of disease surveillance systems

Importance of early disease prediction

Challenges in traditional disease prediction methods

Objective

Aim of the explainable disease surveillance system

Key features and benefits

Method

Data Collection

Sources of routine EHR data

Data types included (medical history, vitals, diagnoses, medications)

Data Preprocessing

Data cleaning and normalization

Handling missing values

Model Training

Selection of models for each disease

Internal validation using F1 scores and AUROC

Clinical Relevance Evaluation

Expert physician review process

Criteria for clinical relevance

Enhancing Explainability

Shapely Attributes

Explanation of Shapely values

How they contribute to model interpretability

Surrogate Models

Use of simpler models to explain complex predictions

Benefits and limitations

Rule Engineering Framework

Development of rules for model predictions

Integration with clinical guidelines

System Evaluation

Performance Metrics

Metrics used for model evaluation

Comparison with existing systems

Clinical Utility

Assessment of system's impact on healthcare

Case studies or pilot project results

Conclusion

Future Directions

Ongoing research and development

Potential for scalability and integration

Impact on Healthcare

Expected improvements in preventive measures

Reduction in healthcare costs

Summary of Key Findings

Recap of system's capabilities and benefits

Basic info

papers

machine learning

artificial intelligence

Advanced features

Insights

What methods are used to enhance the explainability of the disease surveillance system?

What is the ultimate goal of developing this explainable disease surveillance system?

How does the system predict multiple chronic diseases 3-12 months before diagnosis?

What is the main idea behind the explainable disease surveillance system mentioned in the text?