AUTOCT: Automating Interpretable Clinical Trial Prediction with LLM Agents
Fengze Liu, Haoyu Wang, Joonhyuk Cho, Dan Roth, Andrew W. Lo·June 04, 2025
Summary
AUTOCT framework merges large language models & classical ML to autonomously generate clinical trial prediction features. It uses Monte Carlo Tree Search for optimization, needing fewer iterations than current methods. AUTOCT constructs tabular features, enhancing predictive capabilities with unstructured data. It proposes 10 features for phase 1 trial outcome prediction, focusing on intervention type, demographics, previous trial success, research team, funding, primary outcome, location, and eligibility criteria. Key factors for future feature design include route of administration, dosing regimen, previous exposure, safety profiles, and participant health status.
Introduction
Background
Overview of clinical trial prediction challenges
Importance of accurate prediction in clinical research
Objective
Aim of the AUTOCT framework
How it addresses limitations in current clinical trial prediction methods
Method
Data Collection
Types of data utilized (structured, unstructured)
Sources of data (public databases, clinical trial registries)
Data Preprocessing
Techniques for handling unstructured data
Methods for integrating structured and unstructured data
Model Integration
How large language models are combined with classical ML
Role of Monte Carlo Tree Search in optimization
Feature Construction
Generation of tabular features from unstructured data
Selection of 10 key features for phase 1 trial outcome prediction
Evaluation
Metrics for assessing prediction accuracy
Comparison with existing methods
Key Features for Phase 1 Trial Outcome Prediction
Intervention Type
Importance in predicting trial success
Demographics
Influence on trial outcomes
Previous Trial Success
Relevance in assessing potential
Research Team
Impact on trial execution and success
Funding
Role in resource allocation and trial quality
Primary Outcome
Significance in defining trial objectives
Location
Considerations for geographical and logistical factors
Eligibility Criteria
Importance in participant selection and trial design
Future Feature Design Considerations
Route of Administration
Impact on drug efficacy and safety
Dosing Regimen
Influence on treatment outcomes and compliance
Previous Exposure
Relevance in understanding patient response
Safety Profiles
Importance in risk assessment and trial planning
Participant Health Status
Role in predicting potential complications and outcomes
Conclusion
Summary of AUTOCT's contributions
Future directions and potential improvements
Impact on clinical trial design and management
Basic info
papers
machine learning
artificial intelligence
Advanced features
Insights
What are the 10 proposed features by AUTOCT for predicting phase 1 clinical trial outcomes, and what data sources do they utilize?
What are the key advantages of using Monte Carlo Tree Search within the AUTOCT framework compared to existing optimization methods in clinical trial feature engineering?
How does the AUTOCT framework integrate large language models with classical machine learning techniques for autonomous feature generation in clinical trial prediction?
According to the text, what are the key factors to consider when designing future features for clinical trial outcome prediction, beyond the initial 10 proposed by AUTOCT?