SELF-PACED | ONLINE | GO FROM BEGINNER TO ADVANCED LEVEL
🚨 NOTE 🚨: This course is offered only in self-paced mode. To access the course, enroll using the Enroll Now button and you will immediately get access to all course materials in your dashboard. There are no live cohort sessions. Please ignore any live start dates shown on the website. Support is available via Slack and email.
This is an end-to-end, hands-on machine learning program built specifically for molecular drug discovery. You will learn how to build real predictive models on chemical and bioactivity datasets, handle chemical data bias, validate models correctly, and deliver results that hold up under real-world constraints.
The curriculum is designed for professionals and advanced researchers who want practical ML capability in drug discovery, not generic ML theory. You will work with real molecular representations, partitioning strategies that prevent leakage, and model evaluation methods that reflect how discovery teams actually use ML.
This is not an “AI overview” course and it is not a collection of toy notebooks. You will implement full workflows from data collection to model deployment-ready evaluation, including interpretability and explainability methods that are critical when ML outputs drive scientific decisions.
WHO THIS IS FOR
This course is for you if you want to apply machine learning to real drug discovery problems, using real molecular datasets and workflows, not toy examples.
You are a strong fit if you are:
- Medicinal or computational chemist transitioning into molecular machine learning
- Cheminformatics or drug discovery scientist building predictive models
- ML engineer working in life sciences who wants domain correct modeling workflows
- PhD student or postdoc who wants industry-grade modeling skills in molecular property prediction
This course is also the right next step if your goal is generative AI for molecules, because it builds the foundations that determine whether your training data, objectives, and evaluation are even valid. Generative modeling without this is guessing with confidence.
WHAT YOU WILL BE ABLE TO DO
By the end of the program, you will be able to build end-to-end molecular ML pipelines that are reliable, defensible, and usable for drug discovery decision making.
You will be able to:
- Collect and structure molecular drug discovery datasets from real sources
- Run exploratory bioactivity data analysis and molecular visualization
- Represent molecules using practical molecular representations for ML workflows
- Build robust train-test splits using chemical clustering and scaffold-based partitioning to avoid leakage
- Train and tune classical and advanced ML models for molecular prediction tasks, including bias-aware workflows
- Handle chemical data bias using methods that reduce false confidence and improve generalization
- Interpret model behavior using explainability tools like feature importance, LIME, and SHAP, so outputs can be trusted in scientific settings
- Complete capstone-level molecular ML projects such as toxicity prediction, solubility prediction, and drug repurposing
Most importantly: you will be able to evaluate models like a real drug discovery team, instead of optimizing metrics that collapse in the real world.






