Multimodal healthcare AI is challenging because image, video, and clinical metadata rarely arrive in a clean, aligned format. My main goal is to keep the pipeline reproducible, measurable, and interpretable.
What matters most
- Reliable preprocessing across heterogeneous medical sources.
- Strong experiment tracking and versioning for repeatable results.
- Explainability with SHAP and Grad-CAM for clinical review.
My approach
I like to start with a simple baseline, then improve data quality, feature engineering, and evaluation before increasing model complexity. That usually gives faster and more trustworthy progress than jumping directly to a large architecture.