I'm an MSc Health Data Science student at the University of Birmingham. I build and evaluate machine-learning systems for healthcare, on multimodal data that spans multi-omics, medical imaging, and clinical records. My focus is what decides whether a model is actually used: whether it can be explained, audited, and trusted. In healthcare that means designing for clinical sign-off and regulation from the start.
Right now I'm finishing my MSc dissertation on brain-tumour segmentation.
I work end to end, from raw data to the evidence a model needs before anyone signs it off. My main areas:
- Machine learning on multimodal data, across multi-omics, medical imaging, and clinical or tabular records
- Explainable and responsible AI, including model auditing, model risk, and SHAP-based interpretation
- Simulation, decision analysis, and risk modelling, from Monte Carlo cost-effectiveness to risk scoring
- Data governance and regulation, from data-fabric design to privacy and compliance frameworks
- Critical appraisal and communication, including how data and figures can mislead
A reproducible multi-omics pipeline that fuses metabolomics, biochemistry, and diet to classify colorectal cancer, comparing intermediate and late fusion.
Methods: DIABLO, regularised CCA, Random Forests, stacked logistic regression, SHAP, surrogate trees Impact: Recovers coherent shared biology and verified markers, while showing fusion adds little to raw prediction Tools: Python, R (mixOmics), scikit-learn
A safety audit of a diabetic retinopathy classifier using the Medical Algorithmic Audit framework.
Methods: algorithmic auditing, subgroup testing, adversarial robustness, FMEA risk scoring Impact: Exposes failure modes that headline accuracy hides, scored the way a model-risk review would Tools: Python
A Monte Carlo framework that estimates whether an AI triage tool is worth funding, running synthetic cohorts through two care pathways.
Methods: Monte Carlo, Bayesian updating, ICER and QALYs, one-way sensitivity analysis Impact: Turns an accuracy question into a cost-effectiveness decision under uncertainty Tools: Python, NumPy, SciPy
- Retinal Fundus Classification: transfer learning that compares CNNs and Vision Transformers for diabetic retinopathy grading, with Grad-CAM. (Python, PyTorch)
- Healthcare Revenue Cycle Risk Prediction: flags patient-level financial risk on synthetic EHR data so a billing team can intervene early, with about a 9.4x lift over baseline. (Python, scikit-learn)
- Pharmacogenomics-Guided Medication Safety in CVD: a health-data implementation plan for CYP2C19-guided antiplatelet prescribing in the UAE, using data-fabric design, CPIC and PharmCAT translation, and HL7 FHIR Genomics with CDS Hooks.
- Explainable AI in Cancer Research: a review of deep learning and explainable AI for multimodal data integration in oncology, across thirteen case studies.
- Bias in Genomic Data: a critical analysis of ancestry bias in GWAS and polygenic risk scores, using the 2024 All of Us controversy.
More on my repositories.
Roles where I take models from data to deployment in settings where the result has to be trusted: clinical, regulated, or otherwise high-stakes. I'm most engaged by:
- Applied machine learning and responsible AI
- Model risk, auditing, and evaluation
- Simulation, decision analysis, and risk modelling
I'm drawn to teams that treat explainability and real-world deployment as part of the engineering, not an afterthought.
Programming Languages Python (pandas, scikit-learn, SHAP, PyTorch, matplotlib) • R (mixOmics, tidyverse, statistical modelling)
Machine Learning Classification • Survival analysis • Model stacking and fusion • Explainability and model auditing (SHAP, surrogate models) • Transfer learning
Quantitative Methods Monte Carlo simulation • Bayesian updating • Cost-effectiveness and decision analysis • Risk scoring • Sensitivity analysis
Health and Multi-Omics Data Multi-omics integration (DIABLO, rCCA) • EHR and claims data • Pharmacogenomics • Medical imaging • Data governance (PDPL, ADHICS, SaMD)
Tools Git/GitHub • Jupyter • RStudio • Reproducible workflows • BioRender
Based in Dubai, UAE. Open to roles in the UAE or remote.
I'm especially drawn to challenges that matter in the Gulf and Middle East:
- Precision medicine and genomics in diverse populations
- Equitable, well-governed health-data systems
- Risk, cost, and decision modelling for health and public services
I'm always happy to discuss machine learning, responsible AI, and work that moves from analysis to real decisions.
GitHub: @riyashet-hds LinkedIn: linkedin.com/in/riyashet Email: riyashet.psy@gmail.com
Updated: June 2026