Skip to content
View Soham-Lodh's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report Soham-Lodh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Soham-Lodh/README.md

Soham Lodh

B.Tech CSE (AI & ML) · KIIT University · CGPA 9.31

Backend-heavy full-stack engineer with a focus on ML systems — from pipeline architecture through production inference. I build things that ship: deployed APIs, live ML endpoints, and multi-role applications with real users.

📍 Kolkata, India · sohamlodh06@gmail.com · LinkedIn · Portfolio


What I Build

End-to-end ML pipelines (feature engineering → model selection → FastAPI inference) and production full-stack systems (MERN, Next.js, TypeScript). I care about correctness under load, clean API contracts, and things that work in production, not just on localhost.


Featured Work

RiskLens — Credit Default Risk Prediction Platform

Logistic Regression · Optuna · SMOTETomek · SHAP · FastAPI · React

Live · GitHub

An 18-stage ML pipeline on 50,000 records across three relational data sources. Key engineering decisions: stratified splitting before EDA to prevent leakage, VIF-based multicollinearity elimination (5 features dropped), WoE/IV feature selection (10 retained, IV up to 2.38), and SMOTETomek resampling for 10.6% class imbalance. Bayesian optimisation via Optuna (50 trials, TPE sampler) outperformed a tuned XGBoost baseline.

Metric Result
AUC-ROC 0.9834
Macro F1 0.9776
KS Statistic 85.91

Inference served via FastAPI + Uvicorn on Render. Frontend on Vercel with SHAP explainability and a logit-space credit scoring algorithm mapping default probability to a 300–900 institutional risk scale.


PremiumIQ — Health Insurance Premium Predictor

XGBoost · RandomizedSearchCV · FastAPI · React · Vercel

Live · GitHub

End-to-end regression pipeline on 9,905 records. Designed a composite total_risk_score feature from compound medical history fields to replace high-cardinality categoricals — materially improved model signal. Validated 18 features via VIF (all < 2.5); hybrid ordinal + one-hot encoding pipeline; tuned XGBoost via RandomizedSearchCV.

Metric Result
R² Score 0.9944
vs. Linear Regression baseline 0.956
vs. Tuned Random Forest 0.978

Prescripto — Medical Appointment Platform

MERN · JWT · Cloudinary · Vercel

Live (Patient) · Live (Admin/Doctor) · GitHub

Production MERN platform across two independently deployed frontends backed by a centralised Express REST API on MongoDB Atlas. Role-based JWT authentication with separate login flows and permission-scoped dashboards for three account types. Appointment scheduling with slot-conflict detection; hardened with input validation, bcrypt hashing, and rate limiting.


Engineering Experience

Technical Member — Web Development · Coding Ninjas KIIT Chapter (Aug 2025 – Present)

  • Architected a full-stack event management platform (Next.js, Hono, Prisma, MongoDB) with a real-time leaderboard handling concurrent score streams from multiple judges; engineered the write pipeline to prevent race conditions under parallel load.
  • Built a secure quiz platform with server-side session tracking, tab-switch detection, timed execution, and randomised question delivery.
  • Developed a Project Management System with four role levels enforcing access-controlled workflows via middleware; authored API contracts and permission-logic docs for downstream contributors.

Open Source Contributor · GirlScript Summer of Code (Jul – Oct 2025)

  • Implemented authentication modules: OTP login, JWT sessions, Google OAuth, and password recovery — with unit tests across success and failure paths for all four flows.
  • Diagnosed and resolved production-level defects by reproducing failure conditions from logs; fixes reviewed and merged into main.

Tech Stack

Languages

ML & Data Science

Backend

Frontend

Databases

Tools & Deployment


Education

KIIT University — B.Tech, CSE (AI & ML) · 2024–2028 · CGPA 9.31

Relevant Coursework: Data Structures & Algorithms, Operating Systems, DBMS, OOP


Certifications

  • Classify Images with TensorFlow on Google Cloud — Google (2025)
  • DSA in Python — NPTEL (2025)
  • Postman API Student Expert — Postman (2025)

📊 GitHub Stats


🔥 Contribution Streak


📈 Activity Graph


📅 Profile Summary

Popular repositories Loading

  1. DSA-Journey DSA-Journey Public

    Tracking my Data Structures & Algorithms journey in Java — solving problems consistently and improving problem-solving skills.

    Java 2

  2. Calculator Calculator Public

    A calculator to perform various tasks like Arithmetic, Matrix, Vector, Complex calculation

    C 1

  3. Prescripto Prescripto Public

    A full-stack MERN web application that allows patients to browse doctors by specialty, view details, and book appointments online. Doctors can manage their schedules, while admins oversee the entir…

    JavaScript 1

  4. Soham-Lodh Soham-Lodh Public

    1

  5. PremiumIQ PremiumIQ Public

    End-to-end ML pipeline that predicts annual health insurance premiums using XGBoost — featuring custom risk score engineering, hyperparameter tuning, and a deployed frontend.

    Jupyter Notebook 1

  6. Portfolio Portfolio Public

    TypeScript 1