Skip to content
View karankavyanjali77-sys's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report karankavyanjali77-sys

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
header

LinkedIn Gmail GitHub Profile Views


👩‍💻 About Me

Data Analyst & ML Developer who builds and deploys real, working AI systems — not just notebooks.

I'm a Computer Science undergraduate at ITER, SOA University (graduating 2027), selected for McKinsey Forward Program, Google Gen AI Academy APAC, and Guidewire Hackathon (2026) — and a Round 2 qualifier at L&T Technology Services Techgium national hackathon (Oct 2025).

My work spans RAG-based LLM systems, ML pipelines, business intelligence dashboards, and funnel analytics — all deployed end-to-end.

  • 🔭 Currently building Churn Autopsy — a production-style churn prediction system with a FastAPI inference layer and SHAP explainability
  • 🧠 Exploring LLMs, vector databases, and agentic AI architectures
  • 📊 Building analytics tools that translate raw data into business decisions
  • 📍 Based in India — open to Data Analyst / Junior Data Scientist internships & fresher roles (India & International)

🏆 Achievements & Recognition

Programme Organisation Year
Forward Program — Selected Participant McKinsey & Company 2026
Gen AI Academy APAC — Selected Participant Google 2026
Guidewire Hackathon — Seed 2 Phase Guidewire Software 2026
Techgium National Hackathon — Round 2 Qualifier L&T Technology Services Oct 2025

🚀 Featured Projects

Why customers leave — predicted before they do

Stack: Python · Scikit-learn · SMOTE · SHAP · FastAPI · Streamlit · Joblib · Pydantic

Highlights:

  • 🧠 Trained Logistic Regression vs Random Forest vs Gradient Boosting — best model selected via 5-fold stratified CV (ROC-AUC: 0.744)
  • ⚖️ SMOTE oversampling inside the pipeline to handle 33% class imbalance — test set never touched
  • 🔌 Served predictions via FastAPI REST endpoint (POST /predict) returning churn probability + top-3 SHAP explanations + business retention action per customer
  • 📊 Interactive Streamlit dashboard consuming the API — risk banner, probability bar, SHAP reason cards, recommended intervention
  • 🏗️ Production-style separation: src/train.pymodels/churn_pipeline.pklapi/main.pyapp.py

Ask questions over any PDF in natural language — powered by a full RAG pipeline

Stack: Python · LangChain · FAISS · Sentence Transformers · Groq LLM · Streamlit · PyPDF

Highlights:

  • 📄 Multi-PDF ingestion with semantic chunking
  • 🔍 FAISS vector store for fast similarity search
  • 💬 Context-aware Q&A with source-page citation
  • ⚡ Groq LLM for low-latency inference — answers across 100-page documents in under 5 seconds
  • 🖥️ Deployed chat-style UI via Streamlit

Production-deployed ML pipeline with real-time cluster predictions

Stack: Python · Scikit-learn · KMeans · Pandas · Joblib · Streamlit · Matplotlib

Highlights:

  • 🧮 StandardScaler + KMeans (5 clusters) pipeline with Joblib model persistence — iterated across 72 commits
  • 📈 Named business profiles: VIP · Upsell Targets · Retention Risk · Budget Loyalists · Discount Seekers
  • 💡 Dashboard auto-generates segment-specific marketing recommendations at prediction time
  • 📥 Downloadable CSV prediction reports

5,000-row employee survey → 6 publication-quality charts → HR policy brief

Stack: Python · Pandas · Seaborn · Plotly · Jupyter Notebooks

Highlights:

  • 📋 76.1% of employees reported a mental health condition; 51.1% lacked employer resources
  • 📉 Stress rates highest in Finance (35.6%), Healthcare (35.3%), Education (34.5%)
  • 📝 Findings structured as a 4-recommendation HR policy brief written for a non-analytical audience

Track user drop-off across 4 product lifecycle stages

Stack: Python · Pandas · Plotly

Highlights:

  • 📉 63.7% of users dropped at Site Visit → Sign-Up — identified as the single highest-leverage intervention point
  • 📱 Mobile conversion (0.6%) trailed desktop (4.2%) by 3.6 percentage points — representing $8,828 in recoverable monthly revenue
  • 📊 Output structured as a one-page prioritisation brief written for a product manager

Revenue seasonality, top-SKU identification, and regional patterns

Stack: Python · Pandas · Plotly · SQL

Highlights:

  • 📦 SQL window functions and CTEs to identify top-margin SKUs driving disproportionate returns across 8 categories
  • 🗺️ Regional revenue pattern analysis via interactive Plotly dashboards
  • 💰 Monthly trend and category-mix views structured for a category manager audience

Automated dataset profiling — no code required for the end user

Stack: Python · Pandas · Streamlit · Plotly

Highlights:

  • 🔍 Flags implicit nulls, mixed column types, and silent zero-inflation — anomalies visual inspection misses
  • 📊 One-click stakeholder-ready Excel/CSV report generation
  • ⏱️ Reduces a 30+ minute manual profiling process to a single file upload

🛠️ Tech Stack

AI & Machine Learning

Python LangChain FAISS scikit-learn FastAPI SHAP

Data & Analytics

Pandas NumPy Plotly Matplotlib Seaborn

BI & Deployment

Streamlit Power BI Tableau Jupyter

Tools & Cloud

SQL Git VS Code Google Cloud


📊 GitHub Stats

Streak


🏗️ What Sets Me Apart

Strength Evidence
Ships production-style systems Churn Autopsy: training pipeline → REST API → dashboard — 3 independent layers
Explains ML decisions SHAP per-customer explanations — not just global feature importance
Translates data into decisions Mental Health EDA → HR policy brief; Funnel tool → $8,828 revenue opportunity identified
Selected by global organisations McKinsey · Google · Guidewire · L&T Technology Services
Builds for real users All deployed apps have clean UIs designed for non-technical audiences
End-to-end ML engineering train.py → .pkl → FastAPI endpoint → Streamlit UI

📬 Let's Connect

Actively seeking Data Analyst / Junior Data Scientist / ML Engineer internships and fresher roles — open to India and international opportunities.

LinkedIn Email


footer

"Build things that are useful. Then make them better."

Pinned Loading

  1. Kavyanjali-Karan-Amazon-BA-Intern-3100573 Kavyanjali-Karan-Amazon-BA-Intern-3100573 Public

    BA-style analysis built for Amazon's BA Intern role (Job ID 3100573) — revenue trends, customer RFM segmentation, delivery funnel & one-page recommendation brief. By Kavyanjali Karan | SOA Universi…

    Jupyter Notebook 1

  2. ai-pdf-rag-chatbot ai-pdf-rag-chatbot Public

    RAG-based PDF chatbot — upload any PDF and query it in natural language using LangChain, FAISS, and Groq LLM

    Python 1

  3. customer-segmentation-ml customer-segmentation-ml Public

    Production-ready customer segmentation using Scikit-learn KMeans pipeline with real-time Streamlit dashboard and marketing recommendations

    Jupyter Notebook 1

  4. -mental-health-eda -mental-health-eda Public

    Exploratory data analysis on 5,000-row employee mental health survey — 6 charts, key insights, HR policy recommendations

    HTML 1

  5. saas-funnel-analysis saas-funnel-analysis Public

    Python/Plotly funnel analytics tool tracking user drop-off across 4 SaaS lifecycle stages — identifies highest-churn conversion points

    Python 1

  6. ecommerce-analytics ecommerce-analytics Public

    E-commerce sales analytics dashboard — revenue seasonality, top-SKU identification, and regional patterns using Pandas and Plotly

    Python 1