Skip to content
View Shubhaditya14's full-sized avatar

Highlights

  • Pro

Block or report Shubhaditya14

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Shubhaditya14/README.md

Shubhaditya

I build ML systems from the ground up — not to avoid abstractions, but to understand what's underneath them.

Third-year AI/ML student at RV College of Engineering, Bengaluru. I care about inference speed, distributed compute, and what actually happens when you run a model on constrained hardware.


What I've Built

Transformer Inference Accelerator — FPGA-based hardware accelerator for transformer inference on Artix-7. INT8 post-training quantization pipeline, UART host interface, custom memory layout for weight tiling. Built as a capstone with a PCB team; I own the ML side end-to-end.

NietzscheGPT — Character-level GPT trained on the complete works of Nietzsche. Then extended into a Tiny Inference Engine with KV-cache and INT8 PTQ. Followed Karpathy's nanoGPT then went further.

ScratchGPT — GPT-2 implemented from scratch. No Hugging Face, no shortcuts.

Distributed Training Orchestrator — Master-worker architecture over gRPC. Fault tolerance, checkpoint recovery, all-reduce semantics. Built to understand what frameworks like PyTorch Distributed are actually doing.

Federated Learning Aggregation Server — FL system designed for hospital deployment. Hadoop HDFS + MapReduce + Apache Hive for gradient aggregation, FastAPI for the coordination layer. Local simulation as showcase.

Music Recognition — Shazam clone built from scratch. Spectrogram fingerprinting, hash-based matching.


What I'm Interested In

  • ML inference at the edge — making models fast and cheap to run
  • Distributed training internals — what happens below the framework
  • Hardware-software co-design — FPGA, quantization, memory layout
  • Systems that actually ship, not just benchmarks

Stack

Python · C · PyTorch · gRPC · VHDL · FastAPI · Hadoop · PostgreSQL · Linux


Currently

Looking for a stipended ML engineering / AI infrastructure internship in Bengaluru.
If something I've built is interesting to you — reach out.

📧 shubhadityabechan2004@gmail.com 🔗 https://www.linkedin.com/in/shubhaditya-bechan/

Pinned Loading

  1. nietzsche-gpt nietzsche-gpt Public

    GPT implemented from scratch trained on complete works of Friedrich Nietzsche

    Python

  2. ScratchGPT ScratchGPT Public

    GPT-2 implemented from scratch

    Python

  3. Distributed Distributed Public

    Master-worker distributed training orchestrator with gRPC, fault tolerance, and checkpoint recovery.

    Python

  4. MovieShow MovieShow Public

    A website to log and rate all the movies you watched with a recommendation engine

    TypeScript

  5. music-recognition music-recognition Public

    A shazam clone made from scratch.

    Python