Skip to content
View sumeetkhillare's full-sized avatar
πŸ’»
I may be slow to respond.
πŸ’»
I may be slow to respond.

Block or report sumeetkhillare

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sumeetkhillare/README.md

Hi, I'm Sumeet Khillare πŸ‘‹

πŸŽ“ MS in Computer Science @ North Carolina State University (GPA: 4.0)
πŸ’» Software Engineer with 3+ years experience in Distributed Systems, Security, and Machine Learning
⚑ Building scalable, real-time data pipelines and threat detection systems


🧠 Skills

Distributed Systems & Streaming

Kafka, Apache Flink, Siddhi CEP, Event-driven architectures

Backend Development

Java (Spring Boot), Python (Flask, Django), REST APIs

Cloud & DevOps

Kubernetes, Docker, Azure, Google Cloud, Rancher

Data & Storage

Elasticsearch, Redis, PostgreSQL, Logstash, Kibana

Machine Learning

Supervised & Unsupervised Learning, Neural Networks, RAG, Generative AI, Graph neural networks


πŸš€ Featured Projects

Distributed Security Threat Detection System

  • Designed and implemented a real-time container security framework on Kubernetes using Self-Supervised Hybrid Learning (SHIL) to detect anomalies from system call patterns with low false positives.
  • Built a multi-stage anomaly detection pipeline combining autoencoders, Isolation Forest, and Random Forest to accurately classify malicious behavior in dynamic container environments.
  • Developed a per-container monitoring architecture (Monitor Pods) with syscall streaming via Tetragon and Kafka, enabling scalable and fault-tolerant threat detection across distributed clusters.
  • Implemented similarity-based zero-day attack detection using Jaccard similarity on syscall signatures, allowing rapid identification and classification of previously unseen attacks.
  • Achieved near real-time detection (≀8s lead time) for critical vulnerabilities (e.g., Log4j RCE, path traversal), with optimized memory usage (reduced from ~700MiB to ~300MiB) through selective syscall processing.

View Project


Improving the Efficiency of Graph Neural Networks (GNNs)

  • Designed and optimized a Graph Neural Network (GraphSAGE) pipeline on the OGBN-Products dataset (~2.4M nodes, ~124M edges), improving scalability for large-scale graph learning tasks.
  • Improved model efficiency by applying autoencoder-based feature compression (100 β†’ 32 dims), achieving ~68% memory reduction and +2% accuracy gain.
  • Developed cosine similarity–based graph pruning, reducing edge count significantly and achieving up to 7Γ— throughput improvement while maintaining model performance.
  • Implemented knowledge distillation (GNN β†’ MLP) to eliminate graph dependency at inference, resulting in 100Γ— faster inference time with comparable accuracy.
  • Evaluated multiple optimization strategies (quantization, pruning, sparsification, distillation) and identified best trade-offs between accuracy, compute, and latency for real-world deployment.

LLM-based Cache Replacement Policies

  • Designed and implemented CacheForge, an LLM-guided evolutionary framework to automatically generate and optimize cache replacement policies, improving system performance beyond traditional heuristics like LRU.
  • Developed a surrogate machine learning model (code embeddings + metadata fusion with MLP) to predict IPC, reducing expensive ChampSim simulations and enabling scalable exploration of thousands of policies.
  • Built an iterative optimization pipeline combining LLM-based refinement, semantic crossover, and redesign strategies, enabling structured exploration of the policy search space.
  • Implemented lineage tracking and episodic memory systems to maintain policy evolution history and guide future LLM generations using past successes and failures.
  • Integrated system with ChampSim cycle-accurate simulator, evaluating policies across diverse workloads and optimizing for harmonic mean IPC, achieving up to 0.4164 IPC.
  • Conducted sensitivity studies and threshold-based analysis (surrogate filtering at 0.39 vs 0.4) to balance exploration vs exploitation, improving model reliability and selection efficiency.

View Project


AutoCodeSage

  • Designed and built AutoCodeSage, a multi-agent LLM system for automated code smell detection, explanation, and refactoring, improving maintainability beyond traditional linters.
  • Architected a 4-agent pipeline (Finder, Explainer, Patcher, Evaluator) with a centralized coordinator to enable modular, interpretable, and iterative code optimization.
  • Implemented AST-based static analysis + software metrics (cyclomatic complexity, Halstead metrics, Maintainability Index) to detect structural code smells with higher recall than tools like flake8/pylint.
  • Developed LLM-driven explanation and patch generation system producing structured, minimal diffs with safety checks, ensuring localized and readable refactoring.
  • Built a test-driven validation framework (pytest + isolated execution) to verify semantic correctness, achieving high patch success rates and preserving behavior (>0.98 similarity).
  • Evaluated system using CodeBERT-based metrics (BLEU, cosine similarity, F1) and demonstrated superior performance over single-agent LLMs in detection accuracy, explanation quality, and safe refactoring.

LogJam: Multi-Stage Log4j Exploitation CTF Challenge

  • Designed and implemented a multi-stage Web CTF challenge (β€œLogJam”) simulating a real-world Log4j vulnerability leading to Remote Code Execution (RCE).
  • Built a Spring Boot-based distributed architecture involving a public API, internal vulnerable service, and LDAP server to emulate realistic attack surfaces.
  • Exploited and demonstrated Log4j 2.14.1 JNDI injection (CVE-style vulnerability pattern) to enable remote class loading and execution via crafted payloads.
  • Engineered a two-stage exploitation flow, requiring participants to upload a malicious Java .class payload and trigger execution through a controlled internal endpoint.
  • Implemented secure CTF infrastructure using nsjail-based sandboxing and isolated execution environments to prevent cross-player interference and ensure safe payload execution.
  • Collaborated with infrastructure team to refine system design, improving deployment isolation, internal service routing, and scalability for concurrent players. View Project

Systems Project: XINU OS Enhancements

XINU OS Environment Setup and System Call Exploration

  • Worked with the XINU operating system to understand low-level system architecture, memory layout, and process management on x86.
  • Implemented core OS-level utilities in C and x86 assembly, including bit manipulation, stack inspection, and segment analysis.
  • Developed debugging and introspection tools to analyze stack behavior, process states, and memory segments using GDB and QEMU.
  • Extended kernel functionality by adding system call tracing and performance profiling (frequency and execution time tracking).
  • Gained hands-on experience with OS internals, compilation toolchains, and emulator-based kernel debugging.

Custom Process Scheduling Algorithms in XINU (Aging & Linux-like Scheduler)

  • Implemented custom process scheduling algorithms in XINU, including Aging-based and Linux-like schedulers to address process starvation.
  • Modified kernel components (e.g., resched, ready, create) to support dynamic priority adjustment and epoch-based scheduling.
  • Designed and integrated fair CPU allocation strategies using priority aging and time-quantum (goodness-based) scheduling. Developed system-level APIs (setschedclass, getschedclass) to switch between scheduling policies at runtime.
  • Evaluated scheduler performance using controlled workloads, ensuring fairness and balanced CPU utilization across processes.

Kernel-Level Synchronization with Readers-Writer Locks and Priority Inheritance in XINU

  • Implemented readers-writer locks in XINU with support for concurrent reads and exclusive writes, extending kernel synchronization primitives beyond semaphores.
  • Designed and enforced priority-based lock scheduling policies to ensure fairness between readers and writers while avoiding starvation.
  • Developed a priority inheritance mechanism to resolve priority inversion, including handling transitive inheritance across dependent processes.
  • Modified kernel-level data structures and system calls (lock, releaseall, ldelete) to support efficient lock management and safe deletion semantics.
  • Built and tested synchronization mechanisms under concurrent workloads, ensuring correctness, fairness, and performance in multi-process environments.

Virtual Memory Management and Demand Paging Implementation in XINU

  • Implemented a virtual memory system with demand paging in XINU, enabling processes to use address spaces larger than physical memory via backing store abstraction.
  • Designed and developed core system calls (xmmap, xmunmap, vcreate, vgetmem, vfreemem) to support memory mapping and per-process virtual heaps.
  • Built key OS data structures including backing store maps and inverted page tables for efficient page lookup, tracking, and replacement.
  • Implemented page fault handling (ISR 14) with on-demand page table creation, address validation, and dynamic page loading from backing store.
  • Developed and compared page replacement algorithms (Second-Chance and FIFO), including dirty page handling and TLB invalidation.
  • Modified kernel-level memory management and context switching to support per-process page directories and address space isolation.

🏒 Experience

Forescout Technologies

Software Engineer Intern (2025–Present)
Software Engineer (2022–2024)

  • Built real-time threat detection pipelines (eyeAlert) handling 1M–3M+ events/day
  • Developed Spring Boot microservices for Azure Blob + Service Bus log ingestion
  • Processed 100k–300k messages/day using parallelized execution (ExecutorService)
  • Designed scalable batching pipelines for downstream event processing
  • Developed hybrid Autoencoder + Random Forest model for multi-class threat detection (78% accuracy)
  • Built Flask-based SOAR APIs to automate Palo Alto rule creation (80% reduction in analyst effort)
  • Developed public APIs exporting 500K events/day to external SIEMs
  • Optimized Google Pub/Sub streaming ingestion β†’ 2Γ— throughput improvement
  • Worked on GCP β†’ Azure migration, including production debugging and retention automation
  • Solved critical issues in scaling, data consistency, and pipeline reliability

Qualys Inc – Context XDR

Software Engineer

  • Built CEP-based rule engine for real-time threat detection
  • Integrated Siddhi, Apache Flink, and Kafka for high-throughput stream processing
  • Optimized system for 10k events/sec per instance
  • Added custom query extensions and enabled parallel execution
  • Improved performance of distributed event processing pipelines

πŸŽ“ Education

North Carolina State University
MS in Computer Science (GPA: 4.0)

Walchand College of Engineering
B.Tech in Information Technology (GPA: 3.5)


πŸ† Achievements

  • πŸ₯ˆ Forescout Hackathon 2024 – Runner-up
  • πŸ… MLH 2026 – Best Use of Tech View Project
  • πŸ… MLH 2025 – Best Use of Generative AI View Project
  • 🏁 Smart India Hackathon 2020 – Finalist

πŸ“« Connect with Me


⚑ What I'm Currently Working On

  • Real-time threat detection systems
  • Distributed streaming optimizations
  • Applying ML to cybersecurity problems

Pinned Loading

  1. Improving-Performance-of-Evolutionary-Algorithms-using-Docker-Containerization Improving-Performance-of-Evolutionary-Algorithms-using-Docker-Containerization Public

    Python 1

  2. Farmers-Friend-SIH2020 Farmers-Friend-SIH2020 Public

    HTML

  3. CSC-724-SHIL CSC-724-SHIL Public

    Python 1

  4. dlba_project dlba_project Public

    Jupyter Notebook

  5. OS_Projects OS_Projects Public

    C