Skip to content
View Nagharjun17's full-sized avatar

Block or report Nagharjun17

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. MCP-Ollama-Client MCP-Ollama-Client Public

    Lightweight MCP client that uses a local Ollama LLM to query multiple MCP servers defined in config.json

    Python 7 2

  2. CUDA-Custom-Kernels CUDA-Custom-Kernels Public

    Contains my CUDA kernels implementations and benchmarking like Tiled Matrix Multiplication for learning.

    Cuda

  3. ECE-GY-9143---High-Performance-Machine-Learning ECE-GY-9143---High-Performance-Machine-Learning Public

    Contains laboratory and project work for the course ECE-GY 9143 - High Performance Machine Learning

    Python 3 1

  4. Flash-Attention-Triton Flash-Attention-Triton Public

    This repository contains the codebase for the Flash Attention implementation on Triton.

    Python

  5. MLIR-to-PTX-CUDA MLIR-to-PTX-CUDA Public

    Creating an MLIR dialect that fuses Addition + ReLU, lowers to NVVM and LLVM IR and generates PTX to run the kernel on CUDA GPU

    C++

  6. Multimodal-Architecture-Optimisation-on-RTX3060-using-TVM Multimodal-Architecture-Optimisation-on-RTX3060-using-TVM Public

    This repository contains the codebase for optimizing a Vision to Text model on a target RTX3060 device using Apache TVM

    Python