Skip to content
View seyeong-han's full-sized avatar
🦙
Llama
🦙
Llama

Block or report seyeong-han

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
seyeong-han/README.md

Young (Seyeong) Han

Partner Engineer, Generative AI at Meta. I ship ExecuTorch workloads on Apple Silicon, NVIDIA, and edge devices—voice agents, tool-using local assistants, and multimodal inference—and publish runnable artifacts on Hugging Face.

Previously: co-founder of Resia, MLOps + edge ML at Nearthlab, Triton/TensorRT at SNUAILAB. Technical lead of memary (2.5K★). 1st place, Meta × AITX Hackathon 2024 (BookMind).

Illuminate the path to reducing technology gaps so everyone benefits from technological advancement.


Highlights

Full-stack AI website — Resia on YouTube

Resia.design — full-stack AI website

Meta × AITX Hackathon (1st place) — LinkedIn · BookMind code

Meta × AITX Hackathon — BookMind demo

What I ship

ExecuTorch across MLX, CUDA, Metal, and XNNPACK — multimodal and speech: LFM2.5 on MLX (330 tok/s decode), Voxtral-TTS on MLX and CUDA Ampere+Blackwell (RTF 0.81), Gemma 4-E2B, plus 16 model repos for ASR, TTS, VAD, and text on younghan-meta/models.

Partner + product delivery — Led LM Studio integration for local Whisper and Voxtral-Realtime on macOS and Windows. Shipped a macOS dictation voice agent (Parakeet-TDT Metal → LFM2.5 MLX → Voxtral-TTS MLX) that replaced planned SuperWhisper/WisprFlow spend (up to $3.72M/year avoided). Windows WPF real-time voice via ExecuTorch (executorch-examples).

OpenClaw — Optional ExecuTorch Parakeet-TDT on Metal for Talk Mode: CLI setup/status/transcribe, hardened transcript-only prompting (PR #50051).

Agents + developer tooling — Open-source Llama Stack across Python/Swift/Kotlin for mobile and hybrid agents; MCP server for AI coding assistants (Llama content → tool-ready guidance). Llama 4 HF conversion and validation via llama-stack-evals and vLLM; integrations across LangChain, LlamaIndex, LiteLLM, CrewAI, and Haystack.

Upstream work lives in pytorch/executorch (recurring contributor).


Skills

On-device / edge ExecuTorch · MLX · Metal · XNNPACK · CUDA · TensorRT · Triton/DeepStream · ONNX · TFLite (NNAPI) · Snapdragon NPU · quantization · KV-cache

Agents MCP · tool calling · Llama Stack · LlamaIndex · LangChain · CrewAI · LiteLLM · vLLM · PyTorch

Languages / MLOps Python · C++ · Swift · Kotlin · TypeScript · AWS (SageMaker, Bedrock, Lambda, Amplify) · Docker · Airflow · W&B · Grafana


Contact

LinkedIn · YouTube · Website · illuminate.han@gmail.com

Pinned Loading

  1. kingjulio8238/Memary kingjulio8238/Memary Public

    The Open Source Memory Layer For Autonomous Agents

    Jupyter Notebook 2.6k 199

  2. KnowledgeGraphRAG KnowledgeGraphRAG Public

    memAry @ University of Texas at Austin

    Jupyter Notebook 21 2

  3. Architectural-Intelligence Architectural-Intelligence Public

    AI Remodeling web app using React-js, Stable Diffusion and Llama3.2

    JavaScript 5

  4. BookMind BookMind Public

    Draw book mind maps to see the relationship between characters easily :)

    JavaScript 13