Partner Engineer, Generative AI at Meta. I ship ExecuTorch workloads on Apple Silicon, NVIDIA, and edge devices—voice agents, tool-using local assistants, and multimodal inference—and publish runnable artifacts on Hugging Face.
Previously: co-founder of Resia, MLOps + edge ML at Nearthlab, Triton/TensorRT at SNUAILAB. Technical lead of memary (2.5K★). 1st place, Meta × AITX Hackathon 2024 (BookMind).
Illuminate the path to reducing technology gaps so everyone benefits from technological advancement.
Full-stack AI website — Resia on YouTube
Meta × AITX Hackathon (1st place) — LinkedIn · BookMind code
ExecuTorch across MLX, CUDA, Metal, and XNNPACK — multimodal and speech: LFM2.5 on MLX (330 tok/s decode), Voxtral-TTS on MLX and CUDA Ampere+Blackwell (RTF 0.81), Gemma 4-E2B, plus 16 model repos for ASR, TTS, VAD, and text on younghan-meta/models.
Partner + product delivery — Led LM Studio integration for local Whisper and Voxtral-Realtime on macOS and Windows. Shipped a macOS dictation voice agent (Parakeet-TDT Metal → LFM2.5 MLX → Voxtral-TTS MLX) that replaced planned SuperWhisper/WisprFlow spend (up to $3.72M/year avoided). Windows WPF real-time voice via ExecuTorch (executorch-examples).
OpenClaw — Optional ExecuTorch Parakeet-TDT on Metal for Talk Mode: CLI setup/status/transcribe, hardened transcript-only prompting (PR #50051).
Agents + developer tooling — Open-source Llama Stack across Python/Swift/Kotlin for mobile and hybrid agents; MCP server for AI coding assistants (Llama content → tool-ready guidance). Llama 4 HF conversion and validation via llama-stack-evals and vLLM; integrations across LangChain, LlamaIndex, LiteLLM, CrewAI, and Haystack.
Upstream work lives in pytorch/executorch (recurring contributor).
On-device / edge ExecuTorch · MLX · Metal · XNNPACK · CUDA · TensorRT · Triton/DeepStream · ONNX · TFLite (NNAPI) · Snapdragon NPU · quantization · KV-cache
Agents MCP · tool calling · Llama Stack · LlamaIndex · LangChain · CrewAI · LiteLLM · vLLM · PyTorch
Languages / MLOps Python · C++ · Swift · Kotlin · TypeScript · AWS (SageMaker, Bedrock, Lambda, Amplify) · Docker · Airflow · W&B · Grafana




