EdgeLLM

Run Large Language Models on iOS devices with just one line of code

let response = try await EdgeLLM.chat("Hello, world!")

Note: EdgeLLM is now fully functional! Supports multiple models including Qwen, Gemma, and Phi-3.

Quick Start

import EdgeLLM

// 1. Basic chat (uses default model)
let response = try await EdgeLLM.chat("Hello!")
print(response)

// 2. Choose a specific model
let response = try await EdgeLLM.chat("Hello!", model: .gemma2b)

// 3. Stream responses in real-time
for try await token in EdgeLLM.stream("Tell me a joke") {
    print(token, terminator: "")
}

// 4. Advanced usage with custom instance
let llm = try await EdgeLLM(model: .qwen06b)
let response = try await llm.chat("Explain quantum computing")

>
_{▲ Fictional penguin article summarised offline.}

Features

🚀 Dead Simple - Chat with LLMs in one line

📱 iOS Optimized - Metal GPU acceleration for blazing speed

🔒 Privacy First - Everything runs on-device

📦 Easy Install - Swift Package Manager ready

🌊 Streaming Support - Real-time responses

Installation

Swift Package Manager

In Xcode:

File → Add Package Dependencies

Enter URL: https://github.com/john-rocky/EdgeLLM

Select version and click "Add Package"

Or add to your Package.swift:

dependencies: [ .package(url: "https://github.com/john-rocky/EdgeLLM", from: "1.0.0") ]

Supported Models

Qwen 0.6B (.qwen06b) - Smallest, fastest model (~1.2GB)

Gemma 2B (.gemma2b) - Balanced performance (~2.5GB)

Phi-3.5 Mini (.phi3_mini) - Most capable (~3.8GB)

Models are automatically downloaded on first use (WiFi recommended).

Usage

Simplest Example

import EdgeLLM // Chat in one line! let response = try await EdgeLLM.chat("What's the weather like?") print(response)

Streaming Responses

// Receive response token by token for try await token in EdgeLLM.stream("Tell me a story") { print(token, terminator: "") }

Customization

// Specify model and options let response = try await EdgeLLM.chat( "Technical question", model: .gemma2b, // Use different model options: EdgeLLM.Options( temperature: 0.3, // More deterministic maxTokens: 500 ) )

Advanced Usage

// Keep LLM instance for conversations let llm = try await EdgeLLM(model: .gemma2b) // Multiple exchanges let response1 = try await llm.chat("My name is John") let response2 = try await llm.chat("What's my name?") // Reset conversation await llm.reset()

Example Apps

Simple Chat

Basic chat interface in Examples/SimpleChat:

cd Examples/SimpleChat open SimpleChat.xcodeproj

Streaming Chat with Performance Metrics

Advanced demo with real-time streaming and performance monitoring:

cd Examples/StreamingChat open StreamingChat.xcodeproj

Features:

Real-time token streaming

Live performance metrics (tokens/sec, latency)

Model comparison (Qwen3, Gemma, Phi-3.5)

Requirements

iOS 14.0+

Xcode 15.0+

4GB+ free storage for models

Recommended: iPhone 12 or newer (Neural Engine support)

Performance

On iPhone 15 Pro:

Initial load: 2-3 seconds

Token generation: 10-30 tokens/sec (model dependent)

Memory usage: 1-4GB depending on model

Troubleshooting

Model Not Found

Models are downloaded automatically on first run (WiFi recommended).

Out of Memory

Try a smaller model like .qwen06b:

let response = try await EdgeLLM.chat("Hello", model: .qwen06b)

License

Apache2.0 License

Contributing

Pull requests are welcome!

Development Setup

Clone the repository

Set up git hooks to prevent large files:
git config core.hooksPath .githooks

Important: Large Files Policy

Never commit binary files (.xcframework, .zip, .mlmodel, etc.)

Maximum file size: 10MB

Large files should be uploaded to GitHub Releases

The pre-commit hook will block commits with large files

Links

Documentation

Example App

Report Issues

Credits

EdgeLLM is built on top of the MLC-LLM project.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
.githooks		.githooks
.github/workflows		.github/workflows
Examples		Examples
Plugins/EdgeLLMSetupPlugin		Plugins/EdgeLLMSetupPlugin
Sources/EdgeLLM		Sources/EdgeLLM
.gitignore		.gitignore
EdgeLLM_Cleanup_Spec.md		EdgeLLM_Cleanup_Spec.md
LICENSE		LICENSE
Package.swift		Package.swift
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EdgeLLM

EdgeLLM

Quick Start

Features

Installation

Swift Package Manager

Supported Models

Usage

Simplest Example

Streaming Responses

Customization

Advanced Usage

Example Apps

Simple Chat

Streaming Chat with Performance Metrics

Requirements

Performance

Troubleshooting

Model Not Found

Out of Memory

License

Contributing

Development Setup

Important: Large Files Policy

Links

Credits

About

Uh oh!

Releases 6

Packages

Languages

License

john-rocky/EdgeLLM

Folders and files

Latest commit

History

Repository files navigation

EdgeLLM

EdgeLLM

Quick Start

Features

Installation

Swift Package Manager

Supported Models

Usage

Simplest Example

Streaming Responses

Customization

Advanced Usage

Example Apps

Simple Chat

Streaming Chat with Performance Metrics

Requirements

Performance

Troubleshooting

Model Not Found

Out of Memory

License

Contributing

Development Setup

Important: Large Files Policy

Links

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Languages

Packages