Collaboration proposal: @sparseskip inference optimization + ternary MoE model + MCP tooling

Hi — we came across Off Grid via a cold email from Saiganesh and immediately recognized the overlap with our work.

**Who we are:** RFI-IRFOS, a small research team building the **Ternary Intelligence Stack** — a full-layer research and inference platform built on ternary computation:

- **albert.** — ternary MoE language model, trained from scratch
- **@sparseskip** — patent-pending sparse inference (skips zero-weight expert activations at runtime; 83 tok/s on modest hardware)
- **ternlang** — ternary programming language and runtime
- **TernStudio** — IDE built for ternary-native development
- **MCP infrastructure** — live endpoint on Smithery + Fly.io, auth, KPI pipeline

All MIT-licensed. github.com/rfi-irfos | ternlang.com

**Three concrete angles we'd like to explore:**

### 1. @sparseskip — sparse inference for your pipeline

We have a patent-pending technique that skips zero-weight expert activations at inference time. Ternary weights ({-1, 0, +1}) have a very high zero-weight rate by design, so the gains are especially large on ternary models. We're hitting 83 tok/s on modest hardware in our benchmarks. On mobile CPUs where every cycle matters, this could meaningfully improve your tok/s numbers. Happy to discuss how it could fit into the llama.rn / llama.cpp layer.

### 2. albert. as a model in your browser

albert. will export to GGUF. A ternary MoE at 4–8GB would be the first model of its kind in a mobile app. The quality-per-size tradeoff is the whole point of ternary quantization — fits your 4GB device constraint story well.

### 3. MCP tooling — we have a head start

We noticed MCP server support is on your Pro roadmap. We have a live MCP endpoint (published on Smithery, running on Fly.io) and have been building that infrastructure for a while. If you're building the client side and we have the server side, this is a natural handoff.

---

We use Claude Code, move fast, and are genuinely excited about where this could go — a fully offline ternary LLM app with a complete tool ecosystem is not a small thing. Not proposing anything formal — just opening the conversation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Collaboration proposal: @sparseskip inference optimization + ternary MoE model + MCP tooling #376

1. @sparseskip — sparse inference for your pipeline

2. albert. as a model in your browser

3. MCP tooling — we have a head start

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Collaboration proposal: @sparseskip inference optimization + ternary MoE model + MCP tooling #376

Description

1. @sparseskip — sparse inference for your pipeline

2. albert. as a model in your browser

3. MCP tooling — we have a head start

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions