JavaScript client for woolball server Transform idle browsers into a powerful distributed AI inference network For detailed examples and model lists, visit our GitHub repository.
This SDK is automatically generated by the Swagger Codegen project
dotnet add package Woolball.CSharp.SDKWoolball Server is an open-source network server that orchestrates AI inference jobs across a distributed network of browser-based compute nodes. Instead of relying on expensive cloud infrastructure, harness the collective power of idle browsers to run AI models efficiently and cost-effectively.
| 🔧 Provider | 🎯 Task | 🤖 Models | 📊 Status |
|---|---|---|---|
| Transformers.js | 🎤 Speech-to-Text | ONNX Models | ✅ Ready |
| Transformers.js | 🔊 Text-to-Speech | ONNX Models | ✅ Ready |
| Kokoro.js | 🔊 Text-to-Speech | ONNX Models | ✅ Ready |
| Transformers.js | 🌐 Translation | ONNX Models | ✅ Ready |
| Transformers.js | 📝 Text Generation | ONNX Models | ✅ Ready |
| WebLLM | 📝 Text Generation | MLC Models | ✅ Ready |
| MediaPipe | 📝 Text Generation | LiteRT Models | ✅ Ready |
Generate text with powerful language models
🤖 Available Models
| Model | Quantization | Description |
|---|---|---|
HuggingFaceTB/SmolLM2-135M-Instruct |
fp16 |
Compact model for basic text generation |
HuggingFaceTB/SmolLM2-360M-Instruct |
q4 |
Balanced performance and size |
Mozilla/Qwen2.5-0.5B-Instruct |
q4 |
Efficient model for general tasks |
onnx-community/Qwen2.5-Coder-0.5B-Instruct |
q8 |
Specialized for code generation |
using IO.Swagger.Api;
using IO.Swagger.Client;
using System;
// Configure the API client
var api = new TextGenerationApi();
api.Configuration.BasePath = "http://localhost:9002";
// Text generation with Transformers.js
var input = "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"What is the capital of Brazil?\"}]";
var response = api.TextGeneration(
"transformers", // provider
"HuggingFaceTB/SmolLM2-135M-Instruct", // model
input, // input
50, // topK
1.0, // topP
0.7, // temperature
1.0, // repetitionPenalty
"fp16", // dtype
20, // maxLength
250, // maxNewTokens
0, // minLength
null, // minNewTokens
true, // doSample
1, // numBeams
0, // noRepeatNgramSize
null, // contextWindowSize
null, // slidingWindowSize
null, // attentionSinkSize
null, // frequencyPenalty
null, // presencePenalty
null, // bosTokenId
null, // maxTokens
null // randomSeed
);
Console.WriteLine("Response: " + response);| Parameter | Type | Default | Description |
|---|---|---|---|
model |
string | - | 🤖 Model ID (e.g., "HuggingFaceTB/SmolLM2-135M-Instruct") |
dtype |
string | - | 🔧 Quantization level (e.g., "fp16", "q4") |
max_length |
number | 20 | 📏 Maximum length the generated tokens can have (includes input prompt) |
max_new_tokens |
number | null | 🆕 Maximum number of tokens to generate, ignoring prompt length |
min_length |
number | 0 | 📐 Minimum length of the sequence to be generated (includes input prompt) |
min_new_tokens |
number | null | 🔢 Minimum numbers of tokens to generate, ignoring prompt length |
do_sample |
boolean | false | 🎲 Whether to use sampling; use greedy decoding otherwise |
num_beams |
number | 1 | 🔍 Number of beams for beam search. 1 means no beam search |
temperature |
number | 1.0 | 🌡️ Value used to modulate the next token probabilities |
top_k |
number | 50 | 🔝 Number of highest probability vocabulary tokens to keep for top-k-filtering |
top_p |
number | 1.0 | 📊 If < 1, only tokens with probabilities adding up to top_p or higher are kept |
repetition_penalty |
number | 1.0 | 🔄 Parameter for repetition penalty. 1.0 means no penalty |
no_repeat_ngram_size |
number | 0 | 🚫 If > 0, all ngrams of that size can only occur once |
🤖 Available Models
| Model | Description |
|---|---|
DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC |
DeepSeek R1 distilled model with reasoning capabilities |
DeepSeek-R1-Distill-Llama-8B-q4f16_1-MLC |
DeepSeek R1 distilled Llama-based model |
SmolLM2-1.7B-Instruct-q4f32_1-MLC |
Compact instruction-following model |
Llama-3.1-8B-Instruct-q4f32_1-MLC |
Meta's Llama 3.1 8B instruction model |
Qwen3-8B-q4f32_1-MLC |
Alibaba's Qwen3 8B model |
using IO.Swagger.Api;
using IO.Swagger.Client;
using System;
// Configure the API client
var api = new TextGenerationApi();
api.Configuration.BasePath = "http://localhost:9002";
// Text generation with WebLLM
var input = "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"What is the capital of Brazil?\"}]";
var response = api.TextGeneration(
"webllm", // provider
"DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC", // model
input, // input
null, // topK
0.95, // topP
0.7, // temperature
null, // repetitionPenalty
null, // dtype
null, // maxLength
null, // maxNewTokens
null, // minLength
null, // minNewTokens
null, // doSample
null, // numBeams
null, // noRepeatNgramSize
null, // contextWindowSize
null, // slidingWindowSize
null, // attentionSinkSize
null, // frequencyPenalty
null, // presencePenalty
null, // bosTokenId
null, // maxTokens
null // randomSeed
);
Console.WriteLine("Response: " + response);| Parameter | Type | Description |
|---|---|---|
model |
string | 🤖 Model ID from MLC (e.g., "DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC") |
provider |
string | 🔧 Must be set to "webllm" when using WebLLM models |
context_window_size |
number | 🪟 Size of the context window for the model |
sliding_window_size |
number | 🔄 Size of the sliding window for attention |
attention_sink_size |
number | 🎯 Size of the attention sink |
repetition_penalty |
number | 🔄 Penalty for repeating tokens |
frequency_penalty |
number | 📊 Penalty for token frequency |
presence_penalty |
number | 👁️ Penalty for token presence |
top_p |
number | 📈 If < 1, only tokens with probabilities adding up to top_p or higher are kept |
temperature |
number | 🌡️ Value used to modulate the next token probabilities |
bos_token_id |
number | 🏁 Beginning of sequence token ID (optional) |
🤖 Available Models
| Model | Device Type | Description |
|---|---|---|
https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma2-2b-it-cpu-int8.task |
CPU | Gemma2 2B model optimized for CPU inference |
https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma2-2b-it-gpu-int8.bin |
GPU | Gemma2 2B model optimized for GPU inference |
https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma3-1b-it-int4.task |
CPU/GPU | Gemma3 1B model with INT4 quantization |
https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma3-4b-it-int4-web.task |
Web | Gemma3 4B model optimized for web deployment |
using IO.Swagger.Api;
using IO.Swagger.Client;
using System;
// Configure the API client
var api = new TextGenerationApi();
api.Configuration.BasePath = "http://localhost:9002";
// Text generation with MediaPipe
var input = "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"Explain quantum computing in simple terms.\"}]";
var response = api.TextGeneration(
"mediapipe", // provider
"https://woolball.sfo3.cdn.digitaloceanspaces.com/gemma3-1b-it-int4.task", // model
input, // input
40, // topK
null, // topP
0.7, // temperature
null, // repetitionPenalty
null, // dtype
null, // maxLength
null, // maxNewTokens
null, // minLength
null, // minNewTokens
null, // doSample
null, // numBeams
null, // noRepeatNgramSize
null, // contextWindowSize
null, // slidingWindowSize
null, // attentionSinkSize
null, // frequencyPenalty
null, // presencePenalty
null, // bosTokenId
500, // maxTokens
12345 // randomSeed
);
Console.WriteLine("Response: " + response);| Parameter | Type | Description |
|---|---|---|
model |
string | 🤖 Model ID for MediaPipe LiteRT models on DigitalOcean Spaces |
provider |
string | 🔧 Must be set to "mediapipe" when using MediaPipe models |
maxTokens |
number | 🔢 Maximum number of tokens to generate |
randomSeed |
number | 🎲 Random seed for reproducible results |
topK |
number | 🔝 Number of highest probability vocabulary tokens to keep for top-k-filtering |
temperature |
number | 🌡️ Value used to modulate the next token probabilities |
Convert audio to text with Whisper models
| Model | Quantization | Description |
|---|---|---|
onnx-community/whisper-large-v3-turbo_timestamped |
q4 |
🎯 High accuracy with timestamps |
onnx-community/whisper-small |
q4 |
⚡ Fast processing |
using IO.Swagger.Api;
using IO.Swagger.Client;
using System;
// Configure the API client
var api = new TextGenerationApi();
api.Configuration.BasePath = "http://localhost:9002";
// Text generation with WebLLM
var input = "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"What is the capital of Brazil?\"}]";
var response = api.TextGeneration(
"webllm", // provider
"DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC", // model
input, // input
null, // topK
0.95, // topP
0.7, // temperature
null, // repetitionPenalty
null, // dtype
null, // maxLength
null, // maxNewTokens
null, // minLength
null, // minNewTokens
null, // doSample
null, // numBeams
null, // noRepeatNgramSize
null, // contextWindowSize
null, // slidingWindowSize
null, // attentionSinkSize
null, // frequencyPenalty
null, // presencePenalty
null, // bosTokenId
null, // maxTokens
null // randomSeed
);
Console.WriteLine("Response: " + response);| Parameter | Type | Description |
|---|---|---|
model |
string | 🤖 Model ID from Hugging Face (e.g., "onnx-community/whisper-large-v3-turbo_timestamped") |
dtype |
string | 🔧 Quantization level (e.g., "q4") |
return_timestamps |
boolean | 'word' | ⏰ Return timestamps ("word" for word-level). Default is false. |
stream |
boolean | 📡 Stream results in real-time. Default is false. |
chunk_length_s |
number | 📏 Length of audio chunks to process in seconds. Default is 0 (no chunking). |
stride_length_s |
number | 🔄 Length of overlap between consecutive audio chunks in seconds. If not provided, defaults to chunk_length_s / 6. |
force_full_sequences |
boolean | 🎯 Whether to force outputting full sequences or not. Default is false. |
language |
string | 🌍 Source language (auto-detect if null). Use this to potentially improve performance if the source language is known. |
task |
null | 'transcribe' | 'translate' | 🎯 The task to perform. Default is null, meaning it should be auto-detected. |
num_frames |
number | 🎬 The number of frames in the input audio. |
Generate natural speech from text
🤖 Available Models
| Language | Model | Flag |
|---|---|---|
| English | Xenova/mms-tts-eng |
🇺🇸 |
| Spanish | Xenova/mms-tts-spa |
🇪🇸 |
| French | Xenova/mms-tts-fra |
🇫🇷 |
| German | Xenova/mms-tts-deu |
🇩🇪 |
| Portuguese | Xenova/mms-tts-por |
🇵🇹 |
| Russian | Xenova/mms-tts-rus |
🇷🇺 |
| Arabic | Xenova/mms-tts-ara |
🇸🇦 |
| Korean | Xenova/mms-tts-kor |
🇰🇷 |
using IO.Swagger.Api;
using IO.Swagger.Client;
using System;
// Configure the API client
var api = new TextGenerationApi();
api.Configuration.BasePath = "http://localhost:9002";
// Text generation with MMS
var input = "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"What is the capital of Brazil?\"}]";
var response = api.TextGeneration(
"transformers", // provider
"HuggingFaceTB/SmolLM2-135M-Instruct", // model
input, // input
50, // topK
1.0, // topP
0.7, // temperature
1.0, // repetitionPenalty
"fp16", // dtype
20, // maxLength
250, // maxNewTokens
0, // minLength
null, // minNewTokens
true, // doSample
1, // numBeams
0, // noRepeatNgramSize
null, // contextWindowSize
null, // slidingWindowSize
null, // attentionSinkSize
null, // frequencyPenalty
null, // presencePenalty
null, // bosTokenId
null, // maxTokens
null // randomSeed
);
Console.WriteLine("Response: " + response);| Parameter | Type | Description | Required For |
|---|---|---|---|
model |
string | 🤖 Model ID | All providers |
dtype |
string | 🔧 Quantization level (e.g., "q8") | All providers |
stream |
boolean | 📡 Whether to stream the audio response. Default is false. |
All providers |
🤖 Available Models
| Model | Quantization | Description |
|---|---|---|
onnx-community/Kokoro-82M-ONNX |
q8 |
High-quality English TTS with multiple voices |
onnx-community/Kokoro-82M-v1.0-ONNX |
q8 |
Alternative Kokoro model version |
using IO.Swagger.Api;
using IO.Swagger.Client;
using System;
// Configure the API client
var api = new TextGenerationApi();
api.Configuration.BasePath = "http://localhost:9002";
// Text generation with Kokoro
var input = "[{\"role\": \"system\", \"content\": \"You are a helpful assistant.\"}, {\"role\": \"user\", \"content\": \"What is the capital of Brazil?\"}]";
var response = api.TextGeneration(
"webllm", // provider
"DeepSeek-R1-Distill-Qwen-7B-q4f16_1-MLC", // model
input, // input
null, // topK
0.95, // topP
0.7, // temperature
null, // repetitionPenalty
null, // dtype
null, // maxLength
null, // maxNewTokens
null, // minLength
null, // minNewTokens
null, // doSample
null, // numBeams
null, // noRepeatNgramSize
null, // contextWindowSize
null, // slidingWindowSize
null, // attentionSinkSize
null, // frequencyPenalty
null, // presencePenalty
null, // bosTokenId
null, // maxTokens
null // randomSeed
);
Console.WriteLine("Response: " + response);| Parameter | Type | Description | Required For |
|---|---|---|---|
model |
string | 🤖 Model ID | Required |
dtype |
string | 🔧 Quantization level (e.g., "q8") | Required |
voice |
string | 🎭 Voice ID (see below) | Required |
stream |
boolean | 📡 Whether to stream the audio response. Default is false. |
Optional |
🎭 Available Voice Options
🇺🇸 American Voices
- 👩 Female:
af_heart,af_alloy,af_aoede,af_bella,af_jessica,af_nova,af_sarah - 👨 Male:
am_adam,am_echo,am_eric,am_liam,am_michael,am_onyx
🇬🇧 British Voices
- 👩 Female:
bf_emma,bf_isabella,bf_alice,bf_lily - 👨 Male:
bm_george,bm_lewis,bm_daniel,bm_fable
Translate between 200+ languages
| Model | Quantization | Description |
|---|---|---|
Xenova/nllb-200-distilled-600M |
q8 |
🌍 Multilingual translation model supporting 200+ languages |
using IO.Swagger.Api;
using IO.Swagger.Client;
using System;
// Configure the API client
var api = new TextGenerationApi();
api.Configuration.BasePath = "http://localhost:9002";
// Translation example
var response = api.Translation(
"Xenova/nllb-200-distilled-600M", // model
"q8", // dtype
"Hello, how are you today?", // input
"eng_Latn", // srcLang
"por_Latn" // tgtLang
);
Console.WriteLine("Translation: " + response);Uses FLORES200 format - supports 200+ languages!
| Parameter | Type | Description |
|---|---|---|
model |
string | 🤖 Model ID (e.g., "Xenova/nllb-200-distilled-600M") |
dtype |
string | 🔧 Quantization level (e.g., "q8") |
srcLang |
string | 🌍 Source language code in FLORES200 format (e.g., "eng_Latn") |
tgtLang |
string | 🌍 Target language code in FLORES200 format (e.g., "por_Latn") |
We welcome contributions! Here's how you can help:
- 🐛 Report bugs via GitHub Issues
- 💡 Suggest features in our Discord
- 🔧 Submit PRs for improvements
- 📖 Improve documentation
This project is licensed under the MIT License - see the LICENSE file for details.
Made with ❤️ by the Woolball team