Latai is a Terminal UI application designed to evaluate the performance of Generative AI providers using either default prompts or your own custom inputs.
Lataiβs TUI is structured into three distinct views:
-
Table View β This is the main interface displaying the list of AI models. Upon startup,
lataiverifies your access keys and loads models from providers that pass the verification. If a key is missing, a notification appears in the Events panel. -
Information Panel β This section provides detailed insights into the current run. Itβs especially useful when running multiple prompts, as it displays key performance metrics such as jitter, average latency, and min/max latency.
-
Events Panel β This panel logs all system activities in real-time. Since
lataiexecutes performance measurements in parallel without blocking the UI, you can monitor ongoing processes and check for any error messages here.
TLDR version (click on links to navigate to documentation sections):
- Install.
- Check your keys in environment.
OPENAI_API_KEYfor OpenAI models.GROQ_API_KEYfor Groq models.AWS_PROFILEfor AWS Bedrock
- Run
latai.
Two installation methods are available.
- Select a release from the Releases Page π.
- Download the appropriate file for your platform:
- Mac
- Linux
- Windows
- If you're unsure which release to download, always get the latest version.
For this one you need Go and its tooling to be installed.
# Install
go install github.com/pvlbzn/latai
# Run
lataiLatai requires API keys to access LLM providers. Each provider is optionalβby default, Latai attempts to load all providers and verifies their keys. If a key is missing, the corresponding provider will not be loaded.
If you donβt need a specific provider, simply omit its key.
TLDR:
- Add the following API keys and values to your environment.
- Update your terminal environment to apply the changes.
# OpenAI API key.
export OPENAI_API_KEY=
# Groq API key.
export GROQ_API_KEY=
# AWS Bedrock key. You can specify your AWS profile and region
# here. If you don't do this, yet you have your AWS CLI installed
# Latai will use `default` profile and `us-east-1` region.
export AWS_PROFILE=
export AWS_REGION=Important
Transparency note. Keys never leave your machine. Latai has no telemetry and does not send your data anywhere. Each provider code has two functions which reach internet. First one is VerifyAccess which sends request to list all available models to check API key validity. Second one is Send which sends requests to LLMs to measure latency based on default or user prompts.
API key management is provider specific, here are the instructions for each supported provider.
OpenAI and Groq use the same API therefore their key management principle is identical. To set keys add these into your environment:
# OpenAI API key.
export OPENAI_API_KEY=
# Groq API key.
export GROQ_API_KEY=If you don't need Groq, just don't add key.
AWS uses their own mechanism of authentication which is based on AWS CLI. Refer to their documentation for details if you need it.
Latai will load your AWS profile in following order:
- Access
AWS_PROFILEandAWS_REGIONfrom your environment. - If not found default to the default values
AWS_PROFILE=default,AWS_REGION=us-east-1.
To set your profile and region add those:
export AWS_PROFILE=
export AWS_REGION=Make sure that you either load Latai from the same terminal after exports, or add those exports into your shell rc file, e.g. .bashrc, .zshrc, etc.
Note
Make sure you have access to LLM models from your AWS account. They are not enabled by default. You have to navigate to https://REGION.console.aws.amazon.com/bedrock/home?region=REGION#/modelaccess and enable models from the console. Make sure to replace REGION with your actual region. Here is the link for us-east-1.
To verify your access you can use aws CLI.
aws bedrock \
list-foundation-models \
--region REGION \
--profile PROFILESubstitute REGION and PROFILE with your data. You can optionally pipe into jq to make output more readable.
Latai uses a set 3 pre-defined prompts by default. They are just good enough to measure latency to model and back. E.g. Respond with a single word: "optimistic".. You can find them here. Three pre-defined prompts meaning that by default all sampling happens with 3 runs.
If you need to measure compute time, or performance with your particular prompts, then you can add your own into ~/.latai directory.
# Create a directory where prompts are stored.
mkdir -p ~/.latai/prompts
# Create your prompts in there.
cd ~/.latai/prompts
touch p1.prompt
# Or create many.
touch {p1, p2, p3, p4, p5}.promptYou can create any number of prompts you wish, just mind throttling and rate limiting. All prompts should have .prompt postfix, files with other postfixes will be ignored.
Definitions:
- A provider is a service that grants access to models via an API.
- A vendor is the company that owns, develops, trains, and aligns a model.
- A model is an LLM with specific properties such as performance, context length, and supported languages.
Providers serve models through APIs. Models can be grouped into families, such as the Claude family. Some providers are mono-family, like OpenAI, which uses a single unified API for all its models. Others are multi-family, like AWS Bedrock, which has its own API, but the communication format varies depending on the model family.
Vendors may have one or more model families, typically defined by their API. For example, if models A and B share the same API and belong to the same vendor, they are considered part of the same family.
Latai organizes models by provider, as the provider is the core entity that runs the models. However, models can often be available through multiple providers. This distinction creates two API layers:
- The provider API, which handles transport.
- The model API, which defines the data format.
Commonly rate limits measured in following metrics:
- RPM: Requests per minute
- RPD: Requests per day
- TPM: Tokens per minute
- TPD: Tokens per day
Verify those with your model provider. This information can be found at provider's documentation. You can find these links below. Keep in mind that these rate limits almost always negotiable with your provider, and generally limits applied to models, not provider itself.
Providers:
- Groq Rate Limits
- OpenAI Rate Limits
- AWS Bedrock (read below)
OpenAI uses tiered rate limits from 1 to 5. For more details consult their documentation.
AWS Bedrock has multiple providers under their name. Before using most of the models you have to request access to them via AWS UI.
Note
Make sure you have access to LLM models from your AWS account. They are not enabled by default. You have to navigate to https://REGION.console.aws.amazon.com/bedrock/home?region=REGION#/modelaccess and enable models from the console. Make sure to replace REGION with your actual region.
To verify your access you can use aws CLI.
aws bedrock \
list-foundation-models \
--region REGION \
--profile PROFILESubstitute REGION and PROFILE with your data. You can optionally pipe into jq to make output more readable.
Even though AWS Bedrock returns lots of models, not all of them can be accessed "as-is". For example AWS Bedrock lists more than 20 Claude-family models, however, only 6 out of them are available without provisioning. Genlat doesn't include models which require special access at the moment.
You can fork this repository and add required provisioned models by adding their ID into NewBedrock function in this file.
All providers reside at provider package. There is one main interface defined at provider.go file.
// Comments omitted for brevity, check source file
// to see full version.
type Provider interface {
Name() ModelProvider
GetLLMModels(filter string) []*Model
Measure(model *Model, prompt *prompt.Prompt) (*Metric, error)
Send(message string, to *Model) (*Response, error)
VerifyAccess() bool
}If a struct satisfies this Provider interface it is ready to be used along with all other providers.
If provider you are adding is OpenAI API compliant check groq.go implementation.
Do not forget to add tests. You can see implementation of tests inside of provider package.
Read Events block of TUI, it generally explains what went wrong. The most common issues is related to AWS Bedrock due to access to models.
