Prerequisites
Feature Description
ZAYA1-8B is a sparse Mixture-of-Experts (MoE) model with 760M active parameters (8.4B total) developed by Zyphra. It introduces a reasoning method called Markovian RSA (Rational Speech Acts) sampling, which optimizes the chain-of-thought during inference to improve logical accuracy.
Motivation
It is an ultra-sparse model, making it ideal for local use.
Its performance is SOTA for its size
Possible Implementation
the GitHub fork of vllm with the following commit (Zyphra/vllm@641dc7a)
Zyphra's Technical Report: https://www.zyphra.com/zaya1-8b-technical-report
Prerequisites
Feature Description
ZAYA1-8B is a sparse Mixture-of-Experts (MoE) model with 760M active parameters (8.4B total) developed by Zyphra. It introduces a reasoning method called Markovian RSA (Rational Speech Acts) sampling, which optimizes the chain-of-thought during inference to improve logical accuracy.
Motivation
It is an ultra-sparse model, making it ideal for local use.
Its performance is SOTA for its size
Possible Implementation
the GitHub fork of vllm with the following commit (Zyphra/vllm@641dc7a)
Zyphra's Technical Report: https://www.zyphra.com/zaya1-8b-technical-report