Zen AI Model Family

Hypermodal AI

55+ models across 11 families. Open-weight AI covering text, vision, image, audio, code, embeddings, and reranking — from edge to frontier.

55+

AI Models

11

Model Families

7

Modalities

Apache 2.0

License

Zen 5

Next-generation agentic models with native chain-of-thought.

zen5

Research Preview

Next-generation agentic frontier model trained on 10B+ tokens of real-world tool use, multi-step reasoning, and production workflows. 1M+ token context with native chain-of-thought.

1M contextMoDE + CoT

zen5-pro

Research Preview

High-throughput agentic model for demanding production workloads. Trained on real-world development patterns with deep chain-of-thought reasoning.

524K contextMoDE + CoT

zen5-max

Research Preview

Maximum context agentic model for document-scale analysis. Trained on 10B+ tokens of real-world workflows with extended chain-of-thought.

2M contextMoDE + CoT

zen5-ultra

Research Preview

Deepest reasoning model in the Zen family. Multi-pass chain-of-thought with self-verification.

1M contextMoDE + Deep CoT

zen5-mini

Research Preview

Efficient agentic model delivering zen5-class intelligence at a fraction of the cost.

262K contextMoDE + CoT

Zen 4

Latest generation production models with MoDE architecture.

zen4-max

Most capable model for complex reasoning, analysis, and agentic tasks. 1M token context window.

1M contextDense

zen4.1

High-performance 1M context model for long-document analysis, large codebase reasoning, and agentic workflows. Best balance of intelligence and cost at million-token scale.

1M contextDense

zen4

744B (40B active)

Flagship MoE model for complex reasoning and multi-domain tasks.

202K contextMoE

zen4-ultra

744B (40B active)

Maximum reasoning capability with extended chain-of-thought on MoE architecture.

TextCodeMathReasoning

262K contextMoE + CoT

zen4-pro

80B (3B active)

Efficient MoE model for demanding workloads with strong reasoning at production-grade cost.

131K contextMoE

zen4-thinking

80B (3B active)

Dedicated reasoning model with explicit chain-of-thought capabilities.

TextCodeMathReasoning

131K contextMoE + CoT

zen4-mini

Ultra-fast lightweight model optimized for speed and cost efficiency. Ideal for free tier.

128K contextDense

Code

Specialized models for code generation, review, and debugging.

zen4-coder

480B (35B active)

Code-specialized MoE model for generation, review, debugging, and agentic programming.

163K contextMoE

zen4-coder-flash

30B (3B active)

Lightweight code model optimized for speed and inline completions.

262K contextMoE

zen4-coder-pro

Full-precision BF16 code model for maximum accuracy on complex codebases.

131K contextDense BF16

zen-coder

Baseline code model for generation and completions.

131K contextDense

zen-coder-flash

Fast code model for inline completions and suggestions.

32K contextDense

zen-code

Legacy code model (superseded by Zen4 Coder series).

32K contextDense

Zen 3

Previous generation API models — language, vision, multimodal, and safety.

zen3-omni

Multimodal model supporting text, vision, audio, and structured output.

202K contextDense Multimodal

zen3-vl

30B (3B active)

Vision-language model for image understanding and visual reasoning.

262K contextMoE Vision-Language

zen3-nano

Ultra-lightweight model for edge deployment and low-latency tasks. Available on free tier.

128K contextDense

zen3-guard

Content safety classifier for moderation and guardrails. 9 safety categories, 119 languages.

65K contextDense

Embedding & Retrieval

Text embeddings and search reranking via API.

zen3-embedding

3072 dimensions

High-quality text embeddings for RAG, search, and classification.

8K contextEmbedding

zen3-embedding-medium

Balanced embedding model for cost-effective retrieval workloads.

40K contextEmbedding

zen3-embedding-small

Lightweight embedding model for high-throughput, low-cost applications.

32K contextEmbedding

zen3-reranker

High-quality reranker for improving retrieval accuracy in RAG pipelines.

40K contextReranker

zen3-reranker-medium

Balanced reranker for cost-effective retrieval quality improvement.

40K contextReranker

zen3-reranker-small

Lightweight reranker for high-throughput reranking at minimal cost.

40K contextReranker

zen-embedding

3072 dimensions

Foundation embedding model for search and retrieval.

8K contextEmbedding

zen-reranker

Cross-encoder reranker for search result quality.

TextEmbeddingsReranking

8K contextReranker

Image Generation

Text-to-image generation via API.

zen3-image

Best general-purpose image generation.

DiffusionDiffusion

zen3-image-max

Maximum quality image generation for professional creative work.

DiffusionDiffusion

zen3-image-dev

Development model for experimentation and iteration.

DiffusionDiffusion

zen3-image-fast

Fastest image model for real-time generation.

DiffusionDiffusion

zen3-image-sdxl

High-resolution image generation at 1024px.

DiffusionDiffusion

zen3-image-playground

Aesthetic model for artistic image generation.

DiffusionDiffusion

zen3-image-ssd

Fastest diffusion model for real-time generation.

DiffusionDiffusion

zen3-image-jp

Japanese-specialized image generation model.

DiffusionDiffusion

Audio & Speech

Speech-to-text, text-to-speech, and streaming ASR.

zen3-audio

Best quality speech-to-text transcription. 100+ languages.

zen3-audio-fast

Fastest speech-to-text transcription for high-throughput workloads.

zen3-asr

Real-time streaming speech recognition for live transcription and voice agents.

TextAudioStreaming

Streaming ASRStreaming ASR

zen3-asr-v1

First-generation streaming ASR for legacy compatibility.

TextAudioStreaming

Streaming ASRStreaming ASR

zen3-tts

High-quality text-to-speech with natural prosody. 40+ voices, 8 languages.

zen3-tts-hd

Maximum fidelity text-to-speech for broadcast-quality audio production.

zen3-tts-fast

Low-latency text-to-speech for real-time voice agents and interactive applications.

Foundation

General-purpose open-weight models from 0.6B to 235B parameters.

zen-nano

Ultra-lightweight LLM for edge and mobile deployment.

32K contextDense

zen-eco

Efficient 4B model for general-purpose tasks.

32K contextDense

zen

Standard model available in 8B and 32B variants.

32K contextDense

zen-pro

Professional-grade 32B dense model for demanding workloads.

32K contextDense

zen-max

235B (22B active)

High-capability MoE model with 235B parameters.

131K contextMoE

zen-next

Next-generation preview model with cutting-edge capabilities.

256K contextDense

Vision (Open Weights)

Vision-language and multimodal open-weight models.

zen-vl

Multi-modal vision-language model for image understanding.

32K contextDense Multimodal

zen-omni

Hypermodal model combining text, vision, audio, and code.

TextVisionAudioCode

131K contextDense Multimodal

Safety

Content moderation and safety guardrail models.

zen3-guard

Content safety classifier for moderation and guardrails. 9 safety categories, 119 languages.

65K contextDense

zen-guard

Content safety and moderation classifier.

32K contextDense

Agents

Agent-optimized models for tool use and planning.

zen-agent

Agent-optimized model for multi-step tool use and planning.

131K contextDense

Capabilities Matrix

Each model specializes in different modalities and tasks

Model	Text	Image	Video	Audio	3D	Code	Agents
zen5	✓	—	—	—	—	✓	✓
zen4	✓	—	—	—	—	✓	✓
zen4-max	✓	—	—	—	—	✓	✓
zen4-ultra	✓	—	—	—	—	✓	✓
zen4-coder	✓	—	—	—	—	✓	✓
zen3-omni	✓	✓	—	✓	—	—	—
zen3-vl	✓	✓	—	—	—	—	—
zen3-nano	✓	—	—	—	—	✓	—
zen3-guard	✓	—	—	—	—	—	—
zen3-image	✓	✓	—	—	—	—	—
zen3-audio	✓	—	—	✓	—	—	—
zen3-tts	✓	—	—	✓	—	—	—
zen3-embedding	✓	—	—	—	—	—	—
zen3-reranker	✓	—	—	—	—	—	—

Infrastructure

Production-ready tools for training and deploying Zen models

Zoo Engine

High-performance cloud inference — 60+ architectures, CUDA/Metal, OpenAI-compatible API

Zoo Edge

On-device AI inference — run models locally on any device, browser, or embedded system

Zen Gym

Unified training platform for all Zen models with LoRA, QLoRA, GRPO, and more

Zoo MCP

Model Context Protocol for AI context management and tool use

Quick Start

Get started with any Zen model in seconds

# Install and run any model
pip install transformers torch

# Use directly
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("zenlm/zen-eco-4b-instruct")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-eco-4b-instruct")

# Or use via Zoo Cloud API
from zooai import Zoo
client = Zoo(api_key="zk-your-api-key")

response = client.chat.completions.create(
    model="zen-eco-4b-instruct",
    messages=[{"role": "user", "content": "Hello!"}]
)

Build with Zen AI

Open-weight models, Apache 2.0 licensed. Free to use for research and commercial applications.