MiniMax

MiniMax Developer API Reference

Integrate MiniMax AI models into your applications with a clean REST API built for developers who value speed, clarity, and predictability.

API Overview

The MiniMax API is a RESTful interface that accepts JSON request bodies and returns JSON responses, with streaming support for real-time chat and video generation.

The MiniMax API operates over HTTPS on api.minimax.gr.com. Every request requires a valid Bearer token sent in the Authorization header. Response codes follow standard HTTP semantics — 200 for success, 401 for authentication failures, 429 when you hit rate limits, and 5xx for server-side issues. All endpoints support idempotency keys via the Idempotency-Key header, letting you safely retry requests without duplicating operations.

The API is versioned through a date-based scheme in the URL path. The current version is /v1. Older versions receive a six-month deprecation window with clear migration guides. Breaking changes are announced through the changelog and the developer mailing list. Non-breaking additions — new model IDs, optional parameters, additional response fields — appear without version bumps.

Endpoint Quick Reference:

All endpoints live under https://api.minimax.gr.com/v1. Chat, embeddings, video, and model management endpoints share a common auth model and error format. Use /v1/models to list available models and their capabilities before building your integration.

Authentication

Every MiniMax API call authenticates with a Bearer token generated from your platform hub dashboard.

Create an API key in the platform hub under Settings > API Keys. Keys support scope restrictions — you can limit a key to read-only access, video-only endpoints, or specific models. Production deployments should use separate keys for development and production environments. Rotate keys on a regular schedule and monitor key usage in the dashboard's activity log.

The authorization header format is Authorization: Bearer mmx_live_a1b2c3d4e5f6g7h8i9j0. Keys prefixed with mmx_live_ are production keys. Test keys use the mmx_test_ prefix and route to a sandbox environment with no billing impact. Never commit API keys to version control — use environment variables or a secrets manager.

Rate Limits

MiniMax enforces tiered rate limits with transparent headers so your application can adapt without guesswork.

Rate limits apply per API key, not per IP address. Every response includes X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers. When you exceed your limit, the API returns HTTP 429 with a Retry-After header indicating the number of seconds to wait. Free tier accounts receive 60 requests per minute. Pay-as-you-go accounts get 600 RPM with higher limits on video endpoints. Enterprise customers receive dedicated capacity with burst allowances negotiated during onboarding.

Implement exponential backoff in your client: wait Retry-After seconds on the first 429, double the wait on subsequent failures, and add jitter to avoid thundering-herd retry patterns. The official MiniMax SDKs handle this automatically.

Endpoint Categories

The MiniMax API groups endpoints into four categories — Chat, Embeddings, Video, and Models — each with its own request shape and response convention.

Chat: The chat completions endpoint (POST /v1/chat/completions) accepts a messages array with role/content pairs. It supports system prompts, multi-turn conversations, function calling, and streaming via SSE. Temperature, top_p, and max_tokens give you granular control over output.

Embeddings: Call POST /v1/embeddings with an array of input strings and a model ID. Returns fixed-dimension float vectors suitable for semantic search, clustering, and RAG pipelines. Batch up to 2,048 input strings per request.

Video: The video generation endpoint (POST /v1/video/generations) takes a text prompt, optional reference image, and output parameters including resolution and duration. Generation is asynchronous — you receive a task ID immediately, then poll GET /v1/video/generations/{task_id} for completion status.

Models: GET /v1/models returns available models with metadata including context window size, pricing per token, capabilities (chat, embeddings, vision), and deprecation status. GET /v1/models/{model_id} provides detailed information for a specific model.

Response Formats

All MiniMax API responses follow a consistent JSON envelope — success payloads go in data, errors go in error, and metadata lives in a top-level meta object.

A successful response looks like: {"data": {...}, "meta": {"request_id": "req_abc123", "model": "minimax-chat-v2", "usage": {"prompt_tokens": 42, "completion_tokens": 18}}}. The request_id field is critical for support inquiries — include it when reporting unexpected behavior. Streaming responses use SSE frames, each containing a JSON chunk with a delta object that accumulates into the full response.

Error responses use the structure {"error": {"code": "invalid_request", "message": "The 'model' field is required", "details": {"field": "model"}}}. Common error codes include invalid_request, authentication_failed, rate_limit_exceeded, model_not_found, and server_error.

SDK Installation & Quickstart

The MiniMax SDKs wrap the REST API in idiomatic language interfaces — install with one command, make your first call in under ten lines of code.

Python: pip install minimax-sdk. Import from minimax import MiniMax, instantiate with client = MiniMax(api_key="..."), and call client.chat.completions.create(model="minimax-chat-v2", messages=[{"role": "user", "content": "Hello"}]).

JavaScript: npm install @minimax/sdk. Import import MiniMax from '@minimax/sdk', create const client = new MiniMax({ apiKey: '...' }), and call await client.chat.completions.create({ model: 'minimax-chat-v2', messages: [{ role: 'user', content: 'Hello' }] }).

Go: go get github.com/minimax/minimax-go. Import the package, create a client with client := minimax.NewClient("..."), and call client.Chat.Completions(ctx, &minimax.ChatCompletionRequest{Model: "minimax-chat-v2", Messages: []minimax.Message{{Role: "user", Content: "Hello"}}}).

Curl Examples

Test the MiniMax API directly from your terminal with these curl snippets — replace the placeholder key and model ID with your credentials.

Chat completion: curl https://api.minimax.gr.com/v1/chat/completions -H "Authorization: Bearer $MINIMAX_API_KEY" -H "Content-Type: application/json" -d '{"model":"minimax-chat-v2","messages":[{"role":"user","content":"Explain REST APIs in one paragraph."}]}'

Generate embeddings: curl https://api.minimax.gr.com/v1/embeddings -H "Authorization: Bearer $MINIMAX_API_KEY" -H "Content-Type: application/json" -d '{"model":"minimax-embed-v1","input":["MiniMax provides powerful AI tools for developers."]}'

List models: curl https://api.minimax.gr.com/v1/models -H "Authorization: Bearer $MINIMAX_API_KEY"

Start video generation: curl https://api.minimax.gr.com/v1/video/generations -H "Authorization: Bearer $MINIMAX_API_KEY" -H "Content-Type: application/json" -d '{"prompt":"A golden retriever running through a field of sunflowers at golden hour","duration":5}'

Error Handling Patterns

Robust MiniMax integrations handle errors at three levels — network timeouts, HTTP error codes, and application-level response validation.

Set reasonable timeouts: 30 seconds for chat completions, 120 seconds for video generation polling. Wrap API calls in retry logic that respects the Retry-After header. Log the request_id from every response so you can trace failures back through the MiniMax infrastructure. For production services, implement circuit breakers that pause requests when error rates exceed a threshold, giving the API time to recover.

API Endpoints Reference

The table below lists every available MiniMax API endpoint with its HTTP method, description, and rate limit tier.

EndpointMethodDescriptionRate Limit
/v1/chat/completionsPOSTGenerate chat completions with streaming support600 RPM
/v1/embeddingsPOSTCreate vector embeddings for text inputs600 RPM
/v1/video/generationsPOSTSubmit video generation tasks120 RPM
/v1/video/generations/{id}GETPoll video generation task status300 RPM
/v1/modelsGETList all available models and their capabilities300 RPM
/v1/models/{id}GETRetrieve detailed metadata for a specific model300 RPM
/v1/filesPOSTUpload files for fine-tuning or video reference60 RPM
/v1/files/{id}GETRetrieve file metadata and download URL300 RPM
/v1/fine-tunesPOSTCreate a fine-tuning job on a base model30 RPM
/v1/fine-tunes/{id}GETCheck fine-tuning job status and progress300 RPM

What Developers Say

Frequently Asked Questions

Popular Searches on MiniMax