Instant
AI Inference
via API
Integrate 30+ pre-optimized AI models into your applications with our developer-friendly API and robust Inference infrastructure.
- Llama3.1 70B
- Deepseek-R1
- Deepseek-V3
- Qw3-32B
- Qw3-14B
- Flux.V1-Schnell
- Flux.V1-Dev
Serverless Inference
Get access to 15+ models through API endpoints like Llama, DeepSeek, Qwen, Mistral, FLUX and many others.
Integrate powerful open-source and multimodal AI capabilities.
From conversational AI to image generation and code assistance – into your applications.
Deploy any of our 15+ open-source AI models instantly without infrastructure setup. Get your AI endpoints live in seconds with automatic scaling and enterprise-grade reliability.
Test and prototype with models directly in our web-based playground before integrating. Experiment with parameters, compare model outputs, and generate API code snippets to accelerate your development workflow.
Sub-second response times with 99.9% uptime SLA, automatic load balancing, and global edge deployment. Built on our optimized inference stack to handle production workloads at any scale.
Secure API access with token-based authentication, custom rate limits, and usage analytics. Monitor consumption in real-time.
Affordable text generation and image creation with pay-as-you-use pricing model available.
| Model | Input | Output |
|---|---|---|
| Deepseek-V3 | $2/M tokens | $2/M tokens |
| Deepseek-R1 | $3/M tokens | $8/M tokens |
| Llama3.1 8B | $0.05/M tokens | $0.2/M tokens |
| Llama3.1 70B | $1/M tokens | $1.5/M tokens |
| Gemma2 9B | $0.1/M tokens | $0.3/M tokens |
| Gemma3 27B | $0.25/M tokens | $0.7/M tokens |
| Qwen3 32B | $0.3/M tokens | $1.0/M tokens |
| Qwen3 235B A22B | $0.5/M tokens | $2/M tokens |
| Qwen3 Coder 480B A35B | $1/M tokens | $4/M tokens |
| GPT-OSS 120B | $0.25/M tokens | $0.75/M tokens |
| GPT-OSS 20B | $0.15/M tokens | $0.3/M tokens |
| GLM-4.5 | $1/M tokens | $5/M tokens |
| GLM-4.5 Air | $0.4/M tokens | $2.5/M tokens |
| Model | Price per image | Images per $1 |
|---|---|---|
| Flux Schnell | $0.003 | 333 |
| Flux Dev | $0.025 | 40 |
