AI Endpoint Model Serverless

Instant
AI Inference via API

Integrate 30+ pre-optimized AI models into your applications with our developer-friendly API and robust Inference infrastructure.

  • Meta Llama3.1 70B
  • DeepSeek Deepseek-R1
  • DeepSeek Deepseek-V3
  • Qwen Qw3-32B
  • Qwen Qw3-14B
  • Flux Flux.V1-Schnell
  • Flux Flux.V1-Dev
Serverless Inference

Get access to 15+ models through API endpoints like Llama, DeepSeek, Qwen, Mistral, FLUX and many others.

Features For which use cases?

What Models Will You Deploy Today?

Integrate powerful open-source and multimodal AI capabilities.

From conversational AI to image generation and code assistance – into your applications.

Pricing Build with Arkane Cloud

Affordable text generation and image creation with pay-as-you-use pricing model available.

Text Generation Text to text

Model Input Output
Deepseek-V3 $2/M tokens $2/M tokens
Deepseek-R1 $3/M tokens $8/M tokens
Llama3.1 8B $0.05/M tokens $0.2/M tokens
Llama3.1 70B $1/M tokens $1.5/M tokens
Gemma2 9B $0.1/M tokens $0.3/M tokens
Gemma3 27B $0.25/M tokens $0.7/M tokens
Qwen3 32B $0.3/M tokens $1.0/M tokens
Qwen3 235B A22B $0.5/M tokens $2/M tokens
Qwen3 Coder 480B A35B $1/M tokens $4/M tokens
GPT-OSS 120B $0.25/M tokens $0.75/M tokens
GPT-OSS 20B $0.15/M tokens $0.3/M tokens
GLM-4.5 $1/M tokens $5/M tokens
GLM-4.5 Air $0.4/M tokens $2.5/M tokens
Image Generation Text to Image

Model Price per image Images per $1
Flux Schnell $0.003 333
Flux Dev $0.025 40

Create your account