Volt Spark

Tokens-as-a-service, OpenAI-compatible

Frontier open-weights LLMs in production, served in your customer's metro. Bedrock-beating prices with zero egress.

$0.95/M tokens, Llama 70B standard
OpenAI drop-in
Change the base URL and key. Your existing SDK code keeps working.
Zero egress, in-metro serving
Tokens are served in your customer’s metro. Data never leaves the city.
Sovereign tier
Pod-pinned inference with attestation. $1.45/M, about 45% below AWS Bedrock.

Drop-in compatible

Point the OpenAI SDK at Volt. No rewrite, no new client library.

from openai import OpenAI

client = OpenAI(
    base_url="https://api.voltcloud.ai/v1",
    api_key="volt-...",
)

resp = client.chat.completions.create(
    model="llama-3.3-70b",
    messages=[{"role": "user", "content": "Hello, Volt."}],
)

print(resp.choices[0].message.content)
  • Standard catalog: Llama 3.3/4, Mistral, Gemma 3, Phi-4, and more.
  • Bring your own LoRA or full-weights fine-tunes.
  • 99.9% uptime SLA with credits at 99.0% and 98.0% breach.

Run frontier models in your metro.

Zero egress, in-metro serving, at Bedrock-beating prices. Your data never leaves the city.