Pricing that scales from token to bare metal

Three ways to run frontier open-weights models in your customer's metro. Zero egress on every tier.

Save up to 60% on 36-month reserved capacity
Volt Spark
Tokens-as-a-service. OpenAI-compatible. Llama 70B standard rate.
$0.95/M
Standard rate · committed volume
Get started
  • OpenAI drop-in: change base URL + key
  • Zero egress, in-metro serving
  • Sovereign tier: pod-pinned + attestation ($1.45/M)
Volt Forge
GPU-as-a-service. Dedicated leases in your namespace.
$2.36/GPU/hr
NVIDIA B200 · 36-mo reserved
Get started
  • Everything in Spark plus:
  • NVIDIA B200 + L40S capacity
  • Scoped kubeconfig into a dedicated namespace
  • Reserved: 45% off at 12-mo, 60% off at 36-mo
  • 31% below CoreWeave list
  • 99.9% uptime SLA (Tier III)
Volt Vault
Dedicated bare-metal. Sovereign by default.
$85,000/mo
8-GPU B200 rack · 36-mo
Get started
  • Everything in Forge plus:
  • Single-tenant bare-metal racks
  • Measured-boot attestation per node
  • SPIFFE federation into your trust domain
  • Zero ingress, egress, and inter-pod transfer
  • 8-GPU B200 rack, 36-mo

Frequently Asked Questions

Answers on zero egress, model catalog, pricing, and the sovereign tier behind Volt's inference cloud.

Platform

Models & Pricing

Reliability

Can't find what you're looking for? Reach our team

Run frontier models in your metro.

Zero egress, in-metro serving, at Bedrock-beating prices. Your data never leaves the city.