Pricing that scales from token to bare metal
Three ways to run frontier open-weights models in your customer's metro. Zero egress on every tier.
Save up to 60% on 36-month reserved capacity
Volt Spark
Tokens-as-a-service. OpenAI-compatible. Llama 70B standard rate.
Standard rate · committed volume
- OpenAI drop-in: change base URL + key
- Zero egress, in-metro serving
- Sovereign tier: pod-pinned + attestation ($1.45/M)
Volt Forge
GPU-as-a-service. Dedicated leases in your namespace.
NVIDIA B200 · 36-mo reserved
- Everything in Spark plus:
- NVIDIA B200 + L40S capacity
- Scoped kubeconfig into a dedicated namespace
- Reserved: 45% off at 12-mo, 60% off at 36-mo
- 31% below CoreWeave list
- 99.9% uptime SLA (Tier III)
Volt Vault
Dedicated bare-metal. Sovereign by default.
8-GPU B200 rack · 36-mo
- Everything in Forge plus:
- Single-tenant bare-metal racks
- Measured-boot attestation per node
- SPIFFE federation into your trust domain
- Zero ingress, egress, and inter-pod transfer
- 8-GPU B200 rack, 36-mo
Frequently Asked Questions
Answers on zero egress, model catalog, pricing, and the sovereign tier behind Volt's inference cloud.
Platform
Models & Pricing
Reliability
Can't find what you're looking for? Reach our team
Run frontier models in your metro.
Zero egress, in-metro serving, at Bedrock-beating prices. Your data never leaves the city.