Run 70B models in your customer's metro. At Bedrock prices. Without your data leaving the city.
A distributed Tier-3 inference fabric across 25+ metro pods. Multi-vendor GPU, zero egress, Kubernetes-native.
Built on a multi-vendor, CNCF-native stack
Built on the CNCF cloud-native AI stack
Capabilities
Zero Egress
Data residency is structural, not contractual. Zero ingress, zero egress, zero inter-pod transfer across every SKU.
Cryptographic Attestation
Measured-boot attestation per node and SPIFFE workload identity. Prove where your inference ran, signed end to end.
Multi-Vendor GPU
NVIDIA B200 and L40S today. AMD MI355X and Intel Gaudi 3 as the stack hardens. One control plane across all of it.
OpenAI-Compatible API
A drop-in endpoint. Change the base URL and key, keep your existing SDKs and tooling unchanged.
Kubernetes-Native Control Plane
CNCF-native by design: llm-d, KServe, Kueue, Cilium, and SPIRE. Scoped kubeconfig into your namespace, contributed back upstream.
Sovereign inference, by the numbers
Frontier open-weights LLMs in your customer's metro, with zero egress and Bedrock-beating prices across every pod.
Metro pods
Uptime SLA
Llama 70B tokens
Built on the tools your team already runs
Open standards end to end — OpenAI-compatible APIs, Kubernetes, and the CNCF stack. No proprietary lock-in.
OpenAI-compatible API
Spark is a drop-in OpenAI endpoint. Change the base URL and key — your existing SDK calls just work.
Kubernetes
Kubernetes-native by design. Forge leases land as scoped kubeconfigs into a dedicated namespace.
vLLM
vLLM-direct serving at the data plane for high-throughput open-weights inference, with llm-d as the stack hardens.
KServe
CNCF KServe drives model lifecycle and autoscaling across every pod. Contributed back upstream.
SPIRE / SPIFFE
Workload identity via SPIRE-issued SVIDs. Federate Vault nodes into your own trust domain.
Cilium
Cilium enforces default-deny egress at L3/L4. Data residency is structural, not contractual.
Terraform
Provision pods, namespaces, and leases as code. Reproducible, auditable infrastructure across metros.
voltctl CLI
Manage tenants, models, and reserved capacity from the command line. Scriptable and CI-friendly.
Pricing that scales from token to bare metal
Three ways to run frontier open-weights models in your customer's metro. Zero egress on every tier.
- OpenAI drop-in: change base URL + key
- Zero egress, in-metro serving
- Sovereign tier: pod-pinned + attestation ($1.45/M)
- Everything in Spark plus:
- NVIDIA B200 + L40S capacity
- Scoped kubeconfig into a dedicated namespace
- Reserved: 45% off at 12-mo, 60% off at 36-mo
- 31% below CoreWeave list
- 99.9% uptime SLA (Tier III)
- Everything in Forge plus:
- Single-tenant bare-metal racks
- Measured-boot attestation per node
- SPIFFE federation into your trust domain
- Zero ingress, egress, and inter-pod transfer
- 8-GPU B200 rack, 36-mo
Data residency you can prove beats data residency you promise. Zero egress is structural, not contractual — data never leaves the metro, and every request is bound to a tenant identity in an immutable audit log.
Volt
Platform thesis — Sovereign Inference Cloud
The team behind Volt

Angel Ramirez
CEO · CNCF Ambassador, founder of Cuemby

Cristher Castro
CCO · Talent, financial discipline, international ops

Hitomi Mizugaki
CPO · Product, agile, customer growth
Backed by operators
Advised by Nick Lashinsky, Jim Chappell, and James Leaverton. The founding team has completed 120+ tech due-diligence engagements.
Frequently Asked Questions
Answers on zero egress, model catalog, pricing, and the sovereign tier behind Volt's inference cloud.
Platform
Models & Pricing
Reliability
Can't find what you're looking for? Reach our team
Run frontier models in your metro.
Zero egress, in-metro serving, at Bedrock-beating prices. Your data never leaves the city.