Run 70B models in your customer's metro. At Bedrock prices. Without your data leaving the city.

A distributed Tier-3 inference fabric across 25+ metro pods. Multi-vendor GPU, zero egress, Kubernetes-native.

background
background

Built on a multi-vendor, CNCF-native stack

NVIDIAAMDIntelCNCFKubernetesvLLMCilium

Built on the CNCF cloud-native AI stack

NVIDIA
AMD
Intel
CNCF
Kubernetes
vLLM
KServe
Cilium
Summit HQ
Supermicro
Arista

Capabilities

Zero Egress

Data residency is structural, not contractual. Zero ingress, zero egress, zero inter-pod transfer across every SKU.

Cryptographic Attestation

Measured-boot attestation per node and SPIFFE workload identity. Prove where your inference ran, signed end to end.

Multi-Vendor GPU

NVIDIA B200 and L40S today. AMD MI355X and Intel Gaudi 3 as the stack hardens. One control plane across all of it.

OpenAI-Compatible API

A drop-in endpoint. Change the base URL and key, keep your existing SDKs and tooling unchanged.

Kubernetes-Native Control Plane

CNCF-native by design: llm-d, KServe, Kueue, Cilium, and SPIRE. Scoped kubeconfig into your namespace, contributed back upstream.

Sovereign inference, by the numbers

Frontier open-weights LLMs in your customer's metro, with zero egress and Bedrock-beating prices across every pod.

25+

Metro pods

99.9%

Uptime SLA

$0.95/M

Llama 70B tokens

Built on the tools your team already runs

Open standards end to end — OpenAI-compatible APIs, Kubernetes, and the CNCF stack. No proprietary lock-in.

OpenAI-compatible API

Spark is a drop-in OpenAI endpoint. Change the base URL and key — your existing SDK calls just work.

Kubernetes

Kubernetes-native by design. Forge leases land as scoped kubeconfigs into a dedicated namespace.

vLLM

vLLM-direct serving at the data plane for high-throughput open-weights inference, with llm-d as the stack hardens.

KServe

CNCF KServe drives model lifecycle and autoscaling across every pod. Contributed back upstream.

SPIRE / SPIFFE

Workload identity via SPIRE-issued SVIDs. Federate Vault nodes into your own trust domain.

Cilium

Cilium enforces default-deny egress at L3/L4. Data residency is structural, not contractual.

Terraform

Provision pods, namespaces, and leases as code. Reproducible, auditable infrastructure across metros.

voltctl CLI

Manage tenants, models, and reserved capacity from the command line. Scriptable and CI-friendly.

Pricing that scales from token to bare metal

Three ways to run frontier open-weights models in your customer's metro. Zero egress on every tier.

Save up to 60% on 36-month reserved capacity
Volt Spark
Tokens-as-a-service. OpenAI-compatible. Llama 70B standard rate.
$0.95/M
Standard rate · committed volume
Get started
  • OpenAI drop-in: change base URL + key
  • Zero egress, in-metro serving
  • Sovereign tier: pod-pinned + attestation ($1.45/M)
Volt Forge
GPU-as-a-service. Dedicated leases in your namespace.
$2.36/GPU/hr
NVIDIA B200 · 36-mo reserved
Get started
  • Everything in Spark plus:
  • NVIDIA B200 + L40S capacity
  • Scoped kubeconfig into a dedicated namespace
  • Reserved: 45% off at 12-mo, 60% off at 36-mo
  • 31% below CoreWeave list
  • 99.9% uptime SLA (Tier III)
Volt Vault
Dedicated bare-metal. Sovereign by default.
$85,000/mo
8-GPU B200 rack · 36-mo
Get started
  • Everything in Forge plus:
  • Single-tenant bare-metal racks
  • Measured-boot attestation per node
  • SPIFFE federation into your trust domain
  • Zero ingress, egress, and inter-pod transfer
  • 8-GPU B200 rack, 36-mo

Data residency you can prove beats data residency you promise. Zero egress is structural, not contractual — data never leaves the metro, and every request is bound to a tenant identity in an immutable audit log.

Volt

Platform thesis — Sovereign Inference Cloud

The team behind Volt

Angel Ramirez

Angel Ramirez

CEO · CNCF Ambassador, founder of Cuemby

Cristher Castro

Cristher Castro

CCO · Talent, financial discipline, international ops

Hitomi Mizugaki

Hitomi Mizugaki

CPO · Product, agile, customer growth

Backed by operators

Advised by Nick Lashinsky, Jim Chappell, and James Leaverton. The founding team has completed 120+ tech due-diligence engagements.

Frequently Asked Questions

Answers on zero egress, model catalog, pricing, and the sovereign tier behind Volt's inference cloud.

Platform

Models & Pricing

Reliability

Can't find what you're looking for? Reach our team

Run frontier models in your metro.

Zero egress, in-metro serving, at Bedrock-beating prices. Your data never leaves the city.