Pricing — Veri

GPU compute

Customer price

Loading live GPU pricing…

Fine-tuning

Estimated per 1M tokens

Model SizeSFTPreference/RL

Up to 8B$0.45$1.10

9B-34B$0.90$2.20

35B-70B$1.80$4.40

70B+CustomCustom

Billing is based on GPU-hours, runtime, and storage — you pay for compute time, not per-token markup.

Deployments

GPU-hour based

ModeBillingNotes

On-demandGPU-hour tableAutoscale by endpoint capacity

DedicatedContact usReserved GPUs for production traffic

Clusters

Capacity

TypeAvailabilityPrice

On-demandAuto-selected providersGPU-hour table

DedicatedReserved capacityContact us

Spot pricing is not exposed as a user-facing option yet. Current jobs auto-select configured providers unless a provider is specified.

Storage

Monthly

ItemPriceUnit

Veri Volumes$0.085GiB-month after 512 GiB free

Checkpoint storage$0.023GiB-month

Dedicated serving

Sized to your traffic · compared vs frontier API list prices

Compute CostsInput TPM/GPUCached TPM/GPUOutput TPM/GPUPrice GPU/Min

MiniMax M3138,840694,20023,140—

GLM-5.235,731192,4009,620—

Qwen3 235B52,000260,00011,500—

Comparison

Your traffic on Veri vs frontier API list prices

ModelComparePeak requests/sCache Hit Rate (%)Input tokens/RequestOutput tokens/Request

Estimated num of H100s33

Estimated Monthly CostLoading…

Estimated Monthly SavingsLoading…

Estimates assume continuous 24/7 provisioning (43,800 min/mo) sized to your peak load, billed at an estimated $2.50/GPU-hr (live rate temporarily unavailable). Savings compare against the selected frontier model's published list price (cached input billed at the provider's caching discount) on the same traffic profile. Throughput per GPU varies with context length and serving configuration — talk to us for a sized quote.