White paper xxx · AI Vendors · PTU

AI vendors PTU commitment design.

Provisioned Throughput Units (PTUs) are the reserved-capacity commercial construct the major AI substrates have converged on for steady-state generative workloads. Azure OpenAI sells PTUs in deployment-zone reservations. AWS Bedrock sells the equivalent as Provisioned Throughput model units. Anthropic and OpenAI offer enterprise reserved-throughput agreements on their direct APIs. Google Vertex AI sells dedicated throughput as a reservation construct. The constructs are not interchangeable, and the buyer-side procurement methodology must read them as five separate commercial surfaces sized against a single workload. This 24-page paper sets out the cross-vendor PTU sizing arithmetic, the deployment-zone affinity rules, the model deprecation protection, the IP indemnification position across the substrates, the multi-cloud portability architecture and the renewal posture inside each vendor’s commercial cycle. Buyer-side. Independent. Gated.

FormatWhite paper, gated
Pages24
AudienceCIO, CDO, ML Platform, Procurement
PublishedDecember 2025
UpdatedMay 2026

A senior Admodum advisor will follow up to confirm receipt and offer a private read of the document if you would prefer a guided walkthrough. There is no obligation. The paper is the deliverable.

Contents

Inside the 24 pages.

i.
The PTU construct across five substrates
Provisioned Throughput Units as a converged commercial form. Azure OpenAI PTU, AWS Bedrock Provisioned Throughput, Anthropic enterprise reserved throughput, OpenAI enterprise reservations, Vertex AI dedicated throughput.
ii.
Sizing arithmetic
Tokens-per-minute per PTU, the time-to-first-token guarantee, the latency-versus-throughput trade and the buyer-side workload-to-PTU calculation that anchors the reservation count.
iii.
PTU versus PAYG break-even
The pay-as-you-go (per-token) construct read against the reserved (PTU per hour) construct. Utilisation thresholds, monthly amortisation, hidden costs in burst routing and the buyer-side break-even read.
iv.
Deployment-zone affinity
Azure PTU regional binding, Bedrock cross-region inference, Vertex AI multi-region replication and the data-residency, latency and failover rules each construct imposes on the procurement.
v.
Reservation tenor and re-commitment
Monthly, quarterly, annual and three-year tenor across the substrates. The discount curve, the cancellation posture, the over-commit allowance and the under-commit protection.
vi.
Model deprecation protection
The published model lifecycle policies across the five substrates. Migration windows, deprecated-model continuity guarantees, the buyer-side reservation-versus-lifecycle alignment read.
vii.
IP indemnification position
Azure Customer Copyright Commitment, AWS Bedrock IP indemnification, Anthropic and OpenAI enterprise indemnification language, Google Vertex AI indemnification. The carve-outs, caps and survival periods.
viii.
Multi-cloud portability architecture
The abstraction interface, the routing protocol, the cost-versus-latency arbitration and the operational protocol for re-routing a workload between substrates without re-engineering the application layer.
ix.
Renewal posture
The PTU renewal inside the parent commercial cycle: Azure MACC, AWS EDP, Google EDP. The standalone enterprise API renewals. The buyer-side position paper and the BATNA.
x.
Reading list and references
Companion papers on Azure MACC, AWS Bedrock commitment, GCP EDP, Salesforce Agentforce, Workday Illuminate AI and the AI vendors practice.
Excerpt · Section II

The sizing is not the marketing forecast.

Every PTU substrate publishes a target tokens-per-minute throughput per reserved unit, conditional on a stated latency profile. The buyer-side sizing methodology begins with the deployed workload’s sustained tokens-per-minute (drawn from telemetry, not from the marketing forecast), the workload’s latency tolerance (time-to-first-token in milliseconds), and the workload’s burst envelope (the ratio of peak-to-sustained call volume).

A PTU sized against an aspirational rollout is a reserved unit running at shelfware utilisation.

The reservation count is then sized so that the sustained throughput sits inside the reservation’s steady-state capacity, with the burst envelope handled either by a spillover routing rule to the pay-as-you-go endpoint or by an additional reservation that absorbs the peak. The number is rarely the number the publisher account team will propose; the published forecast typically anchors at the rollout target rather than at the trailing-period sustained read.

The buyer-side discipline is therefore to size the reservation against the smaller of the two reads (trailing-period sustained, conservative forward forecast), and to leave the upside as pay-as-you-go capacity until the rollout confirms the larger figure. The Admodum AI vendors practice sizes PTU reservations against this methodology across every engagement.

This paper sets out the ten-section cross-vendor protocol Admodum applies to the PTU decision, with the sizing arithmetic, the break-even, the deployment-zone affinity, the deprecation protection and the multi-cloud portability each carried through to the closing position.

Request the white paper

Request the 24-page paper.

Corporate email only. The paper is sent by a senior Admodum advisor. No marketing list, no third-party distribution. Submission confirms you have read the independence statement at the foot of this page.

Please enter a corporate email address. Personal addresses (gmail, hotmail, outlook, yahoo, icloud and similar) are not accepted.
Corporate email required · No marketing list · Reply within one business day
Independence
Admodum is not a partner, reseller, or affiliate of Microsoft, AWS, OpenAI, Anthropic, Google, Mistral, Meta, Cohere or any other software vendor. No reseller margin, no implementation-partner fee, no certified-consulting commission.
Software licensing white paper

Bring an advisor to the PTU decision.

The Admodum AI vendors practice sizes PTU commitments inside the Renewal Programme and the Benchmarking Programme. Engagements run as fixed fee, contingency or annual retainer.