White paper xxx · AI Vendors · PTU

AI vendors PTU commitment design.

Provisioned Throughput Units (PTUs) are the reserved-capacity commercial construct the major AI substrates have converged on for steady-state generative workloads. Azure OpenAI sells PTUs in deployment-zone reservations. AWS Bedrock sells the equivalent as Provisioned Throughput model units. Anthropic and OpenAI offer enterprise reserved-throughput agreements on their direct APIs. Google Vertex AI sells dedicated throughput as a reservation construct. The constructs are not interchangeable, and the buyer-side procurement methodology must read them as five separate commercial surfaces sized against a single workload. This 24-page paper sets out the cross-vendor PTU sizing arithmetic, the deployment-zone affinity rules, the model deprecation protection, the IP indemnification position across the substrates, the multi-cloud portability architecture and the renewal posture inside each vendor’s commercial cycle. Buyer-side. Independent. Gated.

FormatWhite paper, gated

Pages24

AudienceCIO, CDO, ML Platform, Procurement

PublishedDecember 2025

UpdatedMay 2026

A senior Admodum advisor will follow up to confirm receipt and offer a private read of the document if you would prefer a guided walkthrough. There is no obligation. The paper is the deliverable.

Contents

Inside the 24 pages.

The PTU construct across five substrates

Provisioned Throughput Units as a converged commercial form. Azure OpenAI PTU, AWS Bedrock Provisioned Throughput, Anthropic enterprise reserved throughput, OpenAI enterprise reservations, Vertex AI dedicated throughput.

ii.

Sizing arithmetic

Tokens-per-minute per PTU, the time-to-first-token guarantee, the latency-versus-throughput trade and the buyer-side workload-to-PTU calculation that anchors the reservation count.

iii.

PTU versus PAYG break-even

The pay-as-you-go (per-token) construct read against the reserved (PTU per hour) construct. Utilisation thresholds, monthly amortisation, hidden costs in burst routing and the buyer-side break-even read.

iv.

Deployment-zone affinity

Azure PTU regional binding, Bedrock cross-region inference, Vertex AI multi-region replication and the data-residency, latency and failover rules each construct imposes on the procurement.

Reservation tenor and re-commitment

Monthly, quarterly, annual and three-year tenor across the substrates. The discount curve, the cancellation posture, the over-commit allowance and the under-commit protection.

vi.

Model deprecation protection

The published model lifecycle policies across the five substrates. Migration windows, deprecated-model continuity guarantees, the buyer-side reservation-versus-lifecycle alignment read.

vii.

IP indemnification position

Azure Customer Copyright Commitment, AWS Bedrock IP indemnification, Anthropic and OpenAI enterprise indemnification language, Google Vertex AI indemnification. The carve-outs, caps and survival periods.

viii.

Multi-cloud portability architecture

The abstraction interface, the routing protocol, the cost-versus-latency arbitration and the operational protocol for re-routing a workload between substrates without re-engineering the application layer.

ix.

Renewal posture

The PTU renewal inside the parent commercial cycle: Azure MACC, AWS EDP, Google EDP. The standalone enterprise API renewals. The buyer-side position paper and the BATNA.

Reading list and references

Companion papers on Azure MACC, AWS Bedrock commitment, GCP EDP, Salesforce Agentforce, Workday Illuminate AI and the AI vendors practice.

Excerpt · Section II

The sizing is not the marketing forecast.

Every PTU substrate publishes a target tokens-per-minute throughput per reserved unit, conditional on a stated latency profile. The buyer-side sizing methodology begins with the deployed workload’s sustained tokens-per-minute (drawn from telemetry, not from the marketing forecast), the workload’s latency tolerance (time-to-first-token in milliseconds), and the workload’s burst envelope (the ratio of peak-to-sustained call volume).

A PTU sized against an aspirational rollout is a reserved unit running at shelfware utilisation.

The reservation count is then sized so that the sustained throughput sits inside the reservation’s steady-state capacity, with the burst envelope handled either by a spillover routing rule to the pay-as-you-go endpoint or by an additional reservation that absorbs the peak. The number is rarely the number the publisher account team will propose; the published forecast typically anchors at the rollout target rather than at the trailing-period sustained read.

The buyer-side discipline is therefore to size the reservation against the smaller of the two reads (trailing-period sustained, conservative forward forecast), and to leave the upside as pay-as-you-go capacity until the rollout confirms the larger figure. The Admodum AI vendors practice sizes PTU reservations against this methodology across every engagement.

This paper sets out the ten-section cross-vendor protocol Admodum applies to the PTU decision, with the sizing arithmetic, the break-even, the deployment-zone affinity, the deprecation protection and the multi-cloud portability each carried through to the closing position.

Request the white paper

Request the 24-page paper.

Corporate email only. The paper is sent by a senior Admodum advisor. No marketing list, no third-party distribution. Submission confirms you have read the independence statement at the foot of this page.

Full name

Job title

Company

Corporate email address

Substrate in scope

How did you find Admodum?

Context (optional)

Please enter a corporate email address. Personal addresses (gmail, hotmail, outlook, yahoo, icloud and similar) are not accepted.

Corporate email required · No marketing list · Reply within one business day

Prefer to read now? Open the full text in your browser →

AI vendors PTU commitment design.

Inside the 24 pages.

The sizing is not the marketing forecast.

Request the 24-page paper.

Read alongside.

Bring an advisor to the PTU decision.