Provisioned Throughput Units (PTUs) are the reserved-capacity commercial construct the major AI substrates have converged on for steady-state generative workloads. Azure OpenAI sells PTUs in deployment-zone reservations. AWS Bedrock sells the equivalent as Provisioned Throughput model units. Anthropic and OpenAI offer enterprise reserved-throughput agreements on their direct APIs. Google Vertex AI sells dedicated throughput as a reservation construct. The constructs are not interchangeable, and the buyer-side procurement methodology must read them as five separate commercial surfaces sized against a single workload. This 24-page paper sets out the cross-vendor PTU sizing arithmetic, the deployment-zone affinity rules, the model deprecation protection, the IP indemnification position across the substrates, the multi-cloud portability architecture and the renewal posture inside each vendor’s commercial cycle. Buyer-side. Independent. Gated.
A senior Admodum advisor will follow up to confirm receipt and offer a private read of the document if you would prefer a guided walkthrough. There is no obligation. The paper is the deliverable.
Every PTU substrate publishes a target tokens-per-minute throughput per reserved unit, conditional on a stated latency profile. The buyer-side sizing methodology begins with the deployed workload’s sustained tokens-per-minute (drawn from telemetry, not from the marketing forecast), the workload’s latency tolerance (time-to-first-token in milliseconds), and the workload’s burst envelope (the ratio of peak-to-sustained call volume).
The reservation count is then sized so that the sustained throughput sits inside the reservation’s steady-state capacity, with the burst envelope handled either by a spillover routing rule to the pay-as-you-go endpoint or by an additional reservation that absorbs the peak. The number is rarely the number the publisher account team will propose; the published forecast typically anchors at the rollout target rather than at the trailing-period sustained read.
The buyer-side discipline is therefore to size the reservation against the smaller of the two reads (trailing-period sustained, conservative forward forecast), and to leave the upside as pay-as-you-go capacity until the rollout confirms the larger figure. The Admodum AI vendors practice sizes PTU reservations against this methodology across every engagement.
This paper sets out the ten-section cross-vendor protocol Admodum applies to the PTU decision, with the sizing arithmetic, the break-even, the deployment-zone affinity, the deprecation protection and the multi-cloud portability each carried through to the closing position.
Corporate email only. The paper is sent by a senior Admodum advisor. No marketing list, no third-party distribution. Submission confirms you have read the independence statement at the foot of this page.
The Admodum AI vendors practice sizes PTU commitments inside the Renewal Programme and the Benchmarking Programme. Engagements run as fixed fee, contingency or annual retainer.