AI vendors, read in full · Blog

Inside the pillar

Why AI vendors are a category of their own
The five substrates
Provisioned Throughput Units
On-demand versus PTU break-even
IP indemnification, read in writing
Multi-cloud portability
Model deprecation and reservation tenor
Fine-tuning, retention and the data position
Renewal posture inside parent envelopes
Reading list

Section i

Why AI vendors are a category of their own.

The generative-AI publishers operate on a commercial surface that is structurally different from the established enterprise software catalogue. The metric is tokens (or generative credits, or PTUs). The unit cost moves with model deprecation cycles measured in months rather than years. The IP indemnification posture is contractually load-bearing in a way no on-premise licence has been. The deployment substrate is the hyperscaler envelope (Azure, AWS, Google) plus the publisher-direct API.

The procurement implication is that the buyer-side methodology must treat the AI catalogue as a category and not as a line item inside the parent hyperscaler commitment. The AI Vendors practice runs the methodology across the five substrates: Azure OpenAI Service, AWS Bedrock, publisher-direct APIs (OpenAI, Anthropic), Google Vertex AI and on-premise or self-hosted models.

This pillar groups the AI commentary into ten editorial sections. Each section names the load-bearing mechanic, links the deeper spoke articles and points to the practice page and the relevant white papers for the buyer who wants the engagement methodology.

Section ii

The five substrates.

The generative-AI commercial surface runs across five distinct substrates. The Microsoft Azure OpenAI Service surfaces OpenAI models inside the Azure consumption envelope (MACC inclusive where contracted). AWS Bedrock surfaces Anthropic, Mistral, Meta, Cohere, Stability and Amazon Titan models inside the AWS consumption envelope (EDP inclusive where contracted). The Anthropic and OpenAI direct APIs surface their models on the publisher-direct billing surface. Google Vertex AI surfaces Gemini and partner models inside the GCP consumption envelope (EDP inclusive where contracted). On-premise or self-hosted models run against the GPU compute envelope and a different commercial logic.

The buyer-side methodology reads the substrate choice as a commercial decision and not as a technical preference. The substrate determines the commitment instrument, the IP indemnification position, the data residency footprint, the renewal cadence and the BATNA position. The same Claude or GPT-4o workload may sit on Bedrock, Azure OpenAI or the publisher-direct API at three materially different commercial positions.

The full methodology sits in three companion papers: AI Vendors PTU Design (the cross-substrate paper), AWS Bedrock Commitment and the cloud-pillar reading list above.

Section iii

Provisioned Throughput Units.

The Provisioned Throughput Unit is the converged form of reservation across the five substrates. Azure OpenAI calls it the PTU. AWS Bedrock calls it Provisioned Throughput (model units). Vertex AI calls it Provisioned Throughput. The publisher-direct APIs (OpenAI, Anthropic) operate enterprise commitments that sit alongside PTU but are billed differently.

The buyer-side methodology sizes the PTU envelope against trailing sustained token consumption (not against the publisher’s aspirational rollout). The sizing protocol reads the trailing-period consumption telemetry, identifies the sustained base (not the burst peak), converts the sustained base to the PTU-equivalent rate and sizes the reservation at the right utilisation ratio.

The full reading sits in the PTU Design paper. The paper covers the five-substrate PTU equivalence map, the sizing arithmetic, the break-even point against on-demand and the reservation-tenor decision.

Section iv

On-demand versus PTU break-even.

The break-even point between on-demand token-based pricing and PTU-reserved capacity sits at a defined utilisation ratio per model. For Claude Sonnet on AWS Bedrock, the break-even point sits in a specific utilisation band; the band is materially different for Claude Opus, GPT-4o on Azure, Gemini Pro on Vertex AI and so on. The arithmetic depends on the published PTU hourly rate, the on-demand per-token rate and the realised token throughput.

The publisher position on PTU sizing will be to over-provision to absorb growth. The buyer position is that growth absorbs into PAYG above the reserved capacity at the on-demand rate and that over-provisioning at year-one creates burnt capacity that the next renewal absorbs into the baseline.

The Admodum methodology runs the break-even arithmetic per model per substrate on the buyer’s telemetry. The output is the PTU envelope sized to the sustained base at the break-even utilisation, with PAYG headroom for the burst.

PTU sizing is not aspiration. It is the trailing-period sustained throughput, sized at the published break-even.

Section v

IP indemnification, read in writing.

The IP indemnification position on generative-AI output is contractually load-bearing. The five substrates each publish an indemnification position; the positions vary materially on the coverage scope (outputs versus inputs, training data versus inference outputs), the carve-outs (the customer must use the published content filters, the customer must not modify the system prompt outside named parameters), the cap (per-claim, aggregate) and the survival period (does the indemnification survive contract termination).

The buyer-side methodology reads the indemnification position in the contractual document rather than in the publisher marketing. The standard wording across the five publishers has converged on a similar coverage scope but the carve-outs and the cap vary materially.

The Admodum methodology produces the indemnification comparison table at engagement start. The table reads the five substrates against the buyer’s deployment topology and recommends the substrate combination that delivers the right coverage at the right cost. The full reading sits in the PTU Design paper.

Section vi

Multi-cloud portability.

The multi-cloud portability question reads against the lock-in posture of each substrate. The same Claude or Mistral or Llama model is available on multiple substrates with different commercial positions. The same OpenAI model is available on Azure OpenAI and the OpenAI direct API. Gemini is available on Vertex AI and (in limited form) on Google AI Studio.

The buyer-side methodology designs the deployment architecture for portability where the commercial position justifies it. The abstraction interface (LangChain, LiteLLM, custom routing) allows the workload to move between substrates at the model-API boundary. The portability is the renewal-cycle BATNA: a substrate that knows the workload can move will hold a different commercial position from a substrate that knows the workload cannot.

The Admodum methodology designs the portability layer as a procurement instrument. The full reading sits in the PTU Design paper and the AI Vendors practice.

Section vii

Model deprecation and reservation tenor.

The generative-AI model lifecycle runs on a deprecation cycle measured in months rather than years. A model that is leading at contract signing may be deprecated or superseded within twelve months. The PTU reservation tenor (monthly, annual, three-year) must be read against the model deprecation risk.

The buyer-side methodology sizes the reservation tenor against the model deprecation cycle. A three-year PTU reservation on a model with a published two-year deprecation horizon is a commercial exposure that the buyer should not accept without an explicit migration clause in the contract.

The Admodum protocol negotiates the migration clause explicitly. The clause records that the publisher will honour the PTU reservation against the named successor model (at no incremental cost to the buyer) for the residual tenor. The full reading sits in the PTU Design paper.

Section viii

Fine-tuning, retention and the data position.

The fine-tuning surface reads against the buyer’s data position. Fine-tuning produces a customised model that carries the buyer’s proprietary training data as a derivative. The contractual position must record that the fine-tuned weights remain the buyer’s property (and not the publisher’s), that the training data is retained only for the duration of the fine-tuning workload and that the publisher does not use the buyer’s data for the publisher’s model training.

The five substrates publish materially different retention positions. Azure OpenAI publishes a 30-day default retention with explicit opt-out. AWS Bedrock publishes a zero-retention default for Anthropic, Mistral, Cohere and Stability models (with publisher confirmation). The publisher-direct APIs (OpenAI, Anthropic) publish their own retention positions that vary by enterprise tier.

The Admodum methodology reads the retention position in writing in the contractual document, not in the publisher marketing. The data position is non-negotiable in regulated industries and the substrate choice must reflect the regulated position.

Section ix

Renewal posture inside parent envelopes.

The AI commitments interact with the parent hyperscaler commitment envelopes. Azure OpenAI consumption typically counts against the Microsoft Azure MACC (where the contractual document includes Azure OpenAI as an eligible service). AWS Bedrock consumption typically counts against the AWS EDP. Vertex AI consumption typically counts against the Google Cloud EDP. The publisher-direct APIs (OpenAI, Anthropic) sit outside the hyperscaler envelopes.

The buyer-side renewal posture runs on a dual cadence. The hyperscaler envelope (MACC, EDP, EDP) renews on its own cadence and absorbs the eligible AI spend. The publisher-direct APIs renew on their own cadence and require their own negotiation. The Admodum methodology designs the dual cadence as a single procurement decision.

The Microsoft, AWS and Google Cloud knowledge hubs aggregate the wider reading. The Renewal Programme runs the cadence.

Section x

Reading list.

The pillar groups AI commentary into ten sections above. The spoke band for this cluster is in preparation and will publish in weekly batches. The white papers below sit alongside the pillar as the methodology deliverables; the practice page sits alongside as the engagement entry point.

A short follow-up checklist for the reader who is closing this page: visit the AI Vendors practice for the engagement entry point; visit the cloud knowledge hubs for the parent-envelope reading; request the two AI papers (PTU Design, AWS Bedrock Commitment); or open a private conversation with a senior Admodum AI advisor through /contact/.

AI vendors, read in full.