The Provider Landscape

Understanding every major AI provider — their strengths, weaknesses, and when to use each one.

Why Provider Choice Matters

Choosing the wrong AI provider can mean:

10x higher costs than necessary for simple tasks
Compliance violations if data crosses geographic boundaries
Vendor lock-in that's expensive to escape
Downtime when a single provider has an outage

Smart organizations use multiple providers — routing each task to the best option based on cost, capability, privacy, and reliability.

OpenAI — The First Mover

The company that started the LLM revolution:

Model	Tier	Cost (in/out per 1M)	Best For
GPT-5	Frontier	$15 / $60	Complex reasoning, code
GPT-4o	Mid-tier	$2.50 / $10	General tasks, chat
GPT-4o mini	Budget	$0.15 / $0.60	Simple tasks, classification
o3	Reasoning	$10 / $40	Math, logic, analysis

Strengths: Largest ecosystem, best tooling, function calling, vision Weaknesses: Expensive at scale, data handling concerns, rate limits API: api.openai.com/v1/chat/completions

OpenAI set the API standard. Almost every other provider offers an "OpenAI-compatible" API endpoint.

Anthropic — The Safety Leader

Founded by ex-OpenAI researchers focused on AI safety:

Model	Tier	Cost (in/out per 1M)	Best For
Claude Opus 4.6	Frontier	$15 / $75	Complex analysis, long context
Claude Sonnet 4	Mid-tier	$3 / $15	Balanced quality/cost
Claude Haiku 3.5	Budget	$0.80 / $4	Fast tasks, high volume

Strengths: Longest effective context (200K), best instruction following, strong coding Weaknesses: Higher output pricing, smaller model range API: api.anthropic.com/v1/messages (also OpenAI-compatible via adapters)

Google — The Multimodal Giant

Leveraging decades of search and ML research:

Model	Tier	Cost (in/out per 1M)	Best For
Gemini 2.5 Pro	Frontier	$1.25 / $10	Multimodal, long context
Gemini 2.0 Flash	Mid-tier	$0.10 / $0.40	Speed, cost efficiency
Gemini 2.5 Flash	Budget	$0.15 / $0.60	Thinking tasks on a budget

Strengths: Massive context (1M+ tokens), excellent multimodal, competitive pricing Weaknesses: Inconsistent quality, less reliable for code, complex API API: generativelanguage.googleapis.com or via Vertex AI

AWS Bedrock & Azure — Enterprise Cloud

For organizations already in AWS or Azure:

AWS Bedrock:

Access Claude, Llama, Mistral, Cohere, Amazon Nova through your AWS account
Data stays in your AWS region — critical for compliance
Pay-as-you-go pricing, no separate AI vendor contracts
Cross-region inference for availability

Azure OpenAI:

GPT-4, GPT-4o through your Azure subscription
Enterprise security, VNet integration, managed identity
Same API as OpenAI but with Azure compliance guarantees
Content filtering built in

When to use cloud-managed AI:

Your company already has an AWS/Azure contract
You need SOC 2, HIPAA, or FedRAMP compliance
You want a single invoice for all infrastructure
Data residency requirements (EU data stays in EU)

Open Source — Llama, Qwen, Mistral

Run models yourself with zero per-token cost:

Model	Parameters	Quality Level	VRAM Needed
Llama 4 Scout	109B (17B active)	GPT-4 class	64GB
Qwen 3	30B	Strong	24GB
Mistral Large	123B	GPT-4 class	80GB
Llama 3.3	70B	Very strong	48GB
Phi-4	14B	Good for size	12GB

How to run them:

Ollama — easiest: ollama run llama3.3
vLLM — fastest: optimized serving for production
llama.cpp — most portable: runs on CPU, even Raspberry Pi

When open source wins:

High volume (>10K requests/day) → zero marginal cost
Air-gapped environments → no internet needed
Full control → customize, fine-tune, inspect weights
Privacy → data never leaves your infrastructure

Provider Comparison Matrix

Choosing the right provider at a glance:

Factor	OpenAI	Anthropic	Google	Bedrock	Self-hosted
Quality	Excellent	Excellent	Very Good	Same models	Good-Excellent
Speed	Fast	Fast	Very Fast	Fast	Varies
Cost	High	High	Low-Medium	Medium	Fixed only
Privacy	Cloud	Cloud	Cloud	Your AWS	Full control
Compliance	Limited	Limited	Limited	SOC2/HIPAA	Your responsibility
Availability	99.9%	99.9%	99.9%	99.99%	Your SLA
Setup	Minutes	Minutes	Minutes	Hours	Days

Recommendation: Start with a cloud API (OpenAI or Anthropic) for development. Add Bedrock or Azure for production compliance. Add self-hosted for high-volume or air-gapped needs. Use a gateway like Model Prism to route between them seamlessly.

---quiz question: Why would an enterprise choose AWS Bedrock over calling OpenAI directly? options:

{ text: "Bedrock models are always higher quality", correct: false }
{ text: "Data stays in their AWS region for compliance, and billing goes through their existing AWS contract", correct: true }
{ text: "Bedrock is always cheaper per token", correct: false } feedback: AWS Bedrock keeps data within the organization's AWS infrastructure (critical for HIPAA, SOC 2, data residency), and billing flows through existing cloud contracts — simplifying procurement and compliance.

---quiz question: When does self-hosting open-source models become more cost-effective than cloud APIs? options:

{ text: "Always — self-hosting is always cheaper", correct: false }
{ text: "When making more than ~10,000 requests per day with consistent workloads", correct: true }
{ text: "Only when using models with fewer than 7B parameters", correct: false } feedback: Self-hosting has a fixed GPU cost but zero per-token cost. At high volume (10K+ daily requests), the fixed cost is amortized and becomes significantly cheaper than per-token cloud pricing.

---quiz question: What is the main advantage of using multiple AI providers instead of just one? options:

{ text: "It's simpler to manage", correct: false }
{ text: "Optimal cost, capability matching, redundancy, and avoiding vendor lock-in", correct: true }
{ text: "Each provider requires a minimum spend", correct: false } feedback: Multi-provider strategies let you route each task to the cheapest capable model, maintain availability if one provider goes down, and avoid vendor lock-in — which is exactly what gateway tools like Model Prism enable.