The Provider Landscape
Understanding every major AI provider — their strengths, weaknesses, and when to use each one.
Why Provider Choice Matters
Choosing the wrong AI provider can mean:
- 10x higher costs than necessary for simple tasks
- Compliance violations if data crosses geographic boundaries
- Vendor lock-in that's expensive to escape
- Downtime when a single provider has an outage
Smart organizations use multiple providers — routing each task to the best option based on cost, capability, privacy, and reliability.
OpenAI — The First Mover
The company that started the LLM revolution:
| Model | Tier | Cost (in/out per 1M) | Best For |
|---|---|---|---|
| GPT-5 | Frontier | $15 / $60 | Complex reasoning, code |
| GPT-4o | Mid-tier | $2.50 / $10 | General tasks, chat |
| GPT-4o mini | Budget | $0.15 / $0.60 | Simple tasks, classification |
| o3 | Reasoning | $10 / $40 | Math, logic, analysis |
Strengths: Largest ecosystem, best tooling, function calling, vision
Weaknesses: Expensive at scale, data handling concerns, rate limits
API: api.openai.com/v1/chat/completions
OpenAI set the API standard. Almost every other provider offers an "OpenAI-compatible" API endpoint.
Anthropic — The Safety Leader
Founded by ex-OpenAI researchers focused on AI safety:
| Model | Tier | Cost (in/out per 1M) | Best For |
|---|---|---|---|
| Claude Opus 4.6 | Frontier | $15 / $75 | Complex analysis, long context |
| Claude Sonnet 4 | Mid-tier | $3 / $15 | Balanced quality/cost |
| Claude Haiku 3.5 | Budget | $0.80 / $4 | Fast tasks, high volume |
Strengths: Longest effective context (200K), best instruction following, strong coding
Weaknesses: Higher output pricing, smaller model range
API: api.anthropic.com/v1/messages (also OpenAI-compatible via adapters)
Google — The Multimodal Giant
Leveraging decades of search and ML research:
| Model | Tier | Cost (in/out per 1M) | Best For |
|---|---|---|---|
| Gemini 2.5 Pro | Frontier | $1.25 / $10 | Multimodal, long context |
| Gemini 2.0 Flash | Mid-tier | $0.10 / $0.40 | Speed, cost efficiency |
| Gemini 2.5 Flash | Budget | $0.15 / $0.60 | Thinking tasks on a budget |
Strengths: Massive context (1M+ tokens), excellent multimodal, competitive pricing
Weaknesses: Inconsistent quality, less reliable for code, complex API
API: generativelanguage.googleapis.com or via Vertex AI
AWS Bedrock & Azure — Enterprise Cloud
For organizations already in AWS or Azure:
AWS Bedrock:
- Access Claude, Llama, Mistral, Cohere, Amazon Nova through your AWS account
- Data stays in your AWS region — critical for compliance
- Pay-as-you-go pricing, no separate AI vendor contracts
- Cross-region inference for availability
Azure OpenAI:
- GPT-4, GPT-4o through your Azure subscription
- Enterprise security, VNet integration, managed identity
- Same API as OpenAI but with Azure compliance guarantees
- Content filtering built in
When to use cloud-managed AI:
- Your company already has an AWS/Azure contract
- You need SOC 2, HIPAA, or FedRAMP compliance
- You want a single invoice for all infrastructure
- Data residency requirements (EU data stays in EU)
Open Source — Llama, Qwen, Mistral
Run models yourself with zero per-token cost:
| Model | Parameters | Quality Level | VRAM Needed |
|---|---|---|---|
| Llama 4 Scout | 109B (17B active) | GPT-4 class | 64GB |
| Qwen 3 | 30B | Strong | 24GB |
| Mistral Large | 123B | GPT-4 class | 80GB |
| Llama 3.3 | 70B | Very strong | 48GB |
| Phi-4 | 14B | Good for size | 12GB |
How to run them:
- Ollama — easiest:
ollama run llama3.3 - vLLM — fastest: optimized serving for production
- llama.cpp — most portable: runs on CPU, even Raspberry Pi
When open source wins:
- High volume (>10K requests/day) → zero marginal cost
- Air-gapped environments → no internet needed
- Full control → customize, fine-tune, inspect weights
- Privacy → data never leaves your infrastructure
Provider Comparison Matrix
Choosing the right provider at a glance:
| Factor | OpenAI | Anthropic | Bedrock | Self-hosted | |
|---|---|---|---|---|---|
| Quality | Excellent | Excellent | Very Good | Same models | Good-Excellent |
| Speed | Fast | Fast | Very Fast | Fast | Varies |
| Cost | High | High | Low-Medium | Medium | Fixed only |
| Privacy | Cloud | Cloud | Cloud | Your AWS | Full control |
| Compliance | Limited | Limited | Limited | SOC2/HIPAA | Your responsibility |
| Availability | 99.9% | 99.9% | 99.9% | 99.99% | Your SLA |
| Setup | Minutes | Minutes | Minutes | Hours | Days |
Recommendation: Start with a cloud API (OpenAI or Anthropic) for development. Add Bedrock or Azure for production compliance. Add self-hosted for high-volume or air-gapped needs. Use a gateway like Model Prism to route between them seamlessly.
---quiz question: Why would an enterprise choose AWS Bedrock over calling OpenAI directly? options:
- { text: "Bedrock models are always higher quality", correct: false }
- { text: "Data stays in their AWS region for compliance, and billing goes through their existing AWS contract", correct: true }
- { text: "Bedrock is always cheaper per token", correct: false } feedback: AWS Bedrock keeps data within the organization's AWS infrastructure (critical for HIPAA, SOC 2, data residency), and billing flows through existing cloud contracts — simplifying procurement and compliance.
---quiz question: When does self-hosting open-source models become more cost-effective than cloud APIs? options:
- { text: "Always — self-hosting is always cheaper", correct: false }
- { text: "When making more than ~10,000 requests per day with consistent workloads", correct: true }
- { text: "Only when using models with fewer than 7B parameters", correct: false } feedback: Self-hosting has a fixed GPU cost but zero per-token cost. At high volume (10K+ daily requests), the fixed cost is amortized and becomes significantly cheaper than per-token cloud pricing.
---quiz question: What is the main advantage of using multiple AI providers instead of just one? options:
- { text: "It's simpler to manage", correct: false }
- { text: "Optimal cost, capability matching, redundancy, and avoiding vendor lock-in", correct: true }
- { text: "Each provider requires a minimum spend", correct: false } feedback: Multi-provider strategies let you route each task to the cheapest capable model, maintain availability if one provider goes down, and avoid vendor lock-in — which is exactly what gateway tools like Model Prism enable.