AI Gateway
Multi-model routing with smart selection and cost control
The Fiduci AI Gateway combines paid frontier models through the Vercel AI Gateway with free local Mistral inference, automatically routing each request to the best model for the task.
Model selection guide
Each task type maps to a recommended model, balancing capability against cost.
| Task | Recommended model | Reason | Cost |
|---|---|---|---|
| Code generation | gpt-4-turbo | Best at coding | $$ |
| Deep analysis | claude-3-opus | Most thorough | $$$ |
| Quick Q&A | gpt-3.5-turbo | Fast & cheap | $ |
| Marketing copy | gpt-4 | Creative & persuasive | $$ |
| Data analysis | claude-3-sonnet | Excellent reasoning | $$ |
| Local (free) | mistral (Ollama) | No API key needed | Free |
Cost optimization strategies
Use the local model first
Run Mistral 7B locally on the M1 via Ollama — free inference with no API key for everyday prompts.
Route by complexity
Send simple Q&A to a cheap model (gpt-3.5-turbo) and reserve premium models for deep analysis.
Batch processing
Group multiple prompts into a single batch request for better rates and higher throughput.
Gateway capabilities
- 8 different AI models accessible
- Smart model selection by task
- Cost tracking across all models
- Local free inference (Ollama)
- Batch processing (50+ prompts)
- Streaming responses
- 2-3 second response time on M1