⚡
VANSHAJPOONIA.COM
>Projects>Lab>Workbench>Blueprints>Research
>Now>Roadmap>Timeline>Learning>Artifacts>Blog
GitHubLinkedInX
status: building
>Projects>Lab>Workbench>Blueprints>Research>Now>Roadmap>Timeline>Learning>Artifacts>Blog
status: building

Connect

Let's build something together

Always interested in collaborations, interesting problems, and conversations about code, design, and everything in between.

send a signal→

Find me elsewhere

Website
vanshajpoonia.com
GitHub
@VanshajPoonia
LinkedIn
vanshajpoonia
X
@PooniaVanshaj
Forged with& code

© 2026 Vanshaj Poonia — All experiments reserved

Workbench · Architecture Note

Why AI Apps Need Model Routing, Not Just a Chat Box

Different tasks need different models, and cost, latency, quality, modality, and plan limits all matter.

Architecture Note2026-02-086 min readZenquanta AI
back to workbench

Short intro

A chat box is a starting interface, not a complete AI product architecture. Once tasks, users, and plans diverge, the system needs routing.

What I was trying to do

I wanted Zenquanta to avoid treating every prompt as the same kind of work. Planning, writing, debugging, analysis, and image tasks should not all hit the same model by default.

What I learned

  • Different tasks need different models.
  • Cost, latency, quality, modality, and plan limits shape the product experience.
  • Assistant families create structure for users before the model is even called.
  • Prompt precheck can improve UX by recommending a better assistant or warning about unsupported requests.

Technical notes

  • Routing can start as simple policy: assistant family + plan + modality -> model.
  • Raw provider cost should be tracked separately from user-facing usage.
  • Fallbacks matter because provider failures are product failures if they are not handled.
  • Streaming complicates accounting because the response is still arriving while usage is counted.

Problems / open questions

  • When does routing need a classifier instead of rules?
  • How transparent should model selection be to users?
  • What quality signals should feed future routing decisions?

Next steps

  • Document assistant family responsibilities.
  • Track model call latency and cost by route.
  • Add admin visibility into fallback behavior.
  • Connect this to the AI model routing blueprint.

Tags

AIModel RoutingOpenRouterZenquanta

Related links

Zenquanta AIZenquanta AIAI model routing blueprint