Hacker News — vinext + Cloudflare Workers

new
past
show
ask
show
jobs
submit

▲Save Claude Code Tokens with Smart Routing (github.com)

11 points by FrancescoMassa 1 days ago | 3 comments

nithiink 20 hours ago [-]

How do you handle prompt caching? A lot of cost savings for a single model chat come from cache hits on the conversation context, and switching models invalidates that cache — the new model has to reprocess everything at full input price.

1 days ago [-]

patch_dev 20 hours ago [-]

What does this solve that well used subagents doesn't solve already?

FrancescoMassa 20 hours ago [-]

On our tests subagents & well used workflows are 20-30% more expensive for context & token efficiency

Rendered at 03:43:32 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.