Three routing dimensions in one call: intent (coding queries go to Anthropic, reviews to OpenAI), complexity (simple → fast model, deep analysis → large model), and scope (single query or multi-stage pipeline). One config file. Every provider.
Supported Providers
Three steps from question to answer.
POST /agent/run with your question. The router makes two decisions simultaneously: it reads the semantic intent — coding, reviewing, explaining — to pick the right provider, and it scores complexity to select the model tier. Coding queries go to Anthropic. Reviews go to OpenAI. Simple lookups use the fast model. Deep analysis uses the large one.
If context is needed, the agent calls search_code and graph_neighbors via pyckle-mcp. It keeps searching — up to 10 iterations — until it has enough context to answer well.
The final response cites specific files and functions from your actual codebase. Session memory is updated automatically — future questions benefit from what was just found.
Every feature in Pyckle Router at $10/mo.
Anthropic, OpenAI, Together, Groq, Mistral, Ollama. Automatic failover. Priority-ordered with health checks cached for 30s.
Autonomously calls search_code and graph_neighbors until it has enough context. Up to 10 iterations per query. No manual invocation needed.
Complexity scores queries into three tiers — fast, default, large — and budget overrides force local_only when you hit the cap. Simple lookups never pay for a large model. Deep analysis never gets a weak one.
The router reads what you're doing — coding, reviewing, explaining — and routes to the provider best suited for that task type. Coding intent → Anthropic. Review intent → OpenAI. Not just cheapest. Best fit.
Define multi-stage workflows in YAML. code_review runs understand → critique → fix, each stage on its own optimal model. Output of each stage flows into the next. Three built-in pipelines: code review, doc generation, refactor.
Past session decisions pre-loaded into system prompt. The agent knows what you worked on before the conversation starts.
POST /agent/run, GET /agent/history, POST /agent/reset. Deploy on Fly.io or self-host. Call from any language with a single HTTP request.
Route to Ollama for zero data egress. llama3.2 default, llama3.3:70b for large analysis. No API key required for local runs.
One endpoint. Any language.
curl
python
Define complex workflows in YAML. Each stage routes independently — intent, model tier, and provider chosen per stage.
python
understand → critique → fix
summarize → expand → format
analyze → plan → implement
Define your own in configs/pipelines.yaml — no code changes required.
Pyckle Router at $10/mo. Pair with Pyckle Pro MCP for the full two-layer stack — $25/mo bundle saves $5/mo.