Pyckle Router - The AI Router That Knows Your Code

How It Works

Three steps from question to answer.

1

You send a message

POST /agent/run with your question. The router makes two decisions simultaneously: it reads the semantic intent — coding, reviewing, explaining — to pick the right provider, and it scores complexity to select the model tier. Coding queries go to Anthropic. Reviews go to OpenAI. Simple lookups use the fast model. Deep analysis uses the large one.

2

The tool loop searches your codebase

If context is needed, the agent calls search_code and graph_neighbors via pyckle-mcp. It keeps searching — up to 10 iterations — until it has enough context to answer well.

3

You get a grounded answer

The final response cites specific files and functions from your actual codebase. Session memory is updated automatically — future questions benefit from what was just found.

What's Included

Every feature in Pyckle Router at $10/mo.

Multi-Provider Routing

Anthropic, OpenAI, Together, Groq, Mistral, Ollama. Automatic failover. Priority-ordered with health checks cached for 30s.

MCP Tool Loop

Autonomously calls search_code and graph_neighbors until it has enough context. Up to 10 iterations per query. No manual invocation needed.

Cost-Aware Dynamic Routing

Complexity scores queries into three tiers — fast, default, large — and budget overrides force local_only when you hit the cap. Simple lookups never pay for a large model. Deep analysis never gets a weak one.

Intent-Aware Provider Routing

The router reads what you're doing — coding, reviewing, explaining — and routes to the provider best suited for that task type. Coding intent → Anthropic. Review intent → OpenAI. Not just cheapest. Best fit.

Pipeline Execution Engine

Define multi-stage workflows in YAML. code_review runs understand → critique → fix, each stage on its own optimal model. Output of each stage flows into the next. Three built-in pipelines: code review, doc generation, refactor.

Pupil Memory Bootstrap

Past session decisions pre-loaded into system prompt. The agent knows what you worked on before the conversation starts.

REST API Deployment

POST /agent/run, GET /agent/history, POST /agent/reset. Deploy on Fly.io or self-host. Call from any language with a single HTTP request.

Local-First Option

Route to Ollama for zero data egress. llama3.2 default, llama3.3:70b for large analysis. No API key required for local runs.

Call It From Anywhere

One endpoint. Any language.

curl

curl -X POST https://pyckle-agent.fly.dev/agent/run \
  -H "Content-Type: application/json" \
  -d '{
    "message": "Where is auth middleware implemented?",
    "tools": true
  }'
                    

python

import requests

resp = requests.post(
    "https://pyckle-agent.fly.dev/agent/run",
    json={
        "message": "Find the router class",
        "tools": True,
        "model_tier": "fast",
    }
)
print(resp.json()["response"])
                    

Multi-Stage Pipelines

Define complex workflows in YAML. Each stage routes independently — intent, model tier, and provider chosen per stage.

python

from pyckle_agent import PycklAgent

agent = PycklAgent.from_yaml("providers.yaml")
result = agent.run_pipeline(
    "code_review",
    input_text=my_code,
)

print(result.final)  # reviewed and fixed, 3 stages
                    

code_review

understand → critique → fix

fast · coding default · review large · coding

doc_gen

summarize → expand → format

fast default fast

refactor

analyze → plan → implement

default · review default large · coding

Define your own in configs/pipelines.yaml — no code changes required.

Start routing smarter today.

Pyckle Router at $10/mo. Pair with Pyckle Pro MCP for the full two-layer stack — $25/mo bundle saves $5/mo.

See Pricing Integration Guide

The AI Router That Knows Your Code.