Your CI pipeline runs linters. Maybe it runs Copilot's PR review or some other AI code reviewer. The output arrives: a comment suggesting you add a docstring, a warning about an unused import, maybe a note about variable naming conventions.
All technically correct. All completely useless for catching the bug that ships to production next Tuesday.
The problem is not the AI. The problem is what the AI knows. Generic PR review tools see only the diff. They have no idea that the function you just modified is called from twelve different places in your authentication middleware. They cannot tell you that the error handling pattern you used contradicts the one established in the rest of the module. They do not know that last quarter's outage started with a change that looked exactly like this one.
These tools review code in isolation. Your code does not run in isolation.
What Generic PR Review Actually Catches
To be fair, generic AI reviewers are not worthless. They catch surface-level issues reliably:
- Syntax errors and obvious bugs
- Missing null checks (sometimes)
- Documentation gaps
- Simple security patterns like SQL string concatenation
- Style violations
These are the things a linter could catch. Some of them, a linter already did catch. The AI reviewer duplicates the feedback with slightly different wording.
What generic review cannot do is understand context. It cannot know that user.role should always be checked before this function runs because of how your permission system works. It cannot flag that the test file you added does not cover the edge case from the incident two months ago. It cannot tell you that this change affects the caching layer in a way that will cause stale data under load.
Context is the difference between "this code looks fine" and "this code will break production."
Context-Aware Review
Pyckle's review_diff tool takes a different approach. Before generating any findings, it runs search_code on the changed files. It retrieves related code, past patterns, callers, dependencies. It builds context before forming opinions.
This means the reviewer knows:
- What code calls the functions you changed
- How similar code elsewhere in the codebase handles the same problem
- What patterns exist in the same module
- What dependencies might be affected by your change
The difference is immediate. Instead of "consider adding error handling," the review says "this function is called without a try/catch in handlers/auth.py:142 — that call path expects an exception to propagate." Instead of generic warnings, you get specific risks tied to your actual code.
The review is not smarter. It is better informed.
Severity Scoring
Not every finding deserves the same attention. A missing docstring is not the same as a potential null pointer in your payment flow.
Pyckle's review tags each finding with severity:
- HIGH — Things that could break production. Missing error handling on critical paths, security issues, race conditions, breaking changes to public interfaces.
- MEDIUM — Real improvements that matter. Inconsistent patterns, missing tests for important code paths, performance concerns.
- LOW — Suggestions and style. Nice-to-haves, documentation improvements, minor refactoring opportunities.
The severity comes from context, not guesswork. A null check is HIGH when the function handles payment data, MEDIUM when it processes optional metadata, LOW when it is in a test utility. Same code pattern, different risk based on where it lives and what calls it.
You scan for HIGH findings. You address MEDIUM when you have time. You ignore LOW unless you are refactoring anyway. The review respects your time by making priority obvious.
Setting It Up
One workflow file. Add .github/workflows/pyckle-review.yml to your repository:
name: Pyckle PR Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run Pyckle Review
env:
PYCKLE_API_KEY: ${{ secrets.PYCKLE_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
pip install pyckle-cli
pyckle review-diff --post-comment
The review-diff command fetches the diff, queries your indexed codebase for context, generates findings, and posts them as a PR comment. The PYCKLE_API_KEY authenticates against your indexed project. The GITHUB_TOKEN allows posting the comment.
Full setup instructions, including indexing your codebase and configuration options, are at /integrations/github.html.
Non-Blocking by Design
The review workflow always exits with code 0. Errors are logged but do not fail the build. Findings are posted as comments, not status checks.
This is intentional.
The review is information, not a gate. It surfaces risks for human judgment. You decide what matters. Maybe that HIGH finding is a known tradeoff. Maybe the MEDIUM suggestion contradicts a decision made last sprint. The review does not know your roadmap, your deadlines, your technical debt budget.
Blocking PR reviews create perverse incentives. Teams start ignoring findings to ship on time. Or they spend hours arguing with an AI about whether a warning is valid. Neither outcome helps code quality.
Non-blocking review means you see the risks, you make the call, you own the outcome. The AI assists. It does not gatekeep.
What the Comment Looks Like
A review comment appears on your PR with findings grouped by file:
src/handlers/checkout.py
HIGH:
process_paymentnow catchesPaymentErrorbut the caller inapi/routes.py:89expects the exception to propagate for retry logic. This will silently swallow failures.MEDIUM: The timeout of 30s differs from the 10s timeout used in
process_refund. Consider extracting to a shared constant for consistency.LOW: Missing type hints for return value.
Each finding references specific code locations. HIGH findings explain the impact. You can click through to the relevant lines and understand exactly what the reviewer is flagging and why it matters in this codebase, not in theory.
The Feedback Loop
Review quality improves as your codebase index improves. The more Pyckle knows about your code — its structure, its patterns, its history — the more relevant the findings become.
Early reviews might flag issues that turn out to be non-issues. That is expected. As the index captures more of your codebase's context, the signal-to-noise ratio improves. The reviewer learns what "normal" looks like for your project and focuses on deviations that actually matter.
This is different from rule-based systems that stay static. Context-aware review gets better over time because the context gets richer.
What It Costs
PR review is included in Pyckle Pro. No per-review fees, no token metering, no surprise charges at the end of the month. Index your codebase once, run reviews on every push.
If you are evaluating whether this replaces your existing AI reviewer, the answer depends on what you value. If you want generic style feedback, free tools exist. If you want reviews that understand your actual codebase and surface real risks before they ship, that requires context — and context requires indexing.
Most teams find the HIGH findings alone justify the switch. One prevented production incident pays for a lot of subscription months.
Pyckle's PR review is available for GitHub repositories. GitLab and Bitbucket support is in development.