This guide walks you through running your first semantic search with Pyckle and reading the results well enough to act on them.
- Pyckle installed and indexed (see Set Up Pyckle in Claude Code)
- At least one codebase indexed
Step 1: Understand How Semantic Search Differs from Grep
Grep matches characters. Semantic search matches meaning. When you search for "authentication middleware," grep finds files containing that exact phrase. Pyckle finds files that handle authentication — even if the code calls it auth_guard, verify_token, or check_session.
Under the hood, Pyckle encodes your query and every code chunk into the same vector space using PyckLM embeddings, then ranks results by cosine similarity fused with BM25 keyword scoring. You don't need to know the variable names. You need to know what the code does.
Semantic search closes the gap between how you think about code and how the code is actually named. The larger and more inconsistently named the codebase, the bigger the payoff.
Step 2: Run a Natural Language Query
Open Claude Code with your indexed project active. Call search_code() with a plain English description of what you're looking for — not a filename, not a function name, just what the code does.
search_code("rate limiting on API endpoints")
Pyckle returns ranked chunks with file paths, line ranges, and relevance scores. The top results are the ones the model judged most semantically similar to your query. Read the file paths first — they tell you whether you're in the right layer of the stack before you read a single line of code.
Run the same search twice using different phrasings — "rate limiting on API endpoints" and "throttle requests per user." Compare the top-five results. Overlap tells you the signal is strong. Divergence tells you the codebase uses inconsistent abstractions.
Step 3: Refine Results with Context
Your first query is rarely your last. If the results are too broad, add specificity. If they're too narrow, step back to the concept.
Suppose your first query returned ten files across three packages. Narrow it:
search_code("rate limiting middleware applied per user token in the API gateway")
Adding domain context — "API gateway," "per user token" — shifts the embedding closer to the implementation you actually care about. Drop terms that don't belong to the concept and add terms that do. Treat it like talking to a senior engineer who knows the system, not like constructing a regex.
Longer, more specific queries almost always outperform short ones in semantic search. The model has more signal to work with. Three words of context you add can eliminate five irrelevant results.
Step 4: Combine Semantic Search with Keyword Filtering
Semantic search surfaces the right concepts. Keyword filtering pins you to the right file. Use them together when you already know part of the answer.
Find all rate-limiting logic, then confirm which module owns it:
search_code("rate limiting middleware applied per user token in the API gateway")
Once you have candidate file paths from Pyckle, run a targeted grep to verify the exact symbol:
grep -rn "RateLimiter\|throttle" src/gateway/middleware/
Semantic search gets you to the right neighborhood. Grep gets you to the right door. Neither alone is as fast as both together.
Don't skip reading the returned chunks. A high relevance score means the embedding space says "close" — it doesn't mean the code does exactly what you need. Skim the top three results before you open any file.
Step 5: Read Search Quality Signals
Pyckle surfaces two signals worth watching: relevance scores and result distribution. High scores clustered in one directory mean your query is well-targeted. Scores spread flat across unrelated files mean your query is too abstract or your index needs a refresh.
Check your index health anytime:
index_stats()
If the chunk count is lower than expected, files may have been added after the last index run. Re-index with the path of the affected directory:
index_codebase("/home/user/myproject")
Token usage matters too, especially on large codebases. Check how much each search is consuming:
token_stats(last_n=10)
If token cost per query is climbing, your queries are probably pulling in too many chunks. Tighten the phrasing before you tighten the budget.
After ten queries in a session, run token_stats(last_n=10) and look at the distribution. Outlier queries — the ones that cost three times the average — are usually the vague ones. Rewrite them and compare.