---
title: "Security Auditing Your Codebase with AI"
subtitle: "Pattern-Based Vulnerability Discovery, Risk Mapping, and Remediation Planning"
author: "David Kelly Price"
version: "1.0"
date: 2026-04-20
status: draft
type: ebook
target_audience: "Security engineers, senior engineers with security responsibility, and engineering managers evaluating security posture — not pen testers, but developers who own the code"
estimated_pages: 80
chapters:
  - "Why Code Security Audits Miss Things"
  - "The Pattern-Based Approach to Vulnerability Discovery"
  - "Semantic Search for Security: Finding Misuse Patterns"
  - "Authentication and Authorization Flows"
  - "Input Handling and Injection Surfaces"
  - "Cryptography Misuse and Secrets in Code"
  - "Dependency Risk and Supply Chain"
  - "Prioritizing and Planning Remediation"
  - "Continuous Security: Audit as Process"
tags:
  - pyckle
  - ebook
  - security
  - code-audit
  - ai-tools
  - vulnerability
  - semantic-search
  - draft
---

<!-- DESIGN & LAYOUT NOTES

Target formats:
- Primary: Markdown (source of truth)
- Export: PDF via Pandoc, web page
- Print-ready: Letter size, 1" margins

Typography:
- Headers: Sans-serif (brand-consistent)
- Body: Serif or clean sans-serif for readability
- Code: Monospace, syntax highlighted, line-numbered where helpful

Callout box types:
- **Try This** — Exercises and hands-on activities
- **Key Insight** — Important concepts worth remembering
- **Warning** — Common mistakes or gotchas

Figures:
- Captioned and numbered (Figure 1, Figure 2, etc.)
- Referenced by number in body text
-->

---

# Security Auditing Your Codebase with AI

## Pattern-Based Vulnerability Discovery, Risk Mapping, and Remediation Planning

**By David Kelly Price**

Version 1.0 — April 2026

---

## Table of Contents

- About This Guide
- Chapter 1: Why Code Security Audits Miss Things
- Chapter 2: The Pattern-Based Approach to Vulnerability Discovery
- Chapter 3: Semantic Search for Security: Finding Misuse Patterns
- Chapter 4: Authentication and Authorization Flows
- Chapter 5: Input Handling and Injection Surfaces
- Chapter 6: Cryptography Misuse and Secrets in Code
- Chapter 7: Dependency Risk and Supply Chain
- Chapter 8: Prioritizing and Planning Remediation
- Chapter 9: Continuous Security: Audit as Process
- Conclusion
- Appendix A: Glossary
- Appendix B: Tools & Resources
- Appendix C: Further Reading

---

## About This Guide

Most security guides are written for people who already think of themselves as security people. This one is not.

It is written for the engineer who owns a production codebase and knows, somewhere in the back of their mind, that it has not been audited properly. It is written for the engineering manager who has heard the phrase "security posture" in three consecutive planning meetings and wants to understand what actually needs to happen. It is written for the senior engineer who is sharp enough to find vulnerabilities once they know where to look, but has not had a systematic framework for finding where to look.

The central argument of this guide is narrow and specific: traditional code security audits miss things not because the people doing them are bad at their jobs, but because the tools and methods they use are structurally misaligned with how vulnerabilities actually live in real codebases. Pattern-based discovery, powered by semantic search and modern AI tooling, addresses that structural misalignment directly.

This is not a guide about penetration testing. It does not cover network scanning, social engineering, or runtime exploitation. The focus is entirely on the code — reading it, searching it, understanding how it assembles at runtime, and identifying the patterns that produce vulnerabilities before an attacker finds them first.

The approach described here does not require a security background. It requires the ability to read code, an understanding of how your system is assembled, and a willingness to think systematically about where things can go wrong. If you have those three things, this guide will give you a method.

Code examples throughout this guide use Python and JavaScript because they are the languages most commonly found in web application backends and frontends. The patterns, however, apply across languages. A SQL injection is a SQL injection whether it is written in Java, Go, or Ruby. The specific syntax changes. The underlying misuse pattern does not.

---

## Chapter 1: Why Code Security Audits Miss Things

Security audits are not uncommon. Most organizations of any meaningful size run them. They hire firms, engage contractors, run automated scanners, check compliance boxes. And then, six months later, a breach happens and the post-mortem reveals a vulnerability that the audit should have caught.

This is not primarily a people problem. The engineers conducting audits are often technically sophisticated. The issue is methodological. The tools and approaches most teams use for security audits are poorly matched to the actual distribution of where vulnerabilities live in modern codebases.

### The Grep Problem

The oldest and still most common approach to code security review is manual grep — searching for known-dangerous function calls, string patterns, and identifiers. Look for `eval(`, look for `execute(`, look for `password`, look for `md5`. When you find a hit, inspect it manually.

This works for the most obvious cases. It will catch the student-project mistake where a developer wrote `query = "SELECT * FROM users WHERE id=" + user_id`. But real codebases are not student projects. They have layers of abstraction, helper functions, framework conventions, and years of accumulated refactoring that separate the dangerous operation from the string that would trigger a grep hit.

Consider this Python example:

```python
# utils/db.py
def run_query(sql, params=None):
    conn = get_connection()
    cursor = conn.cursor()
    if params:
        cursor.execute(sql, params)
    else:
        cursor.execute(sql)
    return cursor.fetchall()

# api/users.py
def get_user_profile(user_id):
    query = build_user_query(user_id)
    return run_query(query)

# api/helpers.py
def build_user_query(user_id):
    return f"SELECT * FROM users WHERE id = {user_id}"
```

A grep for `cursor.execute` in `utils/db.py` looks clean — it takes a parameterized form. A grep for `f"SELECT` in `api/helpers.py` might catch the injection, but only if your pattern library includes f-strings. Most automated tools from five years ago do not. And the actual construction of the dangerous string is three function calls away from where the database interaction happens, across three files.

This kind of cross-file, multi-function vulnerability pattern is the norm in production codebases, not the exception.

> **Key Insight**
>
> Most static analysis tools reason about code locally — within a function, within a file. Most vulnerabilities in production systems are relational — they emerge from how code connects across files, modules, and layers of abstraction.

### The Coverage Problem

Even if your audit method is sound, coverage is a real constraint. A meaningful production codebase can have hundreds of files and hundreds of thousands of lines. A security auditor doing a manual review is going to sample it. They will look at the authentication module, the payment handling code, the admin API. They will not look at the internal analytics endpoint that someone added eight months ago, or the data export utility that runs as a cron job, or the third-party webhook handler that receives POST requests from five different external services.

The vulnerability is often not in the obvious place. It is in the less-scrutinized corner.

Automated scanners help with coverage but introduce a different problem: false positive noise. A scanner that flags every `eval(` call in your codebase regardless of context will produce a report that is practically useless. Security engineers learn very quickly to tune out scanner output because most of it is wrong. That tuning creates blind spots.

> **Warning**
>
> Scanner fatigue is real. When your automated tool produces 400 findings and 380 of them are false positives, the signal-to-noise ratio destroys trust in the tool. Engineers stop reading the reports. The 20 real findings get buried.

### The Static-Context Problem

Code does not execute the way it reads. A security audit that reads code statically — which most audits do — is reasoning about a model of runtime behavior that may not match actual runtime behavior.

Consider authorization. In most web frameworks, authorization checks happen in middleware, decorators, or interceptors that are applied to route handlers. To know whether a particular endpoint is properly protected, you need to understand the full middleware stack applied to that route — which is often not visible by reading any single file. You have to understand how the framework assembles the request pipeline.

```python
# Flask example — what is actually protecting this route?

@app.route('/admin/export')
@login_required         # checks authentication
def export_users():
    return dump_all_users()
```

`@login_required` confirms the user is authenticated. It says nothing about whether they are authorized. The missing `@admin_required` decorator is invisible to a grep-based audit — there is no string to search for. The vulnerability is the absence of a pattern, not the presence of one.

Finding absences is fundamentally harder than finding presences. Most tooling is built to find things that exist. Finding things that should exist but do not requires reasoning about what the code is supposed to do, not just what it does.

### The Semantic Gap

Here is the deepest problem with conventional auditing: security vulnerabilities are semantic. They are about the meaning and intent of code, not just its syntax.

A hard-coded API key is a vulnerability. But whether a string constant in a file is an API key depends on what the string is, what it is named, how it is used, and what service it authenticates to. A simple regex can find string constants. Determining which ones are sensitive secrets requires semantic understanding.

The same function call can be safe or dangerous depending on context. `subprocess.run(cmd)` is dangerous when `cmd` includes unsanitized user input and safe when `cmd` is a static string defined at module load time. Distinguishing these two cases requires following the data flow from input to execution — not just recognizing the function call.

This semantic gap is where conventional tools consistently fail. Syntax-aware tools (linters, basic static analyzers) reason about what code says. Vulnerability identification requires reasoning about what code means.

> **Key Insight**
>
> The semantic gap — the distance between what code says syntactically and what it means semantically — is the primary reason automated scanners produce so many false positives and false negatives simultaneously. They are optimized for the wrong signal.

### Why AI Changes the Equation

Large language models and semantic search tools do not fully solve the semantic gap. Nothing does completely. But they change the cost structure of code analysis in ways that matter for security auditing.

Semantic search lets you find code by describing what it does, not by matching syntax. You can ask "where does this codebase handle user permissions" and get a useful answer even if the relevant code does not contain the word "permission" anywhere — because it lives in a function called `check_access_level` across four files.

AI-assisted code analysis can follow data flows across function boundaries, reason about what a block of code is doing in context, and flag cases where code does something that looks structurally similar to a known vulnerability class — even if the specific syntax would not trigger a conventional pattern match.

The combination of these capabilities produces a fundamentally different kind of audit: one that is coverage-driven rather than sample-driven, semantic rather than syntactic, and capable of finding the absence of patterns as well as their presence.

The following chapters lay out exactly how to use these tools and methods systematically. But this chapter's point stands on its own: if your current security audit relies primarily on grep, a checklist, and manual sampling of obvious hotspots, you are going to miss things. Not occasionally. Routinely.

---

### Key Takeaways

1. Grep-based pattern matching fails in real codebases because vulnerabilities routinely span multiple files and abstraction layers.
2. Coverage is a structural problem with manual audits — the vulnerability is often in the less-scrutinized corner, not the obvious one.
3. Static analysis tools reason about what code says; security requires reasoning about what code means.
4. Absence of a pattern (missing authorization check, missing input validation) is as dangerous as the presence of a bad one, and harder to find.
5. AI-assisted semantic search changes the cost structure of auditing by making coverage and semantic reasoning tractable.

### Chapter Exercise

Take any module in your codebase that handles data from external sources. Trace one data path — from the point where data enters the system to the point where it is written to a database or returned to a caller. Count how many function calls and file boundaries that path crosses. That count is a rough proxy for how deep your audit needs to go to fully evaluate that path. If your current audit method does not follow that entire path, it is leaving risk on the table.

---

## Chapter 2: The Pattern-Based Approach to Vulnerability Discovery

Pattern-based vulnerability discovery starts from a specific premise: vulnerabilities in production code are not random. They cluster around a predictable set of misuse patterns — places where code does something structurally similar to a known class of vulnerability. These patterns are learnable, searchable, and repeatable across codebases and programming languages.

This is not a new idea. Security research has been cataloguing vulnerability patterns for decades. What is new is the tooling available to find those patterns in a systematic, coverage-driven way across an entire codebase, rather than through the sample-based, point-in-time approach of a conventional audit.

### What Is a Vulnerability Pattern?

A vulnerability pattern is an abstraction over a class of security bugs. It describes the structural conditions under which a vulnerability can exist, independent of the specific code that instantiates those conditions.

SQL injection, as a pattern, has this structure:
- User-controlled input reaches a string construction operation
- The result of that construction is executed as a database query
- No sanitization or parameterization happens between input and execution

Any code that matches this structure has a SQL injection vulnerability, regardless of which language it is written in, which database driver it uses, or what variable names the developer chose.

This abstraction is what makes patterns useful. You do not need to find every possible SQL injection in all its specific forms. You need to find code that matches the pattern. Once you have that, the specific instances are just search results.

> **Key Insight**
>
> Vulnerability patterns are structural, not syntactic. The structure describes the dangerous relationship between data flow, operations, and missing safeguards — not the specific code tokens that implement those relationships.

### The Pattern Library

A working security audit needs a pattern library — a structured catalog of the vulnerability patterns you are looking for, with enough specificity to guide a search and enough abstraction to cover the real cases you will find.

The following is a starting point, not a complete list. These are the patterns that appear most frequently in production web application codebases.

**Injection patterns:**
- User input → string construction → database execution (SQL injection)
- User input → string construction → shell execution (command injection)
- User input → HTML/template rendering without escaping (XSS)
- User input → XML/JSON parser with external entity resolution enabled (XXE)
- User input → file path construction without path traversal protection (path traversal)

**Authentication patterns:**
- Unauthenticated access to authenticated endpoints (missing auth middleware)
- Authentication bypass via type confusion or encoding tricks
- Weak session token generation or insufficient entropy
- Session tokens that survive logout (no invalidation)

**Authorization patterns:**
- Missing ownership check on resource access (IDOR)
- Horizontal privilege escalation via parameter manipulation
- Missing role check on privileged operations
- Authorization decided client-side with server trust

**Cryptography patterns:**
- Weak algorithm (MD5, SHA1, DES) for security-sensitive operations
- Predictable IV or nonce in symmetric encryption
- Hard-coded cryptographic keys or secrets
- Missing signature verification on signed data

**Secrets patterns:**
- Credentials embedded in source code
- API keys in configuration files committed to version control
- Secrets in log output
- Secrets passed as command-line arguments

**Dependency patterns:**
- Packages with known CVEs
- Packages pulling from mutable references (no pinning)
- Packages with excessive permissions or access

This library is the starting point for every audit. It defines what you are looking for before you look.

### Pattern Specificity and Generality

There is a tradeoff in pattern definition between specificity and recall. A very specific pattern finds fewer things but produces fewer false positives. A very general pattern produces more hits but requires more triage.

For an initial audit of an unfamiliar codebase, start general. You want coverage. You can tighten the pattern later when you understand how the codebase is structured. For ongoing monitoring of a codebase you know well, tighter patterns produce more actionable results.

The injection pattern for SQL injection can be stated at multiple levels of specificity:

**Very general:** Any place where a string is assembled from multiple parts and passed to a database call.

**Moderate:** String formatting or concatenation operations where at least one operand traces back to request data, within call chains that terminate in database cursor execute operations.

**Specific:** f-string or `%` formatting in functions that are called from request handlers, where the formatted string is passed to `cursor.execute()` or `connection.query()` without appearing in a parameterized form.

The very general version will catch cases you would miss otherwise. It will also flag a lot of code that is fine. The specific version is more surgical but will miss edge cases — especially when developers use naming conventions or abstractions you have not seen before.

> **Warning**
>
> Defining patterns too specifically too early is a common mistake. You end up auditing the patterns you understand, not the vulnerabilities that actually exist. Start broad, triage the results, and tighten iteratively.

### Building the Search Strategy

With a pattern library in hand, the search strategy is straightforward in principle: for each pattern, define a search that can find code matching that pattern's structure, execute it across the codebase, evaluate the results, and record findings.

In practice, this requires three kinds of searches working together.

**Syntax-level search** finds the anchoring points — specific function calls, keywords, or identifiers that are characteristic of a pattern. `cursor.execute`, `subprocess.run`, `eval`, `innerHTML`. These are fast, high-precision starting points that localize the search.

**Semantic search** finds the conceptual connections — code that does something relevant to the pattern, described in natural language. "functions that handle database queries", "request input validation logic", "authentication middleware". These catch code that does not contain the expected keywords because it uses different naming or abstractions.

**Data flow analysis** follows the connections — tracing how data moves from an input source through intermediate operations to a dangerous sink. This is the hardest of the three and often requires manual or AI-assisted tracing rather than automated tooling.

The three searches are complementary. Syntax search gives you anchors. Semantic search gives you context and coverage. Data flow analysis confirms the connection between source and sink.

```
Pattern: SQL Injection
│
├─ Syntax search: cursor.execute, db.query, session.execute
│   └─ Finds: all database execution points
│
├─ Semantic search: "functions that construct SQL queries"
│   └─ Finds: helper functions, ORM bypasses, raw query builders
│
└─ Data flow: trace request params → query construction → execution
    └─ Confirms: which database calls receive unsanitized user input
```

*Figure 1: Three-layer search strategy for SQL injection pattern.*

### Organizing Findings

Every pattern search produces a list of candidate findings. Each candidate needs to be evaluated: is this an actual vulnerability, a false positive, or something requiring further investigation?

A simple three-state classification works:

- **Confirmed:** The code matches the pattern and represents a real vulnerability.
- **Possible:** The code matches structurally but context is unclear — further investigation needed.
- **False positive:** The code matched the search but on inspection is safe.

Track these systematically. A spreadsheet works. A security finding tracker works better. The key is that every candidate has a status and a disposition, so nothing falls through the cracks.

When recording confirmed findings, capture:
- The file and line number
- The pattern it matches
- The data flow path (input source → dangerous sink)
- The exploitability assessment (how hard is this to trigger?)
- The impact assessment (what does exploitation produce?)

> **Try This**
>
> Before running any searches, write down your codebase's top 5 entry points for external data — the places where data from users, APIs, or external systems enters your system. These are the source endpoints for virtually all injection-class vulnerabilities. Having them listed explicitly focuses your data flow analysis dramatically.

### Pattern-Based vs. Checklist-Based Auditing

Security checklists — OWASP Top 10, CIS benchmarks, NIST 800-53 controls — are valuable reference material. They are not a substitute for pattern-based searching.

A checklist tells you what categories of vulnerability to look for. A pattern tells you what to search for to find them. The checklist is the "what"; the pattern is the "how."

The teams that use checklists as their primary audit method tend to produce shallow results. They can confirm that they "checked for SQL injection" on the checklist, but the checking often amounts to: we looked at the database layer and it uses an ORM, so we marked it safe. The pattern-based approach would ask: does the ORM layer anywhere receive raw SQL strings? Are there any raw query bypasses? Do any of those raw queries include unsanitized input? The checklist gets a checkbox; the pattern-based approach gets an answer.

---

### Key Takeaways

1. Vulnerability patterns are structural abstractions over classes of security bugs — they describe the conditions under which a vulnerability can exist, independent of specific code.
2. A working audit requires a pattern library defined before you start searching.
3. Effective pattern search uses three complementary methods: syntax search for anchors, semantic search for context and coverage, and data flow analysis for confirmation.
4. Start pattern definitions broad, then tighten based on what you find.
5. Checklists define categories; patterns define search strategies. Both are necessary; neither is sufficient alone.

### Chapter Exercise

Select one vulnerability class from the pattern library above — whichever is most relevant to your codebase's technology stack. Define the pattern at three levels of specificity: very general, moderate, and specific. Then identify the two or three syntax anchors you would use to start the search for that pattern in your own code. This is the first step toward an executable search strategy.

---

## Chapter 3: Semantic Search for Security: Finding Misuse Patterns

Conventional code search is keyword search. You know what token you are looking for, you search for it, and you find its occurrences. This works well when the code you are looking for uses the vocabulary you expect. It fails whenever there is a gap between your search terms and the code's actual naming conventions.

Semantic search closes that gap. Instead of matching tokens, it matches meaning. You describe what you are looking for conceptually, and the search returns code that is conceptually similar, regardless of whether it uses your exact words.

For security auditing, this capability is not a nice-to-have. It is the difference between finding vulnerabilities that happen to use predictable naming and finding vulnerabilities that exist in code that was written by a developer who named things differently, used an abstraction you did not anticipate, or wrote a helper function that does something dangerous but is called something innocent.

### How Semantic Code Search Works

Semantic search over code works by converting code into high-dimensional numerical representations called embeddings. Each chunk of code — a function, a class, a module — gets represented as a vector in a space where similar code ends up geometrically close together.

When you submit a search query, it is converted into a vector using the same embedding model. The search returns the code chunks whose vectors are closest to the query vector. "Closest" in this context means semantically similar — which, when the embedding model is trained well, corresponds to similar meaning and purpose.

The practical result: you can search for "functions that validate user permissions before accessing resources" and get back code that validates permissions, even if none of the matching functions use the word "permission" — because the model understands that `check_access_level()`, `assert_authorized()`, and `verify_role()` all relate to the same semantic concept.

> **Key Insight**
>
> Embedding-based semantic search is particularly effective at finding code by behavioral description rather than implementation detail. This is exactly what security auditing requires — you often know what dangerous behavior you are looking for before you know what it is called in any specific codebase.

### Hybrid Search: Semantic + Keyword

Pure semantic search has a weakness: it can miss highly specific technical identifiers. If you are searching for all calls to a specific dangerous API function — `pickle.loads`, `subprocess.Popen`, `eval` — keyword search will find them more reliably than semantic search, because those function names are specific enough that the semantic model may not give them sufficient discriminative weight.

The solution is hybrid search: run semantic search and keyword search in parallel, then combine the results. This is typically done with a fusion algorithm like Reciprocal Rank Fusion (RRF), which merges ranked lists from multiple retrievers in a way that rewards results appearing highly in multiple lists.

```
Query: "unsafe deserialization of user-provided data"
│
├─ Semantic search results:
│   1. api/upload.py: load_payload() — 0.89 similarity
│   2. utils/serializers.py: deserialize_config() — 0.84 similarity
│   3. workers/job_queue.py: process_task() — 0.79 similarity
│
├─ Keyword search results (terms: pickle, deserialize, loads):
│   1. utils/serializers.py: deserialize_config() — exact match
│   2. workers/job_queue.py: process_task() — exact match
│   3. legacy/importer.py: import_data() — exact match
│
└─ RRF-fused results:
    1. utils/serializers.py: deserialize_config() — in both lists, high confidence
    2. workers/job_queue.py: process_task() — in both lists, high confidence
    3. api/upload.py: load_payload() — semantic only, worth investigating
    4. legacy/importer.py: import_data() — keyword only, worth investigating
```

*Figure 2: Hybrid search combining semantic and keyword results via RRF.*

The fused list is more useful than either list alone. Results appearing in both lists have higher confidence. Results appearing in only one list are still worth reviewing but require more scrutiny.

### Constructing Security-Focused Queries

The quality of semantic search results depends heavily on query quality. For security auditing, there are predictable query patterns that produce consistently useful results.

**Behavior-based queries** describe what dangerous code does:
- "functions that construct SQL queries from string concatenation"
- "code that executes shell commands with user-provided arguments"
- "request handlers that return user data without authentication check"
- "cryptographic operations using MD5 or SHA1"

**Role-based queries** describe what secure code should do but might be missing:
- "input sanitization and validation functions"
- "authentication middleware applied to request handlers"
- "rate limiting or throttling logic"
- "error handling that avoids exposing stack traces"

**Data flow queries** trace the movement of specific data types:
- "where session tokens are generated and stored"
- "how user-uploaded files are processed and saved"
- "where database queries receive their parameters"
- "functions that log request data"

**Pattern absence queries** find the missing safeguard:
- "endpoints that do not check user role before returning data"
- "database functions without parameterized queries"
- "file operations without path sanitization"

The absence-pattern queries are the ones that keyword search cannot handle at all. There is no keyword that marks the absence of an authorization check. Semantic search, when combined with code understanding, can return candidate functions for manual review — not to flag them definitively as vulnerable, but to surface them for inspection.

> **Warning**
>
> Absence-pattern queries produce a high proportion of false positives by design — you are asking for everything that might be missing something. Use them to build a review queue, not as definitive findings. The value is coverage: ensuring that every authentication-requiring endpoint is evaluated, not just the ones that fail an obvious test.

### A Practical Audit Query Set

The following is a working set of semantic search queries organized by vulnerability category. These are starting points — adapt them to your codebase's vocabulary and framework conventions.

**For injection vulnerabilities:**
```
"raw SQL query construction using string formatting or concatenation"
"shell command execution with variable arguments"
"template rendering with unescaped user input"
"XML parsing with external entity resolution"
"file path construction using user-supplied strings"
```

**For authentication and authorization:**
```
"request handlers that access user-specific data"
"endpoints that perform privileged or administrative operations"
"session token generation and validation logic"
"middleware that checks authentication or login status"
"permission or role verification before data access"
```

**For cryptography:**
```
"password hashing and storage"
"data encryption and decryption operations"
"cryptographic key generation and management"
"signature generation and verification"
"random number generation for security purposes"
```

**For secrets and configuration:**
```
"API key or credential configuration loading"
"database connection string construction"
"third-party service authentication setup"
"environment variable reading for sensitive values"
```

**For error handling and logging:**
```
"exception handling that returns error details to caller"
"logging statements that include request data or user input"
"error responses that include stack traces or internal state"
```

Running these queries against a codebase produces a mapped picture of where each category of concern lives — not just whether vulnerable code exists, but where in the codebase the relevant logic is concentrated.

### Reading Search Results

The output of a semantic search is a ranked list of code chunks with similarity scores. Reading these results effectively is a skill worth developing.

High-similarity results (above 0.85 on most embedding models) are likely genuinely relevant. They match the semantic description closely and deserve close inspection.

Mid-range results (0.65–0.85) are worth reviewing but will include more noise. The semantic model found similarity, but the match may be partial or tangential.

Low-similarity results (below 0.65) can usually be deprioritized unless you have specific reason to believe the search is poorly calibrated to this codebase.

When a result looks relevant, follow it. Read the function, trace its call chain, check its callers. A single relevant result often leads to a cluster of related code that was not in the initial search results.

> **Try This**
>
> Run the query "functions that handle database queries" against your codebase using a semantic search tool. Note how many results return. Then manually categorize them: how many use an ORM? How many use parameterized queries directly? How many use raw string construction? The distribution tells you where to focus your injection audit.

### Semantic Search Tools

Several tools support semantic code search with varying levels of maturity and capability.

Pyckle's code-mcp provides hybrid semantic + BM25 keyword search with ChromaDB-backed vector storage. It supports indexed codebases, graph-based dependency traversal, and session-aware context that tracks which files have been read and edited during an audit session.

GitHub Copilot and similar tools provide semantic code understanding but are optimized for completion assistance, not systematic security searching. They are useful for interpreting specific code segments during audit review but less useful for codebase-wide coverage searches.

Sourcegraph offers structural search and some semantic capabilities, particularly for large multi-repository environments.

The workflow described in this guide assumes a tool that supports: natural language queries over indexed code, hybrid semantic + keyword search, and the ability to return ranked chunks with enough context to understand what the code does and where it sits in the codebase.

---

### Key Takeaways

1. Semantic search matches code by meaning rather than tokens, making it effective for finding code regardless of naming conventions.
2. Hybrid search combining semantic and keyword retrieval outperforms either method alone, especially for security-relevant technical identifiers.
3. Security audit queries should cover behavior, role, data flow, and pattern absence.
4. High-similarity results deserve close inspection; mid-range results require more triage; treat all results as candidates, not findings.
5. Semantic search is most valuable for finding what keyword search cannot: absence patterns, non-standard naming, and conceptual relationships across files.

### Chapter Exercise

Using whichever semantic search tool you have available, run five queries from the practical audit query set above against your codebase. For each query, note: the number of results returned, the top three results by similarity score, and whether any results surprised you by being relevant when you expected them not to be. Those surprises are where the audit value lives.

---

## Chapter 4: Authentication and Authorization Flows

Authentication and authorization failures are among the most consistently exploited vulnerability classes in production applications. They appear on the OWASP Top 10 list every cycle because they remain common. Not because developers do not understand the concepts — most do — but because the implementation is spread across enough of a codebase that gaps form at the edges.

Authentication answers the question: who is this? Authorization answers the question: what is this person allowed to do? These are distinct questions, and they require distinct checks. The most common failure pattern is conflating them: confirming that a user is authenticated and then trusting, without verification, that the authenticated user is authorized to do what they are attempting.

### Mapping the Authentication Surface

Before evaluating authentication quality, map the authentication surface. This means identifying every endpoint, function, or operation that requires authentication to access — and confirming that each one is actually protected.

In a well-structured application, this map is produced directly from the routing and middleware configuration. In practice, it is rarely that clean.

```python
# Django example — what's actually protected?

urlpatterns = [
    path('api/users/', views.UserListView.as_view()),
    path('api/users/<int:pk>/', views.UserDetailView.as_view()),
    path('api/admin/export/', views.ExportView.as_view()),
    path('api/profile/', views.ProfileView.as_view()),
    path('api/login/', views.LoginView.as_view()),
    path('api/reset-password/', views.PasswordResetView.as_view()),
]
```

Reading this URL configuration tells you what routes exist. It does not tell you whether each view applies authentication. For that, you need to read each view class's implementation, or check whether `DEFAULT_AUTHENTICATION_CLASSES` and `DEFAULT_PERMISSION_CLASSES` in Django REST Framework are set globally and whether individual views override them.

The dangerous case is a view that inherits from a base class that was written to not require authentication — perhaps because it was originally public — and was later used as the base for a new view that should be protected, without the developer noticing that the base class had no authentication enforcement.

> **Key Insight**
>
> Authentication enforcement is most likely to be absent at the edges: newly added endpoints, views that inherit from unexpected base classes, routes that were added for debugging and never removed, and internal APIs that were assumed to be inaccessible from the outside but are not.

### Common Authentication Bypass Patterns

Authentication bypasses in production code tend to follow a small set of patterns. Knowing these patterns makes them easier to find.

**Order-dependent bypass:** An authentication check runs before a conditional branch, but one branch of the code exits before reaching the check.

```python
def get_user(request, user_id):
    if user_id == 'me':
        # Shortcut path — but where's the auth check?
        return get_current_user(request)

    require_auth(request)  # Only reached for explicit user IDs
    return User.objects.get(pk=user_id)
```

**Type confusion bypass:** Authentication logic checks a property of the user object, but the type of that object is not validated, and a crafted input can produce a truthy value regardless of actual authentication state.

```javascript
// Express.js example
function requireAdmin(req, res, next) {
  if (req.user && req.user.role === 'admin') {
    return next();
  }
  return res.status(403).json({ error: 'Forbidden' });
}

// If req.user.role can be set by the client via a JWT payload
// and the JWT signature is not properly verified, this fails
```

**Method bypass:** Authentication is enforced on POST but not GET, or vice versa. A read-only GET endpoint that returns sensitive data without authentication is as problematic as an unprotected write endpoint.

**Path traversal bypass:** The routing middleware applies authentication to `/api/admin/...` but not to `/api/Admin/...` due to case sensitivity differences between the router and the filesystem or downstream service.

**Header injection bypass:** Authentication is skipped when a specific header is present — originally added for internal service-to-service calls — but the header is not restricted to internal traffic and can be set by external clients.

```python
def authenticate(request):
    # Internal bypass — problematic if this header isn't
    # stripped at the load balancer
    if request.headers.get('X-Internal-Request') == 'true':
        return True

    token = request.headers.get('Authorization')
    return validate_token(token)
```

### Authorization: The IDOR Problem

Insecure Direct Object Reference (IDOR) is the authorization failure that appears most frequently in bug bounty programs and red team exercises. It is also one of the easiest to miss in a code review because the code often looks correct at a glance.

The pattern: a user requests a resource by its identifier, and the server returns the resource associated with that identifier without verifying that the requesting user has permission to access it.

```python
# Vulnerable
@app.route('/api/documents/<int:doc_id>')
@login_required
def get_document(doc_id):
    doc = Document.objects.get(pk=doc_id)
    return doc.to_dict()

# Fixed
@app.route('/api/documents/<int:doc_id>')
@login_required
def get_document(doc_id):
    doc = Document.objects.get(pk=doc_id, owner=request.user)
    return doc.to_dict()
```

The first version confirms the user is authenticated. It does not confirm the document belongs to the authenticated user. Any authenticated user can access any document by guessing or iterating through IDs.

Finding IDOR vulnerabilities semantically: search for "functions that retrieve database records by ID from request parameters" and then manually verify whether each result includes an ownership or permission check alongside the ID lookup.

> **Warning**
>
> `@login_required` (or its equivalent in your framework) confirms authentication, not authorization. The two checks serve different purposes. A codebase where every endpoint passes authentication review but none of the resource-specific endpoints pass authorization review has confirmed that only logged-in users can exploit its IDORs.

### Privilege Escalation Patterns

Horizontal privilege escalation: a user accesses resources belonging to other users at the same privilege level (the IDOR pattern described above).

Vertical privilege escalation: a user performs operations intended only for higher-privilege roles. This happens most commonly when role checks are missing from specific endpoints, when role logic is checked client-side rather than server-side, or when a lower-privilege user can modify their own role attribute.

```javascript
// Vertical escalation via self-modification
router.patch('/api/users/:id', authenticate, async (req, res) => {
  const user = await User.findById(req.params.id);

  // Missing: verify req.user.id === req.params.id OR req.user.isAdmin
  // Missing: filter req.body to allowed fields

  Object.assign(user, req.body);  // User can set their own role: 'admin'
  await user.save();
  res.json(user);
});
```

The semantic search query for this pattern: "endpoints that update user attributes from request body without field filtering". The manual review question: does each result restrict which fields can be updated, and does it verify that the requester has permission to update those fields for the target user?

### Session Management

Session security is a distinct concern from authentication logic. A strong authentication mechanism can be undermined by weak session management. The key areas:

**Session token entropy:** Tokens should be generated from a cryptographically secure random source with sufficient entropy. 128 bits minimum; 256 bits preferred.

```python
# Weak — predictable
import random
session_token = str(random.randint(1000000, 9999999))

# Strong — cryptographically secure
import secrets
session_token = secrets.token_hex(32)  # 256 bits
```

**Session invalidation on logout:** Logout must invalidate the server-side session, not just clear the client-side cookie. If the server does not track valid sessions and simply validates the token's signature, a stolen token remains valid indefinitely.

**Session fixation:** If a session token is assigned before authentication and reused after authentication, an attacker who knows the pre-authentication token (e.g., via XSS) can use it post-authentication. Tokens should be regenerated on privilege changes.

**Cookie security attributes:** Session cookies should have `HttpOnly`, `Secure`, and `SameSite` set appropriately. Missing `HttpOnly` enables JavaScript access. Missing `Secure` allows transmission over HTTP. Missing `SameSite` enables CSRF.

---

### Key Takeaways

1. Authentication and authorization are distinct concerns requiring distinct checks; conflating them produces access control vulnerabilities.
2. Authentication gaps appear most often at the edges: new endpoints, inherited base classes, internal routes.
3. IDOR is the most common authorization failure — confirming that every resource retrieval includes an ownership or permission check is non-negotiable.
4. Vertical privilege escalation often occurs through unfiltered object updates; mass assignment is a common vector.
5. Session management failures can undermine strong authentication; token entropy, invalidation, and cookie attributes all require explicit verification.

### Chapter Exercise

Map every route in one service of your application. For each route: (1) confirm what authentication middleware is applied, (2) confirm whether authorization checks the ownership or role relationship for any resource it returns, and (3) identify whether the endpoint accepts object updates and whether field assignment is filtered. Any route that fails all three checks is high-priority for remediation.

---

## Chapter 5: Input Handling and Injection Surfaces

Every application has an attack surface. For most web applications, the majority of that surface is input handling — every place where data from outside the system enters and gets processed. SQL injection, command injection, path traversal, XSS, and XML injection all share the same root cause: user-controlled data reaches a dangerous operation without sufficient sanitization, escaping, or parameterization.

The specifics differ by operation type, but the structural pattern is identical. Understanding this shared structure makes it possible to audit all injection surfaces systematically rather than checking each type independently.

### The Source-Sink Model

Injection vulnerabilities are always described by a source and a sink. The source is where untrusted data enters the system. The sink is the dangerous operation that processes it. The vulnerability exists when a path connects source to sink without adequate protection.

**Common sources:**
- HTTP request parameters (query string, POST body, path parameters)
- HTTP request headers (User-Agent, Referer, X-Forwarded-For, Cookie)
- File uploads (file content, file name, file metadata)
- Webhook payloads from external services
- Data read from external APIs
- Data read from shared storage (database, cache, message queue) that was originally written by untrusted parties

**Common sinks:**
- Database query execution
- Shell command execution
- File system operations (read, write, path construction)
- HTML template rendering
- Redirect location construction
- Log statement content
- Email content construction
- PDF or document generation

The audit task is to trace paths from sources to sinks and verify that adequate protection exists at each point.

> **Key Insight**
>
> Most codebases treat data sanitization as something that happens at the input — clean it once, trust it everywhere. This is the wrong model. Protection should be applied at the sink — using parameterized queries at query execution time, using escaping at template rendering time, using validation at path construction time. Input-layer sanitization is a defense-in-depth measure, not a primary defense.

### SQL Injection

SQL injection remains the most consequential injection class because the impact — arbitrary data extraction, modification, or deletion — is severe and the database is almost always connected to the most sensitive data the application holds.

The primary defense is parameterized queries. The parameterized form separates query structure from query data, so user-provided values are never interpreted as SQL syntax.

```python
# Vulnerable
def get_user_by_email(email):
    query = f"SELECT * FROM users WHERE email = '{email}'"
    return db.execute(query)

# Vulnerable — using old-style formatting
def get_user_by_email(email):
    query = "SELECT * FROM users WHERE email = '%s'" % email
    return db.execute(query)

# Safe — parameterized
def get_user_by_email(email):
    return db.execute("SELECT * FROM users WHERE email = ?", (email,))

# Safe — ORM
def get_user_by_email(email):
    return User.objects.filter(email=email).first()
```

The ORM case is safe precisely because it generates parameterized queries internally. But ORMs have escape hatches — `raw()` in Django, `text()` in SQLAlchemy, `query()` with `${}` interpolation in Sequelize. These bypass the ORM's protection and require explicit parameterization.

```python
# Dangerous ORM escape hatch
from django.db import connection

def search_users(name):
    with connection.cursor() as cursor:
        # raw() with interpolation — vulnerable
        cursor.execute(f"SELECT * FROM users WHERE name LIKE '%{name}%'")
        return cursor.fetchall()

# Safe — parameterized even in raw SQL
def search_users(name):
    with connection.cursor() as cursor:
        cursor.execute(
            "SELECT * FROM users WHERE name LIKE %s",
            [f"%{name}%"]
        )
        return cursor.fetchall()
```

The semantic search query for this pattern: "functions that construct SQL queries using string formatting or f-strings". Pay particular attention to search results involving `raw()`, `execute()`, `query()`, and any function whose name suggests database interaction.

### Command Injection

Command injection occurs when user input reaches shell command execution. The consequences are severe — arbitrary code execution on the server. The pattern appears most often in:
- File processing utilities that call external programs
- Image/video conversion pipelines
- PDF generation tools
- System administration features (backup, log rotation, diagnostics)
- CI/CD and automation tooling

```python
import subprocess

# Vulnerable — shell=True with string interpolation
def convert_image(filename):
    cmd = f"convert {filename} -resize 800x600 output.jpg"
    subprocess.run(cmd, shell=True)

# Vulnerable — if filename is '"; rm -rf / #'
# The command becomes: convert "; rm -rf / #" -resize 800x600 output.jpg

# Safe — list form, no shell interpretation
def convert_image(filename):
    # Also validate filename against an allowlist first
    subprocess.run(
        ["convert", filename, "-resize", "800x600", "output.jpg"],
        shell=False
    )
```

The list form prevents shell interpretation. Even if `filename` contains shell metacharacters, they are passed as literal arguments to `convert`, not interpreted by the shell.

> **Warning**
>
> `shell=True` in Python's subprocess is the primary command injection vector in Python codebases. Any code that combines `shell=True` with a string that includes variable data is worth treating as a likely vulnerability until proven otherwise.

### Path Traversal

Path traversal occurs when user input is included in a file path construction without validation that the resulting path stays within the intended directory.

```python
# Vulnerable
def serve_file(filename):
    base_dir = "/var/app/files"
    file_path = os.path.join(base_dir, filename)
    with open(file_path, 'rb') as f:
        return f.read()

# Attack: filename = "../../../../etc/passwd"
# Resolved path: /var/app/files/../../../../etc/passwd = /etc/passwd
```

`os.path.join` does not protect against traversal — it resolves the path normally, including `..` components. The fix is to resolve the canonical path and verify it starts with the intended base directory.

```python
import os

def serve_file(filename):
    base_dir = "/var/app/files"
    requested_path = os.path.realpath(os.path.join(base_dir, filename))

    if not requested_path.startswith(base_dir + os.sep):
        raise ValueError("Path traversal detected")

    with open(requested_path, 'rb') as f:
        return f.read()
```

### Cross-Site Scripting (XSS)

XSS occurs when user-controlled data is rendered in an HTML context without proper escaping. The impact ranges from session theft to account takeover to malware distribution depending on what the application allows JavaScript to do.

Modern web frameworks enable auto-escaping by default in templates. The vulnerabilities appear where developers opt out of auto-escaping, use raw HTML insertion APIs, or build HTML strings outside the template system.

```javascript
// Vulnerable — innerHTML assignment
function displayUsername(username) {
    document.getElementById('welcome').innerHTML = `Welcome, ${username}!`;
}

// Safe — textContent assignment
function displayUsername(username) {
    document.getElementById('welcome').textContent = `Welcome, ${username}!`;
}

// Vulnerable — dangerouslySetInnerHTML in React
function UserBio({ bio }) {
    return <div dangerouslySetInnerHTML={{ __html: bio }} />;
}

// Safe — only if bio has been sanitized server-side with an allowlist
// Otherwise, do not render HTML from user input at all
```

Server-side XSS often appears in template engines where developers use unescaped output markers:

```
{# Jinja2 — vulnerable #}
{{ user.bio | safe }}

{# Jinja2 — safe #}
{{ user.bio }}
```

The `| safe` filter in Jinja2 (and equivalent in other template engines) disables escaping. Search for these markers and verify that the content being marked safe has been sanitized before rendering.

### XXE and Deserialization

XML External Entity (XXE) injection targets XML parsers that resolve external entities by default. Most modern XML parsers in mainstream languages have this disabled by default, but older configurations and specific libraries still have it enabled.

```python
# Vulnerable — lxml with external entity resolution
from lxml import etree

def parse_xml(xml_data):
    parser = etree.XMLParser()  # Default allows external entities
    return etree.fromstring(xml_data, parser)

# Safe — external entities disabled
def parse_xml(xml_data):
    parser = etree.XMLParser(
        resolve_entities=False,
        no_network=True,
        load_dtd=False
    )
    return etree.fromstring(xml_data, parser)
```

Deserialization of untrusted data is similar in structure: user-provided binary or text data is processed by a deserialization operation that can trigger code execution. Python's `pickle`, Java's native serialization, PHP's `unserialize()`, and Ruby's `Marshal.load()` are all capable of executing arbitrary code during deserialization if the input is attacker-controlled.

The rule is simple: never deserialize data from untrusted sources using these mechanisms. Use data formats that cannot execute code (JSON, Protocol Buffers, MessagePack) or implement strict type validation before deserialization.

---

### Key Takeaways

1. All injection vulnerabilities share the source-sink model: untrusted data reaches a dangerous operation without adequate protection.
2. Protection belongs at the sink, not just at the input — parameterized queries, escaping at render time, validated paths.
3. SQL injection survives ORMs because raw query escape hatches exist; audit those explicitly.
4. `shell=True` combined with variable data is a command injection flag; prefer list-form subprocess calls.
5. Deserialization of untrusted data from pickle-like mechanisms is an arbitrary code execution risk; switch to safe serialization formats.

### Chapter Exercise

Find every place in your codebase where the application writes data to a file or reads a file at a path that includes any portion of user-provided data. For each location: (1) trace where the filename or path component originates, (2) verify whether path canonicalization and base directory validation are applied, and (3) confirm whether the file operation is on the read path, write path, or both. Write path vulnerabilities are particularly severe.

---

## Chapter 6: Cryptography Misuse and Secrets in Code

Cryptography is a domain where mistakes are not always visible. A system using broken cryptography often looks, from the outside, exactly like a system using strong cryptography. The cipher text exists. The hash exists. The signature exists. The fact that they can be broken or forged is not apparent until it is exploited.

Cryptographic misuse in production code is extraordinarily common — not because developers are careless, but because cryptography is genuinely hard to use correctly, the standard libraries expose dangerous options without obvious warnings, and the consequences of getting it wrong are delayed and indirect.

### Password Storage

Password storage is the most widely understood cryptography problem in web development, and yet poor implementations remain common. The correct approach has been settled for years: use a purpose-built, adaptive hashing algorithm with built-in salting. bcrypt, scrypt, and Argon2 are the appropriate choices. SHA-256, SHA-512, and MD5 are not, regardless of whether salting is applied.

```python
# Wrong — SHA-256 is not a password hashing function
import hashlib

def hash_password(password):
    return hashlib.sha256(password.encode()).hexdigest()

# Also wrong — salted SHA-256 is still wrong
def hash_password(password, salt):
    return hashlib.sha256((salt + password).encode()).hexdigest()

# Right — bcrypt
import bcrypt

def hash_password(password):
    return bcrypt.hashpw(password.encode(), bcrypt.gensalt(rounds=12))

def verify_password(password, hashed):
    return bcrypt.checkpw(password.encode(), hashed)

# Right — Argon2
from argon2 import PasswordHasher

ph = PasswordHasher()

def hash_password(password):
    return ph.hash(password)

def verify_password(password, hashed):
    return ph.verify(hashed, password)
```

The reason SHA-256 is wrong for password storage is computational speed. SHA-256 can be computed billions of times per second on modern hardware. bcrypt with a cost factor of 12 computes roughly 10-100 times per second. That difference is what makes bcrypt resistant to brute-force attacks after a breach and makes SHA-256 catastrophically unsuitable.

Semantic search query: "functions that hash or store user passwords". Review every result. If any result contains `md5`, `sha1`, `sha256`, or `sha512` applied to passwords, it is a confirmed vulnerability.

> **Warning**
>
> Salting does not fix SHA-256 for password storage. Salting prevents rainbow table attacks; it does nothing about brute force speed. This is one of the most common misconceptions in password storage. A salted SHA-256 password database can still be cracked at billions of guesses per second.

### Symmetric Encryption

Symmetric encryption misuse typically appears in one of three forms: using an outdated algorithm, using a secure algorithm incorrectly, or using a secure algorithm with a weak key.

**Algorithm choice:** AES is the standard. DES and 3DES are deprecated and should not appear in new code. RC4 is broken and should not appear anywhere.

**Mode choice:** AES-GCM is the correct choice for most applications. It provides both encryption and authentication (preventing ciphertext tampering). AES-CBC without a message authentication code (MAC) is vulnerable to padding oracle attacks. AES-ECB reveals patterns in plaintext and should never be used.

```python
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
import os

# Right — AES-GCM
def encrypt(data: bytes, key: bytes) -> tuple[bytes, bytes]:
    nonce = os.urandom(12)  # 96-bit nonce for GCM
    aesgcm = AESGCM(key)
    ciphertext = aesgcm.encrypt(nonce, data, None)
    return nonce, ciphertext

def decrypt(nonce: bytes, ciphertext: bytes, key: bytes) -> bytes:
    aesgcm = AESGCM(key)
    return aesgcm.decrypt(nonce, ciphertext, None)

# Wrong — ECB mode (never use this)
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes

def encrypt_ecb(data: bytes, key: bytes) -> bytes:
    cipher = Cipher(algorithms.AES(key), modes.ECB())
    encryptor = cipher.encryptor()
    return encryptor.update(data) + encryptor.finalize()
```

**IV/nonce reuse:** In modes like GCM, reusing a nonce with the same key is catastrophic — it allows recovery of the keystream and potentially the key itself. Nonces must be generated freshly for each encryption operation using a cryptographically secure random source.

**Key management:** Encryption keys must not be hard-coded, derived from weak sources, or stored alongside the data they protect. Key derivation from passwords must use a KDF (Key Derivation Function) like PBKDF2, scrypt, or Argon2 — not a simple hash.

### Token and Secret Generation

Random number generation for security purposes must use a cryptographically secure PRNG. Language standard libraries typically provide both a fast, non-secure PRNG (for simulations, games, statistical sampling) and a CSPRNG (for security-sensitive operations). Using the wrong one is an easy mistake.

```python
import random
import secrets

# Wrong — predictable, seeded from system clock
reset_token = str(random.randint(100000, 999999))

# Wrong — larger space but still predictable
reset_token = hex(random.getrandbits(128))

# Right — cryptographically secure
reset_token = secrets.token_urlsafe(32)  # 256 bits of entropy

# Right — UUID v4 uses os.urandom() internally
import uuid
reset_token = str(uuid.uuid4())
```

In JavaScript:

```javascript
// Wrong — Math.random() is not cryptographically secure
const token = Math.random().toString(36).substring(2);

// Right — Web Crypto API
const array = new Uint8Array(32);
crypto.getRandomValues(array);
const token = Array.from(array, byte => byte.toString(16).padStart(2, '0')).join('');
```

### Secrets in Code

Hard-coded credentials, API keys, and secrets are among the most reliably exploitable vulnerabilities in any codebase. They appear in version control history, are accessible to anyone with repository access, and often survive long after they should have been rotated.

The pattern appears in several forms:

```python
# Direct embedding
DATABASE_URL = "postgresql://admin:s3cr3tpassword@db.internal/prod"

# Configuration file checked into version control
# config/settings.py
STRIPE_SECRET_KEY = "sk_live_4xK2..."
SENDGRID_API_KEY = "SG.4xK2..."

# Fallback defaults that are real secrets
SECRET_KEY = os.environ.get('SECRET_KEY', 'my-hardcoded-secret')

# Secrets in test files (often overlooked)
# tests/test_payment.py
STRIPE_TEST_KEY = "sk_test_actualRealTestKey..."
```

The third form is particularly insidious: using a hard-coded secret as the fallback when an environment variable is not set. In development, this looks fine. In production, if the environment variable is not configured correctly, the fallback kicks in and the hard-coded value is used — possibly without anyone noticing.

Secrets in code are audited using a combination of pattern matching and semantic search.

Pattern matching: regular expressions targeting common credential formats — AWS access key patterns (`AKIA[0-9A-Z]{16}`), API key formats (`sk_live_`, `ghp_`, `Bearer `), private key headers (`-----BEGIN RSA PRIVATE KEY-----`).

Semantic search: "code that loads or configures API credentials or service authentication". This catches cases where the credential is stored in a variable name that does not match common patterns.

Git history: secrets that were removed from code often remain in git history. A complete audit checks historical commits, not just the current HEAD.

```bash
# Search git history for common credential patterns
git log --all -p | grep -E "(password|secret|key|token)\s*=\s*['\"][^'\"]{8,}['\"]"

# Search for AWS key patterns in history
git log --all -p | grep -E "AKIA[0-9A-Z]{16}"
```

> **Try This**
>
> Run `git log --all --full-history -- "*.env" "*.json" "*.yaml" "*.yml" "*.py" "*.js"` to find commits that touched configuration files. Then examine those commits for any credentials that may have been added and "removed" from the current state of the file. Removal from the file does not remove from history.

### Signature and Integrity Verification

Cryptographic signatures provide integrity guarantees — they allow a receiver to verify that data has not been tampered with. Failing to verify signatures is as dangerous as not signing at all.

The most common failure: receiving a JWT, extracting the payload, and using the payload data without verifying the signature. Some JWT libraries, particularly older versions, had a vulnerability where passing `algorithm=None` caused the library to skip signature verification entirely.

```python
import jwt

# Vulnerable — no signature verification
def get_user_from_token(token):
    # decode() without verify=True accepts unsigned tokens
    payload = jwt.decode(token, options={"verify_signature": False})
    return payload['user_id']

# Vulnerable — accepts algorithm from token header
def get_user_from_token(token):
    header = jwt.get_unverified_header(token)
    algorithm = header['alg']  # Attacker can set this to 'none'
    payload = jwt.decode(token, PUBLIC_KEY, algorithms=[algorithm])
    return payload['user_id']

# Safe — explicit algorithm, mandatory verification
def get_user_from_token(token):
    payload = jwt.decode(
        token,
        PUBLIC_KEY,
        algorithms=["RS256"],  # Explicit, not from token header
        options={"require": ["exp", "iat", "sub"]}
    )
    return payload['sub']
```

---

### Key Takeaways

1. Use purpose-built password hashing algorithms (bcrypt, scrypt, Argon2); SHA-256, even salted, is not appropriate for passwords.
2. AES-GCM is the correct choice for symmetric encryption; ECB mode reveals patterns and must never be used.
3. Security-sensitive random values must use a CSPRNG; `Math.random()` and `random.random()` are not cryptographically secure.
4. Hard-coded secrets in source code expose credentials to anyone with repository access and often survive rotation attempts because they persist in git history.
5. JWT and signature verification must explicitly specify the allowed algorithm and must not trust the algorithm specified in the token header.

### Chapter Exercise

Search your codebase for all uses of `md5`, `sha1`, `sha256`, and `sha512`. For each result, determine: is this being applied to a password? If so, it is a confirmed vulnerability. Is this being used for data integrity verification (not security)? Then the algorithm choice depends on the threat model. Is this being used for HMAC-based authentication? Then the algorithm choice matters — SHA-256 is acceptable for HMAC but the key length must be sufficient. Categorize each usage by purpose, then assess whether the algorithm is appropriate for that purpose.

---

## Chapter 7: Dependency Risk and Supply Chain

The code your team wrote is only part of your attack surface. Every library, framework, package, and tool your application depends on extends that surface. In most production applications, the majority of the code that executes — by line count — was written by someone other than your team. The security of that code is largely outside your direct control, but its vulnerabilities become your vulnerabilities.

Supply chain security has moved from a theoretical concern to an operational one. The SolarWinds attack, the log4shell vulnerability, the event-stream compromise, and dozens of smaller incidents have established that attackers actively target upstream dependencies as a way to compromise downstream systems at scale.

### Mapping Your Dependency Graph

The first step is knowing what you depend on. This sounds obvious but is often not well understood in practice.

Direct dependencies are the packages you explicitly declare — in `requirements.txt`, `package.json`, `go.mod`, `Gemfile`. Transitive dependencies are everything those packages depend on, and everything those packages depend on, and so on. In a moderately complex JavaScript application, a `package.json` with 50 direct dependencies might produce a `node_modules` directory with 500 or 1,000 total packages.

```bash
# Node.js — count total installed packages
ls node_modules | wc -l

# Python — list all packages including transitive
pip freeze | wc -l

# Go — list module dependencies
go list -m all | wc -l
```

Each of those packages is code your application executes. Each has its own dependency tree, its own update cadence, its own maintainer security practices, and its own vulnerability history.

> **Key Insight**
>
> The surface area of your supply chain is usually an order of magnitude larger than developers intuitively estimate. The number people quote when asked "how many dependencies does your application have?" is almost always the direct dependency count — which may be 5-10% of the actual total.

### Known Vulnerabilities: CVE Scanning

The most tractable part of dependency security is known vulnerabilities. Security researchers publish discovered vulnerabilities as CVEs (Common Vulnerabilities and Exposures), and the major package registries maintain databases that map CVEs to affected package versions.

Scanning your dependency lockfiles against these databases is automated and fast. Tools:

- `npm audit` — Node.js packages, uses npm's advisory database
- `pip-audit` — Python packages, uses PyPI advisory database and OSV
- `trivy` — Multi-language, pulls from multiple CVE databases
- `snyk test` — Commercial tool with broader coverage and developer workflow integration
- `dependabot` — GitHub-native, automated pull requests for dependency updates with known CVEs
- `OSV-Scanner` — Google's open-source tool, uses the Open Source Vulnerabilities database

```bash
# Python
pip-audit --requirement requirements.txt

# Node.js
npm audit --audit-level=high

# Trivy (cross-language, works on lockfiles or containers)
trivy fs --security-checks vuln .
```

Running one of these tools against your current lockfile is the fastest possible security audit action. If it returns findings, those are confirmed vulnerabilities with known exploits or known risk profiles.

> **Warning**
>
> CVE scanners report on known vulnerabilities in published versions. They do not detect novel supply chain attacks (malicious packages that have not yet been flagged), dependency confusion attacks, or typosquatting. Known CVEs are the floor of supply chain risk, not the ceiling.

### Transitive Risk and Severity Assessment

Not all dependency vulnerabilities are equally relevant. A CVE in a package affects your application only if:
1. The vulnerable code path in the package is reachable from your application's usage
2. The exploitation condition can be triggered in your deployment environment
3. The attacker has the ability to trigger the condition

A remote code execution vulnerability in a package your application uses only for log formatting has a different practical risk than an RCE in a package that processes untrusted user input directly.

Assess each finding in context:
- What functionality does the vulnerable package provide?
- Is the vulnerable code path exercised by your application?
- What is the exploit condition (unauthenticated access, specific input format, specific configuration)?
- Is that exploit condition reachable in your environment?

This contextualization is where the scanner output requires human judgment. Most scanners assign a CVSS score but have no knowledge of your specific application architecture. A critical CVSS score does not automatically mean critical risk to your specific system.

### Dependency Pinning and Integrity

Pinning means specifying exact dependency versions in your lockfiles (`package-lock.json`, `poetry.lock`, `Gemfile.lock`) rather than floating ranges. Pinning ensures reproducible builds — the same code is installed every time.

Without pinning, a dependency update that happens between your last build and your next deployment can introduce a new version with a new vulnerability or, in the case of a supply chain attack, malicious code.

Integrity verification goes further: the lockfile records a cryptographic hash of each package. When the package is installed, the hash is verified. If the package content has been modified — even by the package registry — the hash will not match and installation will fail.

```json
// package-lock.json — integrity hash
{
  "node_modules/lodash": {
    "version": "4.17.21",
    "integrity": "sha512-v2kDEe57lecTulaDIuNTPy3Ry4gLGJ6Z1O3vE1krgXZNrsQ+LFTGHVxVjcXPs17LhbZkGLzoYWhsA53A9M2w==",
    "resolved": "https://registry.npmjs.org/lodash/-/lodash-4.17.21.tgz"
  }
}
```

Both `npm ci` and modern package managers enforce these integrity checks during installation. Confirming that your CI/CD pipeline uses the lockfile-respecting install command (not the non-locked version) is a basic but important configuration check.

### Typosquatting and Dependency Confusion

Beyond known CVEs, supply chain attacks include two categories that scanners typically do not detect.

**Typosquatting:** Publishing malicious packages under names similar to popular legitimate packages. `requests` becomes `requestss`, `lodash` becomes `1odash` (with a numeral 1). Developers who mistype a package name in an install command pick up the attacker's package instead.

Mitigation: use a package audit for any new dependency addition. Look up the package on the registry before installing — verify the author, the download count, and the publication date. A package published last week with 12 downloads that claims to be a utility library is worth scrutinizing.

**Dependency confusion:** Exploits the resolution order in package managers that check both public and private registries. If your organization has a private package named `internal-utils` and an attacker publishes a package with the same name to the public registry at a higher version number, some package manager configurations will install the public version.

Mitigation: use scoped packages (`@yourorg/internal-utils`) for internal packages. Configure your package manager to use a private registry exclusively for internal packages. Set `always-auth=true` in your npm configuration to prevent fallback to the public registry.

### The Maintainer Risk Problem

Some supply chain risk has no automated solution. A well-maintained package with no CVEs can be compromised if its maintainer credentials are stolen, if the maintainer is socially engineered, or if the maintainer intentionally adds malicious code. The event-stream compromise in 2018 involved a maintainer transferring package ownership to an attacker who then added malicious code to a minor version update.

Reducing this risk:
- Prefer packages with multiple active maintainers over single-maintainer packages for critical dependencies
- Monitor packages for unusual activity (rapid version releases, new maintainers added, changes to CI/CD configuration)
- Consider `npm publish` hooks or package registry monitoring tools that alert on new versions of your dependencies
- Lock major and minor versions, not just patches, for critical dependencies — a major version bump can indicate significant code changes worth reviewing

> **Try This**
>
> For your five most critical dependencies (the ones that handle authentication, database access, or data parsing), look at the maintainer profile on the package registry. How many maintainers does it have? When was the last release? How many open security issues are there? How frequently do releases happen? This fifteen-minute exercise often produces surprising findings about the packages you implicitly trust the most.

---

### Key Takeaways

1. Transitive dependencies often outnumber direct dependencies by an order of magnitude — the supply chain surface is much larger than it appears.
2. CVE scanning against lockfiles is the fastest audit action available and should be part of every CI/CD pipeline.
3. Not all CVEs are equally relevant to your application; contextual assessment of reachability and exploit conditions is required.
4. Dependency pinning and integrity verification prevent unintended version updates and detect tampered packages.
5. Typosquatting and dependency confusion attacks cannot be detected by CVE scanners; they require process controls around dependency addition.

### Chapter Exercise

Run a CVE scan on your current lockfiles using `pip-audit`, `npm audit`, or an equivalent tool for your language stack. For each finding above your severity threshold: (1) look up the CVE and understand the exploit condition, (2) determine whether that exploit condition is reachable in your application, and (3) record a remediation action with a target date. Any critical findings with reachable exploit conditions should be treated as immediate priorities.

---

## Chapter 8: Prioritizing and Planning Remediation

An audit without a remediation plan is documentation. It records vulnerabilities but does not fix them. The gap between "we know this is broken" and "we fixed it" is where the real work happens, and it requires a different skillset than the audit itself — specifically, the ability to prioritize under real-world constraints and plan work in a way that engineering teams can execute.

### Why Prioritization Is Hard

Security findings are not created equal, but the factors that determine severity in practice are not the same as the factors that determine severity in scanner output.

CVSS scores measure technical severity — how bad could this get in a worst-case scenario. They do not measure:
- Exploitability in your specific deployment context
- Likelihood that an attacker would target this particular vulnerability
- Actual business impact of exploitation
- Cost to remediate
- Risk introduced by the remediation itself

A critical CVSS vulnerability in a service that is not internet-facing and sits behind multiple layers of access control is lower practical risk than a medium CVSS vulnerability in a publicly accessible API that processes payment data.

Prioritization needs to account for all of these dimensions, not just technical severity. The goal is to maximize risk reduction per unit of engineering effort — which is a different optimization target than "fix the highest CVSS scores first."

> **Key Insight**
>
> Fixing a critical vulnerability in an internal service that is difficult to access is usually lower priority than fixing a medium vulnerability in a public endpoint that handles sensitive data. Exploitability and exposure determine real-world risk; CVSS alone does not.

### A Practical Risk Scoring Framework

A simple risk scoring model that works in practice:

```
Risk Score = Severity × Exploitability × Exposure × Impact
```

Where each factor is scored on a 1-3 scale:

**Severity (technical):**
- 3: Remote code execution, authentication bypass, or direct data exposure
- 2: Privilege escalation, injection requiring authentication, or indirect data exposure
- 1: Information disclosure, denial of service, or low-impact logic flaw

**Exploitability:**
- 3: No authentication required, no preconditions, public-facing
- 2: Requires authentication or specific conditions
- 1: Requires elevated access or complex preconditions

**Exposure:**
- 3: Internet-facing endpoint
- 2: Internal network with broad access
- 1: Isolated service, internal-only, significant access controls

**Impact:**
- 3: Customer PII, payment data, authentication credentials, or system integrity
- 2: Internal data, audit logs, or non-critical business data
- 1: Public data, low-sensitivity internal data

Maximum possible score: 81. Minimum: 1. In practice, scores above 50 are high priority, 20-50 are medium, and below 20 are lower priority.

This model is a starting point, not a formula. The specific weights and thresholds should be adjusted for your organization's risk tolerance and business context.

### Grouping by Remediation Type

Before building a remediation plan, group findings by remediation type. Many vulnerabilities in different locations require the same kind of fix, and batching them produces more efficient remediation.

**Remediation type groups typically look like:**

- *Parameterization fixes:* Replace string-formatted SQL with parameterized queries. These are mechanical changes with low risk.
- *Library upgrades:* Update vulnerable dependencies to patched versions. Risk depends on how much the API changed.
- *Authentication additions:* Add missing authentication checks to unprotected endpoints. These require testing to verify the check does not break existing functionality.
- *Authorization additions:* Add ownership checks to resource retrieval operations. These require understanding which users should have access to which resources.
- *Secret rotation and externalization:* Remove hard-coded secrets, externalize to environment variables or secrets management, and rotate the exposed credentials. These involve operational coordination beyond the code change.
- *Algorithm replacements:* Replace weak cryptographic algorithms with appropriate ones. These require careful handling to maintain backward compatibility with existing data.

Grouping makes the remediation plan legible. Instead of "47 separate security findings," you have "6 parameterization fixes, 3 library upgrades, 8 auth additions, 5 IDOR fixes, 4 secret externalizations, and 2 algorithm replacements." That is a plan engineering teams can work with.

### Remediation Risk

Every fix introduces risk. A parameterized query fix has minimal risk — the behavior change is intended and testable. Adding an authentication check to an endpoint that was previously public has higher risk — it will break any client that was not sending authentication tokens, which might include legitimate internal services, monitoring systems, or legacy clients.

Before implementing each fix:
1. Understand the current behavior and who depends on it
2. Assess whether the fix changes visible behavior (it usually does)
3. Identify what will break and coordinate with affected teams
4. Plan for backward compatibility where necessary (e.g., a deprecation period for a changed endpoint)

Fixes that carry high remediation risk should be planned and communicated rather than implemented ad-hoc.

> **Warning**
>
> Security fixes that break production systems create pressure to revert. A reverted security fix is worse than a delayed one — it creates the impression that security is negotiable and makes it harder to get the fix reapplied later. Plan carefully, communicate broadly, and roll back gracefully rather than creating emergencies.

### Building the Remediation Roadmap

A remediation roadmap answers four questions for each finding:
1. Who owns the fix?
2. What is the target date?
3. What does "done" look like?
4. What is the mitigation in the interim?

**Ownership** should be unambiguous. A finding that belongs to "the backend team" will not get fixed. A finding that belongs to a specific named engineer with the context to address it will.

**Target dates** should be based on risk score, not scheduling convenience. High-risk findings should have dates measured in days, not quarters.

**Definition of done** matters because security fixes often require more than a code change — credential rotation, configuration change, infrastructure update. Make the full requirement explicit.

**Interim mitigation** is often overlooked. If a critical vulnerability will take three weeks to fix properly, what is the plan for those three weeks? A WAF rule, a network control, a feature flag, or additional monitoring can reduce exposure while the proper fix is underway.

```markdown
## Finding: SQL Injection in User Search API
- **Severity:** High (Risk Score: 54)
- **Owner:** Sarah Chen, Backend Team
- **Target date:** 2026-05-01
- **Done when:** Parameterized queries replace all string formatting in
  `api/users/search.py`; unit tests confirm injection attempts do not
  execute; verified by code review
- **Interim mitigation:** WAF rule added to block common SQL metacharacters
  in the `q` parameter of `/api/users/search`
```

### Tracking and Verification

A finding is not remediated until the fix is verified. Verification means:
- The code change is reviewed and merged
- The fix addresses the full vulnerability path, not just the obvious trigger
- Tests exist that confirm the fix (and would catch a regression)
- If credentials were exposed, they have been rotated
- If a configuration was changed, the change is applied in all relevant environments

The temptation to mark a finding as remediated when the code change is merged without verification is strong. Resist it. A code change that fixes the visible case but leaves an adjacent case open is not a fix.

---

### Key Takeaways

1. CVSS scores measure technical severity; practical risk requires weighting exploitability, exposure, and business impact alongside technical severity.
2. Grouping findings by remediation type makes the work legible and enables efficient batching.
3. Security fixes carry their own risk; fixes that break production behavior require planning, communication, and testing.
4. A remediation plan must answer: who owns it, when is it done, what does done mean, and what is the mitigation in the interim.
5. A finding is not remediated until the fix is verified — code merge alone is not sufficient.

### Chapter Exercise

Take the top five findings from your most recent audit (or from the exercise outputs in previous chapters). Apply the risk scoring model above to each one, adjusting the weights to reflect your organization's context. Then write a one-paragraph remediation plan for the highest-scored finding: owner, target date, definition of done, and interim mitigation. The act of writing that plan will surface any ambiguities in the finding itself.

---

## Chapter 9: Continuous Security: Audit as Process

A point-in-time security audit is a photograph. It shows you what the codebase looked like on the day you took it. The moment new code is committed, the photograph is out of date. The vulnerabilities you found are being fixed while new ones may be introduced.

Security as a one-time or annual exercise is a compliance approach, not a security approach. It satisfies the checkbox. It does not actually reduce risk over time because the codebase continues to change between audits. A sustainable security posture requires making security discovery and remediation continuous processes, embedded in the engineering workflow, not bolted on as a periodic external activity.

### What Continuous Security Actually Means

Continuous security is not running the same scanner every day and hoping the output changes. It is building the feedback loops that ensure new vulnerabilities are found quickly after they are introduced, and that the cost of finding and fixing them stays low because fixes happen close to the code change that introduced them.

The core components of a continuous security process:

**Prevention:** Making it hard to introduce vulnerabilities in the first place. Pre-commit hooks, linter rules, IDE integrations, and developer training reduce the rate at which vulnerabilities enter the codebase.

**Detection:** Automated checks that run on every change — in CI pipelines — that catch vulnerability patterns, dependency updates with known CVEs, and configuration regressions.

**Response:** Defined processes for handling findings from automated detection, with clear ownership, severity tiers, and remediation SLAs.

**Measurement:** Tracking trends over time — mean time to detect, mean time to remediate, vulnerability rates by team and by type — to identify systemic issues and measure whether the process is working.

These four components work together. A process that only does detection without response is alert fatigue waiting to happen. A process that only does prevention without detection misses vulnerabilities that slip through. Measurement is what tells you which components are underperforming.

### Integrating Security Into CI/CD

The CI/CD pipeline is the natural integration point for automated security checks. Code does not reach production without passing through the pipeline, which makes it the right place to enforce security baselines.

**Dependency scanning:** Every build should run a dependency vulnerability scan. This is low-cost (fast, automated) and high-return (catches known CVEs before they reach production).

```yaml
# GitHub Actions example
- name: Dependency audit
  run: |
    pip-audit --requirement requirements.txt --fail-on-vulnerability
```

**Static analysis:** A linter configured with security rules runs on every pull request. This catches simple but high-frequency issues — shell injection via `shell=True`, use of `eval`, weak cryptographic algorithms, `pickle.loads` applied to untrusted input.

```yaml
- name: Security linting
  run: |
    bandit -r src/ --severity-level medium --confidence-level medium
```

**Secret scanning:** Every commit should be scanned for accidentally committed credentials.

```yaml
- name: Secret detection
  uses: trufflesecurity/trufflehog@main
  with:
    path: ./
    base: ${{ github.event.repository.default_branch }}
    head: HEAD
```

**SAST (Static Application Security Testing):** A deeper static analysis tool than a linter. Tools like Semgrep allow defining custom rules that match your codebase's specific patterns and technology stack.

```yaml
- name: Semgrep scan
  uses: returntocorp/semgrep-action@v1
  with:
    config: >-
      p/python
      p/jwt
      p/security-audit
```

The pipeline integration should enforce a minimum bar — failing the build on critical findings — while routing medium and lower findings to a review queue rather than blocking deployment. A security check that blocks every deployment on medium-severity findings will be disabled by engineers who need to ship.

> **Key Insight**
>
> The security enforcement threshold in CI should be set where you are willing to block a deployment. Setting it too low creates friction that engineers route around. Setting it too high misses significant issues. Critical findings always block; high findings should block with a defined exception process; medium and lower should route to a review queue.

### Developer-Facing Security Tooling

Prevention requires reaching developers at the moment of code writing, not after the fact in a CI report. Developer-facing security tooling runs in the IDE or in pre-commit hooks, providing feedback before the code is committed.

Pre-commit hooks can run fast checks — secret detection, obvious injection patterns, dangerous function calls — with low enough latency that developers experience them as part of the commit workflow rather than as an interruption.

```bash
# .pre-commit-config.yaml
repos:
  - repo: https://github.com/trufflesecurity/trufflehog
    rev: v3.63.7
    hooks:
      - id: trufflehog
        args: ['--only-verified']

  - repo: https://github.com/PyCQA/bandit
    rev: 1.7.6
    hooks:
      - id: bandit
        args: ["-lll"]

  - repo: https://github.com/awslabs/git-secrets
    rev: ad82d68ee924906a0401dfd48de5057731a9bc84
    hooks:
      - id: git-secrets
```

IDE integrations like Snyk IDE plugins, Semgrep VS Code extension, and similar tools highlight potential issues inline as code is written. The feedback loop shortens from "hours later in a CI report" to "seconds after typing."

### Security-Aware Code Review

Automated tools catch patterns but miss intent. Human code review is still the best mechanism for catching authorization logic errors, business logic bypasses, and architectural security flaws that require understanding context.

Security-aware code review does not require every reviewer to be a security expert. It requires:
- A documented list of things to check for in code review (a security review checklist specific to your stack and patterns)
- The habit of asking security-relevant questions: where does this input come from? What happens if this value is controlled by an attacker? Is there an authorization check before this data access?
- Escalation paths for code that handles security-sensitive operations — a second review from someone with security expertise for authentication, cryptography, and payment handling code

A lightweight code review checklist for web applications:

```
Authentication & Authorization:
□ Does this endpoint require authentication if it returns user data?
□ Does this resource retrieval verify ownership (not just existence)?
□ Is there a role check before any privileged operation?

Input Handling:
□ Is user-provided data used in any database query? Is it parameterized?
□ Is user-provided data reflected in HTML output? Is it escaped?
□ Is user-provided data used in file paths? Is traversal prevented?

Cryptography:
□ Are any passwords being hashed? Is bcrypt/Argon2 used?
□ Are any secrets hardcoded or logged?

Dependencies:
□ Are any new dependencies being added? Were they audited?
□ Is the lockfile updated correctly?
```

### Periodic Deep Audits

Continuous automated checks are not a replacement for periodic human-led deep audits. Automation catches what it knows to look for. A human-led audit finds systemic issues, architectural flaws, and the patterns that fall outside automated rule sets.

The cadence depends on the rate of change and the sensitivity of the system. A fast-moving codebase handling financial data warrants a comprehensive audit at least twice a year. A slower-moving internal tool handling less sensitive data might warrant annual.

The difference between a periodic deep audit and the traditional annual compliance audit is this: the periodic deep audit is informed by the continuous monitoring data. You know where changes have been concentrated. You know which vulnerability categories automated tools have flagged recently. You know which teams have been shipping under time pressure. The audit focuses on those areas rather than sampling randomly.

> **Warning**
>
> If your continuous monitoring catches nothing between periodic audits, the monitoring is not configured correctly. Real codebases generate a steady stream of dependency updates, occasionally committed secrets, and periodic static analysis findings. Silence is a sign the tools are not working, not that the code is perfect.

### Measuring the Security Program

What gets measured gets managed. A security program that cannot articulate its metrics cannot demonstrate its value or identify where it is underperforming.

Useful security metrics:

**Mean time to detect (MTTD):** How long after a vulnerability is introduced does the monitoring detect it? This measures the effectiveness of the automated detection layer.

**Mean time to remediate (MTTR):** How long from detection to verified fix? This measures the effectiveness of the response process and reveals whether the priority and ownership assignments are working.

**Vulnerability rate by type:** Are SQL injections increasing or decreasing over time? This reveals whether developer training and linting rules are working for specific vulnerability classes.

**Escaped vulnerabilities:** Vulnerabilities found by external parties (bug bounty reporters, penetration testers, incident response) that were not found by internal processes. These are failures of the detection layer and deserve root cause analysis.

**Dependency age:** The average age of your dependency versions versus available versions. Old, unupdated dependencies are a proxy metric for supply chain risk.

These metrics are only useful if they are tracked over time and reviewed regularly. A monthly security metrics review — 30 minutes, the right people, the right data — produces more program improvement than quarterly reports that sit unread.

### Security Champions

Security expertise cannot be concentrated entirely in a security team, if you have one, or in one or two security-conscious engineers. At scale, that model creates bottlenecks and leaves most code without security-aware oversight.

The security champions model distributes security knowledge across engineering teams. Each team has one or two engineers who are more deeply invested in security — who stay current on relevant vulnerability classes, who lead security discussions in their team's code reviews, and who are the escalation point for security questions from their teammates.

Champions are not responsible for all security work in their team. They are responsible for raising the baseline security awareness and capability of their team, and for ensuring that security work gets the attention it needs rather than being deferred indefinitely.

Building a champions program:
- Identify interested engineers through voluntary opt-in
- Provide structured learning resources and a monthly community of practice
- Give champions access to security tooling and vulnerability reports for their team's code
- Acknowledge the contribution — it takes time and should count in performance evaluation

---

### Key Takeaways

1. Point-in-time audits are photographs; continuous security is the process that keeps the photograph current as the codebase changes.
2. CI/CD is the natural enforcement point for automated security checks; the threshold for blocking should be critical findings only, with medium and lower routed to review queues.
3. Developer-facing tooling (pre-commit hooks, IDE plugins) reduces the cost of security feedback by shortening the loop from commit to detection.
4. Human code review with security checklists catches what automated tools miss; escalation paths for sensitive code ensure depth where it matters most.
5. Measurement — MTTD, MTTR, vulnerability rates, escaped findings — is what tells you whether the program is working and where to invest next.

### Chapter Exercise

Audit your current CI/CD pipeline for security integration. Answer these four questions: (1) Does every pipeline run include dependency vulnerability scanning? (2) Does every pipeline run include secret scanning? (3) Does every pipeline run include any static security analysis? (4) Are there pre-commit hooks configured for developer machines? For any "no" answer, identify the specific tool you would add and write the pipeline configuration snippet for it. Getting all four to "yes" is achievable in a single sprint and significantly raises your continuous security baseline.

---

## Conclusion

A codebase is a living artifact. It changes daily. Engineers join and leave. Features are added and refactored. Third-party dependencies update, sometimes with vulnerabilities and sometimes with patches. The security posture of a codebase is not a property that gets established and stays fixed — it is a state that must be actively maintained.

The argument of this guide has been specific: the methods most teams use for security auditing are structurally misaligned with how vulnerabilities actually exist in production codebases. Grep-based searches miss the cross-file, multi-abstraction paths that vulnerabilities travel. Sample-based manual reviews miss the corners where vulnerabilities hide. Point-in-time assessments miss the vulnerabilities introduced after the review date.

Pattern-based discovery with semantic search tools addresses these structural misalignments. It makes coverage tractable. It finds code by what it does rather than what it is named. It surfaces absence patterns — the missing authorization check, the unsigned data that should be verified — not just the presence of dangerous code.

The methods in this guide are not theoretical. They are executable today with tools that exist and are accessible to engineering teams that are not dedicated security organizations. Semantic search over a codebase, hybrid retrieval that combines meaning and keyword, dependency scanning in CI, secret detection in pre-commit hooks — none of these require specialized security infrastructure. They require deliberate setup and the discipline to act on what they find.

The gap that most organizations face is not a knowledge gap. They know, at some level, that their codebase has vulnerabilities. The gap is methodological: they do not have a systematic approach to finding those vulnerabilities before an attacker does. This guide provides that approach.

Use it. Run the chapter exercises. Set up the CI integrations. Build the remediation plan. The vulnerabilities in your codebase will not disappear on their own.

---

## Appendix A: Glossary

**CSRF (Cross-Site Request Forgery):** An attack that tricks an authenticated user's browser into making a request to a web application, causing it to perform an action the user did not intend. Mitigated by CSRF tokens or the `SameSite` cookie attribute.

**CSPRNG (Cryptographically Secure Pseudo-Random Number Generator):** A random number generator whose output is unpredictable to an attacker. Required for security-sensitive random values like session tokens, password reset links, and cryptographic keys. Examples: `secrets` module in Python, `crypto.getRandomValues()` in JavaScript.

**CVE (Common Vulnerabilities and Exposures):** A standardized identifier and database entry for publicly disclosed security vulnerabilities. Format: CVE-YEAR-NUMBER. Published by MITRE and distributed through the National Vulnerability Database (NVD).

**CVSS (Common Vulnerability Scoring System):** A framework for rating the severity of security vulnerabilities on a 0-10 scale, based on factors including attack vector, attack complexity, privileges required, and impact.

**Data flow analysis:** A form of program analysis that traces how data moves through a program — from input sources to output sinks — to identify potentially dangerous paths. Used in static analysis tools to find injection vulnerabilities.

**Embedding:** A numerical vector representation of a piece of data (text, code) in a high-dimensional space. Items with similar meaning have geometrically similar embeddings, enabling similarity-based search.

**Hybrid search:** A retrieval method that combines results from multiple search techniques — typically semantic (embedding-based) and keyword (BM25 or TF-IDF) — using a fusion algorithm to produce a single ranked result list.

**IDOR (Insecure Direct Object Reference):** An authorization vulnerability where a user can access resources belonging to other users by manipulating the resource identifier in a request, without the server verifying ownership.

**Injection vulnerability:** A class of vulnerability where untrusted data is sent to an interpreter (SQL engine, shell, HTML renderer) in a way that causes the interpreter to execute the data as a command or query. SQL injection, command injection, and XSS are all injection vulnerabilities.

**IV (Initialization Vector) / Nonce:** A random value used in conjunction with a cryptographic key to ensure that encrypting the same plaintext multiple times produces different ciphertexts. Must be unique per encryption operation.

**JWT (JSON Web Token):** A compact, URL-safe means of representing claims between parties. Consists of a header, payload, and signature. The signature must be verified using the appropriate algorithm and key before trusting the payload.

**KDF (Key Derivation Function):** A cryptographic function that derives one or more secret keys from a master key or password. Designed to be intentionally slow to resist brute-force attacks. Examples: PBKDF2, bcrypt, scrypt, Argon2.

**Mass assignment:** A vulnerability pattern where an application automatically assigns values from user-supplied input (e.g., a request body) to an object's attributes without filtering, allowing attackers to set attributes they should not be able to modify.

**Parameterized query:** A database query that separates the query structure from the data values, preventing data from being interpreted as SQL syntax. The primary defense against SQL injection.

**Path traversal:** A vulnerability where user-supplied input is used to construct a file path, allowing an attacker to navigate outside the intended directory using `../` sequences.

**RRF (Reciprocal Rank Fusion):** An algorithm for combining multiple ranked lists into a single unified ranking, used in hybrid search. Each item's combined score is the sum of `1/(k + rank_i)` across all lists where the item appears.

**SAST (Static Application Security Testing):** Security analysis performed on source code without executing it. Identifies patterns that correspond to known vulnerability classes.

**Semantic search:** Search that retrieves results based on meaning and conceptual similarity rather than exact keyword matching, enabled by embedding-based vector similarity.

**Session fixation:** A session management vulnerability where an attacker sets a user's session ID to a known value before authentication, allowing the attacker to use that session ID to impersonate the authenticated user.

**Supply chain attack:** An attack that compromises a target by inserting malicious code into a software component the target depends on — a library, build tool, or CI/CD pipeline.

**Typosquatting:** A social engineering attack where malicious packages are published under names similar to popular legitimate packages, targeting developers who mistype package names during installation.

**XSS (Cross-Site Scripting):** An injection vulnerability where an attacker injects malicious scripts into content delivered to other users' browsers. Reflected XSS uses the HTTP response; stored XSS persists the payload in the application's data.

**XXE (XML External Entity):** An XML injection vulnerability where a maliciously crafted XML document causes the parser to retrieve and process external resources or disclose internal files.

---

## Appendix B: Tools & Resources

### Semantic and Hybrid Search

**Pyckle Code-MCP**
Hybrid semantic + BM25 search over indexed codebases. Supports natural language queries, graph-based dependency traversal, and session-aware context tracking. ChromaDB-backed vector storage. Suitable for per-file and codebase-wide security queries.

**Sourcegraph**
Code intelligence platform with structural search and some semantic capabilities. Particularly useful for large multi-repository environments. Commercial product with an open-source community edition.

**GitHub Copilot (Code Analysis Mode)**
AI-assisted code understanding useful for interpreting specific code segments during audit review. Less suited for systematic codebase-wide coverage searches; optimized for completion assistance.

### Static Analysis and SAST

**Semgrep**
Open-source static analysis framework supporting custom rule patterns. Extensive rule registry for security patterns across Python, JavaScript, Go, Java, Ruby, and other languages. Integrates with CI/CD via GitHub Actions and GitLab CI. Free for open-source; commercial offering for advanced features.

**Bandit**
Python-specific security linter. Checks for common Python security issues: shell injection, weak cryptography, hardcoded passwords, unsafe YAML loading. Fast enough for pre-commit hooks.

**ESLint with security plugins**
JavaScript/TypeScript. `eslint-plugin-security` adds security rules for common JavaScript patterns. `eslint-plugin-no-unsanitized` flags unsafe DOM manipulation.

**CodeQL**
GitHub's semantic code analysis engine. Queries written in a purpose-built language (QL) that can reason about data flow across function boundaries. Deep analysis but slower than linters; suitable for CI runs on pull requests.

**SonarQube / SonarCloud**
Comprehensive code quality and security analysis platform. Security Hotspots feature identifies code requiring security review. Community edition is free; commercial editions add advanced security features.

### Dependency Scanning

**pip-audit**
Python dependency vulnerability scanner. Uses the Python Packaging Advisory Database and the Open Source Vulnerabilities (OSV) database. Fast, accurate, integrates cleanly into CI.

**npm audit**
Node.js built-in dependency auditing. Checks installed packages against npm's advisory database. `npm audit fix` attempts automatic remediation of fixable findings.

**Trivy**
Multi-language, multi-target vulnerability scanner from Aqua Security. Scans dependency files, container images, and infrastructure as code. Pulls from NVD, GitHub Security Advisories, and other databases. Open source.

**Snyk**
Commercial dependency scanning and developer security platform. Broader coverage than free tools, developer workflow integrations, and automated fix pull requests. Free tier for open-source projects; commercial for private repositories.

**OWASP Dependency-Check**
Open-source tool supporting Java, .NET, Python, Ruby, and other languages. Uses the NVD database. Slower than Trivy or pip-audit but well-established.

**OSV-Scanner**
Google's open-source vulnerability scanner. Uses the Open Source Vulnerabilities database, which aggregates CVEs and security advisories from multiple sources.

### Secret Detection

**TruffleHog**
Open-source secret detection tool. Searches git history for secrets using high-entropy detection and pattern matching. Verifies detected secrets against the relevant APIs to confirm validity, reducing false positives.

**Gitleaks**
Open-source secret detection tool focused on git repositories. Fast, configurable, supports custom regex rules. Integrates with pre-commit and CI/CD.

**git-secrets**
AWS Labs tool that prevents committing secrets to git repositories. Configures git hooks to scan commits for secret patterns before they are recorded.

**Detect-secrets (Yelp)**
Open-source secrets detection with a baseline approach — establishes a known baseline of non-secrets and alerts on new detections.

### Password and Cryptography

**passlib (Python)**
Comprehensive password hashing library with support for bcrypt, scrypt, Argon2, and other modern algorithms. Recommended for Python applications.

**argon2-cffi (Python)**
Python bindings for Argon2, the winner of the Password Hashing Competition. Argon2id is the current recommended password hashing algorithm.

**cryptography (Python)**
Comprehensive cryptographic library for Python. Provides high-level recipes (Fernet symmetric encryption, X.509 certificate handling) and low-level primitives (AES, RSA, ECDSA). Prefer the high-level recipes where they cover your use case.

**bcrypt (Node.js)**
Well-established bcrypt implementation for Node.js. Use for password hashing in Node.js applications.

### OWASP Resources

**OWASP Top 10**
The definitive list of the ten most critical web application security risks. Updated periodically. Useful as a checklist category reference, not as a complete audit methodology.

**OWASP ASVS (Application Security Verification Standard)**
Detailed requirements framework for application security. Organized by security control area. More comprehensive than the Top 10 and suitable for security requirements definition.

**OWASP Testing Guide**
Comprehensive guide to web application security testing. Covers manual testing techniques, automation, and specific vulnerability classes in depth.

**OWASP Cheat Sheet Series**
Concise security guidance on specific topics: SQL injection prevention, authentication, session management, cryptographic storage, and many others. Practical and implementation-focused.

---

## Appendix C: Further Reading

### Books

**"The Web Application Hacker's Handbook" (Stuttard, Pinto)**
Comprehensive coverage of web application attack techniques. Essential for understanding how the vulnerabilities in this guide are actually exploited — which improves the quality of both finding and fixing them.

**"Hacking: The Art of Exploitation" (Erickson)**
Deeper into the low-level mechanics of exploitation. Most relevant for systems programming security rather than web application security, but builds foundational understanding of how computer systems fail.

**"Cryptography Engineering" (Ferguson, Schneier, Kohno)**
The practical cryptography reference for engineers who need to use cryptographic primitives correctly, not just implement them. Clear, actionable, and does not require a mathematics background.

**"Security Engineering" (Anderson)**
Ross Anderson's comprehensive treatment of security as a systems and engineering discipline. Covers threat modeling, cryptography, economics of security, and organizational security. Updated editions cover modern concerns.

**"Alice and Bob Learn Application Security" (Shema)**
More accessible than most security texts. Good starting point for developers building security awareness without a dedicated security background.

### Papers and Research

**"SoK: Eternal War in Memory" (Szekeres et al.)**
Survey paper on memory corruption vulnerabilities and defenses. More systems-focused than web application security but foundational for understanding how classes of vulnerabilities persist despite mitigation efforts.

**"The Seven Properties of Highly Secure Devices" (Microsoft)**
Framework for thinking about security as a designed-in property rather than a retrofitted one. Applicable beyond device security to software system design.

**NIST SP 800-53**
NIST's security and privacy controls for information systems and organizations. Comprehensive and authoritative. More useful as a policy and requirements reference than as a day-to-day engineering guide.

### Online Resources

**CVE database (cve.mitre.org)**
Authoritative source for CVE descriptions and references. Always check the CVE entry directly when assessing severity and exploitability of a known vulnerability.

**NVD (nvd.nist.gov)**
National Vulnerability Database. Enriches CVE data with CVSS scores, severity ratings, and links to patches and advisories.

**OSV (osv.dev)**
Open Source Vulnerabilities database. Aggregates security advisories from multiple ecosystems in a machine-readable format. Particularly useful for querying vulnerability data programmatically.

**Snyk Vulnerability DB (snyk.io/vuln)**
Searchable database of known vulnerabilities in open-source packages. Often includes more detail on exploitability and remediation than NVD.

**PortSwigger Web Security Academy**
Free, hands-on training in web application security vulnerabilities. Includes labs that allow direct exploitation of real vulnerability instances in a safe environment. Highly effective for building deep intuition about how injection attacks, authentication bypasses, and other vulnerabilities work in practice.

**OWASP Web Security Testing Guide (owasp.org/www-project-web-security-testing-guide)**
Comprehensive, continuously updated testing guide. Covers manual testing methodology and automation for all major web application vulnerability classes.

**GitHub Security Advisories (github.com/advisories)**
Security advisories for packages hosted on GitHub. Often includes proof-of-concept code and detailed technical descriptions earlier than other databases.

**Google Project Zero blog (googleprojectzero.blogspot.com)**
Deep technical analysis of discovered vulnerabilities from Google's elite vulnerability research team. Advanced reading, but invaluable for understanding how sophisticated vulnerabilities are found and the techniques used to discover them.

---

*Security Auditing Your Codebase with AI — Version 1.0*

*David Kelly Price — April 2026*

*The tools change. The patterns do not.*

---



---

## Related Blog Posts

- [Your Codebase Has Its Own Language](https://pyckle.co/blog/your-codebase-has-its-own-languageand-your-ai-doesnt-speak-it.html)
- [Why Naive Retrieval Breaks at Scale](https://pyckle.co/blog/why-naive-retrieval-breaks-at-scale-and-what-we-built-instead.html)

---

*[Browse all free guides →](https://pyckle.co/books.html)*
