# How to Migrate from OpenAI Embeddings to Pyckle for Code Search

If you're running a code search or RAG pipeline on OpenAI embeddings (`text-embedding-3-small` or `text-embedding-ada-002`), migrating to Pyckle requires three changes:

1. Update the API endpoint and key
2. Re-index your codebase (vectors are not compatible between models)
3. Update your similarity search setup (dimensions change from 1536 to 768)

That's it. The response format is compatible, so most existing code works without modification.

---

## Why Migrate

OpenAI's embedding models are excellent general-purpose models. For natural language retrieval — documentation, articles, prose — they're hard to beat.

For code-to-query retrieval, PyckLM outperforms them on L1 queries (the vocabulary translation problem) because it was trained specifically on code-to-query pairs with hard negatives. A query like "where does session verification happen" returns better results against a codebase indexed with PyckLM than with `text-embedding-3-small`.

The tradeoff: PyckLM is tuned for code. If your use case is primarily documentation or prose retrieval, OpenAI's models may be a better fit. If it's code search, migrate.

---

## Before You Migrate

### Check Compatibility

The Pyckle API response format is compatible with the OpenAI Embeddings API. The `data[].embedding` field is in the same position. You can swap the endpoint without changing your downstream code.

What changes:
- **Dimensions**: PyckLM returns 768-dimensional vectors. OpenAI `text-embedding-3-small` returns 1536 (or configured lower). `ada-002` returns 1536.
- **Stored vectors**: Incompatible. You must re-index.
- **Cost model**: Pyckle charges per API key tier, not per token. Cost is fixed monthly.

### Estimate Re-Indexing Time

Calculate your codebase chunk count:

```bash
# Count Python functions (rough chunk estimate)
find . -name "*.py" -exec grep -c "^def \|^    def \|^async def " {} \; 2>/dev/null | awk '{s+=$1} END {print s}'
```

At 60 requests/minute (Pro tier) with 100 chunks per request, you can index 6,000 chunks per minute. A 60,000-chunk codebase takes about 10 minutes.

---

## Migration: Python

### Before (OpenAI)

```python
from openai import OpenAI

client = OpenAI(api_key="sk-...")

def embed(texts: list[str]) -> list[list[float]]:
    response = client.embeddings.create(
        input=texts,
        model="text-embedding-3-small",
    )
    return [item.embedding for item in response.data]
```

### After (Pyckle)

```python
import httpx

def embed(texts: list[str]) -> list[list[float]]:
    response = httpx.post(
        "https://api.pyckle.co/v1/embeddings",
        headers={"Authorization": "Bearer pk_live_your_key"},
        json={"input": texts, "model": "pycklelm-1"},
        timeout=30,
    )
    response.raise_for_status()
    data = response.json()["data"]
    return [item["embedding"] for item in sorted(data, key=lambda x: x["index"])]
```

If you're using the OpenAI Python client and want to keep the same interface, you can use the `httpx` client directly or just replace the embedding call site.

### If You're Using LangChain

LangChain has a generic `Embeddings` base class. Implement it for Pyckle:

```python
from langchain_core.embeddings import Embeddings
import httpx

class PyckleEmbeddings(Embeddings):
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.client = httpx.Client(timeout=30)

    def embed_documents(self, texts: list[str]) -> list[list[float]]:
        all_embeddings = []
        for i in range(0, len(texts), 100):
            batch = texts[i:i+100]
            all_embeddings.extend(self._embed(batch))
        return all_embeddings

    def embed_query(self, text: str) -> list[float]:
        return self._embed([text])[0]

    def _embed(self, texts: list[str]) -> list[list[float]]:
        response = self.client.post(
            "https://api.pyckle.co/v1/embeddings",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={"input": texts, "model": "pycklelm-1"},
        )
        response.raise_for_status()
        data = response.json()["data"]
        return [item["embedding"] for item in sorted(data, key=lambda x: x["index"])]

# Usage — drop-in replacement
embeddings = PyckleEmbeddings(api_key="pk_live_...")
vectorstore = Chroma.from_documents(docs, embeddings)
```

### If You're Using LlamaIndex

```python
from llama_index.core.embeddings import BaseEmbedding
import httpx

class PyckleEmbedding(BaseEmbedding):
    api_key: str

    def _get_text_embedding(self, text: str) -> list[float]:
        return self._get_text_embeddings([text])[0]

    def _get_text_embeddings(self, texts: list[str]) -> list[list[float]]:
        response = httpx.post(
            "https://api.pyckle.co/v1/embeddings",
            headers={"Authorization": f"Bearer {self.api_key}"},
            json={"input": texts, "model": "pycklelm-1"},
            timeout=30,
        )
        response.raise_for_status()
        data = response.json()["data"]
        return [item["embedding"] for item in sorted(data, key=lambda x: x["index"])]

    async def _aget_text_embedding(self, text: str) -> list[float]:
        return (await self._aget_text_embeddings([text]))[0]

    async def _aget_text_embeddings(self, texts: list[str]) -> list[list[float]]:
        async with httpx.AsyncClient() as client:
            response = await client.post(
                "https://api.pyckle.co/v1/embeddings",
                headers={"Authorization": f"Bearer {self.api_key}"},
                json={"input": texts, "model": "pycklelm-1"},
                timeout=30,
            )
            response.raise_for_status()
            data = response.json()["data"]
            return [item["embedding"] for item in sorted(data, key=lambda x: x["index"])]
```

---

## Migration: Vector Store Update

### ChromaDB

When you change embedding models, you must recreate the collection (dimensions differ):

```python
import chromadb

client = chromadb.PersistentClient(path="./codebase_index")

# Delete old collection (was 1536-dim OpenAI)
try:
    client.delete_collection("codebase")
except Exception:
    pass  # collection may not exist yet

# Create new collection (768-dim Pyckle)
collection = client.create_collection(
    name="codebase",
    metadata={"hnsw:space": "cosine"}
)

# Re-index with new embeddings
# (see indexing code in the Getting Started guide)
```

### Pinecone

Pinecone indexes are fixed-dimension. You need a new index:

```python
from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="...")

# Create new index for Pyckle (768 dimensions)
pc.create_index(
    name="codebase-pyckle",
    dimension=768,
    metric="cosine",
    spec=ServerlessSpec(cloud="aws", region="us-east-1")
)

new_index = pc.Index("codebase-pyckle")

# Migrate: re-embed all your chunks and upsert into new index
```

### Qdrant

```python
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams

client = QdrantClient("localhost", port=6333)

# Delete old collection
client.delete_collection("codebase")

# Create new collection (768-dim)
client.create_collection(
    collection_name="codebase",
    vectors_config=VectorParams(size=768, distance=Distance.COSINE),
)

# Re-index
```

### Weaviate

```python
import weaviate
import weaviate.classes as wvc

client = weaviate.connect_to_local()

# Delete old collection
client.collections.delete("CodeChunk")

# Recreate (768-dim, bring your own vectors)
client.collections.create(
    name="CodeChunk",
    vectorizer_config=wvc.config.Configure.Vectorizer.none(),
    properties=[
        wvc.config.Property(name="content", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="filepath", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="functionName", data_type=wvc.config.DataType.TEXT),
        wvc.config.Property(name="lineStart", data_type=wvc.config.DataType.INT),
    ]
)
client.close()
```

---

## Validate the Migration

After re-indexing, run a set of test queries and compare results against your OpenAI baseline:

```python
test_queries = [
    # L0 — exact name queries (both models should perform similarly)
    "validate_jwt_token",
    "create_user function",

    # L1 — vocabulary translation (Pyckle should outperform)
    "where does session verification happen",
    "how are new accounts created",
    "where is rate limiting enforced",

    # L2 — behavioral (Pyckle should outperform)
    "what runs when a login attempt fails",
    "how does the system recover from a failed webhook",
]

def evaluate_query(query: str, collection) -> list[str]:
    """Return top-5 function names for a query."""
    results = search(query, collection, top_k=5)
    return [r["metadata"].get("name", "unknown") for r in results]

for query in test_queries:
    results = evaluate_query(query, collection)
    print(f"Query: {query}")
    print(f"  Top results: {results}")
    print()
```

For L1 and L2 queries, Pyckle's results should be more semantically relevant — you should see the actual implementation functions in the top results rather than documentation mentions or unrelated utilities.

---

## Rollback Plan

Keep your old index around for 24-48 hours after migrating. ChromaDB has no rename operation, so create the new collection under a different name during migration, then swap your code to point at it:

```python
# During migration: keep old collection, create new one under a different name
new_collection = client.create_collection(
    name="codebase_pyckle",
    metadata={"hnsw:space": "cosine"}
)
# Re-index into codebase_pyckle, validate, then update your code to use it.

# After validation, delete the backup
# client.delete_collection("codebase")  # old OpenAI-indexed collection
```

If something goes wrong, revert your code to point at the old `codebase` collection and revert the API key change.

---

## Summary

| Step | Action |
|------|--------|
| 1 | Get a Pyckle API key at pyckle.co/products |
| 2 | Update embedding call to use `https://api.pyckle.co/v1/embeddings` |
| 3 | Recreate vector store collection with `dimension=768` |
| 4 | Re-index codebase with Pyckle embeddings |
| 5 | Run validation queries to confirm quality improvement |
| 6 | Delete old OpenAI index after 48 hours |

Total time for a mid-size codebase (~50K chunks): approximately 30 minutes.

---

*The Pyckle Embeddings API is live. Get started at pyckle.co/products.*
