The Hard Part of Chunking Isn't Chunking

Tim Marois/Jun 2, 2026/

This is how you give an AI agent infinite memory: a large knowledge base it searches instead of cramming everything into the prompt. The context window stays small while the knowledge behind it grows without limit. Two things have to be right for that to work, and most setups get one of them wrong.

Don't embed the whole document. One vector for a long document averages everything and matches nothing in particular, so search comes back vague. Atlas splits each document into small, overlapping chunks, so a query lands on the exact paragraph that answers it.

Don't re-embed the whole document. Edit one line and the naive approach re-encodes all of it, paying the full cost over and over. Atlas hashes the content and re-embeds only the chunk that actually changed.

Chunking, embedding, searching by meaning, none of that is new. The hard part was never the chunking. It's doing this consistently as content changes, across the different kinds of documents every project stores. That's what I built Atlas to handle.

Keeping It in Sync

Atlas hashes the content of each document. If nothing changed, nothing happens. If something did, it re-chunks, compares against what's stored, and embeds only the chunks that are new or different. Edit one paragraph in a 20-chunk document and a chunk or two re-embed, not all twenty.

There's a debounce on top, so a burst of edits collapses into a single pass against the final text, and the work runs on a queue so saves stay instant. It's tedious to build and easy to get wrong, which is exactly why it belongs in a shared package instead of every app's code.

The Setup Is Small

A trait, an interface, and the column that holds your content:

php

use Atlasphp\Atlas\Embeddings\Chunkable;
use Atlasphp\Atlas\Persistence\Concerns\HasChunkedEmbeddings;
use Illuminate\Database\Eloquent\Model;

class Document extends Model implements Chunkable
{
    use HasChunkedEmbeddings;

    protected string $chunkableField = 'body';
}

php

Atlas::registerChunkable(\App\Models\Document::class);

The default MarkdownChunker splits on headings, keeps code blocks and tables intact, records the heading path each chunk sits under, and caps each chunk near 512 tokens with a little overlap. It covers most documents out of the box. When a model holds something else, like call transcripts or source code, you point it at your own chunker:

php

class Transcript extends Model implements Chunkable
{
    use HasChunkedEmbeddings;

    protected string $chunkableField = 'body';
    protected ?string $chunker = TranscriptChunker::class;
}

Your chunker implements the Chunker interface, or extends BaseTokenAwareChunker to inherit the token math, so you decide how each document type gets split while the rest of the pipeline stays exactly the same. Full setup, including the migration columns, is in the chunked embeddings docs.

Searching the Knowledge Base

Point a support agent at a company handbook. An employee asks how many sick days they get. One call returns the answer, no matter how the model stores its vectors:

php

$results = Atlas::similaritySearch(Document::class, 'how many sick days do I get', [
    'limit' => 5,
]);

foreach ($results as $result) {
    echo $result->content;       // the matched chunk
    echo $result->headingPath;   // "Handbook > Leave > Sick Leave"
    echo $result->record->title; // the parent document
}

It embeds the query, finds the closest chunks, and returns the parent record plus the section that matched. Here's what comes back, one chunk out of a forty-page handbook:

similarity   0.86
headingPath  "Handbook > Leave > Sick Leave"
content      "Sick Leave. Full-time employees accrue one sick day for every
              month worked, up to a maximum of twelve days per year. Unused
              sick days do not roll over into the next year."

That's the whole point. The agent gets the few hundred words that answer the question instead of skimming forty pages. And because the chunker tracks the heading hierarchy, each chunk carries its full path down to the subheading, so the agent knows exactly where in the document it's reading and cites the real section instead of guessing.

The edges of each chunk overlap its neighbors by about fifty tokens, so a passage never starts or ends mid-sentence. You can see it where one chunk hands off to the next:

chunk 5 ends:   ...up to twelve days per year. Unused sick days do not roll over.
chunk 6 starts: Unused sick days do not roll over. To take sick leave, notify your manager...

That shared sentence is the overlap. Without it, a chunk could open halfway through a thought and the agent would answer from half a rule.

The same call works on every model because Atlas stores embeddings two ways, one vector per row for short items and many chunks per row for long ones, and routes to the right one for you. Your code doesn't branch. The similarity search docs cover scoping, like limiting a query to one user's own records.

Best of all, you can hand it to an agent as a tool without writing any tool code:

php

use Atlasphp\Atlas\Agent;
use Atlasphp\Atlas\Tools\SimilaritySearch;

class SupportAgent extends Agent
{
    public function tools(): array
    {
        return [
            SimilaritySearch::usingModel(Document::class, limit: 5)
                ->withName('search_knowledge')
                ->withDescription('Search the knowledge base for relevant answers.'),
        ];
    }
}

The agent calls it, gets the closest chunks, and answers from your docs, with no idea whether the model is chunked or whole-record. And because it searches by meaning, a question about dinner can surface a peanut-allergy note that never says "dinner." Grep can't do that, though grep is still better when you know the exact words.

Compared to the Laravel AI SDK

People ask how this compares to the first-party Laravel AI SDK, so here's the honest version as of June 2026, since both move fast.

The AI SDK handles the basics well, with broad provider support, Laravel's native vector queries, and a SimilaritySearch tool for agents. For small or rarely-changing data it's perfectly fine. The difference is that it stores one vector per row and does no chunking, which gets expensive on large documents that change.

You pay for that twice. Edit a line and you re-embed the entire document, because nothing tracks what changed, whereas Atlas re-embeds only the chunk that changed. And one vector per document means a search hands the agent a big blob of text to pay for on every question, whereas chunking hands it just the relevant paragraph.

So it isn't that the SDK can't do retrieval. It gives you the raw pieces and leaves the chunking pipeline for you to build and pay for, while Atlas builds it so you're not paying to redo work that never changed. If you're reading this much later, check the current AI SDK docs before trusting where the line sits.

Why I Built It

I didn't set out to reinvent retrieval. It's a well-worn road, and plenty of people walk it. I built Atlas because I kept needing the tedious, correctness-sensitive parts handled well in Laravel, so I could build on top of them instead of rebuilding them in every project.

That is what this feature does. It keeps the chunking, the re-embedding, and the search in sync for you, so the boring parts stay out of the way and your time goes to the agent and the product. It's the same machinery running under Rundesk, my latest system I've been building for months.

The Laravel AI SDK is a solid first-party package, and it will keep improving. But on the parts that matter most for large, changing knowledge bases, the chunking and the incremental re-embedding, it's still behind. Atlas solves the problems I have today, and I can't wait for them to catch up.

If you're building retrieval into a Laravel app and hit the same rough edges, I'd like to hear how you're handling them. Reach out on GitHub or X.

The Hard Part of Chunking Isn't Chunking ​

Keeping It in Sync ​

The Setup Is Small ​

Searching the Knowledge Base ​

Compared to the Laravel AI SDK ​

Why I Built It ​