Batching, Reasoning, and Why Atlas Keeps Pulling Ahead

Tim Marois/Jun 12, 2026/

Every release of Atlas, I ask the same question. What am I rebuilding by hand in my own projects that should just live in the SDK? v3.6 is three answers to that question: batch processing, reasoning, and token counting.

The headline is batching, and it's the one I'm most excited about, because it saves real money. But the bigger story is the one I keep coming back to. Atlas isn't trying to keep up with the alternatives anymore. It's covering ground most of them haven't touched yet.

Batching: The Same Work for Half the Cost

If you've ever needed to run the same kind of request thousands of times, captioning a media library, classifying a support backlog, embedding a corpus, summarizing every record in a table overnight, you've felt this pain. You loop through your records, fire one synchronous API call per record, and watch the bill climb while your queue crawls through rate limits.

Every major provider has a batch API that solves exactly this. You hand them a big set of independent requests, they process them within a completion window (up to 24 hours, usually much faster), and they charge you roughly half of what the same calls would cost in real time. The catch is that nobody in the Laravel ecosystem wrapped it in a way that felt native. So you either wrote the JSONL upload, polling, and result-stitching yourself, or you paid full price and pretended the batch API didn't exist.

Atlas v3.6 makes it one fluent builder:

php

use Atlasphp\Atlas\Atlas;
use Atlasphp\Atlas\Input\Image;

$batch = Atlas::batch('openai');

foreach ($images as $image) {
    $batch->add(
        Atlas::text()->model('gpt-5')->message('Caption this image.', [Image::fromUrl($image->url)]),
        key: (string) $image->id, // echoed back on the result, so it maps to your record
    );
}

$response = $batch->submit();
$response->batchId; // "batch_abc123"

That's the stateless path. No database, no migrations, no setup. You get a batch id back immediately and poll the provider yourself whenever you want:

php

$results = Atlas::provider('openai')->batchResults($response->batchId);

foreach ($results as $result) {
    if ($result->succeeded()) {
        Image::find($result->customId)?->update(['caption' => $result->response->text]);
    }
}

Notice the key on each line. It's echoed back on the matching result, so you use your own model's primary key and the answers map straight back to your records. No fragile ordering assumptions, no manual index bookkeeping.

Turn on persistence and it tracks itself

This is where it gets nice. If you have Atlas persistence enabled, the exact same submit() call becomes self-managing. It persists a BatchJob, and a scheduled command brings the results in for you:

php

use Illuminate\Support\Facades\Schedule;

Schedule::command('atlas:batch-poll')->everyFiveMinutes();

Then you just listen for completion and write the results back to your own models:

php

use Atlasphp\Atlas\Events\BatchCompleted;

class WriteCaptionsBack
{
    public function handle(BatchCompleted $event): void
    {
        foreach ($event->job->results()->where('status', 'succeeded')->cursor() as $result) {
            Image::where('id', $result->custom_id)->update(['caption' => $result->response['text']]);
        }
    }
}

Submit, walk away, get an event when it's done. The rolled-up token usage lands on the job row too, so batch spend shows up in your cost reporting instead of vanishing into a provider dashboard.

It scales to the jobs you actually have

A single batch can hold tens of thousands of requests, but real workloads get split up for file-size limits or progress granularity. Atlas has batch groups for that. Chunk a giant query into many batches, attach them all to one group, and ask the group how it's doing:

php

$group = Atlas::batchGroup('caption-run');

Image::whereNull('caption')->chunkById(1000, function ($images) use ($group) {
    $batch = Atlas::batch('openai')->group($group);

    foreach ($images as $image) {
        $batch->add(
            Atlas::text()->model('gpt-5')->message('Caption this image.', [Image::fromUrl($image->url)]),
            key: (string) $image->id,
        );
    }

    $batch->submit();
});

$group->progress();   // ['total' => 4000, 'succeeded' => 3820, 'completed_jobs' => 3, ...]
$group->isComplete(); // true once every job in the group is terminal

It works the same across OpenAI (text, vision, and embeddings), Anthropic, and Google Gemini. Swap the provider string and Atlas maps it to that provider's native batch endpoint. One command bounds the history so the tables don't grow forever (atlas:batch-prune, 90-day retention by default).

The honest part: tools and per-request middleware don't go into a batch, because a batch line is a single one-shot request resolved later, and the tool loop needs synchronous round-trips. Atlas rejects those at build time instead of silently dropping them. Everything that's part of the request body survives: messages, vision input, structured output schemas, reasoning effort, temperature, provider options.

Reasoning, Normalized Across Every Provider

The reasoning models are good now, and every provider exposes the dial differently. OpenAI takes a reasoning.effort string. Anthropic wants a thinking.budget_tokens integer and quietly forbids you from also setting temperature. Gemini buries it under generationConfig.thinkingConfig. xAI collapses it to two levels. Keeping that straight in application code is exactly the kind of provider trivia Atlas exists to absorb.

So now there's one knob:

php

use Atlasphp\Atlas\Enums\ReasoningEffort;

$response = Atlas::text('openai', 'gpt-5')
    ->reasoning(ReasoningEffort::High)
    ->message('Prove the square root of 2 is irrational.')
    ->asText();

$response->reasoning;              // the thought summary, when the model returns one
$response->usage->reasoningTokens; // reasoning tokens, when reported

Pick Minimal, Low, Medium, or, and Atlas translates it to each provider's native format, including deriving a sensible token budget for the budget-based providers. You can override the budget and ask for thought summaries when you want them:

php

Atlas::text('anthropic', 'claude-sonnet-4-5')
    ->reasoning(ReasoningEffort::High, budgetTokens: 24000, includeSummary: true)
    ->message('...')
    ->asText();

The part I care about most is that reasoning survives the tool loop. Atlas replays each provider's signed reasoning context across steps automatically (Anthropic's signed thinking blocks, OpenAI's encrypted reasoning items), so a multi-step agent conversation keeps thinking the whole way through instead of losing its train of thought after the first tool call. v3.6 also fixed reasoning through persisted-conversation reloads, so it holds up when you rehydrate a saved chat. Set it once as an agent default with ->reasoning() and forget about it.

Token Counting Before You Spend a Cent

This one is small and I use it constantly. You can now count the input tokens a request would consume before you send it, for free:

php

$count = Atlas::text('anthropic', 'claude-sonnet-4-5')
    ->instructions('You are a helpful assistant.')
    ->message('Summarize the attached report.')
    ->countTokens();

$count->inputTokens; // 1287
$count->estimated;   // false, this was an exact provider count

The reason this beats a local strlen / 4 tokenizer is that it counts the actual payload Atlas would send, including the system prompt, every tool's JSON schema, and any attached images or PDFs. Those are often the largest part of a request and a local tokenizer can't see any of them. On Anthropic, OpenAI, and Google it calls the provider's own count endpoint, so the number is exact. On the others it falls back to a heuristic and flags estimated so you always know which you got.

It makes budget enforcement trivial. Gate a request before it's ever sent:

php

$count = Atlas::text('anthropic', 'claude-sonnet-4-5')
    ->message($userInput)
    ->countTokens();

if ($count->inputTokens > $tenant->remainingTokenBudget()) {
    throw new BudgetExceededException('Request would exceed the remaining token budget.');
}

Atlas ships the count, not the policy, and deliberately no price table. Model prices change and a stale one is worse than none, so you apply your own current rates to the number. That's the right division of labor.

The Pattern Behind All Three

Look at what these features have in common. Batch APIs existed before v3.6. Reasoning dials existed. Token-counting endpoints existed. Providers shipped all of it. What didn't exist was a single Laravel-native layer that gave you all of it with one consistent API, the same way across every provider, with persistence and cost tracking wired in when you want it and zero ceremony when you don't.

That's the whole thesis of Atlas, and it's why I keep choosing it over the alternatives even though I'm the one who has to maintain it. The other packages in this space are good. But when I line up what I actually need to ship production AI features, thirteen modalities, real-time voice, agent delegation, runtime middleware, incremental embeddings, and now batch processing, normalized reasoning, and pre-flight token counting, the gap isn't close. Most of them are still working on the basics. Atlas is shipping the advanced stuff.

I don't say that to dunk on anyone. I say it because comprehensiveness is the entire point. The value of an SDK isn't the one feature you came for. It's that every adjacent thing you need next is already there, built the same way, so you never drop out of the framework to go wrestle a raw provider API. Every release closes another one of those gaps before I hit it in my own work.

Why I Built It

Same reason as always. I have products to ship, and I got tired of rebuilding the tedious, correctness-sensitive plumbing in every project. Batching was the most recent thing I caught myself about to hand-roll. So I built it into Atlas instead, the way I'd want to use it.

Check out Atlas on GitHub, read the batch docs, and let me know what you build with it. I've got bigger things coming that run on top of all of this, and I can't wait to share them soon.

Batching, Reasoning, and Why Atlas Keeps Pulling Ahead ​

Batching: The Same Work for Half the Cost ​

Turn on persistence and it tracks itself ​

It scales to the jobs you actually have ​

Reasoning, Normalized Across Every Provider ​

Token Counting Before You Spend a Cent ​

The Pattern Behind All Three ​

Why I Built It ​