Why AI Agents Are a New Casino

Token subscriptions, iterative nudges, and agent retries create a spending model that looks less like software procurement and more like gambling economics.

The Problem

AI agents are often sold as productivity infrastructure, but the day-to-day interaction model feels closer to a casino. First you buy chips, except the chips are tokens or a monthly subscription. Then you place a bet, except the bet is a prompt. Then you wait to see whether your number comes up, except the outcome is whether the agent actually did the task you had in mind.

The analogy is not rhetorical decoration. In both systems, the cost of entry is intentionally abstracted. You do not think about money once you hold chips in your hand, and you do not think about spend once the platform says you still have credits, tokens, or a plan allowance. The transaction becomes lighter precisely so that repeated use becomes easier.

The second similarity is variance. One query may produce exactly the refactor, dataset summary, or deployment note you wanted. The next one may be slightly wrong in a way that forces another round. The third may be plausible but incomplete. The result is a spending loop where each extra iteration feels small even if the cumulative burn is not.

Casino Loop vs Agent Loop

Buy chips

→

Place bet

→

Hope for payout

Buy tokens

→

Ask agent

→

Hope it is exact

The Challenge: The Subway Upsell Loop

The original sandwich price looks fine until every step invites one more paid refinement.

How resources disappear

The second analogy is Subway. The advertised base sandwich price is acceptable. The real spend appears through a sequence of small add-ons: cheese, bacon, avocado, a larger drink, an extra cookie. AI helper products do the same thing with iteration prompts.

A coding assistant completes the requested method, then asks whether it should apply the same change to all similar methods. It drafts the release note, then asks whether it should also rewrite the changelog in the house style. It summarizes a dataset, then asks whether it should run one more pass to verify edge cases, and one more after that to generate charts, and one more to produce a polished stakeholder version.

Typical upsell moments:

Scope extension: "I can now apply the same fix to all related classes." Useful, but it starts a second billing event after the first task already succeeded.

Quality extension: "I can do one more pass for polish." This sounds cheap, but repeated polish passes often cost more than the original task.

Format extension: "I can also turn this into a PR description, slide outline, executive summary, and email." Each add-on is individually reasonable, cumulatively expensive.

Uncertainty extension: "I can verify this with another run." Because the first answer is probabilistic rather than guaranteed, users are pushed toward another bet.

This is where the Subway analogy reconnects to the casino one. The system does not need to force reckless behavior. It only needs to make one more step sound cheap, then another, then another. The spending model thrives on low-friction continuation.

The Solution: Put the House Rules in Writing

An explicit `AGENTS.md` does not eliminate stochastic behavior, but it reduces needless bets and blocks many upsell loops before they start.

How to change the economics

The practical fix is not moral outrage about tokens. It is constraint design. If the agent has to infer your preferences every time, it will keep asking. If the scope boundary is vague, it will keep proposing extra scope. If the default behavior is "try one more pass," you will keep paying for one more pass.

A good AGENTS.md shifts those decisions from the moment of interaction into project policy. It tells the agent when to stop, when to ask, when to avoid optional polish, how to write outputs, and what counts as complete. That reduces two kinds of waste at once:

Less casino waste: fewer retries caused by ambiguous expectations.
Less Subway waste: fewer incremental add-ons framed as harmless improvements.

In other words, the point of a house rule file is not only quality control. It is budget control. You are narrowing the probability distribution of acceptable agent behavior before the first prompt is even sent.

Without vs With AGENTS.md

Without: subscribe -> ask -> retry -> extend scope -> polish -> retry again

With: define rules -> ask once -> receive bounded output -> stop or escalate intentionally

Code: Two AGENTS.md Patterns

The mechanism is simple: tell the agent what "done" means before it starts spending your tokens.

Example 1: Product web app

AGENTS.md

# Scope and stopping rules
Do exactly the requested change.
Do not add optional refactors, polish passes, or extra cleanup unless asked.
If a change could expand to more than 3 files, stop and ask first.

# Output policy
Return the result, tests run, and any known risk.
Do not suggest "one more pass" unless there is a concrete failing test or bug.

# Cost control
Prefer one strong implementation over several exploratory variants.
Do not rewrite copy, comments, or naming outside the changed scope.
Do not add follow-up tasks unless they are required for correctness.

# Verification
Run only the smallest test command that covers the changed behavior.
If no targeted test exists, say so plainly instead of inventing extra work.

AGENTS.md

# Research workflow
State the hypothesis before proposing experiments.
Prefer one reproducible notebook or script over multiple speculative analyses.

# Cost control
Do not run extra exploratory passes unless they answer a stated hypothesis.
Do not suggest additional plots, ablations, or parameter sweeps unless requested.
If a result is uncertain, explain the uncertainty in one paragraph before proposing more work.

# Deliverables
Return:
1. the direct answer,
2. the code or notebook cell that produced it,
3. the smallest next experiment only if needed.

# Stop condition
When the requested analysis is complete, stop.
Do not append optional "could also analyze" ideas by default.

Why this works

It removes hidden upsells: optional refinement is no longer the default.
It reduces retries: the agent has clearer acceptance criteria from the start.
It makes cost visible: scope expansion becomes an explicit decision rather than a conversational drift.

Summary: Better Rules, Fewer Bets

AI agents are useful, but their commercial interaction model often rewards repeated wagering and incremental upselling unless you actively constrain it.

The casino analogy explains the variance: you pay first, then discover whether the outcome was exact enough. The Subway analogy explains the drain: the first task is affordable, but the system keeps offering one more addition. Together they describe why agent spend can feel modest per interaction and excessive in aggregate.

The solution is not to avoid agents. It is to stop treating every prompt as a fresh negotiation. A disciplined AGENTS.md sets scope, stopping conditions, and verification policy in advance. That does not make the model deterministic, but it does make the process less expensive and less manipulative.

Fewer retries

Clear acceptance criteria narrow the gap between what you mean and what the agent tries.

Less scope creep

Optional add-ons become explicit choices instead of the default conversation path.

Lower spend variance

Token consumption becomes easier to predict because the agent has fewer reasons to ask for another pass.

Explore More Stories

This story is about agent economics rather than model internals. If you want a technical contrast on what is practical today versus what still belongs to research infrastructure, continue with our quantum and systems stories.

Read Another Story → Back to Stories