My relationship with the product
Full stack user: consumer app on my phone, Claude Code in my terminal most of the day, Pro subscription I renew without really thinking about it, API wired into a few side projects. I'm not reviewing this from a distance.
The honest reason I'm writing this is that I like the product. A lot, actually. Which makes the specific way it fails me more frustrating, because it doesn't feel like a hard problem.
The job Claude actually does
What Claude is actually useful for — at least for me — isn't question-answering in the search engine sense. It's staying in motion on something complex. You're mid-way through architecting something, or three threads deep into debugging, and Claude holds the whole context so you don't have to re-explain everything every turn. Not a smarter Google. A collaborator who remembers where we were.
When the context window fills up, that collaborator is just gone. Mid-sentence sometimes.
What this post is and isn't about
Before I get into it: I'm not touching memory or cross-session persistence. That's genuinely hard and there's real money and serious engineering going into it from several directions right now. I don't have anything to add that isn't already being built somewhere.
What I'm looking at is narrower. Session-level UX. What happens within a single working session when limits kick in. That's a product problem, not a research problem, and it's fixable without waiting for the next model.
Where the product breaks down
Everyone who uses Claude seriously knows about the token limit. That's not the problem. The problem is what happens when you hit it — the session just ends. No warning that arrived meaningfully early, no structured summary of where things stood, nothing. Start a new chat.
If you've been working for two hours on something with real state — decisions made, context built up, a whole working understanding between you and the model — you're now reconstructing that from memory. Or you're giving up. Most people give up.
Then there's the surface fragmentation. Claude.ai, Claude Code, and the API have different limit behaviors and no shared state. I've started something in Claude Code and wanted to continue in the browser — that's not really a thing you can do. You're not picking up where you left off, you're starting over with extra steps.
The constraint isn't the token limit. The constraint is that hitting it feels like abandonment rather than a pause.
The Canva comparison
Canva lets you build an entire presentation before it asks you to pay. You invest the time, you see the finished thing, and the paywall is at the export step. The conversion ask comes after you've already experienced the value.
Claude works the opposite way. The wall hits mid-build. You haven't finished anything. You've done two hours of collaborative work and then the product stops before you can see where it was going. That's not a neutral experience — it actively damages how you feel about the subscription.
I'm not saying Anthropic should copy Canva's pricing structure. The lesson is simpler: make the product earn the upgrade decision. Don't make the limit event be the thing that makes someone wonder if they should cancel.
What I'd prioritize: Handoff Mode
One thing. I'd build what I'd call Handoff Mode: proactive and structured rather than abrupt.
When Claude can see it's running low — around 80% of context, not 98% — it tells you clearly. Token usage, a plain-English read on what's left, a projection of what can still get done. No alarm bells, just information while you can still act on it.
Phase 1: Proactive visibility
When Claude detects it's approaching roughly 80% of the context window, it surfaces a lightweight status update inline: token usage as a percentage, a plain-language read on what's left, and a projection of what can realistically be completed in the current session. Not a warning, not a hard wall. A clear look at the situation while there's still time to act on it.
Phase 2: Structured checkpoint
If the session is going to hit the limit mid-task, Claude proactively generates a checkpoint document before cutting off. Not a transcript. A structured handoff:
- What's been completed
- Decisions made and the reasoning behind them
- What's remaining, in priority order
- The exact prompt to resume intelligently in a new session
Phase 3: Cross-surface continuity
The checkpoint becomes a resumption point that works across surfaces. Start in Claude Code, pick up in claude.ai, continue via the API. One user, one task, coherent context regardless of where you're working.
On autonomous continuation
The obvious next idea: let Claude keep working in the background when your quota resets. Queue the remaining tasks, run them when limits refresh, send a notification when something's done. I've thought about this and I think it's the right long-term direction, with a real caveat.
If Claude misreads something on step two of an autonomous run, steps three through five build on that misread. You come back to output that looks finished but isn't. Debugging that is worse than resuming a paused session yourself — the time you'd save doesn't cover the trust repair when it goes wrong.
Build the checkpoint and persistence layer first. Let people get comfortable with Claude accurately representing where work stands. Then add autonomous continuation as something you opt into, with clear guardrails on what Claude can and can't do without checking in. The mobile notification at that stage is a progress card — what got done, what's next, one tap to review. Not a screenshot of code. Think how a good CI/CD pipeline surfaces a build result: proof of work, not just a status.
The metric I care about
The number I'd track first is session continuation rate: how many users who hit a limit come back to the same task within 24 hours. My guess is it's pretty low. Hitting a limit feels like a hard stop, not a pause. People close the tab.
Handoff Mode is specifically built to move that number. And it's clean to measure — run it against a control group, watch continuation rates and time-to-resume over 30 days, and you know pretty quickly if it's working.
What this is really about
Anthropic doesn't have a capability problem. Claude is good enough that people pay for it without being asked twice. What they do have is a product that regularly creates moments where users reconsider. A badly handled limit event is exactly that moment — you're invested in something, it stops, and now you're thinking about whether this subscription makes sense.
Handoff Mode doesn't make the context window bigger. It makes the end of the context window feel less like the product giving up on you. Session continuation rate and upgrade conversion from limit events are both things you can move with better UX, without needing to wait on model research to ship.
The upgrade conversation should happen when the product is working for you, not when it's stopping you.