Back to Series
If I Was a PM @ ___
March 2026 10 min read Entry 01

If I Was a PM @ Anthropic

I've been on Claude long enough that the token limit isn't a surprise anymore — it's something I plan around. That shouldn't be normal. This is about why it is, and what I'd do about it.

My relationship with the product

Full stack user: consumer app on my phone, Claude Code in my terminal most of the day, Pro subscription I renew without really thinking about it, API wired into a few side projects. I'm not reviewing this from a distance.

The honest reason I'm writing this is that I like the product. A lot, actually. Which makes the specific way it fails me more frustrating, because it doesn't feel like a hard problem.

The job Claude actually does

What Claude is actually useful for — at least for me — isn't question-answering in the search engine sense. It's staying in motion on something complex. You're mid-way through architecting something, or three threads deep into debugging, and Claude holds the whole context so you don't have to re-explain everything every turn. Not a smarter Google. A collaborator who remembers where we were.

When the context window fills up, that collaborator is just gone. Mid-sentence sometimes.

What this post is and isn't about

Before I get into it: I'm not touching memory or cross-session persistence. That's genuinely hard and there's real money and serious engineering going into it from several directions right now. I don't have anything to add that isn't already being built somewhere.

What I'm looking at is narrower. Session-level UX. What happens within a single working session when limits kick in. That's a product problem, not a research problem, and it's fixable without waiting for the next model.

Where the product breaks down

Everyone who uses Claude seriously knows about the token limit. That's not the problem. The problem is what happens when you hit it — the session just ends. No warning that arrived meaningfully early, no structured summary of where things stood, nothing. Start a new chat.

If you've been working for two hours on something with real state — decisions made, context built up, a whole working understanding between you and the model — you're now reconstructing that from memory. Or you're giving up. Most people give up.

Then there's the surface fragmentation. Claude.ai, Claude Code, and the API have different limit behaviors and no shared state. I've started something in Claude Code and wanted to continue in the browser — that's not really a thing you can do. You're not picking up where you left off, you're starting over with extra steps.

The constraint isn't the token limit. The constraint is that hitting it feels like abandonment rather than a pause.

The Canva comparison

Canva lets you build an entire presentation before it asks you to pay. You invest the time, you see the finished thing, and the paywall is at the export step. The conversion ask comes after you've already experienced the value.

Claude works the opposite way. The wall hits mid-build. You haven't finished anything. You've done two hours of collaborative work and then the product stops before you can see where it was going. That's not a neutral experience — it actively damages how you feel about the subscription.

I'm not saying Anthropic should copy Canva's pricing structure. The lesson is simpler: make the product earn the upgrade decision. Don't make the limit event be the thing that makes someone wonder if they should cancel.


What I'd prioritize: Handoff Mode

One thing. I'd build what I'd call Handoff Mode: proactive and structured rather than abrupt.

When Claude can see it's running low — around 80% of context, not 98% — it tells you clearly. Token usage, a plain-English read on what's left, a projection of what can still get done. No alarm bells, just information while you can still act on it.

Feature Spec / Handoff Mode

Phase 1: Proactive visibility

When Claude detects it's approaching roughly 80% of the context window, it surfaces a lightweight status update inline: token usage as a percentage, a plain-language read on what's left, and a projection of what can realistically be completed in the current session. Not a warning, not a hard wall. A clear look at the situation while there's still time to act on it.

Phase 2: Structured checkpoint

If the session is going to hit the limit mid-task, Claude proactively generates a checkpoint document before cutting off. Not a transcript. A structured handoff:

  • What's been completed
  • Decisions made and the reasoning behind them
  • What's remaining, in priority order
  • The exact prompt to resume intelligently in a new session

Phase 3: Cross-surface continuity

The checkpoint becomes a resumption point that works across surfaces. Start in Claude Code, pick up in claude.ai, continue via the API. One user, one task, coherent context regardless of where you're working.

On autonomous continuation

The obvious next idea: let Claude keep working in the background when your quota resets. Queue the remaining tasks, run them when limits refresh, send a notification when something's done. I've thought about this and I think it's the right long-term direction, with a real caveat.

If Claude misreads something on step two of an autonomous run, steps three through five build on that misread. You come back to output that looks finished but isn't. Debugging that is worse than resuming a paused session yourself — the time you'd save doesn't cover the trust repair when it goes wrong.

Sequencing take

Build the checkpoint and persistence layer first. Let people get comfortable with Claude accurately representing where work stands. Then add autonomous continuation as something you opt into, with clear guardrails on what Claude can and can't do without checking in. The mobile notification at that stage is a progress card — what got done, what's next, one tap to review. Not a screenshot of code. Think how a good CI/CD pipeline surfaces a build result: proof of work, not just a status.


The metric I care about

The number I'd track first is session continuation rate: how many users who hit a limit come back to the same task within 24 hours. My guess is it's pretty low. Hitting a limit feels like a hard stop, not a pause. People close the tab.

Handoff Mode is specifically built to move that number. And it's clean to measure — run it against a control group, watch continuation rates and time-to-resume over 30 days, and you know pretty quickly if it's working.

Success metrics
Primary
Session continuation rate
Target
Meaningful lift within 30 days of launch vs. control group
Business
Upgrade conversion from limit events
Window
Users who hit a limit and upgrade within 7 days (Handoff vs. no Handoff)
User outcome
Task completion rate across sessions
Signal
Are multi-step tasks actually finishing, or just being resumed and abandoned again?
Engagement
Time to resume
Signal
How quickly users return after a limit event. Shorter means the handoff was useful
Failure signal
Checkpoint generation rate vs. checkpoint usage rate
What it means
If users generate checkpoints but don't use them to resume, the problem is intent loss, not UX. That's a harder bet.

What this is really about

Anthropic doesn't have a capability problem. Claude is good enough that people pay for it without being asked twice. What they do have is a product that regularly creates moments where users reconsider. A badly handled limit event is exactly that moment — you're invested in something, it stops, and now you're thinking about whether this subscription makes sense.

Handoff Mode doesn't make the context window bigger. It makes the end of the context window feel less like the product giving up on you. Session continuation rate and upgrade conversion from limit events are both things you can move with better UX, without needing to wait on model research to ship.

The upgrade conversation should happen when the product is working for you, not when it's stopping you.