New ask Hacker News story: Cancelled 2x Cursor Ultra plans, here's why

Cancelled 2x Cursor Ultra plans, here's why
4 by throwawayround | 4 comments on Hacker News.
Posting this because it took me way too long to figure out what was going on, and I wish I had seen a post like this earlier. I just canceled two Cursor Ultra plans. My usage went from a steady ~$60–100/month to $500+ in a few days, projecting ~$1,600/month. Support told me this was “expected.” I did not suddenly start doing 10x more work. Cursor shows a 200k context window and says content is summarized to stay within limits. Pricing is shown as $ per million tokens. Based on that, I monitored my call count and thought I was being careful. What I did not realise: - Cursor builds a very large hidden prompt state: conversation history, tool traces, agent state, extended reasoning, codebase context. - That state is prompt-cached. - On every call, the entire cached prefix is replayed. - Anthropic bills cache read tokens for every replay. - Cache reads are billed even if that content is later summarised or truncated before inference. So the UI says “max 200k context”, but billing says otherwise Concrete example from my usage: MAX mode: off Actual user input: ~4k tokens Cache read tokens: ~21 million Total tokens billed: ~22 million Cost for one call: about $12 Claude never attended to 21M tokens. I still paid for them. This was not just Opus. It happened with Sonnet too. Support explained that this is exactly how the API is billed so there wasn't an error and I should just use these models more carefully as they could consume a lot of tokens when they are thinking. But there is a limit to that and what I was charged was way high. There is ZERO transparency about how the cache is used. And the cache breakpoints are decided by Cursor so I don't think it's fair to throw the ball to Anthropic here. The dangerous part is that cost becomes decoupled from anything you can see or reason about as a user. You think you are operating inside a 200k window, but you are paying for a much larger hidden history being replayed over and over. I am not claiming a bug in Anthropic’s API. This is a product transparency issue. If a tool can silently turn a few hundred dollars of usage into four figures because of hidden caching behaviour, users need much better visibility and controls. Support suggested spend controls but I am actually complaining about how my pre-paid package was consumed. If you use Cursor with long-running chats, agents, or large codebases, check your cache read tokens carefully. The UI will not warn you. The only thing you will see is a few days into your subscription "Your are projected to run out of your usage allowance in a few days" I canceled and moved on, giving Claude Code a shot until this is fixed. Posting so others do not find out the hard way.

Comments

Popular posts from this blog

How can Utilize Call Center Outsourcing for Increase your Business Income well?

New ask Hacker News story: EVM-UI – visual tool to interact with EVM-based smart contracts

New ask Hacker News story: Ask HN: Should I quit my startup journey for now?