LLM plans (api/agentic)

What do folks use for their (personal) LLMs needs?

In particular for non-chat (agentic) usage? (Eg Opencode)

I was thinking about trying something like openrouter to avoid being tied to a particular provider.

(Eg if I get Claude I’m afraid I’ll be stuck with Claude Code, if I get openai, will use codex, Gemini would need antigravity, etc)

Any alternative to openrouter? (From what I understand you pay a 5% surcharge on top of a bunch of provider api pricing so you don’t need to create N accounts and can switch easily if there’s a better provider)

Edit: and forgot a other requirements is that interactions/calls are not persisted long term (or used for training). (I think this rules out free usage)

Don’t have experience with Opencode and alikes, but this:

is in theory only possible with local models.

Yes, if you pay your chat is not used (officially) for training or kept for long at most providers unless something happens in the conversation that triggers some safety warning. Then the whole conversation is stored (at least with OpenAI).

As you cannot 100% make sure you never trigger such a scenario by accident, I just treat all input as potentially being used for training.

Largest by volume is for coding. I use Claude code, but I don’t use it with Anthropic’s LLMs as it is too expensive, though I acknowledge they are the best. Though GLM-5 is narrowing the gap substantially.

I use GLM-5 by z.ai. It’s good enough for what I do and has much higher limits than Claude’s coding plans (where after a few prompts you can run out and end up having to sign up for $100/$200 plans - I know people who sign up to multiple $200 claude plans as the allowance is so low).

I previously gave a discounted sign-up link for z.ai, but I’m not sure you can even sign-up any more, they were so in demand that they stopped taking new sign-ups so I was glad I got the Pro plan while it was still available with the discount link and with the special 50% Christmas discount stacked on top.

I’ve also used Qwen Code CLI and Google’s version. Qwen had very generous free tier, but I actually preferred GLM to both Qwen and Gemini.

I also run my own LLMs on local hardware for availability and speed. I run HY-MT for translation and Qwen3 for general tasks. I also run my own TTS and STT models for transcribing text and also converting text to audio so I can listen to stuff when on the go.

I used to use Gemini 3 Pro a lot but they really heavily curtailed the free tier. I also use ChatGPT which still has a generous free tier. Of course, for your use case, you will pay to avoid privacy leaks.

The new Minimax is also making waves, but I haven’t tried that. That is just small enough that you could buy hardware to run it locally.

The Chinese AI labs released a lot just now before Chinese new year, so we have the new GLM-5, MimiMax M2.5, Qwen3.5 and soon Deepseek should release too. So we have very good open source offerings. Anthropic just released a new version of Sonnet too.

We’ve been spoiled, there are a lot of open source models of all sizes to suit hardware from powerful GPU clusters down to CPU only inferencing.

In terms of coding subscription, I’m quite a light user. I don’t use it every day and only use it sporadically (I know people who literally schedule their sleep around the 5 hour window and strategically trigger windows at optimal times to maximize usage).

Even still, I burned through 117 million tokens in the last 7 days and 3 of those days didn’t have usage at all. Lord only knows how much that would have cost if I paid API costs directly…

Checking my internal Qwen3 usage, I have only 18k tokens in the last week and about a million since the start of the year. I tend to use only my local model when I’m out of free tier on both ChatGPT and Gemini or if I need fast guaranteed responses:

3 Likes

Even in the case you cite it wouldn’t be used for training (and afaik wouldn’t have infinite retention), tho I didn’t check the openai tos specifically (but know many companies who would find it unacceptable so I kinda trust that you have decent conditions when you use the API, and I checked what Gemini and open router say).

Anyway let’s say I trust what the tos/faq of major providers say (for the purpose of this post, to avoid derailing too much).

Lite tier?

(Thanks for the example usage, nice to get an idea)

I started with the lowest paid lite tier but moved to the intermediate one. I recommend you do that as lite is crippled in a few ways.

I was so impressed with GLM I bought the company stock. That was up 150% in a month so more than paid for the subscription! :joy:

I bought a lifetime subscription at 1min.ai with stacksocial.

This allow me to test several ai. This is not unlimited but sufficient for my usage.

2 Likes

Do you know what the retention policy is?

I used claude code but switched to antigravity recently. Current pro subscription is discounted to 3usd per month and loads usage included (5hour window) the new gemini pro 3.1 is really strong and you can also use included opus 4.6 for certain tasks. Particularly for building web applications the native browser integration is super neat.

Pro subsbcription allows you to deny your interactions from use in training

Can you give link to the plan?

Seems like they changed the promo.
Atm its 1 month free trial: