DeepSeek makes the V4 Pro price discount permanent

(api-docs.deepseek.com)

147 points | by Tiberium 3 hours ago

20 comments

  • alyxya 2 hours ago

    Once they have their own coding agent which they seem to be working towards, I may start predominantly using their models. They seem to be doing all the "right" things, open sourcing models, publishing research, and keeping prices low for everyone.

    • LaurensBER 23 minutes ago

      It works very well with OpenCode. My team keeps hitting the 5h limits on other subscriptions and it's pretty good to have Deepseek as a backup. I just put 50 bucks on there and it feels like it'll never run out.

      It's not good enough to fully replace any of the frontier models yet but it's definitely great to have as a backup!

      • ammar_x 1 hour ago

        You can use V4 Pro with Claude Code [1].

        I tried it and it's impressive.

        [1]: https://api-docs.deepseek.com/quick_start/agent_integrations...

        • thisisit 42 minutes ago

          I am curious - Is there a way to switch between models depending on the task? Because I believe Deepseek V4 is not multimodal and it will be good to switch back to Claude if vision or other capabilities are required.

          • wiradikusuma 38 minutes ago

            That's interesting. I thought Claude Code is not as good, therefore people want to use Claude model with other alternatives. This is the other way around.

            Which begs the question, regardless of the model, which Claude Code alternative is better? (I keep saying "Claude Code alternative" because I don't know the term... LLM CLI?)

            • wrs 28 minutes ago

              The common term for a tool that wraps an LLM with a workflow is “harness”.

            • Scarbutt 1 hour ago

              Surprised Anthropic hasn't done anything to restrict Claude Code from using other providers.

              • wolttam 44 minutes ago

                The value of Claude Code the harness isn't that great. There's a lot of other good harnesses out there.

                • chandureddyvari 16 minutes ago

                  What’s your favourite harness? Is there any benchmarks for harness like LLMs have for swe verified?

                  • crooked-v 14 minutes ago

                    And it gets dragged down by Anthropic actively injecting unhelpful things into prompts without telling users about them (https://github.com/anthropics/claude-code/issues/58262).

                    • koolba 36 minutes ago

                      Good or better? Curious which would be in either bucket.

                      • wolttam 31 minutes ago

                        Probably a matter of taste. I prefer the harness I wrote, I don't want to go near Anthropic's bloated mess of a harness with a 10-meter pole.

                    • cortesoft 1 hour ago

                      At this point in the AI wars, it is probably better to have more users of Claude code rather than restrict which LLMs it can connect to. Claude code is probably (currently at least) stickier than the LLM model itself. Getting people into the Claude code ecosystem is worth it.

                      Later, they can always lock it down more or add Claude LLM only features to it.

                  • lambda 1 hour ago

                    Why do you need them to provide a coding agent? Just use their model with any off the shelf coding agent. I happen to prefer Pi, but use whatever works for you.

                    • alyxya 1 hour ago

                      I probably have an unfounded assumption that whatever coding agent they make will work really well with their models, better than external harnesses. I don't have a good sense for how all the model + harness combinations compare, nor any good way to compare them myself, but generally believe model companies train their models to work best with their own harness.

                      • wolttam 42 minutes ago

                        I've noticed that models have gotten less finicky with this over time. Harnesses don't need to be complex to get good coding performance from models, they just need to implement some sane primitives for code exploration and editing.

                      • hootz 1 hour ago

                        Yeah, I'm using Pi with their models through an OpenCode Go subscription and it works pretty well. 10 bucks and V4-Flash is virtually infinite.

                        • satvikpendem 1 hour ago

                          RL with the harness inputs and outputs of users is one of the primary improvers of model performance, a self perpetuating flywheel.

                          • apitman 1 hour ago

                            What's the best way to use it with Pi, OpenRouter?

                          • tequila_shot 1 hour ago

                            You no longer need "their coding agent". You can hook up claude code to use Deepseek. Works perfectly.

                            • zozbot234 1 hour ago

                              antirez's ds4-agent works quite fine. It runs on any Apple Silicon device with 96GB RAM or more.

                              • cultofmetatron 1 hour ago

                                open code works with them today. I've been using it fulltime for 2 weeks so far.

                                • sunaookami 1 hour ago

                                  Using it with Pi and can only report good thing so far. I'm very impressed by how good it is (also it's way slower than Claude Sonnet and GPT-5.5 and often thinks "too much" before starting).

                              • wg0 1 hour ago

                                If you have not tried DeepdeekV4 you're missing out. The pricing makes it unbelievably good.

                                The chains of thought for Deepseek are very very interesting reads. Open code won't show them but do read them and you'll be surprised at how underrated the model is.

                                My model usage is very low but I still do pay directly to Deepseek regularly as my tribute and contribution to them open sourcing their models as my gratitude and showing support for what I deem positive for overall social good.

                                • abyssin 1 hour ago

                                  It’s good and cheap, but don’t talk about politics to it or it might trigger some sort of censorship rule. You can see it think, then suddenly erase everything and suggest to switch to another subject, without explaining anything. I also had it output some sort of generic message about how the news outlets are in the service of the people. Both times I was surprised because I didn’t make any sensitive requests, neither illegal nor subversive. But it was a remotely political topic and it was enough. There was something both chilling and refreshing about it, since censorship in the west is usually more subtle.

                                  • tequila_shot 1 hour ago

                                    Yes - the model is REALLY good. I try Claude at work and Deepseek personally and this is the only model that works without trying to actively bankcrypt me.

                                    • seemaze 1 hour ago

                                      Perhaps unintentional, but I find 'bankrypt' to be a thoroughly interesting portmonteau.

                                      I'm not sure if it's when you run out of crypto, or when your bank gets hit by ransomeware.

                                  • Sphax 2 hours ago

                                    That is some insane value. I've been using GLM Coding Plan Max with GLM 5.1 for a while and i've tested DeepSeek V4 Pro maybe for 3 weeks now and I found it to be better than GLM 5.1 for complex coding tasks. I've used 65m tokens and with that price it cost me $1.5, that's really cheap.

                                    • DeathArrow 1 hour ago

                                      I think Deepseek uses much more tokens than other models.

                                    • wolttam 45 minutes ago

                                      I was hoping they were going to do this.

                                      I'll keep running Flash locally for the stuff I care about data privacy, but the value of Pro through their API is unreal for anything else (and I want to give them my training data as long as they keep putting out open models).

                                      • gertlabs 36 minutes ago

                                        Even with the V4 Pro discount, the V4 Flash model gives you the best performance per unit dollar, and better performance overall for agentic, tool-heavy workloads. V4 Pro is smarter in one-shot reasoning, but at a significant speed difference. The performance, cost, and speed, makes V4 Flash our top flash model today by far.

                                        Data at https://gertlabs.com/rankings

                                        • cold_harbor 2 hours ago

                                          their MLA architecture cuts KV cache by ~5-13x vs standard attention. that's why inference is actually cheaper to run, not just a price war to gain market share.

                                          • zozbot234 1 hour ago

                                            That's also a game changer for local inference. It unlocks long contexts, batched inference and storing the KV cache to disk on ordinary consumer platforms.

                                          • margorczynski 1 hour ago

                                            Maybe the Chinese are playing the long game by trying to bankrupt the US competition? Because there's no way this is financially viable.

                                            • ecommerceguy 1 hour ago

                                              Small team, cheap electricity, very efficient models. Many western companies operate at a loss to gain market share. Why can't the Chinese?

                                              • missedthecue 37 minutes ago

                                                DeepSeek hasn't raised enough money to be actively selling tokens at a loss. They have a small team, extremely low overhead relative to other labs, operate in a place with the essentially the cheapest commercial electricity rates in the world, and their architecture lends itself very well to cheap inference.

                                                • odie5533 1 hour ago

                                                  Inference is cheap. I bet the financials of these Chinese companies are much saner looking than any of the big US AI companies which are bloated by investors.

                                                  • tencentshill 1 hour ago

                                                    Federal ban incoming then. They did it with cars already.

                                                    • jdgoesmarching 1 hour ago

                                                      If you think heavily subsidizing AI models isn’t financially viable, I have some bad news for you about US AI companies.

                                                      Deepseek has made some incredible advancements in model efficiency, and more importantly actually publishes those advancements so everyone can benefit from them.

                                                      • zozbot234 1 hour ago

                                                        US suppliers are fine and won't go bankrupt, they can just focus on serving bigger "Pro" class models from their large datacenters. In fact cheap AI makes the bigger and smarter models more useful because it's smart enough to draft a clear question to the model, which helps minimize wasted tokens.

                                                      • Reubend 2 hours ago

                                                        Props to them. That makes DeepSeek v4 Pro extremely cheap compared to others, even in the same category. Look at these prices per million outputs tokens:

                                                        DeepSeek V4 Pro: $0.87

                                                        Qwen 3.7 Max: $7.50

                                                        Grok 4.3: $2.50

                                                        GLM 1.5: $3.08

                                                        Opus 4.7: $25.00

                                                        GPT-5.5: $30.00

                                                        • Arcuru 1 hour ago

                                                          It's actually even cheaper when you look at the cache read costs. Those costs can dominate in agent workflows and DeepSeek's cost for cache reads is insanely low comparatively. At $.003626/M tokens, the cheapest other thing on your list is >$.2/M tokens. That's on the scale of 100x cheaper.

                                                          • marksully 48 minutes ago

                                                            *GLM 5.1

                                                          • doctoboggan 1 hour ago

                                                            I am more worried about accidental data leak (agent reading env file for example) with the Chinese hosted models compared to the US hosted models. Am I wrong to suspect that the Chinese government might be more likely to scan all chats and save useful information compared to the US government or company?

                                                            I hesitated to even post this comment as it sounds biased and xenophobic. I would love for someone to convince me I am wrong. Does anyone have any insight into the company behind deepseek hosting, and what their history of respecting data privacy is?

                                                            • opsnooperfax 23 minutes ago

                                                              I would not be shocked if they do that. I would not be terribly shocked that the US-headquartered models do that for another government either. As far as data confidentiality goes, I wouldn’t hold my breath. Microsoft checks all those enterprise boxes, right? Yet, Azure still gets breached once in a while.

                                                              • wkcheng 1 hour ago

                                                                Just use it through something like Azure. They host the entire model and serve it from the US. I'm sure that there are other providers like this.

                                                                We use it that way and it works great.

                                                                • 3s 1 hour ago

                                                                  It's not an unreasonable concern, which is why most US companies prefer to go with AWS bedrock, or even one of the AI labs, and typically request zero data retention agreements. But leaking is a concern no matter where it's hosted, it's just the incentives that change IMO. For example, the labs do scan every chat and train on data not covered under enterprise ZDR agreements. Law enforcement can request access to all user data with a valid warrant or in an emergency context [1]

                                                                  If you're interested in trying DeepSeek V4 privately, you can try Tinfoil (tinfoil.sh) where all models are hosted in an attested secure hardware enclave, making the inference end-to-end private. Full disclosure: I'm one of the cofounders.

                                                                  [1] https://cdn.openai.com/trust-and-transparency/openai-law-enf...

                                                                  • dualvariable 18 minutes ago

                                                                    I'm not important enough for anyone in China to go out of their way to attack me. And DeepSeek has to maintain a sufficient level of trust so that users keep using their platform--they can't just act like a keylogger attacking everyone's crypto wallets or trust collapses.

                                                                    If I was working on something that the Chinese government considered of strategic importance, then I would certainly be worried about it. But I don't do that.

                                                                    I'm much more worried about techbros in this country using their LLMs to extensively profile me and produce something vastly more dystopian in this country than the real or imagined social credit scores in China. The people trying to convince you that the Chinese government are the people you should be worried about (as an individual in the United States) are probably the people you really need to be worried about.

                                                                    • giwook 1 hour ago

                                                                      I think there is a nonzero chance of that happening. Beijing could at any point decide that DeepSeek has become too powerful and/or is a major export and start to insert themselves (assuming they have not already).

                                                                      There are widespread reports about how foreign actors (not limited to China) have infiltrated critical networks across many industries in the US en masse and are simply waiting for the right time to exploit them. Frontier models are simply another attack vector (and much more easily exploitable when you think about it).

                                                                      The fact is that there is potential for this with any cloud-hosted model, whether it is intentional by the actual company building the models or a malicious actor is able to exploit a vulnerability.

                                                                      • jug 1 hour ago

                                                                        This is a risk although then this is fortunately a model that isn't tied to Chinese hosting. But indeed something to consider if using straight DeepSeek.com.

                                                                        • nivekney 1 hour ago

                                                                          User data integrity definitely should be a concern. It's also known that regulations is being outpaced, so the cost of being/using frontier products is a double-edged sword for sure.

                                                                          • jdgoesmarching 1 hour ago

                                                                            More likely? US tech leaders have been fully capitulating to the surveillance state for over a decade. Why do I care what China does with my data? I don’t live in China and never plan to.

                                                                            The tech bro threat model has always been pure jingoism and xenophobia. Ironically, the worst thing a Chinese company has done with my data is sell Tiktok to an American technofascist.

                                                                          • dburkland 59 minutes ago

                                                                            I've had a ton of success when pairing Opus 4.7 for planning w/ DeepSeek V4 Flash in opencode. Best part is DeepSeek V4 Flash is Free through opencode Zen.

                                                                            • velomash 1 hour ago

                                                                              I found that DSV4 wasn't as cheap as its token price. It burns tokens at a pretty high rate

                                                                              • onlyrealcuzzo 49 minutes ago

                                                                                I just canceled Claude Code and Codex today.

                                                                                RIP.

                                                                                Claude literally refuses to finish tasks in auto mode and just keeps saying, now is a good stopping point, when it's 1% done.

                                                                                Codex literally does not follow directions.

                                                                                May as well pay 1/20th the price.

                                                                                Claude seems to have something that looks at how long you've been a customer and then just massively degrades quality.

                                                                                When I started my subscription, Claude had none of these problems.

                                                                                When I first started using Codex, it followed directions and performed well (and fast).

                                                                                2 months into subscriptions they are both unusably terrible.

                                                                                • eiek 28 minutes ago

                                                                                  They’re playing games behind the scenes to massage and manage their earnings.

                                                                                  China is gonna win long term there’s no doubt. The fact that the American firms haven’t created immense escape velocity despite the disparity in spending is quite telling.

                                                                                • bel8 2 hours ago

                                                                                  Great! I have been using DeepSeek 4 Flash high for everything lately.

                                                                                  First accessible model with useable 1 million context window for me.

                                                                                  • belinder 2 hours ago

                                                                                    Anyone using deepseek through a gateway (not sure if right term) so there's no data retention? At work we're going through a few hundred million tokens a day in our app (using anthropic models), and we're looking for something significantly cheaper

                                                                                    • wkcheng 1 hour ago

                                                                                      Use it through Azure! Azure hosts DeepseekV4-Pro and DeepseekV4-Flash themselves. We're using it and it works great.

                                                                                      You don't get the discount that Deepseek is providing, but it's still a cheap model (v4-pro is cheaper than sonnet)

                                                                                      • bel8 2 hours ago

                                                                                        opencode allegedly has contractual no-data-retention policies with their providers.

                                                                                        I recall reading about that in an issue or in their Discord server.

                                                                                        But I would contact them formally to verify that.

                                                                                        • BeetleB 27 minutes ago

                                                                                          They claim it on their OpenCode Zen page.

                                                                                          What's frustrating is that they give no information on who the provider(s) are!

                                                                                        • mlcruz 1 hour ago

                                                                                          I have been using deepseek via deepinfra, afaik they provide no data retention. Im probably going to deploy the full model on their infra instead of paying credits at some point, so far the experience has been pretty good

                                                                                          • goobatrooba 1 hour ago

                                                                                            But do these prices apply if you use a third party go-between? I would expect they then charge their own prices?

                                                                                        • vladgur 58 minutes ago

                                                                                          Which models do folks use for openclaw nowadays

                                                                                          • Havoc 2 hours ago

                                                                                            Neat. I like DS for secondary checks on code. Sometimes spots things other models don't

                                                                                            • rvz 40 minutes ago

                                                                                              Someone can afford to race everyone to zero.

                                                                                              Remember Jevons paradox? [0] It isn't at Anthropic or Microsoft [0], but it is at DeepSeek.

                                                                                              [0] https://www.thelowdownblog.com/2026/05/microsoft-cancels-int...

                                                                                              • sourcecodeplz 1 hour ago

                                                                                                Honestly I haven't even tried the Pro model. Flash was just so much more than I expected I just keep working with it. Thank you deepseek team

                                                                                                • kingjimmy 2 hours ago

                                                                                                  is this the Huawei chip difference?

                                                                                                  • chvid 44 minutes ago

                                                                                                    That is probably why they were a few months delayed. But could be interesting to see their hosting / network / colocation setup.

                                                                                                  • guelo 1 hour ago

                                                                                                    Even at these prices I find claude and codex subscriptions to be cheaper than per-token pricing when my usage is hovering around the session limits. I guess the subscriptions are heavily subsidized.