DeepSeek V4: The Open-Source Model Frontier Labs Feared

helloai.com

76 points by HelloAi 2 days ago


sschueller - 3 hours ago

The pricing of deepseek v4 flash is incredible. I have been hammering it with kilo code and end up using only cents per day.

ndgold - a day ago

I didn’t read the article but I will say that the value/performance of Deepseek v4 flash is so awesome it is a lifesaver and I’m thrilled for it.

Lucasoato - a day ago

Do you know what kind of machine do I need to run the original DeepSeek v4 pro model with a good tok/s throughput?

superasn - a day ago

I think that is sale pricing at 75% discount till end of May only.

ruxiz - 16 hours ago

Am I able to play with it at home?

LizaBabella - 2 days ago

The cost angle is what most coverage misses. We're using Claude Haiku in production for a small consumer app and the per-call cost is genuinely fine, but the second you have any kind of multilingual fan-out the bill grows non-linearly because the same query gets re-issued in N localized contexts.

Open-weight models with strong multilingual support change the math because you can self-host at marginal cost once you have GPU capacity. DeepSeek's earlier versions already punched above their weight on non-English benchmarks (especially CJK and some Indic languages where the gap to GPT-4 was much narrower than English-only benchmarks suggested).

Two questions for anyone who's actually deployed V4 in production yet:

1. How does it handle Turkish / Slavic morphology compared to V3? In our tests V3 was solid for Russian and respectable for Turkish, but handled compound morphology in agglutinative languages a bit awkwardly.

2. Is the long-context window actually usable end-to-end or does quality degrade past ~64k like with most open models?

Alifatisk - a day ago

[dead]

- 2 days ago
[deleted]
shivang2607 - 2 days ago

In my personal experience, no model comes close to claude when it comes to coding performance. It does not matter what any of the benchmarks says.

Having said that I really hope this model of deepseek, performs significantly on par with the claude saunnet model.