Human coders are still better than LLMs
antirez.com655 points by longwave 10 months ago
655 points by longwave 10 months ago
This matches my experience. I actually think a fair amount of value from LLM assistants to me is having a reasonably intelligent rubber duck to talk to. Now the duck can occasionally disagree and sometimes even refine.
https://en.m.wikipedia.org/wiki/Rubber_duck_debugging
I think the big question everyone wants to skip right to and past this conversation is, will this continue to be true 2 years from now? I don’t know how to answer that question.
LLMs aren't my rubber duck, they're my wrong answer.
You know that saying that the best way to get an answer online is to post a wrong answer? That's what LLMs do for me.
I ask the LLM to do something simple but tedious, and then it does it spectacularly wrong, then I get pissed off enough that I have the rage-induced energy to do it myself.
I'm probably suffering undiagnosed ADHD, and will get stuck and spend minutes picking a function name and then writing a docstring. LLMs do help with this even if they get the code wrong, because I usually won't bother to fix their variables names or docstring unless needed. LLMs can reliably solve the problem of a blank-page.
This. I have ADHD and starting is the hardest part for me. With an LLM it gets me from 0 to 20% (or more) and I can nail it for the rest. It’s way less stressful for me to start now.
very much agree. although lately with how good it is i get hyperfocused and spent more time then i allocated because i ended up wanting to implement more than i planned.
Been suffering the same, I'm used to having so many days (weeks/months) when I just don't get that much done. With LLMs I can take these days and hack around / watch videos / play games while the LLM is working on background and just check the work. Best part is it often leads to some problematic situation that gets me involved and often I'll end up getting a real day of work out of it after I get started.
> LLMs can reliably solve the problem of a blank-page.
This has been the biggest boost for me. The number of choices available when facing a blank page is staggering. Even a bad/wrong implementation helps collapse those possibilities into a countable few that take far less time to think about.
Yeah, keeping me in the flow when I hit one of those silly tasks my brain just randomly says "no let's do something else" to has been the main productivity improving feature of LLMs.
Yes! So many times my brain just skips right over some tasks because it takes too much effort to start. The LLM can give you something to latch onto and work with. It can lay down the starting shape of a function or program and even when it's the wrong shape, you still have something to mold into the correct shape.
The thing about ADHD is that taking a task from nothing to something is often harder than turning that something into the finished product. It's really weird and extremely not fun.
This is the complete opposite for me! I really like a blank page, the thought of writing a prompt destroys my motivation as does reviewing the code that an LLM produces.
As an aside, I'm seeing more an more crap in PRs. Nonsensical use of language features. Really poorly structured code but that is a different story.
I'm not anti LLMs for coding. I use them too. Especially for unit tests.
So much this, the blank page problem is almost gone. Even if it's riddled with errors.
This is my experience, too. As a concrete example, I'll need to write a mapper function to convert between a protobuf type and Go type. The types are mirror reflections of each other, and I feed the complete APIs of both in my prompt.
I've yet to find an LLM that can reliability generate mapping code between proto.Foo{ID string} to gomodel.Foo{ID string}.
It still saves me time, because even 50% accuracy is still half that I don't have to write myself.
But it makes me feel like I'm taking crazy pills whenever I read about AI hype. I'm open to the idea that I'm prompting wrong, need a better workflow, etc. But I'm not a luddite, I've "reached up and put in the work" and am always trying to learn new tools.
An LLM ability to do a task is roughly correlated to the number of times that task has been done on the internet before. If you want to see the hype version, you need to write a todo web app in typescript or similar. So it's probably not something you can fix with prompts, but having a model with more focus on relevant training data might help.
These days, they'll sometimes also RL on a task if it's easy to validate outputs and if it seems worth the effort.
This honestly seems like something that could be better handled with pre-LLM technology, like a 15-line Perl script that reads one on stdin, applies some crufty regexes, and writes the other to stdout. Are there complexities I'm not seeing?
LLMs are a decent search engine a la Google circa 2005.
It's been 20 years since that, so I think people have simply forgotten that a search engine can actually be useful as opposed to ad infested SEO sewage sludge.
The problem is that the conversational interface, for some reason, seems to turn off the natural skepticism that people have when they use a search engine.
> LLMs are a decent search engine a la Google circa 2005.
Statistical text (token) generation made from an unknown (to the user) training data set is not the same as a keyword/faceted search of arbitrary content acquired from web crawlers.
> The problem is that the conversational interface, for some reason, seems to turn off the natural skepticism that people have when they use a search engine.
For me, my skepticism of using a statistical text generation algorithm as if it were a search engine is because a statistical text generation algorithm is not a search engine.
Search engines can be really good still if you have a good idea what you're looking for in the domain you're searching.
Search engines can suck when you don't know exactly what you're looking for and the phrases you're using have invited spammers to fill up the first 10 pages.
They also suck if you want to find something that's almost exactly like a very common thing, but different in some key aspect.
For example, I wanted to find some texts on solving a partial differential equation numerically using 6th-order or higher finite differences, as I wanted to know how to handle boundry conditions (interior is simple enough).
Searching only turned up the usual low-order methods that I already knew.
Asking some LLMs I got some decent answer and could proceed.
Back in the day you could force the search engines to restrict their search scope, but they all seem so eager to return results at all cost these days, making them useless in niche topics.
I agree completely. Personally, I actually like the list of links because I like to compare different takes on a topic. It's also fascinating to see how a scientific study propagates through the media or the way the same news story is treated over time, as trends change. I don't want a single mashed-up answer to a question and maybe that makes me weird but more worrying, whenever I've asked a LLM for an answer to a question on a topic I happen to know a LOT about, the response has been either incorrect or inadequate - "there is currently no information collected on that topic" I do like Perplexity for questions like "without any preamble whatsoever, what is the fastest way to remove a <whatever>stain from X material?"
I almost never bother using Google anymore. When I search for something, I'm usually looking for an answer to question. Now I can just ask the question and get the answer without all the other stuff.
I will often ask the LLM to give me web pages to look at it when I want to do further reading.
As LLMs get better, I can't see myself going back to Google as it is or even as it was.
You get an answer.
If that's the answer, or even the best answer, is impossible to tell without doing the research you're trying to avoid.
If I do research, I get an answer. If that's the answer, or even the best answer, it's impossible to tell. When do I stop looking for the best answer?
If ChatGPT needs to, it will actually do the search for me and then collate the results.
By that logic, it's barely worth reading a newspaper or a book. You don't know if they're giving you accurate information without doing all the research you're trying to avoid.
Recognised newspapers will curate by hiring smart, knowledgeable reporters and funding them to get reliable information. Recognised books will be written by a reliably informed author, and reviewed by other reliably informed people. There are no recognised LLMs, and their method of working precludes reliability.
Malcolm Gladwell, Jonah Lehrer, Daniel Kahneman, Matthew Walker, Stephen Glass? The New York Times, featuring Judith Miller on the existence of WMD, or their award winning podcast "Caliphate"? (Award returned when it became known the whole thing was made up, in case you haven't heard of that one).