Senior Developer Skills in the AI Age
manuel.kiessling.net379 points by briankelly a day ago
379 points by briankelly a day ago
I find myself spending so much time correcting bad code I constantly wonder if I'm saving time. I think I am, but a much higher percentage of my time is doing hard work of evaluating and thinking about things rather than mentally easy things the AI is good at, but what used to give me a little bit of a break.
Sometimes I liken it to my experience with stereoscopic images (I have never been able to perceive them) -- I know there's something there but I frequently don't get it.
The premise might possibly be true, but as an actually seasoned Python developer, I've taken a look at one file: https://github.com/dx-tooling/platform-problem-monitoring-co...
All of it smells of a (lousy) junior software engineer: from configuring root logger at the top, module level (which relies on module import caching not to be reapplied), over not using a stdlib config file parser and building one themselves, to a raciness in load_json where it's checked for file existence with an if and then carrying on as if the file is certainly there...
In a nutshell, if the rest of it is like this, it simply sucks.
The more I browse through this, the more I agree. I feel like one could delete almost all comments from that project without losing any information – which means, at least the variable naming is (probably?) sensible. Then again, I don't know the application domain.
Also…
def _save_current_date_time(current_date_time_file: str, current_date_time: str) -> None:
with Path(current_date_time_file).open("w") as f:
f.write(current_date_time)
there is a lot of obviously useful abstraction being missed, wasting lines of code that will all need to be maintained.The scary thing is: I have seen professional human developers write worse code.
> I feel like one could delete almost all comments from that project without losing any information
I far from a heavy LLM coder but I’ve noticed a massive excess of unnecessary comments in most output. I’m always deleting the obvious ones.
But then I started noticing that the comments seem to help the LLM navigate additional code changes. It’s like a big trail of breadcrumbs for the LLM to parse.
I wouldn’t be surprised if vibe coders get trained to leave the excess comments in place.
More tokens -> more compute involved. Attention-based models work by attending every token with each other, so more tokens means not only having more time to "think" but also being able to think "better". That is also at least part of the reason why o1/o3/R1 can sometimes solve what other LLMs could not.
Anyway, I don't think any of the current LLMs are really good for coding. What it's good at is copy-pasting (with some minor changes) from the massive code corpus it has been pre-trained. For example, give it some Zig code and it's straight unable to solve even basic tasks. Same if you give it really unique task, or if you simply ask for potential improvements of your existing code. Very, very bad results, no signs of out-of-box thinking whatsoever.
BTW: I think what people are missing is that LLMs are really great at language modeling. I had great results, and boosts in productivity, just by being able to prepare the task specification, and do quick changes in that really easily. Once I have a good understanding of the problem, I can usually implement everything quickly, and do it in much much better way than any LLM can currently do.
It doesn't hurt that the model vendors get paid by the token, so there's zero incentive to correct this pattern at the model layer.
or the model get trained from teaching code which naturally contains lots of comments.
the dev is just lazy to not include them anymore, wheres the model doesn't really need to be lazy, as paid by the token
What’s worse, I get a lot of comments left saying what the AI did, not what the code does or why. Eg “moved this from file xy”, “code deleted because we have abc”, etc. Completely useless stuff that should be communicated in the chat window, not in the code.
LLMs are also good at commenting on existing code.
It’s trivial to ask Claude via Cursor to add comments to illustrate how some code works. I’ve found this helpful with uncommented code I’m trying to follow.
I haven’t seen it hallucinate an incorrect comment yet, but sometimes it will comment a TODO that a section should be made more more clear. (Rude… haha)
I have seldomly seen insightful comments from LLMs. It is usually better than "comment what the line does" usefull for getting a hint about undocumented code, but not by much. My experience is limited, but what I have I do agree with. As long as you keep on the beaten path it is ok. Comments are not such a thing.
> there is a lot of obviously useful abstraction being missed, wasting lines of code that will all need to be maintained.
This is a human sentiment because we can fairly easily pick up abstractions during reading. AIs have a much harder time with this - they can do it, but it takes up very limited cognitive resources. In contrast, rewriting the entire software for a change is cheap and easy. So to a point, flat and redundant code is actually beneficial for a LLM.
Remember, the code is written primarily for AIs to read and only incidentally for humans to execute :)
>The scary thing is: I have seen professional human developers write worse code.
This is kind of the rub of it all. If the code works, passes all relevant tests, is reasonably maintainable, and can be fitted into the system correctly with a well defined interface, does it really matter? I mean at that point its kind of like looking at the output of a bytecode compiler and being like "wow what a mess". And it's not like they can't write code up to your stylistic standards, it's just literally a matter of prompting for that.
> If the code works, passes all relevant tests, is reasonably maintainable, and can be fitted into the system correctly with a well defined interface, does it really matter?
You're not wrong here, but there's a big difference in programming one-off tooling or prototype MVPs and programming things that need to be maintained for years and years.
We did this song and dance pretty recently with dynamic typing. Developers thought it was so much more productive to use dynamically typed languages, because it is in the initial phases. Then years went by, those small, quick-to-make dynamic codebases ended up becoming unmaintainable monstrosities, and those developers who hyped up dynamic typing invented Python/PHP type hinting and Flow for JavaScript, later moving to TypeScript entirely. Nowadays nobody seriously recommends building long-lived systems in untyped languages, but they are still very useful for one-off scripting and more interactive/exploratory work where correctness is less important, i.e. Jupyter notebooks.
I wouldn't be surprised to see the same pattern happen with low-supervision AI code; it's great for popping out the first MVP, but because it generates poor code, the gung-ho junior devs who think they're getting 10x productivity gains will wisen up and realize the value of spending an hour thinking about proper levels of abstraction instead of YOLO'ing the first thing the AI spits out when they want to build a system that's going to be worked on by multiple developers for multiple years.
I think the productivity gains of dynamic typed languages were real, and based on two things: dynamic typing (can) provide certain safety properties trivially, and dynamic typing neatly kills off the utterly inadequate type systems found in mainstream languages when they were launched (the 90s, mostly).
You'll notice the type systems being bolted onto dynamic languages or found in serious attempts at new languages are radically different than the type systems being rejected by the likes of javascript, python, ruby and perl.
> those small, quick-to-make dynamic codebases ended up becoming unmaintainable monstrosities
In my experience, type checking / type hinting already starts to pay off when more than one person is working on an even small-ish code base. Just because it helps you keep in mind what comes/goes to the other guy's code.
And in my experience "me 3 months later" counts as a whole second developer that needs accommodating. The only time I appreciate not having to think about types is on code that I know I will never, ever come back to—stuff like a one off bash script.
> "me 3 months later" counts as a whole second developer
A fairly incompetent one, in my experience. And don't even get me started on "me 3 months ago", that guy's even worse.
"How has that shit ever worked?"
Me, looking at code 100% written by me last year.
It gets worse with age and size of the project. I’m getting the same vibes, but for code written by me last month.
Yep, I've seen type hinting even be helpful without a type checker in python. Just as a way for devs to tell each other what they intend on passing. Even when a small percent of the hints are incorrect, having those hints there can still pay off.
I'm certainly extremely happy for having an extensive type system in my daily driver languages especially when working with AI coding assistance — it's yet another very crucial guard rail that ensures that keeps the AI on track and makes a lot of fuckups downright impossible.
The ML world being nearly entirely in Python, much of it untyped (and that the Python type system is pretty weak) is really scary.
>The ML world being nearly entirely in Python, much of it untyped (and that the Python type system is pretty weak) is really scary
I think this has a ton to do with the mixed results from "vibe coding" we've seen as the codebase grows in scope and complexity. Agents seem to break down without a good type system. Same goes for JS.
I've just recently started on an Objective-C project using Cline, and it's like nirvana. I can code out an entire interface and have it implemented for me as I'm going. I see no reason it couldn't scale infinitely to massive LOC with good coding practices. The real killer feature is header files. Being able to have your entire projects headers in context at all time, along with a proper compiler for debugging, changes the game for how agents can reason on the whole codebase.
> You're not wrong here, but there's a big difference in programming one-off tooling or prototype MVPs and programming things that need to be maintained for years and years.
Humans also worry about their jobs, especially in PIP-happy companies; they are very well known for writing intentionally over-complicated code that only they understand so that they are irreplaceable
I'm not convinced this actually happens. Seems more like somthing people assume happens because they don't like whatever codebase is at the new job.
If your TC is 500k-1M and you don’t feel like job hopping anymore, you’d certainly not want to get hit by a random layoff due to insufficient organizational masculinity or whatever. Maintaining a complex blob of mission critical code is one way of increasing your survival chances, though of course nothing is guaranteed.