Software factories and the agentic moment

factory.strongdm.ai

161 points by mellosouls 12 hours ago

If you'd like to try this yourself, you can build an "attractor" by just pointing claude code at their llms.txt. Or if you'd like to save some tokens, you can clone my go version. https://github.com/danshapiro/kilroy This version has a Claude Code skill to help. Tell it to use it's skill to create a dotfile from your requirements. Then tell it to run that dotfile with kilroy.

xyzsparetimexyz - 4 minutes ago

Most of what people like this are actually creating is blogslop.

Alex_L_Wood - 6 hours ago

>If you haven't spent at least $1,000 on tokens today per human engineer, your software factory has room for improvement

…What am I even reading? Am I crazy to think this is a crazy thing to say, or it’s actually crazy?

nine_k - 5 hours ago

$1k per day, 50 work weeks, 5 day a week → $250k a year. That is, to be worth it, the AI should work as well as an engineer that costs a company $250k. Between taxes, social security, and cost of office space, that engineer would be paid, say, $170-180k a year, like an average-level senior software engineer in the US.
This is not an outrageous amount of money, if the productivity is there. More likely the AI would work like two $90k junior engineers, but without a need to pay for a vacation, office space, social security, etc. If the productivity ends up higher than this, it's pure profit; I suppose this is their bet.
The human engineer would be like a tech lead guiding a tea of juniors, only designing plans and checking results above the level of code proper, but for exceptional cases, like when a human engineer would look at the assembly code a compiler has produced.
This does sound exaggeratedly optimistic now, but does not sound crazy.
- richardw - 3 hours ago
  
  It’s a $90k engineer that sometimes acts like a vandal, who never has thoughts like “this seems to be a bad way to go. Let me ask the boss” or “you know, I was thinking. Shouldn’t we try to extract this code into a reusable component?” The worst developers I’ve worked with have better instincts for what’s valuable. I wish it would stop with “the simplest way to resolve this is X little shortcut” -> boom.
  It basically stumbles around generating tokens within the bounds (usually) of your prompt, and rarely stops to think. Goal is token generation, baby. Not careful evaluation. I have to keep forcing it to stop creating magic inline strings and rather use constants or config, even though those instructions are all over my Claude.md and I’m using the top model. It loves to take shortcuts that save GPU but cost me time and money to wrestle back to rational. “These issues weren’t created by me in this chat right now so I’ll ignore them and ship it.” No, fix all the bugs. That’s the job.
  Still, I love it. I can hand code the bits I want to, let it fly with the bits I don’t. I can try something new in a separate CLI tab while others are spinning. Cost to experiment drops massively.
  - latch - an hour ago
    
    Claude code has those "thoughts" you say it never will. In plan mode, it isn't uncommon that it'll ask you: do you want to do this the quick and simple way, or would you prefer to "extract this code into a reusable component". It also will back out and say "Actually, this is getting messy, 'boss' what do you think?"
    I could just be lucky that I work in a field with a thorough specification and numerous reference implementations.
    
    devin - 26 minutes ago
    
    I agree that Claude does this stuff. I also think the Chinese menus of options it provides are weak in their imagination, which means that for thoroughly specified problem spaces with reference implementations you're in good shape, but if you want to come up with a novel system, experience is required, otherwise you will end up in design hell. I think the danger is in juniors thinking the Chinese menu of options provided are "good" options in the first place. Simply because they are coherent does not mean they are good, and the combinations of "a little of this, a little of that" game of tradeoffs during design is lost.
    
    throwaway7783 - 28 minutes ago
    
    This has happened to me too. Claude has stopped and said on occasions "this is a big refactor, and will affect UI as well. Do you want me to do it?"
- lbreakjai - 5 hours ago
  
  $250k a year, for now. What's to stop anthropic for doubling the price if your entire business depends on it? What are you gonna do, close shops?
  - ikr678 - an hour ago
    
    Yeah this is just trading largely known & controllable labour management risks for some fun new unknown software ones.
    You can negotiate with your human engineers for comp, you may not be able to negotaiate with as much power against Anthropic etc (or stop them if they start to change their services for the worse).
  - teaearlgraycold - 5 hours ago
    
    What’s to stop them? Competition.
    
    lbreakjai - 4 hours ago
    
    From whom? OpenAI and Google? Who else has the sort of resources to train and run SOTA models at scale?
    You just reduced the supply of engineers from millions to just three. If you think it was expensive before ...
    
    simonw - 4 hours ago
    
    > Who else has the sort of resources to train and run SOTA models at scale?
    Google, OpenAI, Anthropic, Meta, Amazon, Reka AI, Alibaba (Qwen), 01 AI, Cohere, DeepSeek, Nvidia, Mistral, NexusFlow, Z.ai (GLM), xAI, Ai2, Princeton, Tencent, MiniMax, Moonshot (Kimi) and I've certainly missed some.
    All of those organizations have trained what I'd class as a GPT-4+ level model.
    
    lbreakjai - 4 hours ago
    
    Ah but I said "_... and running at scale_"
    
    simonw - 4 hours ago
    
    Of the list I gave you, at a guess:
    Google, OpenAI, Anthropic, Meta, Amazon, Alibaba (Qwen), Nvidia, Mistral, xAI - and likely more of the Chinese labs but I don't know much about their size.
    
    lbreakjai - 2 hours ago
    
    I guess where I was leading to is who owns the compute that runs those models. Mistral, for example, lists Microsoft and Google as subprocessors (1). Anthropic is (was?) running on GCP and AWS.
    So, we have multiple providers, but for how long? They're all competing for the same hardware and the same energy, and it will naturally converge into an oligopoly. So, if competition doesn't set the floor, what does?
    Local models? If you're not running the best model as fast as you can, then you'll be outpaced by someone that does.
    1. https://trust.mistral.ai/subprocessors