Provenance Is the New Version Control

aicoding.leaflet.pub

24 points by gpi 4 hours ago


mmoustafa - a minute ago

I wrote an article on this exact issue (albeit more simple-minded) and I suggested a rudimentary way of tracking provenance with "reasoning traces".

Would love people's thoughts on this: https://0xmmo.notion.site/Preventing-agent-doom-loops-with-p...

gritzko - an hour ago

LLMs can implement red-black trees with impressive speed, quality and even some level of determinism. Here I buy the argument. Once we take something that is not already on GitHub in a thousand different flavors, it becomes an adventure. Like real adventure.

So what did you say about version contol?

alphabetag675 - 6 minutes ago

If you could regenerate some code from another code in a deterministic manner, then congrats you have developed a compiler and a high-level language.

RHSeeger - an hour ago

I'm a bit confused by this because a given set of inputs can produce a different output, and different behaviors, each time it is run through the AI.

> By regenerable, I mean: if you delete a component, you can recreate it from stored intent (requirements, constraints, and decisions) with the same behavior and integration guarantees.

That statement just isn't true. And, as such, you need to keep track of the end result... _what_ was generated. The why is also important, but not sufficient.

Also, and unrelated, the "reject whitespace" part bothered me. It's perfectly acceptable to have whitespace in an email address.

viraptor - an hour ago

I'm not sure if this actually needs a new system. Git commits have the message, arbitrary trailers, and note objects. If this was of source control is useful, I'm sure it could be prototyped on top of git first.

atoav - 3 minutes ago

[delayed]

jayd16 - 2 hours ago

What if I told you a specification can also be measured (and source controlled) in lines?

klodolph - an hour ago

> Once an AI can reliably regenerate an implementation from specification…

I’m sorry but it feels like I got hit in the head when I read this, it’s so bad. For decades, people have been dreaming of making software where you can just write the specification and don’t have to actually get your hands dirty with implementation.

1. AI doesn’t solve that problem.

2. If it did, then the specification would be the code.

Diffs of pure code never really represented decisions and reasoning of humans very well in the first place. We always had human programmers who would check code in that just did stuff without really explaining what the code was supposed to do, what properties it was supposed to have, why the author chose to write it that way, etc.

AI doesn’t change that. It just introduces new systems which can, like humans, write unexplained, shitty code. Your review process is supposed to catch this. You just need more review now, compared to previously.

You capture decisions and specifications in the comments, test cases, documentation, etc. Yeah, it can be a bit messy because your specifications aren’t captured nice and neat as the only thing in your code base. But this is because that futuristic, Star Trek dream of just giving the computer broad, high-level directives is still a dream. The AI does not reliably reimplement specifications, so we check in the output.

The compiler does reliably reimplement functionally identical assembly, so that’s why we don’t check in the assembly output of compilers. Compilers are getting higher and higher level, and we’re getting a broader range of compiler tools to work with, but AI are just a different category of tool and we work with them differently.

akoboldfrying - an hour ago

Yes, in theory you can represent every development state as a node in a DAG labelled with "natural language instructions" to be appended to the LLM context, hash each of the nodes, and have each node additionally point to an (also hashed) filesystem state that represents the outcome of running an agent with those instructions on the (outcome code + LLM context)s of all its parents (combined in some unambiguous way for nodes with multiple in-edges).

The only practical obstacle is:

> Non-deterministic generators may produce different code from identical intent graphs.

This would not be an obstacle if you restrict to using a single version of a local LLM, turn off all nondeterminism and record the initial seed. But for now, the kinds of frontier LLMs that are useful as coding agents run on Someone Else's box, meaning they can produce different outcomes each time you run them -- and even if they promise not to change them, I can see no way to verify this promise.

hekkle - an hour ago

TL;DR, the author claims that you should record the reasons for change, rather than the code changes themselves...

CONGRATULATIONS: you have just 'invented' documentation, specifically a CHANGE_LOG.