FAWK: LLMs can write a language interpreter

martin.janiczek.cz

194 points by todsacerdoti 13 hours ago


williamcotton - 10 hours ago

I've been working on my own web app DSL, with most of the typing done by Claude Code, eg,

  GET /hello/:world
    |> jq: `{ world: .params.world }`
    |> handlebars: `<p>hello, {{world}}</p>`
  
  describe "hello, world"
    it "calls the route"
      when calling GET /hello/world
      then status is 200
      and output equals `<p>hello, world</p>`
Here's a WIP article about the DSL:

https://williamcotton.com/articles/introducing-web-pipe

And the DSL itself (written in Rust):

https://github.com/williamcotton/webpipe

And an LSP for the language:

https://github.com/williamcotton/webpipe-lsp

And of course my blog is built on top of Web Pipe:

https://github.com/williamcotton/williamcotton.com/blob/mast...

It is absolutely amazing that a solo developer (with a demanding job, kids, etc) with just some spare hours here and there can write all of this with the help of these tools.

jph00 - 6 hours ago

There’s already a language that provides all the features of awk plus modern language conveniences, and is available on every system you can think of. It’s Perl.

It even comes with an auto translator for converting awk to Perl: https://perldoc.perl.org/5.8.4/a2p

It also provides all the features of sed.

The command line flags to learn about to get all these features are: -p -i -n -l -a -e

Philpax - 7 hours ago

I've also had success with this. One of my hobby horses is a second, independent implementation of the Perchance language for creating random generators [0]. Perchance is genuinely very cool, but it was never designed to be embedded into other things, and I've always wanted a solution for that.

Anyway, I have/had an obscene amount of Claude Code Web credits to burn, so I set it to work on implementing a completely standalone Rust implementation of Perchance using documentation and examples alone, and, well, it exists now [1]. And yes, it was done entirely with CCW [2].

It's deterministic, can be embedded anywhere that Rust compiles to (including WASM), has pretty readable code, is largely pure (all I/O is controlled by the user), and features high-quality diagnostics. As proof of it working, I had it build and set up the deploys for a React frontend [3]. This also features an experimental "trace" feature that Perchance-proper does not have, but it's experimental because it doesn't work properly :p

Now, I can't be certain it's 1-for-1-spec-accurate, as the documentation does not constitute a spec, and we're dealing with randomness, but it's close enough that it's satisfactory for my use cases. I genuinely think this is pretty damn cool: with a few days of automated PRs, I have a second, independent mostly-complete interpreter for a language that has never had one (previous attempts, including my own, have fizzled out early).

[0]: https://perchance.org/welcome [1]: https://github.com/philpax/perchance-interpreter [2]: https://github.com/philpax/perchance-interpreter/pulls?q=is%... [3]: https://philpax.me/experimental/perchance/

badsectoracula - 10 hours ago

A related test i did around the beginning of the year: i came up with a simple stack-oriented language and asked an LLM to solve a simple problem (calculate the squared distance between two points, the coordinates of which are already in the stack) and had it figure out the details.

The part i found neat was that i used a local LLM (some quantized version of QwQ from around December or so i think) that had a thinking mode so i was able to follow the thought process. Since it was running locally (and it wasn't a MoE model) it was slow enough for me to follow it in realtime and i found fun watching the LLM trying to understand the language.

One other interesting part is the language description had a mistake but the LLM managed to figure things out anyway.

Here is the transcript, including a simple C interpreter for the language and a test for it at the end with the code the LLM produced:

https://app.filen.io/#/d/28cb8e0d-627a-405f-b836-489e4682822...

vidarh - 12 hours ago

It's a fun post, and I love language experiments with LLMs (I'm close to hitting the weekly limit of my Claude Max subscription because I have a near-constantly running session working on my Ruby compiler; Claude can fix -- albeit with messy code sometimes -- issues that requires complex tracing of backtraces with gdb, and fix complex parser interactions almost entirely unaided as long as it has a test suite to run).

But here's the Ruby version of one of the scripts:

    BEGIN {
      result = [1, 2, 3, 4, 5]
        .filter {|x| x % 2 == 0 }
        .map {|x| x * x}
        .reduce {|acc,x| acc + x }
     puts "Result: #{result}"
    }
The point being that running a script with the "-n" switch un runs BEGIN/END blocks and puts an implicit "while gets ... end" around the rest. Adding "-a" auto-splits the line like awk. Adding "-p" also prints $_ at the end of each iteration.

So here's a more typical Awk-like experience:

    ruby -pe '$_.upcase!' somefile.txt ($_ has the whole line)
Or:

    ruby -F, -ane '$F[1]' # Extracts the second field field -F sets the default character to split on, and -a adds an implicit $F = $_.split.
That is not to detract from what he's doing because it's fun. But if your goal is just to use a better Awk, then Ruby is usually better Awk, and so, for that matter, is Perl, and for most things where an Awk script doesn't fit on the command line the only reason to really use Awk is that it is more likely to be available.
ikari_pl - 10 hours ago

Today, Gemini wrote a python script for me, that connects to Fibaro API (local home automation system), and renames all the rooms and devices to English automatically.

Worked on the first run. I mean, the second, because the first run was by default a dry run printing a beautiful table, and the actual run requires a CLI arg, and it also makes a backup.

It was a complete solution.

moriturius - 44 minutes ago

It sure can! I'm creating my language to do AoC in this year! https://github.com/viro-lang/viro

qsort - 12 hours ago

The money shot: https://github.com/Janiczek/fawk

Purely interpretive implementation of the kind you'd write in school, still, above and beyond anything I'd have any right to complain about.

low_tech_love - 9 hours ago

Slightly off-topic: I have an honest question for all of you out there who love Advent of Code, please don't take this the wrong way, it is a real curiosity: what is it for you that makes the AoC challenge so special when compared with all of the thousands of other coding challenges/exercises/competitions out there? I've been doing coding challenges for a long time and I never got anything special out of AoC, so I'm really curious. Is it simply that it reached a wider audience?

Y_Y - 12 hours ago

I've been trying to get LLMs to make Racket "hashlangs"† for years now, both for simple almost-lisps and for honest-to-god different languages, like C. It's definitely possible, raco has packages‡ for C, Python, J, Lua, etc.

Anyway so far I haven't been able to get any nice result from any of the obvious models, hopefully they're finally smart enough.

https://williamjbowman.com/tmp/how-to-hashlang/

https://pkgd.racket-lang.org/pkgn/search?tags=language

l9o - 6 hours ago

I've been working on something similar, a typed shell scripting language called shady (hehe). haven't shared it because like 99% of the code was written by claude and I'm definitely not a programming language expert. it's a toy really.

but I learned a ton building this thing. it has an LSP server now with autocompletion and go to definition, a type checker, a very much broken auto formatter (this was surprisingly harder to get done than the LSP), the whole deal. all the stuff previously would take months or a whole team to build. there's tons of bugs and it's not something I'd use for anything, nu shell is obviously way better.

the language itself is pretty straightforward. you write functions that manipulate processes and strings, and any public function automatically becomes a CLI command. so like if you write "public deploy $env: str $version: str = ..." you get a ./script.shady deploy command with proper --help and everything. it does so by converting the function signatures into clap commands.

while building it I had lots of process pipelines deadlocking, type errors pointing at the wrong spans, that kind of thing. it seems like LLMs really struggle understanding race conditions and the concept of time, but they seem to be getting better. fixed a 3-process pipeline hanging bug last week that required actually understanding how the pipe handles worked. but as others pointed out, I have also been impressed at how frequently sonnet 4.5 writes working code if given a bit of guidance.

one thing that blew my mind: I started with pest for parsing but when I got to the LSP I realized incremental parsing would be essential. because I was diligent about test coverage, sonnet 4.5 perfectly converted the entire parser to tree-sitter for me. all tests passed. that was wild. earlier versions of the model like 3.5 or 3.7 struggled with Rust quite a bit from my experience.

claude wrote most of the code but I made the design decisions and had to understand enough to fix bugs and add features. learned about tree-sitter, LSP protocol, stuff I wouldn't have touched otherwise.

still feels kinda lame to say "I built this with AI" but also... I did build it? and it works? not sure where to draw the line between "AI did it" and "AI helped me do it"

anyway just wanted to chime in from someone else doing this kind of experiment :)

nl - an hour ago

I've done something similar here but for Prolog: https://github.com/nlothian/Vibe-Prolog

It's interesting comparing what different LLMs can get done.

evacchi - 4 hours ago

I've also been thinking about generating DSLs https://blog.evacchi.dev/posts/2025/11/09/the-return-of-lang...

andsoitis - 10 hours ago

> And it did it.

it would be nice when people do these things give us a transcript or recording of their dialog with the LLM so that more people can learn.

skydhash - 12 hours ago

Commendable effort, but I expected at least a demo, which would showcase working code (even if it’s hacky). It’s like someone talking about a sheet music without playing it once.

fsmv - 5 hours ago

I await your blog post about how it only appeared to work at first and then had major problems when you actually dug in.

slybot - 12 hours ago

I did AoC 2021 until D10 using awk, it was fun but not easy and couldn't proceed further: https://github.com/nusretipek/Advent-of-Code-2021

nbardy - 10 hours ago

They have been able to write languages for two years now.

I think I was the first to write an LLM language and first to use LLMs to write a language with this project. (Right at ChatGPT launch, gpt-3.5 https://github.com/nbardy/SynesthesiaLisp

runeks - 7 hours ago

I think it would be super interesting to see how the LLM handles extending/modifying the code it has written. Ie. adding/removing features, in order to simulate the life cycle of a normal software project. After all, LLM-produced code would only be of limited use if it’s worse at adding new features than humans are.

As I understand, this would require somehow “saving the state” of the LLM, as it exists after the last prompt — since I don’t think the LLM can arrive at the same state by just being fed the code it has written.

jamesu - 12 hours ago

A few months ago I used ChatGPT to rewrite a bison based parser to recursive descent and was pretty surprised how well it held up - though I still needed to keep prompting the AI to fix things or add elements it skipped, and in the end I probably rewrote 20% of it because I wasn't happy with its strange use of C++ features making certain parts hard to follow.

artpar - 12 hours ago

I wrote two

jslike (acorn based parser)

https://github.com/artpar/jslike

https://www.npmjs.com/package/jslike

wang-lang ( i couldn't get ASI to work like javascript in this nearley based grammar )

https://www.npmjs.com/package/wang-lang

https://artpar.github.io/wang/playground.html

https://github.com/artpar/wang

runeks - 10 hours ago

> I only interacted with the agent by telling it to implement a thing and write tests for it, and I only really reviewed the tests.

Did you also review the code that runs the tests?

girishso - 9 hours ago

> the basic human right of being allowed to return arrays from functions

While working in C, can’t count number of times I wanted to return an array

keepamovin - 12 hours ago

Yes! I'm currently using copilot + antigravity to implement a language with ergonomic syntax and semantics that lowers cleanly to machine code targeting multiple platforms, with a focus on safety, determinism, auditability and fail-fast bugs. It's more work than I thought but the LLMs are very capable.

I was dreaming of a JS to machine code, but then thought, why not just start from scratch and have what I want? It's a lot of fun.