Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican

162 points by simonw a day ago

Very mildly related: Pelicans on Bicycles - Raytracer Edition:

https://blog.nawaz.org/posts/2025/Oct/pelican-on-a-bike-rayt...

simonw - a day ago

Hah, love the weird floating eggs!
- dr_dshiv - 14 hours ago
  
  Perhaps not eggs, but misplaced pelican neck pouches?
  Opus4.1: https://claude.ai/public/artifacts/b47c2dd5-41a6-452c-8701-5...
  Sonnet 4.5: https://gemini.google.com/share/a8ebea2c31dd
  Gemini 2.5pro: https://gemini.google.com/share/a8ebea2c31dd
  “a pelican riding on a bicycle in 3d. Works for mobile“
- simonw - 21 hours ago
  
  I blogged about this here, including running your benchmark against gpt-5-codex-mini which did the worst example I've seen by far: https://simonwillison.net/2025/Nov/9/pelican-on-a-bike-raytr...
  Here's the codex-mini attempt: https://static.simonwillison.net/static/2025/povray-pelican-...

Installing Rust projects is usually as simple as calling `cargo install`. No need to wait for 5-30 minutes until LLM figures this out.

People are delegating way too much to LLMs. In turn, this makes your own research or problem-solving skills less sharp.

simonw - a day ago
Quoting my article:
> This is a useful starting point for a project like this—in figuring out the compile step the coding agent gets seeded with a little bit of relevant information about the project, and if it can compile that means it can later partially test the code it is writing while it works.
"Figure out how to build this" is a shortcut for getting a coding agent primed for future work. If you look at the transcript you can see what it did: https://gistpreview.github.io/?ddabbff092bdd658e06d8a2e8f142...
```
   Ran ls
    └ AGENTS.md
      CHANGELOG.md
        +13 lines
      scripts
      sdk
  
   Ran cat README.md
    └ <p align="center"><code>npm i -g @openai/codex</code><br />or <code>brew
      install --cask codex</code></p>
  
        +105 lines
  
      This repository is licensed under the [Apache-2.0 License](LICENSE).
  
   Ran rg -n "rust" -n docs/install.md
    └ 23:curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
      25:rustup component add rustfmt
      26:rustup component add clippy
  
   Ran sed -n '1,200p' docs/install.md
    └ ## Install & build
  
        +36 lines
      cargo test
      ```
  
   Im gathering the build steps for the Rust tool, focusing on running cargo
    build inside the codex-rs directory. I'll check dependencies and confirm
    the instructions before actually building, aiming to provide a clear summary
    afterward.
  
   Ran cargo build
    └     Updating git repository `https://github.com/nornagon/crossterm`
          Updating git repository `https://github.com/nornagon/ratatui`
        +825 lines
         Compiling codex-cli v0.0.0 (/Users/simon/Dropbox/dev/codex/codex-rs/cli)
          Finished `dev` profile [unoptimized + debuginfo] target(s) in 1m 27s
  
   Ran ls target/debug | head
    └ apply_patch
      apply_patch.d
        +6 lines
      codex-execpolicy
```
That's a decent starting point on seeding the context with information that's relevant to making and then testing the modifications I'm about to ask for.
- adastra22 - a day ago
  
  What useful context is in there? How to call “cargo build”? It already knows that.
  - simonw - a day ago
    
    The README files that it looked at, and the directory structure it explored.
    It now knows what the project is, what dependencies it uses, how it's laid out and the set of binaries that it generates.
    Even more importantly: it knows that the project can be built without errors. If it tries a build later and sees an error it will know that the error was caused by code it had modified.
    
    - 21 hours ago
    
    [deleted]
    
    risyachka - a day ago
    
    You could just tell it to check out readme, but I suspect it would have checked it out anyway or figured out the type of project and how it is structured as a first step of any other command you give it as without it it is impossible to add or update the project.
    
    cadamsdotcom - a day ago
    
    For a Rust developer, neglecting their ability to debug cargo build issues puts their career at risk. For someone like that, letting AI handle it would be a really shortsighted move.
    But Simon isn’t a Rust developer - he’s a motivated individual with a side project. He can now speedrun the part he’s not interested in. That doesn’t affect anyone else’s decisions, you can still choose to learn the details. Ability to skip it if you wish, is a huge win for everyone.
    
    antonvs - 19 hours ago
    
    > He can now speedrun the part he’s not interested in.
    The reductio that people tend to be concerned about is, what if someone is not interested in any aspect of software development, and just wants to earn money by doing it? The belief is that the consequences then start becoming more problematic.
    
    simonw - 19 hours ago
    
    Those people are their own worst enemies.
    Some people will always look for ways to "cheat". I don't want to hold back everyone else just because a few people will harm themselves by using this stuff as a replacement for learning and developing themselves.
    
    antonvs - 18 hours ago
    
    Do you genuinely believe that this only applies to "a few people"?
    This new post gets at the issue: https://news.ycombinator.com/item?id=45868271