Determination of the fifth Busy Beaver value

arxiv.org

244 points by marvinborner 15 hours ago


nathanrf - 12 hours ago

Here's a high level overview for a programmer audience (I'm listed as an author but my contributions were fairly minor):

[See specifics of the pipeline in Table 3 of the linked paper]

* There are 181 million ish essentially different Turing Machines with 5 states, first these were enumerated exhaustively

* Then, each machine was run for about 100 million steps. Of the 181 million, about 25% of them halt within this memany step, including the Champion, which ran for 47,176,870 steps before halting.

* This leaves 140 million machines which run for a long time.

So the question is: do those TMs run FOREVER, or have we just not run them long enough?

The goal of the BB challenge project was to answer this question. There is no universal algorithm that works on all TMs, so instead a series of (semi-)deciders were built. Each decider takes a TM, and (based on some proven heuristic) classifies it as either "definitely runs forever" or "maybe halts".

Four deciders ended up being used:

* Loops: run the TM for a while, and if it re-enters a previously-seen configuration, it definitely has to loop forever. Around 90% of machines do this or halt, so covers most.

6.01 million TMs remain.

* NGram CPS: abstractly simulates each TM, tracking a set of binary "n-grams" that are allowed to appear on each side. Computes an over-approximation of reachable states. If none of those abstract states enter the halting transition, then the original machine cannot halt either.

Covers 6.005 million TMs. Around 7000 TMs remain.

* RepWL: Attempts to derive counting rules that describe TM configurations. The NGram model can't "count", so this catches many machines whose patterns depend on parity. Covers 6557 TMs.

There are only a few hundred TMs left.

* FAR: Attempt to describe each TM's state as a regex/FSM.

* WFAR: like FAR but adds weighted edges, which allows some non-regular languages (like matching parentheses) to be described

* Sporadics: around 13 machines had complicated behavior that none of the previous deciders worked for. So hand-written proofs (later translated into Rocq) were written for these machines.

All of the deciders were eventually written in Rocq, which means that they are coupled with a formally-verified proof that they actually work as intended ("if this function returns True, then the corresponding mathematical TM must actually halt").

Hence, all 5-states TMs have been formally classified as halting or non-halting. The longest running halter is therefore the champion- it was already suspected to be the champion, but this proves that there wasn't any longer-running 5-state TM.

ks2048 - 9 hours ago

An interesting thing about this is that they figured out a high-level description of what this Busy Beaver program is doing - it's computing a Collatz-like sequence until it terminates.

I'm not sure if that is described in this paper, but I learned about it in this Scott Aaronson talk,

https://www.youtube.com/watch?v=VplMHWSZf5c

e.g see the slide at 31:40.

fedeb95 - 13 hours ago

what's most interesting to me about this research is that it is an online collaborative one. I wonder how many more project such as this there are, and if it could be more widespread, maybe as a platform.

jsjsjxxnnd - 11 hours ago

BB(5) was discovered last year, so why is this paper coming out now? Is there a new development?

TheAmazingRace - 8 hours ago

I can't help but immediately think of Busy Beaver stores from the tri-state (PA-OH-WV) area.

https://busybeaver.com/

purplejacket - 7 hours ago

This paper is delightfully readable.

DarkNova6 - 11 hours ago

As somebody not familiar with this field, is there any tangible benefit to this solution or is it purely academic?

onraglanroad - 9 hours ago

I was wondering if this had any implications for BB(6) but it seems that can't be written down exactly (not enough room in this universe) so BB(5) is the last one we'll see an exact value for.

bravura - 8 hours ago

Question: Is the computational cost of verifying the proof significantly less than the computational cost of creating the proof?

ape4 - 13 hours ago

I can't quite understand - did they use brute force?