Determination of the fifth Busy Beaver value

244 points by marvinborner 15 hours ago

Here's a high level overview for a programmer audience (I'm listed as an author but my contributions were fairly minor):

[See specifics of the pipeline in Table 3 of the linked paper]

* There are 181 million ish essentially different Turing Machines with 5 states, first these were enumerated exhaustively

* Then, each machine was run for about 100 million steps. Of the 181 million, about 25% of them halt within this memany step, including the Champion, which ran for 47,176,870 steps before halting.

* This leaves 140 million machines which run for a long time.

So the question is: do those TMs run FOREVER, or have we just not run them long enough?

The goal of the BB challenge project was to answer this question. There is no universal algorithm that works on all TMs, so instead a series of (semi-)deciders were built. Each decider takes a TM, and (based on some proven heuristic) classifies it as either "definitely runs forever" or "maybe halts".

Four deciders ended up being used:

* Loops: run the TM for a while, and if it re-enters a previously-seen configuration, it definitely has to loop forever. Around 90% of machines do this or halt, so covers most.

6.01 million TMs remain.

* NGram CPS: abstractly simulates each TM, tracking a set of binary "n-grams" that are allowed to appear on each side. Computes an over-approximation of reachable states. If none of those abstract states enter the halting transition, then the original machine cannot halt either.

Covers 6.005 million TMs. Around 7000 TMs remain.

* RepWL: Attempts to derive counting rules that describe TM configurations. The NGram model can't "count", so this catches many machines whose patterns depend on parity. Covers 6557 TMs.

There are only a few hundred TMs left.

* FAR: Attempt to describe each TM's state as a regex/FSM.

* WFAR: like FAR but adds weighted edges, which allows some non-regular languages (like matching parentheses) to be described

* Sporadics: around 13 machines had complicated behavior that none of the previous deciders worked for. So hand-written proofs (later translated into Rocq) were written for these machines.

All of the deciders were eventually written in Rocq, which means that they are coupled with a formally-verified proof that they actually work as intended ("if this function returns True, then the corresponding mathematical TM must actually halt").

Hence, all 5-states TMs have been formally classified as halting or non-halting. The longest running halter is therefore the champion- it was already suspected to be the champion, but this proves that there wasn't any longer-running 5-state TM.

tsterin - 12 hours ago

In Coq-BB5 machines are not run for 100M steps but directly thrown in the pipeline of deciders. Most halting machines are detected by Loops using low parameters (max 4,100 steps) and only 183 machines are simulated up to 47M steps to deduce halting.
In legacy bbchallenge's seed database, machines were run for exactly 47,176,870 steps, and we were left with about 80M nonhalting candidates. Discrepancy comes from Coq-BB5 not excluding 0RB and other minor factors.
Also, "There are 181 million ish essentially different Turing Machines with 5 states", it is important to note that we end up with this number only after the proof is completed, knowing this number is as hard as solving BB(5).
- cvoss - 11 hours ago
  
  Why is knowing that there are 181 M 5-state machines difficult? Is it not just (read*write*move*(state+halt))^state? About 48^5, modulo "essentially different".
  - tsterin - 11 hours ago
    
    Number of 5-state TMs is 21^10 = 16,679,880,978,201; coming from (1 + writemovestate)^2*state; difference with your formula is that in our model, halting is encoded using undefined transition.
    "essentially different" is not a statically-checked property. It is discovered by the enumeration algorithm (Tree Normal Form, see Section 3 of https://arxiv.org/pdf/2509.12337); in particular, for each machine, the algorithm needs to know it if will ever reach an undefined transition because if it does it will visit the children of that machine (i.e. all ways to set that reached undefined transition).
    Knowing if an undefined transition will be ever reached is not computable, hence knowing the number of enumerated machines is not computable in general and is as hard as solving BB(5).
    
    schoen - 11 hours ago
    
    I guess the meaning of "essentially different" essentially depends on the strength of the mathematical theory that you used to classify them!
    When I first heard it I thought about using some kind of similar symmetry arguments (e.g. swapping left-move and right-move). Maybe there are also more elaborate symmetry arguments of some kind.
    Isn't it fair to say that there is no single objective definition of what differences between machines are "essential" here? If you have a stronger theory and stronger tools, you can draw more distinctions between TMs; with a weaker theory and weaker tools, you can draw fewer distinctions.
    By analogy, suppose you were looking at groups. As you get a more sophisticated group theory you can understand more reasons or ways that two groups are "the same". I guess there is a natural limit of group isomorphism, but perhaps there are still other things where group structure or behavior is "the same" in some interesting or important ways even between non-isomorphic groups?
    
    nickdrozd - 10 hours ago
    
    Turing machine program states are conventionally represented with letters: A, B, C, etc. The starting state is A.
    Now suppose you are running a Turing machine program from the beginning. The only state it has visited so far is state A. It runs until it reaches a state that has not been visited yet. What state is it? B? C? D? According to "tree normal form", the name for that next state just is the earliest unused state name, which in this case is B.
    Visited states so far are A and B. Run until an unvisited state is reached again. What is its name? C? D? E? Tree normal form says that the state will be called C. And the next newly visited state will be D, etc. In general, the canonical form for a Turing machine program will be the one that puts initial state visits in alphabetical order. (This concept also applies to the multi-color case.)
    It's not possible to tell of an arbitrary TM program whether or not that program is in tree normal form. But this proof doesn't look at arbitrary programs. It generates candidate programs by tracing out the normal tree starting from the root, thereby bypassing non-normal programs altogether.
    That is what "essentially different" means here.
    
    tromp - 10 hours ago
    
    If you count arbitrary transition tables, then you get a count of 63403380965376 [1].
    If you count transition tables in which states are reachable and canonically ordered, then you get a count of 632700 * 4^10 = 663434035200 [2]. These machines can behave differently on arbitrary tape contents.
    TNF further reduces these counts by examining machine behaviour when starting on an empty tape.
    [1] https://oeis.org/A052200
    [2] https://oeis.org/A107668
  - dgacmu - 10 hours ago
    
    Just to directly address this - the key thing wrong in this formula is that it's ^2*state, not ^state. The state transition table operates both on the current state and the input you read from the tape, so for a binary turing machine with 5 states, you have a 10-entry transition table.
    
    cvoss - 8 hours ago
    
    Ah, yes, of course the read bit should move into the exponent, since it's an input, not an output of the function. But the key point I was making is that there exists a formula. (I don't really care what the formula is.) The part I was not understanding was the complexity of "essentially different".
    
    LegionMammal978 - 7 hours ago
    
    In this context, two syntactically-different TMs are considered "essentially the same" if all reachable states are the same up to reordering their labels (except for the fixed starting label A) and globally swapping the L/R directions.
    The problem is knowing how many states are actually reachable, and how many are dead code. This is impossible to decide in general thanks to Rice's theorem and whatnot. In this case, it involves deciding all 4-state machines.
  - Lerc - 11 hours ago
    
    My intuition is that this would include every 4 state (and fewer) machine that has a fifth state that is never used. For every different configuration of that fifth state I would consider the machine to be essentially the same.
hackingonempty - 11 hours ago

Thank you for the write-down, it is very interesting!
Is there some reason why Rocq is the proof assistant of choice and not one of the others?
also.... https://www.youtube.com/watch?v=c3sOuEv0E2I
- nathanrf - 10 hours ago
  
  It is as simple as: the person who contributed the proofs/implementations chose Rocq.
  I did some small proofs in Dafny for a few of the simpler deciders (most of which didn't end up being used).
  At the time, I found Dafny's "program extraction" (i.e. transpilation to a "real" programming language) very cumbersome, and produced extremely inefficient code. I think since then, a much improved Rust target has been implemented.
- LegionMammal978 - 8 hours ago
  
  Its focus around constructible objects does lend itself well to the kind of proofs needed for non-halting. Usually they involve lemmas about the 'syntax' of a TM's configuration and how it evolves over time, and the actual number-based math is relatively simple. (With the exception of Skelet #17, which involves fiddly invariants of finite sequences.)
AdamH12113 - 10 hours ago

This is a fantastic summary that's very easy to understand. Thank you very much for writing it!
EvgeniyZh - 8 hours ago

So how many sporadics would be left if we run the same on 6-state TM?
- chtl - 6 hours ago
  
  Very many - I think the number is several thousand. Several sporadic 6-state machines have been solved, though, and there are currently about 2400(?) unsolved machines. Among these are several Cryptids, machines whose halting problem is known to be mathematically hard
pred_ - 7 hours ago

Thanks for writing that up! Was there anything to be learned from those 13 sporadics or are they simply truly sporadic?
_zoltan_ - 7 hours ago

how much computing time does it take to evaluate a machine to 100 million steps on a modern epyc system?
generalizations - 9 hours ago

> There is no universal algorithm that works on all TMs
Is this suspected or proven? I would love to read the proof if it exists.
- mwenge - 9 hours ago
  
  It is indeed proven and the reason they're called Turing machines! https://en.wikipedia.org/wiki/Halting_problem
  - webstrand - 8 hours ago
    
    Doesn't the discovery of the fifth Busy Beaver value indicate that there is a decider for 5-state Turing machines?
    
    tromp - 8 hours ago
    
    Yes, there are deciders for all finite sets of TMs. You just cannot have one for all TMs.
    
    tialaramex - 3 hours ago
    
    I think actually for relatively small n we get cases where mathematics says nope, you can't decide that, the machine goes recursive and so now your decider may be looking at a machine which is itself running deciders and Kurt Gödel says "No".
    
    orlp - 8 hours ago
    
    Yes. But there is no decider for n-state Turing machines that works regardless of n.
- Y_Y - 9 hours ago
  
  Scooping the Loop Snooper (an elementary proof of the undecidability of the halting problem)
  No program can say what another will do. Now, I won't just assert that, I'll prove it to you: I will prove that although you might work til you drop, you can't predict whether a program will stop.
  Imagine we have a procedure called P that will snoop in the source code of programs to see there aren't infinite loops that go round and around; and P prints the word "Fine!" if no looping is found.
  You feed in your code, and the input it needs, and then P takes them both and it studies and reads and computes whether things will all end as the should (as opposed to going loopy the way that they could).
  Well, the truth is that P cannot possibly be, because if you wrote it and gave it to me, I could use it to set up a logical bind that would shatter your reason and scramble your mind.
  Here's the trick I would use - and it's simple to do. I'd define a procedure - we'll name the thing Q - that would take and program and call P (of course!) to tell if it looped, by reading the source;
  And if so, Q would simply print "Loop!" and then stop; but if no, Q would go right back to the top, and start off again, looping endlessly back, til the universe dies and is frozen and black.
  And this program called Q wouldn't stay on the shelf; I would run it, and (fiendishly) feed it itself. What behaviour results when I do this with Q? When it reads its own source, just what will it do?
  If P warns of loops, Q will print "Loop!" and quit; yet P is supposed to speak truly of it. So if Q's going to quit, then P should say, "Fine!" - which will make Q go back to its very first line!
  No matter what P would have done, Q will scoop it: Q uses P's output to make P look stupid. If P gets things right then it lies in its tooth; and if it speaks falsely, it's telling the truth!
  I've created a paradox, neat as can be - and simply by using your putative P. When you assumed P you stepped into a snare; Your assumptions have led you right into my lair.
  So, how to escape from this logical mess? I don't have to tell you; I'm sure you can guess. By reductio, there cannot possibly be a procedure that acts like the mythical P.
  You can never discover mechanical means for predicting the acts of computing machines. It's something that cannot be done. So we users must find our own bugs; our computers are losers!
  by Geoffrey K. Pullum Stevenson College University of California Santa Cruz, CA 95064
  From Mathematics Magazine, October 2000, pp 319-320.
  - gowld - 7 hours ago
    
    In principle, analysts could could analyze all machines of size K or less, classifying many of them as being halting or not, and all the rest could in fact be non-halting, but the analyst would never (at any finite time) know for sure whether they are all non-halting.
- icepush - 9 hours ago
  
  This is proven. It's known as the halting problem and is the central pillar of computational complexity theory. The proof was invented by Alan Turing and is online.
- tekne - 9 hours ago
  
  This is just the Halting Problem, many good introductory expositions of the proof exist.
seberino - 10 hours ago

Thanks. Amazing work.

ks2048 - 9 hours ago

An interesting thing about this is that they figured out a high-level description of what this Busy Beaver program is doing - it's computing a Collatz-like sequence until it terminates.

I'm not sure if that is described in this paper, but I learned about it in this Scott Aaronson talk,

https://www.youtube.com/watch?v=VplMHWSZf5c

e.g see the slide at 31:40.

bobbylarrybobby - 3 hours ago

Interesting, it seems like a possible contender for BB6 (“Antihydra”) also does something Collatz-like. Is Collatz just a good blueprint for constructing long, complex, finite sequences?
gowld - 7 hours ago

Collatz-like except on one critical aspect: it is known to terminate!

fedeb95 - 13 hours ago

what's most interesting to me about this research is that it is an online collaborative one. I wonder how many more project such as this there are, and if it could be more widespread, maybe as a platform.

marvinborner - 12 hours ago

In recent years there has been a movement to collaborate on math proofs via blueprints (dependency graphs) in the Lean language, which seems related.
For example:
https://teorth.github.io/equational_theories/
https://teorth.github.io/pfr/
- fedeb95 - 9 hours ago
  
  thanks, these are interesting indeed!
longwave - 11 hours ago

This comment reminded me to check whether https://www.distributed.net/ was still in existence. I hadn't thought about the site for probably two decades, I ran the client for this back in the late 1990s back when they were cracking RC5-64, but they still appear to be going as a platform that could be used for this kind of thing.
- schoen - 11 hours ago
  
  I was also excited about those projects and ran DESchall as well as distributed.net clients. Later on I was running the EFF Cooperative Computing Award (https://www.eff.org/awards/coop), as in administering the contest, not as in running software to search for solutions!
  The original cryptographic challenges like the DES challenge and the RSA challenges had a goal to demonstrate something about the strength of cryptosystems (roughly, that DES and, a fortiori, 40-bit "export" ciphers were pretty bad, and that RSA-1024 or RSA-2048 were pretty good). The EFF Cooperative Computing Award had a further goal -- from the 1990s -- to show that Internet collaboration is powerful and useful.
  Today I would say that all of these things have outlived their original goals, because the strength of DES, 40-bit ciphers, or RSA moduli are now relatively apparent; we can get better data about the cost of brute-force cryptanalytic attacks from the Bitcoin network hashrate (which obviously didn't exist at all in the 1990s), and the power and effectiveness of Internet collaboration, including among people who don't know each other offline and don't have any prior affiliation, has, um, been demonstrated very strongly over and over and over again. (It might be hard to appreciate nowadays how at one time some people dismissed the Internet as potentially not that important.)
  This Busy Beaver collaboration and Terence Tao's equational theories project (also cited in this paper) show that Internet collaboration among far-flung strangers for substantive mathematics research, not just brute force computation, is also a reality (specifically now including formalized, machine-checked proofs).
  There's still a phenomenon of "grid computing" (often with volunteer resources), working on a whole bunch of computational tasks:
  https://en.wikipedia.org/wiki/List_of_grid_computing_project...
  It's really just the specific "establish the empirical strength of cryptosystems" and "show that the Internet is useful and important" 1990s goals that are kind of done by this point. :-)
arethuza - 13 hours ago

The BB Challenge site is really well structured:
https://bbchallenge.org/13650583
tsterin - 11 hours ago

In the paper we mention two other communities which seem to have similar structure and size:
- https://conwaylife.com/, on Conway's GoL and other cellular automata
- Googology, https://googology.fandom.com/wiki/Googology_Wiki and https://googology.miraheze.org/wiki/Main_Page, on big numbers

jsjsjxxnnd - 11 hours ago

BB(5) was discovered last year, so why is this paper coming out now? Is there a new development?

nathanrf - 10 hours ago

This is the (pre-print) paper produced from that project - i.e. it is a self-contained, complete description of the work completed so that it can be reviewed, and then cited.
The underlying result has not changed, but the presentation of the details is now complete.

TheAmazingRace - 8 hours ago

I can't help but immediately think of Busy Beaver stores from the tri-state (PA-OH-WV) area.

https://busybeaver.com/

purplejacket - 7 hours ago

This paper is delightfully readable.

DarkNova6 - 11 hours ago

As somebody not familiar with this field, is there any tangible benefit to this solution or is it purely academic?

schoen - 11 hours ago

The Busy Beaver problem is one of the most theoretical, one of the most purely mathematical, of all theoretical computer science questions. It is really about what is possible in a very abstract sense.
Working on it did make people cleverer at analyzing the behavior of software, but it's not obvious that those skills or associated techniques will directly transfer to analyzing the behavior of much more complex software that does practical stuff. The programs that were being analyzed here are extraordinarily small by typical software developer standards.
To be more specific, it seems conceivable to me that some of these methods could inspire deductive techniques that can be used in some way in proof assistants or optimizing compilers, in order to help ensure correctness of more complicated programs, but again that application isn't obvious or guaranteed. The people working on this collaboration would definitely have described themselves as doing theoretical computer science rather than applied computer science.
- arethuza - 9 hours ago
  
  "extraordinarily small by typical software developer standards"
  That's the incredible thing about these TMs - they are amazingly small but still can exhibit very complex behaviours - it's like microscopy for programs.
non_aligned - 11 hours ago

The research into BB numbers is purely academic and unlikely to be more than that, unless some other part of mathematics turns out to be wrong. Our current understanding is that the numbers are essentially guaranteed to be useless.
This particular proof is doubly-academic in the sense that the value was already known, this is just a way to make it easier to independently verify the result.
It's a part of a broader movement to provide machine proofs for other stuff (e.g., Fermat's last theorem), which may be beneficial in some ways (e.g., identifying issues, making parts of proofs more "reusable").
- cwillu - 11 hours ago
  
  I'm going to push back on “the value was already known”; my understanding is that the candidate value was known, but while there was a single machine with behaviour that was not yet characterized, it quite easily could have been the real champion.
  - non_aligned - 11 hours ago
    
    Ah yeah, I stand corrected. I didn't realize this is a paper by the original team that found the number a while back, I thought it's an independent formalization in 2025.
YesThatTom2 - 11 hours ago

It is purely academic... or is it?
G. H. Hardy's autobiography was titled "A Mathematician's Apology" because he, somewhat jokingly, wanted to apologize for a life of pure math... totally academic and completely useless.
Then a few years after he passed away, his math was key for The Manhattan Project to build the first successful nuclear bomb. His math lead to the device that changed the world and affected every aspect of human life for a century.
It's only "useless" until someone finds a "use".
P.S. The book documents his time with Ramanujan. If you liked the film "The Man Who Knew Infinity", you should read Apology
- DarkNova6 - 11 hours ago
  
  Look, if you tell me that this is just an interesting problem, I'm happy to hear that. If it helps prove some exotic theorem, I'm happy to hear that. I'm also happy to hear that this has a direct application somewhere.
  I just want to know.
  - YesThatTom2 - 10 hours ago
    
    I'm sorry if it sounded like I was scolding you. My intent was to confirm your assumption but give you an appreciation for the fact that it's hard to know.
  - - 6 hours ago
    
    [deleted]
  - onraglanroad - 7 hours ago
    
    Nobody knows.
    That's the reason why you were pointed to Hardy's Mathematician's Apology. He was talking about how his works would never have a practical use.
    It turns out he was wrong, because Computers are mathematics made flesh. All these abstract notions suddenly had a practical use.
    There is something fundamental about the limits of computing so there might well be something that can be made of the BB function but at the moment it's just pure maths.
cwillu - 11 hours ago

Purely academic, there are no direct applications or anything unlocked by knowing the value; the benefits are all in what was learned along the way.
- vintermann - 9 hours ago
  
  But there were reasons this problem was looked at rather than another problem. We can see that not all problems are equally interesting to mathematicians, and it's not simply about complexity or difficulty.
  So the goals are hard to pin down exactly, but there are goals.
  - cwillu - 9 hours ago
    
    I didn't say or suggest otherwise.
bubblyworld - 7 hours ago

Purely academic, at least if you mean the result itself. People are bringing up all sorts of apologetic stuff about how it might be useful in the future, and as an ex-maths-phd I can sympathise, but to me this seems more like a large-scale bookkeeping exercise than a conceptual breakthrough. It's a very nice result though =)
aeve890 - 10 hours ago

100 years ago you could be asking if number theory or non-euclidean geometry has any tangible benefit of is purely academic. For the bb problem, the answer right now is no. But nobody knows what's down the road when finding useful applications in pure maths.
ivanjermakov - 10 hours ago

Researching methods to solve busy beaver problem also helps in exploring and developing methods to tame halting problem in various applications, such as static analysis.
bonoboTP - 9 hours ago

It's a beautiful achievement of humanity. Like going to the moon. Someone might say there are some mining or rock-related applications around going to the moon, but it's mainly not about that.
I don't like to call this "purely academic" because that sounds dusty and smells moth-chewed and utterly boring. It's purely spiritual, purely heart-warming, purely brain-flexing.
It's like asking a mountain climber whether reaching K2 has practical benefits or if it's "academic".
YetAnotherNick - 11 hours ago

Define tangible benefits and give example of any pure mathematics in last 200 years having tangible benefits.
One benefit is that it is related to foundations of mathematics. If you can find busy beaver number 745, you can prove or disprove Maths. If you can find busy beaver 6, you can prove few collatz like conjecture which is an open problem.
Most core research doesn't help directly, but through indirect means. e.g. This would have led to finding the areas of improvements in Lean due to its complexity. I know for fact that Lean community learns a lot from projects like Mathlib and this. And Lean is used in all sort of scenarios like making AWS better to improving safety of flights.
twosdai - 11 hours ago

I am also unfamiliar, but a naive guess I can make is helping solve other math proofs, and maybe better encryption for the public. Basically everytime I see some progress in math, it usually means encryption in some way is impacted.

onraglanroad - 9 hours ago

I was wondering if this had any implications for BB(6) but it seems that can't be written down exactly (not enough room in this universe) so BB(5) is the last one we'll see an exact value for.

sapiogram - 8 hours ago

Wikipedia give a lower bound for BB(6) of 2^2^2^2^2^2... repeated 33554432 times, that's definitely a fast-growing function!
- mostly_a_lurker - 3 hours ago
  
  Hmm, I think you might be a bit underestimating?
  My understanding of up-arrow notation is: 2 ^^^ 5 means
  2 ^^ (2 ^^ (2 ^^ (2 ^^ 2))) (with five 2's)
  = 2 ^^ (2 ^^ (2 ^^ (2 ^ 2)))
  = 2 ^^ (2 ^^ (2 ^^ 4))
  = 2 ^^ (2 ^^ 65536), since Wikipedia calculates 2 ^^ 4 = 65536.
  Now 2 ^^ 65536 means 2 ^ (2 ^ (2 ^ ...))) with 65536 two's. Call that number N (too large to really describe).
  Then 2 ^^ N is 2 ^ (2 ^ ( ...)) with N two's.
- jojomodding - 6 hours ago
  
  And it gets that figure from this paper and the authors of this paper continuing to work on BB(6).

bravura - 8 hours ago

Question: Is the computational cost of verifying the proof significantly less than the computational cost of creating the proof?

crdrost - 8 hours ago

Like I haven't run Coq but I assume that in this particular instance, the answer is a rather trivial "yes"? You'd think about, creating this proof is a gigantic research slog and publishable result; verifying it is just looking at the notes you wrote down about what worked (this turing machine doesn't halt and here's the input and here's the repetitive sequence it generates, that turing machine always halts and here's the way we proved it...).
But taken more generally, not for this specific instance, your question is actually a Millenium Prize problem worth a million dollars.
- LegionMammal978 - 7 hours ago
  
  In principle, you could reduce the problem into running a few very costly programs, so that the proof is completely verified as soon as someone eats that cost. (Some 6-state machines are already looking like they'll take a galactic cost to prove, and others will take a cost just barely within reason.) But this definitely isn't the case for the BB(5) project.
Kranar - 6 hours ago

This is true in general for every mathematical proof in ZFC (and even in more powerful theories). The decision problem "Given a formula F and an integer n, is there a ZFC proof of F of length <= n?" is NP complete, meaning that verifying the proof can be done in polynomial time while deriving the proof can require an exponential amount of time.
LegionMammal978 - 7 hours ago

The compiled proof does take a few hours to finish verifying on a typical laptop. But obviously the many experiments and enumerations along the way took much more total processing power.

ape4 - 13 hours ago

I can't quite understand - did they use brute force?

bc569a80a344f9c - 13 hours ago

Not quite, I think this is the relevant part of the paper:
> Structure of the proof. The proof of our main result, Theorem 1.1, is given in Section 6. The structure of the proof is as follows: machines are enumerated arborescently in Tree Normal Form (TNF) [9] – which drastically reduces the search space’s size: from 16,679,880,978,201 5-state machines to “only” 181,385,789; see Section 3. Each enumerated machine is fed through a pipeline of proof techniques, mostly consisting of deciders, which are algorithms trying to decide whether the machine halts or not. Because of the uncomputability of the halting problem, there is no universal decider and all the craft resides in creating deciders able to decide large families of machines in reasonable time. Almost all of our deciders are instances of an abstract interpretation framework that we call Closed Tape Language (CTL), which consists in approximating the set of configurations visited by a Turing machine with a more convenient superset, one that contains no halting configurations and is closed under Turing machine transitions (see Section 4.2). The S(5) pipeline is given in Table 3 – see Table 4 for S(2,4). All the deciders in this work were crafted by The bbchallenge Collaboration; see Section 4. In the case of 5-state machines, 13 Sporadic Machines were not solved by deciders and required individual proofs of nonhalting, see Section 5.
So, they figured out how to massively reduce the search space, wrote some generic deciders that were able to prove whether large amounts of the remaining search spaces would halt or not, and then had to manually solve the remaining 13 machines that the generic deciders couldn't reason about.
- ufo - 13 hours ago
  
  Last but not least, those deciders were implemented and verified in the Rocq proof assistant, so we know they are correct.
  - lairv - 12 hours ago
    
    We know that they correctly implement their specification*
    
    meithecatte - 6 hours ago
    
    No, they are correct, because the deciders themselves are just a cog in the proof of the overall theorem. The specification of the deciders is not part of the TCB, so to speak.
- jjgreen - 12 hours ago
  
  Those 13 sporadics are the interesting ones then ...
Xcelerate - 12 hours ago

There are a few concepts at play here. First you have to consider what can be proven given a particular theory of mathematics (presumably a consistent, recursively axiomatizable one). For any such theory, there is some finite N for which that theory cannot prove the exact value of BB(N). So with "infinite time", one could (in principle) enumerate all proofs and confirm successive Busy Beaver values only up to the point where the theory runs out of power. This is the Busy Beaver version of Gödel/Chaitin incompleteness. For BB(5), Peano Arithmetic suffices and RCA₀ likely does as well. Where do more powerful theories come from? That's a bit of a mystery, although there's certainly plenty of research on that (see Feferman's and Friedman's work).
Second, you have to consider what's feasible in finite time. You can enumerate machines and also enumerate proofs, but any concrete strategy has limits. In the case of BB(5), the authors did not use naive brute force. They exhaustively enumerated the 5-state machines (after symmetry reductions), applied a collection of certified deciders to prove halting/non-halting behavior for almost all of them, and then provided manual proofs (also formalized) for some holdout machines.
arethuza - 13 hours ago

I think you have to exhaustively check each 5-state TM, but then for each one brute force will only help a bit - brute force can't tell you that a TM will run forever without stopping?
olmo23 - 13 hours ago

You can not rely on brute force alone to compute these numbers. They are uncomputable.
- karmakaze - 13 hours ago
  
  The busy beaver numbers form an uncomputable sequence.
  For BB(5) the proof of its value is an indirect computation. The verification process involved both computation (running many machines) and proofs (showing others run forever or halt earlier). The exhaustiveness of crowdsourced proofs was a tour de force.
- arethuza - 13 hours ago
  
  Isn't it rather that the Busy Beaver function is uncomputable, particular values can be calculated - although anything great than BB(5) is quite a challenge!
  https://scottaaronson.blog/?p=8972
  - CaptainNegative - 12 hours ago
    
    Only finitely many values of BB can be mathematically determined. Once your Turing Machines become expressive enough to encode your (presumably consistent) proof system, they can begin encoding nonsense of the form "I will halt only after I manage to derive a proof that I won't ever halt", which means that their halting status (and the corresponding Busy Beaver value) fundamentally cannot be proven.
    
    Veserv - 6 hours ago
    
    The bound on that is known to be no more than BB(745) which is independent of ZFC [1].
    [1] https://scottaaronson.blog/?p=7388
    
    Kranar - 5 hours ago
    
    That's a misinterpretation of what the article says. There is no actual bound in principle to what can be computed. There is a fairly practical bound which is likely BB(10) for all intents and purposes, but in principle there is no finite value of n for which BB(n) is somehow mathematically unknowable.
    ZFC is not some God given axiomatic system, it just happens to be one that mathematicians in a very niche domain have settled on because almost all problems under investigation can be captured by it. Most working mathematicians don't really concern themselves with it one way or another, almost no mathematical proofs actually reference ZFC, and with respect to busy beavers, it's not at all uncommon to extend ZFC with even more powerful axioms such as large cardinality axioms in order to investigate them.
    Anyhow, just want to dispel a common misconception that comes up that somehow there is a limit in principle to what the largest BB(n) is that can be computed. There are practical limits for sure, but there is no limit in principle.
    
    IsTom - 4 hours ago
    
    You can compute a number that is equal to BB(n), but you can't prove that it is the right number you are looking for. For any fixed set of axioms you'll eventually run into BB(n) too big that gets indepentent.
    
    Kranar - 4 hours ago
    
    >You can compute a number that is equal to BB(n), but you can't prove that it is the right number you are looking for.
    You can't categorically declare that something is unprovable. You can simply state that within some formal theory a proposition is independent, but you can't state that a proposition is independent of all possibly formal theories.
    
    tromp - 3 hours ago
    
    They didn't claim that. They claimed that any (sound and consistent) finitely axiomatizable theory (basically, any recursively enumerable set of theorems) can only prove finitely many theorems of the form BB(n) = N.