Reverse math shows why hard problems are hard
quantamagazine.org162 points by gsf_emergency_6 a day ago
162 points by gsf_emergency_6 a day ago
I recently had, for various reasons, improve my math skills.
I was surprised at how difficult I found math. Now, I was never really great at math; logic and calculation in the head I could do fairly well (above average), but just foundational knowledge was hard and mathematical theory even harder. But now I even had trouble with integration and differentiation and even with understanding a problem to put it down into a formula. I am far from being the youngest anymore, but I was surprised at how shockingly bad I have become in the last some +25 years. So I decided to change this in the coming months. I think in a way computers actually made our brains worse; many problems can be auto-solved (python numpy, sympy etc...) and the computers work better than hand-held calculators, but math is actually surprisingly difficult without a computer. (Here I also include algorithms by the way, or rather, the theory behind algorithms. And of course I also forgot a lot of the mathematical notation - somehow programming is a lot easier than higher math.)
I've grown to see math as far more pattern matching than I remember it as a kid. I think that explains why "rote memorization" works more than folks want to admit for it.
This doesn't end at kid level maths either, I have seen people get bachelors and masters in maths without understanding much of it intuitively or being able to apply it.
Mostly because they have rote memorised it (and partly because much of the education system is a game to be played e.g. speaking with professors during office hours can give very strong hints to exams).
Right, I've also seen people that couldn't get some higher math items because they haven't learned to recognize some things on sight. Curves are a good example. You should be able to roughly sight identify basic curves. Or distributions based on their shape. With obvious caveats.
I suspect this is a lot like being able to recognize a piece of art to the artist by sight. Strictly, not required. But a lot of great artists can do it.
For real fun, I saw an interview with Magnus Carlsen where someone was quizzing him on famous games. He was able to say the match on the first 2-3 moves a remarkable number of times.
If I may share, the stuff occurring in a computer would not be qualified as math … math is more about intuition and the sense of describing things in quantitative languages …
Neuroscience suggests global connectivity changes after 40 instead of specialized areas. Overall declines do not start until late 40s though.
I started going down that road decades ago myself. I had a degree in computer science already, but I'd only learned just enough math to graduate and then forgotten everything, deliberately.
Years after I graduated, I was browsing comp.lang.java (I think it was) and somebody asked for help developing an applet that could draw a 3-D arrow that could orient itself in any direction. For some reason, that sounded interesting to me, so I started trying to work on it and I realized I needed to go back and re-learn all of that "point-slope" stuff that I'd made a point of learning just enough of to squeak through college.
That sent me down the path of re-learning all the things I now wish I'd put more effort into learning when I was a teenager. I ended up working through my old undergraduate calculus textbook a few times and I understand undergraduate calculus _really_ well now. I was able to coach both of my kids through high school calculus and they both remarked that none of their friends parents were able to help them beyond algebra.
It makes me wonder how many people are great (or even adequate) at math and how many are just faking it - as interesting as I now find it, math skills aren't actually very practically useful.
Whenever the pigeon-hole principle is name dropped, we should also drop Dijkstra's commentary The undeserved status of the pigeon-hole principle.
https://www.cs.utexas.edu/~EWD/transcriptions/EWD10xx/EWD109...
I love Dijkstra's writing, but I don't think this is his strongest piece. In general parlance, when we say "by piegonhole" we mean "any variant of it". I'd still call what he's doing "piegonhole" lol. You can even further generalize it, e.g. by making expected value arguments.
This is not uncommon: we can say that "by the fundamental theorem of algebra" two polynomials of degree N that agree on N+1 points are identically equal. "By induction" includes Cauchy induction, sometimes with "this and that are the same" we mean "up to isomorphism" and so on.
The advice he ends on is extremely solid, though:
The moral of the story is that we are much better off with the neutral, general standard procedure: name the unknown(s) so that you can formalize all conditions given or to be established and simplify by formula manipulation.
The math will always math.Combining that with OP article, the obstruction to showing hardness is therefore not technical but foundational- ie. the required axiom lies outside the standard minimal toolkit
Wow. I thought the metaphor was stupid when I was taught this in college, and only now, decades later, I find out Dijkstra agreed.
Interesting! Were there other examples from your course that had similar delays in insight? What changed your perspective?
The first time I saw the Pigeonhole Principle was in the following:
Problem: A plane has every point colored red, blue, or yellow. Prove that there exists a rectangle whose vertices are the same color.
Solution: Consider a grid of points, 4 rows by 82 columns. There are 3^4=81 possible color patterns of columns, so by the Pigeonhole Principle, at least two columns have the same color pattern. Also by the Pigeonhole Principle, each column of 4 points must have at least two points of the same color. The two repeated points in the two repeated columns form a rectangle of the same color. QED.
The Pigeonhole Principle is very neat. It would be hard not to use it for the proof.
Partly that article argues against proof by contradiction which does seem to be overused.
Unless by plane they mean airplane, since in curved 3d surface this is not automatically given to be true.
Even if it was talking about airplanes, it doesn’t mention “surface”. So it would still hold, given that airplane parts aren’t infinitely thin.
Though if you fly economy it seems like they're bent on approaching it as a limit.
Topologically, an airplane is identical to a universe of curvature=+1. Since the size of the grid versus the airplane/universe is not given, I will assume there are infinitely many grid points.
Specifically, reverse math (a subset of metamathematics which looks at swapping axioms and theorems) allows us to show that some hard problems are equivalent to each other.
EDIT: I think this line is the most telling:
> But he cautioned that the reverse mathematics approach may be most useful for revealing new connections between theorems that researchers have already proved. "It doesn’t tell us much, as far as we can say, about the complexity of statements which we do not know how to prove."
So, at this point, it helps us understand more about problems we already understand a little about, but nothing yet about new problems.
> So, at this point, it helps us understand more about problems we already understand a little about, but nothing yet about new problems.
I don't think this caution makes any sense!
The more we learn about theorem/axiom equivalences (or more generally, the lattice of such connections) between existing proofs, the more insights we will gain into how unproved conjectures may relate to existing proofs or each other.
Only in the strictest possible sense does saying showing X tells us nothing about showing Y. Meaning a proof or identification of X is not a proof or identification of related thing Y. But that is an obviously pedantic statement.
Not to critique the person being quoted. I feel like an offhand remark may have got unduly elevated by being quoted in a "two-sides of a story" writer's dramatization reflex.
That's par for a field whose name seems to have been inspired by "reverse-engineering", which by construction doesn't try to understand products that have not yet reached the market :)
Well, but you can also reverse engineer nature (or the output of gradient descent) or the products of an adversary.
The pigeonhole principle informs us that if we have more pigeons than holes, some holes have two or more pigeons. But also that if we have fewer pigeons than holes, some holes will necessarily be empty.
Given two bit strings of length n, if we compare fewer than n pairs, we cannot tell whether they are equal. The strings being equal is the proposition a_0 = b_0 ^ a_1 = b_1 ^ ... ^ a_n-1 = b_n-1. You cannot simplify this formula such that any a_i or b_i primary is taken away.
Backwards game of life: https://m.youtube.com/watch?v=g8pjrVbdafY
I couldn't see any citations or references in that video or its description. It presents it as him solving the problem himself, but I'm sure other people have written about solving the Game of Life in reverse with SAT solvers prior to this...
Edit: here's a paper on it from 2006, https://link.springer.com/chapter/10.1007/978-3-540-76928-6_...
The approach reminds me of NP-Completeness (Computational harness vs mathematical-proving hardness). Am I over-simplifying?
You’re right on the money. It’s intimately tied to computability theory, complexity theory’s more abstract sibling (I.e. the Halting problem and further Turing degrees). Both rely on the core techniques of diagonalization and reductions. The meat of it can differ a lot because estimating time bounds and determining logical equivalence rapidly become different problem spaces, so it’s not like results in one are really applicable to the other. But a researcher in one will usually be well versed in the other.
I would say that's on the wrong track and that "hard" is the wrong term for what reverse math (RM) tells you about problems. RM studies what axioms you need to prove a given theorem, but it's more like showing that "to pound in a nail of size X you need a hammer of size Y and a smaller hammer just can't do it", than saying pounding in the nail is difficult or complicated. Once you have the big enough hammer, pounding the nail can be very simple.
I'm pretty far away from learning about these things in school, but this made me wonder on the connection between the mentioned communication complexity lower bound and special relativity limits on how fast information can travel.
In the Game of Life community, people use "c" to refer to the speed at which a GoL figure can travel, which is at most 1 in the vertical direction, and 1 in the horizontal direction.
Overall complexity (work required) is a conserved quantity. You can move it around and claim that a new algorithm has reduced the complexity, but in essence it has shifted the complexity elsewhere.
Also, whether some problem has polynomial complexity or exponential complexity depends on what you consider as the problem space. Complexity of b^c is polynomial if b is the problem space and exponential if c is the problem space.
Complexity of traveling salesman problem depends on what you consider as the problem space - number of cities or number of connections between cities.
I'm not sure if you're just choosing intentionally obtuse verbiage or if you're actually saying something completely incoherent.
"Overall Complexity" is a meaningless term without you defining it. I suppose you mean that there is some lower bound limit to the amount of work relative to a model of computation, but this is a trivial statement, because the limit is always at least no work.
We have many problems where we don't know the least upper bound, so even an interesting formulation of your idea is not necessarily true: work need not be conserved at the least upper bound, because reaching the bound may not be possible and epsilon improvement might always be possible.
Finally, algorithmic complexity is the wrong analogy for reverse mathematics anyway.
To give an example, we consider that binary search requires less work than a linear search. But there are costs and usecase considerations involved. Insertion of new recod need to use binary search to keep the data sorted. Also if the number of lookups is far less than number of records, the overall cost is more than appending and linear search. That's what I mean by by moving the complexity around.
A problem scenario doesn't have absolute characteristics. It's relative to your way of looking at it, and your definition of a problem.
You are right, but this doesn't mean that the amount of work is conserved as your original message implies. The correct statement would be that "algorithmic complexity is just one aspect of actual practical complexity and an algorithm with better algorithmic complexity can end up performing worse in reality due to practical considerations of the data and the processor doing the computations".
> Complexity of traveling salesman problem depends on what you consider as the problem space - number of cities or number of connections between cities.
lmao what?
The Travelling Salesman Problem in 1 dimension, on a line, is trivial, I wonder what the connection is between the dimensions and the hardness of problems like this.
I think the reason this is interesting to mathematicians is that he was working with an axiomatic system that is fairly new, and _in particular_ is thought not to be strong enough to be able to prove the pigeon hole principle. Since he proved that all these other theorems are equivalent to the pigeonhole principle, all of those other theorems are probably also not able to be proven with PV1.
I usually don't say this about Quantamagazine, but thank you for covering and informing me about this. The perspective is insightful and after comprehending the concepts a little more, they are not a hyperbole. I'm currently reading through the mentioned paper, "An Introduction to Feasible Mathematics and Bounded Arithmetic for Computer Scientists" by Jiatu Li and I believe that it's greatly elucidating.
If you're reading the paper, I recommend Section 1.3 where it goes over the examples of Binary Search and Dijkstra. The idea that "natural numbers are encoded by binary strings just as other data structures" in the preface is prevalent in their constructions of their proofs. As a computer scientist, this is great because I intuitively recognize that both algorithms and memory consist of only 1's and 0's underneath all the abstractions and code.
This work ties together practical applications and complexity theory to create a new bridge in mathematics that I'm excited about exploring. I'm especially interested in reverse mathematics applied to space complexity.
Here's some additional resources I found on this, Talk by Jiatu Li, joint work with Lijie Chen, Igor Carboni Oliveira Title: Reverse Mathematics of Complexity Lower Bounds https://www.youtube.com/watch?v=g5EqAgDxxE0
[dead]