2026, Year of Reinforcement Learning?

aimlbling-about.ninerealmlabs.com

10 points by namnnumbr 2 days ago

I think the next big thing will be will actually be test time training. It will represent another unbelievable increase in compute but it will produce an even bigger jump than what thinking models provided.

Some food for thought is this: If you think AGI should dynamically learn and get better at arbitrary skills on the fly then LLMs + SGD is already a sort of slow moving AGI.

thtgrisdjdjdh - a day ago

Works only for verifiable rewards, since humans (thankfully) don't have a good theory of knowledge (epistemology).

There's only so far that these agents can go.

- 2 days ago

[deleted]