Moreover, they show a counter-intuitive scaling limit: their reasoning effort increases with challenge complexity nearly some extent, then declines Even with owning an ample token spending budget. By evaluating LRMs with their common LLM counterparts below equivalent inference compute, we discover a few general performance regimes: (1) reduced-complexity duties https://www.youtube.com/watch?v=snr3is5MTiU