Tweet by Dwarkesh Patel: The most interesting part for me is where @karpathy describes why LLMs aren't able to learn like humans. As you would expect, he comes up with a wonderfully evocative phrase to describe RL: “sucking supervision bits through a straw.” A single end reward gets broadcast across… https://t.co/3guOwdewKd pic.twitter.com/lYonLgrukB