Tweet by Eli Lifland: Takeaways re: AI R&D performance: 1. Claude 3.5 Sonnet reaches ~50th percentile human baseline 8-hour performance. 2. Sonnet Old-> New is a 0.2 jump in 4 months. We're 0.6 away from 90th percentile baselines. I think this significantly shortens my timelines. (caveats in reply) https://t.co/6svf0cu8EF pic.twitter.com/VzldSvcMCU