Summarizer

LLM Input

llm/2ad2a7bb-5462-4391-a2da-bf11064993c9/topic-17-1d8f316b-9503-493f-9acd-8ed7e57142b4-input.json

Pretty-print

prompt

The following is content for you to summarize. Do not respond to the comments—summarize them.

<topic>
Academic vs Practical Intelligence # Distinction between Gemini excelling at academic benchmarks while feeling less useful for practical tasks, discussion of book smart vs street smart analogies for AI capabilities
</topic>

<comments_about_topic>
1. My experience also shows that Gemini has unique strength in “generalized” (read: not coding) tasks. Gemini 2.5 Pro and 3 Pro seems stronger at math and science for me, and their Deep Research usually works the hardest, as long as I run it during off-hours. Opus seems to beat Gemini almost “with one hand tied behind its back” in coding, but Gemini is so cheap that it’s usually my first stop for anything that I think is likely to be relatively simple. I never worry about my quota on Gemini like I do with Opus or Chat-GPT.

Comparisons generally seem to change much faster than I can keep my mental model updated. But the performance lead of Gemini on more ‘academic’ explorations of science, math, engineering, etc has been pretty stable for the past 4 months or so, which makes it one of the longer-lasting trends for me in comparing foundation models.

I do wish I could more easily get timely access to the “super” models like Deep Think or o3 pro. I never seem to get a response to requesting access, and have to wait for public access models to catch up, at which point I’m never sure if their capabilities have gotten diluted since the initial buzz died down.

They all still suck at writing an actually good essay/article/literary or research review, or other long-form things which require a lot of experienced judgement to come up with a truly cohesive narrative. I imagine this relates to their low performance in humor - there’s just so much nuance and these tasks represent the pinnacle of human intelligence. Few humans can reliably perform these tasks to a high degree of performance either. I myself am only successful some percentage of the time.

2. Yes, agentic-wise, Claude Opus is best. Complex coding is GPT-5.x. But for smartness, I always felt Gemini 3 Pro is best.

3. Can you give an example of smartness where Gemini is better than the other 2? I have found Gemini 3 pro the opposite of smartness on the tasks I gave him (evaluation, extraction, copy writing, judging, synthesising ) with gpt 5.2 xhigh first and opus 4.5/4.6 second. Not to mention it likes to hallucinate quite a bit .

4. I must be holding these things wrong because I'm not seeing any of these God like superpowers everyone seem to enjoy.

5. Who said they’re godlike today?

And yes, you are probably using them wrong if you don’t find them useful or don’t see the rapid improvement.

6. It's hard to evaluate "logic" and "math", since they're made up of many largely disparate things. But I think modern AI models are clearly better at coding, for example, than 99% of the population. If you asked 100 people at your local grocery store why your goroutine system was failing, do you think multiple of them would know the answer?

7. Their models might be impressive, but their products absolutely suck donkey balls. I’ve given Gemini web/cli two months and ran away back to ChatGPT. Seriously, it would just COMPLETELY forget context mid dialog. When asked about improving air quality it just gave me a list of (mediocre) air purifiers without asking for any context whatsoever, and I can list thousands of conversations like that. Shopping or comparing options is just nonexistent.
It uses Russian propaganda sources for answers and switches to Chinese mid sentence (!), while explaining some generic Python functionality.
It’s an embarrassment and I don’t know how they justify 20 euro price tag on it.

8. I agree. On top of that, in true Google style, basic things just don't work.

Any time I upload an attachment, it just fails with something vague like "couldn't process file". Whether that's a simple .MD or .txt with less than 100 lines or a PDF. I tried making a gem today. It just wouldn't let me save it, with some vague error too.

I also tried having it read and write stuff to "my stuff" and Google drive. But it would consistently write but not be able to read from it again. Or would read one file from Google drive and ignore everything else.

Their models are seriously impressive. But as usual Google sucks at making them work well in real products.

9. These benchmarks are super impressive. That said, Gemini 3 Pro benchmarked well on coding tasks, and yet I found it abysmal. A distant third behind Codex and Claude.

Tool calling failures, hallucinations, bad code output. It felt like using a coding model from a year ago.

Even just as a general use model, somehow ChatGPT has a smoother integration with web search (than google!!), knowing when to use it, and not needing me to prompt it directly multiple times to search.

Not sure what happened there. They have all the ingredients in theory but they've really fallen behind on actual usability.

Their image models are kicking ass though.

10. I gather this isn't intended a consumer product. It's for academia and research institutions.

11. Gemini has always felt like someone who was book smart to me. It knows a lot of things. But if you ask it do anything that is offscript it completely falls apart

12. ARC-AGI 2 is an IQ test. IQ tests have been shown over and over to have predictive power in humans. People who score well on them tend to be more successful

13. top 10 elo in codeforces is pretty absurd

14. Nonsense releases. Until they allow for medical diagnosis and legal advice who cares? You own all the prompts and outputs but somehow they can still modify them and censor them? No.

These 'Ai' are just sophisticated data collection machines, with the ability to generate meh code.
</comments_about_topic>

Write a concise, engaging paragraph (3-5 sentences) summarizing the key points and perspectives in these comments about the topic. Focus on the most interesting viewpoints. Do not use bullet points—write flowing prose.

topic

Academic vs Practical Intelligence # Distinction between Gemini excelling at academic benchmarks while feeling less useful for practical tasks, discussion of book smart vs street smart analogies for AI capabilities

commentCount

← Back to job