Summarizer

LLM Input

llm/302a36fb-79e1-4f4b-b047-e145d20e4497/batch-2-3d952bdc-6bd2-4444-9d97-a7186f6f021e-input.json

prompt

The following is content for you to classify. Do not respond to the comments—classify them.

<topics>
1. CMU Database Group Teaching
   Related: Praise for CMU's eccentric teaching style including gangsta intros, DJ sets before lectures, and unique course materials on YouTube covering database internals for building systems
2. SQLite Production Usage
   Related: Discussion of SQLite's viability in production, WAL mode for concurrent writes, single-file simplicity, Litestream backups, limitations for multi-user systems, and comparisons to traditional databases
3. DuckDB Use Cases
   Related: Enthusiasm for DuckDB's columnar storage, JSON handling, WASM support, S3 integration, and use as analytical complement to SQLite for OLAP workloads
4. SQLite-DuckDB Integration
   Related: Interest in combining SQLite for writes/OLTP with DuckDB for reads/analytics, discussing watermarks, sync strategies, and latency tradeoffs between row and columnar storage
5. MCP Security Concerns
   Related: Skepticism about MCP database access opposing least privilege principles, risks of unfettered LLM access, hallucination-driven SQL injection, and need for guardrails and monitoring
6. Immutable Bi-temporal Databases
   Related: Advocacy for XTDB and Datomic for fintech compliance, discussion of audit requirements, time-travel queries, and lack of production-ready options in this category
7. PostgreSQL vs MySQL Popularity
   Related: Debate over metrics measuring database popularity, distinguishing installed base from new project adoption, noting momentum shift toward PostgreSQL despite MySQL's larger deployment footprint
8. Embedded Database Benefits
   Related: Discussion of local databases without network overhead, caching implications, RAM management differences from server databases, and when to migrate to PostgreSQL
9. MySQL Project Concerns
   Related: Commentary on Oracle firing MySQL open-source team, project becoming rudderless, MariaDB financial problems, and potential impact on ecosystem
10. Database Consolidation Trends
   Related: Concern about software development gravitating toward same tools like PostgreSQL and React, loss of diversity and nuance in technical decisions
11. JSON in Databases
   Related: Appreciation for JSON field support in modern databases, arrow functions in SQLite, and DuckDB's superior JSON handling with columnar extraction
12. EdgeDB/Gel Acquisition Impact
   Related: Disappointment about Gel sunsetting after Vercel acquisition, appreciation for EdgeQL language design, and discussion of community fork efforts
13. Time Series Databases
   Related: Questions about time series database developments, mentions of QuestDB, ClickHouse's experimental time series engine, and need for InfluxDB alternatives
14. Enterprise Database Omissions
   Related: Noting absence of Oracle, MS SQL Server, DB2 from article despite being top-ranked databases, discussion of boring enterprise tech that powers critical systems
15. Database Caching Strategies
   Related: Discussion of PostgreSQL's built-in caching benefits versus SQLite requiring custom read caching, Redis/memcached integration, and CDN layer caching
16. Write Scalability Patterns
   Related: Analysis of SQLite's write throughput capabilities, serial write handling, edge sharding with Cloudflare D1, and when single-node architecture suffices
17. Vector Database Developments
   Related: Brief mentions of Milvus features for RAG, vector indexing in DuckDB, and general traction of vector databases in AI ecosystem
18. Nested Transactions for Agents
   Related: Technical discussion of MVCC databases providing isolated snapshots for agent playgrounds, nested transaction support, and preventing accidental commits
19. File Format Competition
   Related: Interest in new formats challenging Parquet including Vortex, F3, AnyBlox, discussion of format interoperability problems and WASM decoder approaches
20. TiDB Momentum
   Related: Question about TiDB adoption in Silicon Valley as OLTP/OLAP hybrid, seeking commentary on its position in database landscape
0. Does not fit well in any category
</topics>

<comments_to_classify>
[
  
{
  "id": "46498817",
  "text": "> I can't believe that article has no mention of SQLite ??\n\nhttps://www.cs.cmu.edu/~pavlo/blog/2026/01/2025-databases-re..."
}
,
  
{
  "id": "46501912",
  "text": "No MSSQL, DB2 or Oracle either. Anything this proven & stable is probably not worth blogging about in this context. SQLite gets a lot of attention on HN but that's a bit of an exception."
}
,
  
{
  "id": "46503221",
  "text": "Same. CMD-F, 'sqlite', no hits, skip and go straight to comments."
}
,
  
{
  "id": "46499360",
  "text": "> Acquisitions ... Gel → Vercel\n\nis a bit misleading. Gel (formerly EdgeDB) is sunsetting it's development. (extremely talented) Team joins Vercel to work on other stuff.\n\nThat was a hard hit for me in December. I loved working with EdgeQL so much."
}
,
  
{
  "id": "46500059",
  "text": "It is a beautifully designed language and would make a great starting point for future DB projects."
}
,
  
{
  "id": "46498512",
  "text": "No mention of DuckDB? Surprising."
}
,
  
{
  "id": "46502502",
  "text": "Also somewhat surprised. DuckDB traction is impressive and on par with vector databases in their early phases. I think there's a good chance it will earn an honorable mention next year if adoption holds and becomes more mainstream. But my impression is that it's still early in its adoption curve where only those \"in the know\" are using it as a niche tool. It also still has some quirks and foot-guns that need moderately knowledgeable systems people to operate (e.g. it will happily OOM your DB)"
}
,
  
{
  "id": "46498689",
  "text": "Same surprise here. However in practice, the community tends to talk about DuckDB more like a client-side tool than a traditional database"
}
,
  
{
  "id": "46498341",
  "text": "I would like to mention that vector databases like Milvus got lots of new features to support RAG, Agent development, features like BM25, hybrid search etc.."
}
,
  
{
  "id": "46498939",
  "text": "Also emmer (which is perhaps too niche to get mentioned in an article like this), which I focuses more on being a quick/flexible 'data scratchpad', rather than just scale.\n\nhttps://hub.docker.com/r/tiemster/emmer"
}
,
  
{
  "id": "46498972",
  "text": "nice to see it get mentioned here :), I like using it also for scripts etc. Quite flexible because you can do everything with the api."
}
,
  
{
  "id": "46497742",
  "text": "Didn't know MongoDB was suing the company behind FerretDB. That's disgusting."
}
,
  
{
  "id": "46501670",
  "text": "Andy has a balanced and appropriate take here."
}
,
  
{
  "id": "46503863",
  "text": "With a trend towards immutable single writer databases MMAP seems like a massive win."
}
,
  
{
  "id": "46498331",
  "text": "Barely any mention of Oracle or MS Sql Server, commonly reckoned to be #1 and #3 most used databases in the world\n\nhttps://db-engines.com/en/ranking"
}
,
  
{
  "id": "46498388",
  "text": "Oracle is mentioned at the start, where he proclaims the \"dominance\" of Postgres and then admits its newest features have been in Oracle for nearly a quarter of a century already. The dominance he's talking about is only about how many startups raise how many millions from investors, not anything technical.\n\nAnd then of course at the end he has a whole section about Larry Ellison, like always."
}
,
  
{
  "id": "46498983",
  "text": "Isn't it because it's about news , as in what's changing, rather than being about what's staying the same? He's a researcher, so his interests are always going to be more oriented toward new systems and new companies more than the big dominant systems."
}
,
  
{
  "id": "46500713",
  "text": "There's nothing technically new that he's covering here though? It's all just startups adding stuff to Postgres that Oracle had for decades already."
}
,
  
{
  "id": "46500931",
  "text": "The startups are new."
}
,
  
{
  "id": "46496867",
  "text": "Nothing about time series-oriented databases?"
}
,
  
{
  "id": "46498819",
  "text": "> Nothing about time series-oriented databases?\n\nhttps://www.cs.cmu.edu/~pavlo/blog/2026/01/2025-databases-re..."
}
,
  
{
  "id": "46497043",
  "text": "Not much happened I guess. Clickhouse has got an experimental time series engine : https://clickhouse.com/docs/engines/table-engines/special/ti..."
}
,
  
{
  "id": "46506147",
  "text": "QuestDB at least is gaining some popularity: https://questdb.com/\n\nI was hoping to learn about some new potentially viable alternatives to InfluxDB, alas it seems I'll continue using it for now."
}
,
  
{
  "id": "46497278",
  "text": "Over here, it is DB2, SQL Server or Oracle if using a plain RDMS, or whatever DB abstraction layer is provided on top of a SaaS product, where we get to query with some kind of ORM abstraction preventing raw SQL, or GraphQL, without knowing the implementation details."
}
,
  
{
  "id": "46497620",
  "text": "This sounds like a flashback to J2EE. Which I know is still alive and well. Banks, insurance companies and the tax agency do not much care for fancy new stuff, but that it works."
}
,
  
{
  "id": "46500415",
  "text": "I describe these techs like garbage trucks. No one likes to see them but they’re there every day doing a decent part of what it takes to hold society together hah."
}
,
  
{
  "id": "46500534",
  "text": "Scott Hanselman has a good term for all these kind of jobs, the dark matter developers.\n\nhttps://www.hanselman.com/blog/dark-matter-developers-the-un..."
}
,
  
{
  "id": "46498126",
  "text": "Yep, Fortune 500 enterprise consulting, boring technology that pays the bills.\n\nJava, .NET, C++, nodejs, Sitecore, Adobe Experience Manager, Optimizely, SAP, Dynamics, headless CMSes,..."
}
,
  
{
  "id": "46499218",
  "text": "Never felt so old, seeing nodejs in a list of old boring stuff."
}
,
  
{
  "id": "46500545",
  "text": "Yeah, it is on the edge, but unavoidable in many Web projects."
}
,
  
{
  "id": "46501479",
  "text": "Andy is probably the only person who adores Larry Ellison (Oracle) unironically."
}
,
  
{
  "id": "46504549",
  "text": "Ironically unironically."
}
,
  
{
  "id": "46496833",
  "text": "I love these yearly review posts. Thanks Andy and team."
}
,
  
{
  "id": "46499720",
  "text": "we had to restrict ours to views only because it kept trying to run updates. still breaks sometimes when it hallucinates column names but at least it can't do anything destructive"
}
,
  
{
  "id": "46498637",
  "text": "> \"The Dominance of PostgreSQL Continues\"\n\nIt seems like the author is more focused on database features than user base. Every metric I can find online says that MySQL/MariaDB is more popular than PostgreSQL. PostgreSQL seems \"better\" (more features, better standards compliance) but MySQL/MariaDB works fine for many people. Am I living in a bubble?"
}
,
  
{
  "id": "46503942",
  "text": "> Every metric I can find online says that MySQL/MariaDB is more popular than PostgreSQL\n\nWhat are those metrics? If you're talking about things like db-engines rankings, those are heavily skewed by non-production workloads. For example, MySQL still being the database for Wordpress will forever have a high number of installations and developers using and asking StackOverflow questions. But when a new company or established company is deciding which new database to use for their custom application, MySQL is seldom in the running like it was 8-10 years ago."
}
,
  
{
  "id": "46501142",
  "text": "Popularity can mean multiple things. Are we talking about how frequently a database is used or how frequently a database is chosen for new projects? MySQL will always be very popular because some very popular things use it like WordPress.\n\nIt does feel like a lot of the momentum has shifted to PostgreSQL recently. You even see it in terms of what companies are choosing for compatibility. Google has a lot more MySQL work historically, but when they created a compatibility interface for Cloud Spanner, they went with PostgreSQL. ClickHouse went with PostgreSQL. More that I'm forgetting at the moment. It used to be that everyone tried for MySQL wire compatibility, but that doesn't feel like what's happening now.\n\nIf MySQL is making you happy, great. But there has certainly been a shift toward PostgreSQL. MySQL will continue to be one of the most used databases just as PHP will remain one of the most used programming languages. There's a lot of stuff already built with those things. I think most metrics would say that PHP is more widely deployed than NodeJS, but I think it'd be hard to argue that PHP is what the developer community is excited about.\n\nEven search here on HN. In the past year, 4 MySQL stories with over 100 point compared to 28 PostgreSQL stories with over 100 points (and zero MariaDB stories above 100 points and 42 SQLite). What are we talking about here on HN? Not nearly as frequently MySQL - we're talking about SQLite and PostgreSQL. That's not to say that MySQL doesn't work great for you or that it doesn't have a large installed base, but it isn't where our mindshare is about the future."
}
,
  
{
  "id": "46504999",
  "text": "> ClickHouse went with PostgreSQL.\n\nWhat do you mean by this? AFAIK they added MySQL wire protocol compatibility long before they added Postgres. And meanwhile their cloud offering still doesn't support Postgres wire protocol today, but it does support MySQL wire protocol.\n\n> Even search here on HN.\n\nfwiw MySQL has been extremely unpopular on HN for a decade or more, even back when MySQL was a more common choice for startups. So there's a bit of a self-fulfilling prophecy where MySQL ecosystem folks mostly stopped submitting stories here because they never got enough upvotes to rank high enough to get eyeballs and discussion.\n\nThat all said, I do agree with your overall thesis."
}
,
  
{
  "id": "46498693",
  "text": "I think author is basing his observations on where the money is flowing.\nPostgreSQL adjacent startups and businesses are seeing a lot of investment."
}
,
  
{
  "id": "46498898",
  "text": "> Am I living in a bubble?\n\nThere are rumblings that the MySQL project is rudderless after Oracle fired the team working on the open-source project in September 2025. Oracle is putting all its energy in its closed-source MySQL Heatwave product. There is a new company that is looking to take over leadership of open-source MySQL but I can't talk about them yet.\n\nThe MariaDB Corporation financial problems have also spooked companies and so more of them are looking to switch to Postgres."
}
,
  
{
  "id": "46501485",
  "text": "> There are rumblings that the MySQL project is rudderless after Oracle fired the team working on the open-source project in September 2025.\n\nNot just the open-source project; 80%+ (depending a bit on when you start counting) of the MySQL team as a whole was let go, and the SVP in charge of MySQL was, eh, “moving to another part of the org to spend more time with his family”. There was never really a separate “MySQL Community Edition team” that you could fire, although of course there were teams that worked mostly or entirely on projects that were not open-sourced."
}
,
  
{
  "id": "46499898",
  "text": "How is SpacetimeDB not mentioned here?"
}
,
  
{
  "id": "46500018",
  "text": "> How is SpacetimeDB not mentioned here?\n\nhttps://www.cs.cmu.edu/~pavlo/blog/2026/01/2025-databases-re..."
}
,
  
{
  "id": "46502851",
  "text": "Why does \"database\" is surveys like this not include DuckDB and SQLite, which are great [1] embedded answers to Clickhouse and PostgreSQL. Both are excellent and useful databases; DuckDB's reasonable syntax, fast vectorized everything, and support for ingesting the hairiest of data as in-DB ETL make me reach for it first these days, at least for the things I want to do.\n\nWhy is it that in \"I'm a serious database person\" circles, the popular embedded databases don't count?\n\n[1] Yes, I know it's not an exact comparison."
}
,
  
{
  "id": "46502986",
  "text": "TiDB has gained some momentum in silicon valley with companies looking to adopt it. Does he have any commentary on TiDB which is an OLTP and OLAP hybrid?"
}
,
  
{
  "id": "46498813",
  "text": "Can we even say that Anyblox is a file format? By my understanding of the project it's \"just\" a decoder for other file formats to solve the MxN problem."
}
,
  
{
  "id": "46500178",
  "text": "It's so weird how everyone nowadays is using Postgres. It's not like end users can see your database.\n\nIt's disturbing how everyone is gravitating towards the same tools. This started happening since React and kept getting worse. Software development sucks nowadays.\n\nAll technical decisions about which tools to use are made by people who don't have to use the tools. There is no nuance anymore. There's a blanket solution for every problem and there isn't much to choose from. Meanwhile, software is less reliable than it's ever been.\n\nIt's like a bad dream. Everything is bad and getting worse."
}
,
  
{
  "id": "46504091",
  "text": "What's wrong this postgres?"
}
,
  
{
  "id": "46503203",
  "text": "Which alternatives to PostgreSQL would you like to see get more attention?"
}
,
  
{
  "id": "46505428",
  "text": "All of them. Nothing wrong with Postgres, I like Postgres. But the more alternatives the better. My favorite database is RethinkDB but officially, it's a dead project. Unofficially it's still pretty great."
}

]
</comments_to_classify>

Based on the comments above, assign each to up to 3 relevant topics.

Return ONLY a JSON array with this exact structure (no other text):
[
  
{
  "id": "comment_id_1",
  "topics": [
    1,
    3,
    5
  ]
}
,
  
{
  "id": "comment_id_2",
  "topics": [
    2
  ]
}
,
  
{
  "id": "comment_id_3",
  "topics": [
    0
  ]
}
,
  ...
]

Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment

Remember: Output ONLY the JSON array, no other text.

commentCount

50

← Back to job