New ask Hacker News story: Ask HN: How would you architect a RAG system for 10M+ documents today?

Ask HN: How would you architect a RAG system for 10M+ documents today?
2 by Ftrea | 0 comments on Hacker News.
I'm tasked with building a private AI assistant for a corpus of 10 million text documents (living in PostgreSQL). The goal is semantic search and chat, with a requirement for regular incremental updates. I'm trying to decide between: Bleeding edge: Implementing something like LightRAG or GraphRAG. Proven stack: Standard Hybrid Search (Weaviate/Elastic + Reranking) orchestrated by tools like Dify. For those who have built RAG at this scale: What is your preferred stack for 2025? Is the complexity of Graph/LightRAG worth it over standard chunking/retrieval for this volume? How do you handle maintenance and updates efficiently? Looking for architectural advice and war stories.

Comments

Popular posts from this blog

How can Utilize Call Center Outsourcing for Increase your Business Income well?

New ask Hacker News story: EVM-UI – visual tool to interact with EVM-based smart contracts

New ask Hacker News story: Ask HN: Should I quit my startup journey for now?