https://www.youtube.com/watch?v=UabBYexBD4k&list=PPSV&t=404s

Here are both versions.

Tighter blog-style summary

Is RAG still needed? Yes — but not everywhere.

As large language models gain much larger context windows, one of the biggest design questions in AI systems is no longer whether we can inject external knowledge, but how we should do it.

That is where the RAG versus long-context discussion becomes interesting.

The core idea is simple: long-context prompting lets you place more source material directly into the model’s prompt, while RAG retrieves only the most relevant pieces at runtime. As context windows expand, some use cases that once required retrieval pipelines can now be solved with a simpler architecture and fewer moving parts.

But that does not make RAG obsolete.

RAG still matters when the knowledge base is large, frequently updated, proprietary, or too expensive to send in full with every request. In those environments, retrieval remains a practical way to control cost, reduce prompt bloat, and keep responses grounded in the most relevant information.

So the real takeaway is not that one approach has replaced the other. It is that the decision has become architectural.

Use long context when the dataset is bounded, the workflow benefits from simplicity, and the model needs broad visibility across the material. Use RAG when scale, freshness, selectivity, or enterprise data access matter more.

The future is not “RAG or long context.” It is choosing the right tradeoff between simplicity, cost, scale, and grounding.

⸻

Bullet notes for LinkedIn

Is RAG still needed? Yes — but it is no longer the automatic answer for every LLM system.

A few key takeaways: • Bigger context windows are changing the design space. • Some workflows that used to require RAG can now be handled with long-context prompting. • Long context reduces architectural complexity: fewer moving parts, less retrieval orchestration. • But RAG still makes sense when data is: • large • dynamic • proprietary • too expensive to include in every prompt • So this is not about one approach “winning.” • It is about choosing the right tradeoff between: • simplicity • cost • scale • response grounding

Bottom line: Long context reduces the need for RAG in some cases. It does not eliminate the need for retrieval in serious knowledge-heavy systems.

I can also turn this into a more polished CorebaseIT-style post with a stronger hook and closing paragraph.