How SimplyRAG works
Turn your data into searchable, chat-ready knowledge in five steps. No infrastructure to manage—connect, create, index, and ask.
Add your data sources
Organize into knowledge bases
Chunk, embed, store
Chat or API
Accurate, traceable
Step-by-step breakdown
Connect your data
Upload documents (PDF, text), connect SQL databases, crawl websites, or sync Firebase/Firestore. Each source feeds into a collection—your knowledge base.
Create collections
Create a collection for each use case—e.g. product docs, internal wiki, support FAQs. Add your OpenAI and Pinecone keys (BYOK). SimplyRAG handles the rest.
Automatic indexing
Your content is chunked, embedded with OpenAI, and stored in Pinecone. No infrastructure to manage. For SQL collections, the schema is analyzed for natural-language queries.
Ask questions
Use the built-in Chat UI or call REST APIs. Ask in plain language—SimplyRAG retrieves relevant chunks, sends them to the LLM, and returns answers with citations.
Get answers with citations
Answers include source references so you can verify. Embed the chat widget in your app, or use the API to power search, support bots, or internal tools.
Understanding RAG
What is RAG?
Retrieval Augmented Generation (RAG) is a technique that combines a large language model (LLM) with your own data. When you ask a question, the system first searches your documents for relevant content, then sends that content to the LLM as context. The model generates an answer grounded in your data—with source citations—instead of relying only on its training.
Why RAG?
Plain LLMs only know what they were trained on—and that data has a cutoff date. They can't access your internal docs, your product specs, or your latest policies. RAG fixes that: you connect your data, and the model answers from it. You get accurate, up-to-date answers without retraining. It's the standard way to make AI useful for your business.
- • Answers from your content, not generic training
- • Always up to date when you update your data
- • Citations so you can verify sources
RAG vs basic LLM search
A basic LLM chat answers from its training data only—no access to your files. RAG adds a retrieval step: it searches your indexed documents first, finds the relevant chunks, and feeds them to the LLM as context. The result is an answer that comes from your data, with citations. No hallucinations from outdated or irrelevant training data.
Basic LLM:
Question → LLM → Answer (from training only)
RAG:
Question → Search your data → LLM + context → Answer (from your data, cited)
The RAG flow in SimplyRAG
Your documents are chunked, embedded, and stored. When you ask a question, SimplyRAG retrieves the most relevant chunks, sends them to the LLM, and returns an answer with citations.
Ready to try it?
Get started in minutes. No credit card required.