How SimplyRAG works

Turn your data into searchable, chat-ready knowledge in five steps. No infrastructure to manage—connect, create, index, and ask.

1

Add your data sources

DocumentsSQLWebsitesFirebase
2

Organize into knowledge bases

Collections
3

Chunk, embed, store

Index
4

Chat or API

Chat
5

Accurate, traceable

Citations

Step-by-step breakdown

1

Connect your data

Upload documents (PDF, text), connect SQL databases, crawl websites, or sync Firebase/Firestore. Each source feeds into a collection—your knowledge base.

DocumentsSQLWebsitesFirebase
2

Create collections

Create a collection for each use case—e.g. product docs, internal wiki, support FAQs. Add your OpenAI and Pinecone keys (BYOK). SimplyRAG handles the rest.

Collections
3

Automatic indexing

Your content is chunked, embedded with OpenAI, and stored in Pinecone. No infrastructure to manage. For SQL collections, the schema is analyzed for natural-language queries.

Index
4

Ask questions

Use the built-in Chat UI or call REST APIs. Ask in plain language—SimplyRAG retrieves relevant chunks, sends them to the LLM, and returns answers with citations.

Chat
5

Get answers with citations

Answers include source references so you can verify. Embed the chat widget in your app, or use the API to power search, support bots, or internal tools.

Citations

Understanding RAG

What is RAG?

Retrieval Augmented Generation (RAG) is a technique that combines a large language model (LLM) with your own data. When you ask a question, the system first searches your documents for relevant content, then sends that content to the LLM as context. The model generates an answer grounded in your data—with source citations—instead of relying only on its training.

Your dataRetrieveGenerate

Why RAG?

Plain LLMs only know what they were trained on—and that data has a cutoff date. They can't access your internal docs, your product specs, or your latest policies. RAG fixes that: you connect your data, and the model answers from it. You get accurate, up-to-date answers without retraining. It's the standard way to make AI useful for your business.

  • • Answers from your content, not generic training
  • • Always up to date when you update your data
  • • Citations so you can verify sources

RAG vs basic LLM search

A basic LLM chat answers from its training data only—no access to your files. RAG adds a retrieval step: it searches your indexed documents first, finds the relevant chunks, and feeds them to the LLM as context. The result is an answer that comes from your data, with citations. No hallucinations from outdated or irrelevant training data.

Basic LLM:

Question → LLM → Answer (from training only)

RAG:

Question → Search your data → LLM + context → Answer (from your data, cited)

The RAG flow in SimplyRAG

Your documents are chunked, embedded, and stored. When you ask a question, SimplyRAG retrieves the most relevant chunks, sends them to the LLM, and returns an answer with citations.

Your data
Embed & index
Retrieve & answer

Ready to try it?

Get started in minutes. No credit card required.