RAG application architecture
The system follows a professional retrieval-first approach: documents are processed into chunks, converted into embeddings, searched by similarity, and then passed into the AI model as grounded context.
A production-style RAG application that searches culinary knowledge, retrieves the most relevant passages, and generates clear answers with visible source evidence. Built to show practical AI engineering, not just a prompt box.
This page explains the engineering behind the demo from an audience perspective: what problem it solves, what technologies are used, and why the implementation is more professional than a simple AI wrapper.
The system follows a professional retrieval-first approach: documents are processed into chunks, converted into embeddings, searched by similarity, and then passed into the AI model as grounded context.
Stores documents, chunks, page metadata, embeddings, and retrieval information in a way that supports scalable search and clear source display.
Uses vector similarity to find meaning-based matches, allowing the app to retrieve relevant material even when the question wording is different.
Supports strict, balanced, creative, and recipe modes so the same backend can respond differently depending on the user’s goal.
Designed with environment variables, service deployment, secret handling, and Google Cloud Run compatibility in mind.
Choose a mode, ask a question, and inspect the retrieved sources.
Search cooking theory, techniques, and source pages.