Building RAG Applications
This project-based (theory-light) course will explore the world of building vector-search applications integrated with the powerful capabilities of Generative Large Language Models (LLMs). Along the way we’ll dive into best practices for text preprocessing, vectorization, indexing, and reranking using an Weaviate database. We’ll compare the uses of keyword-based and semantic-based search, and use industry-standard evaluation metrics to benchmark our results. Ultimately we’ll bring the project together by plugging into ChatGPT-Turbo-3.5 for Question Answering, and wrap it up in a Streamlit user interface. Optional portions of the course include fine-tuning a vector embedding model to further enhance our retrieval results.
Course taught by expert instructors
Chris Sanchez
Senior Data Science Manager at Microsoft
Chris Sanchez is a Senior Data Scientist at Microsoft working in the Office of the CTO under the Strategic Mission & Technologies division. Prior to his current role he focused on building Information Retrieval (IR) systems for customers in the national security domain. During that time he pioneered semantic-based search methods that focused on relevance at the whole-document level. His domain knowledge in the national security arena stems from his prior military career with Naval Special Warfare. He holds a Masters degree in Data Science from UC Berkeley.
The course
Learn and apply skills with real-world projects.
Software engineers - integrate vector search as part of an overall application’s tech stack
Data scientists - gain a better understanding of the tradeoffs between keyword and vector-based search and create benchmark datasets that allow for direct comparisons between the two methods
Students/recent college grads - gain hands-on experience with building your first vector search application tied to the Question Answering capability of a LLM (OpenAI ChatGPT Turbo-3.5)
Uplimit Search Fundamentals course or professional/academic experience working with search engines such as OpenSearch/Elasticsearch/Solr/Vespa. We will not be teaching search fundamentals in this course.
Minimum 1-year of coding in Python to include the following skillsets: OOP including Inheritance, Dictionary and List Comprehensions, Lambda Functions, Virtual Environments
Ability to comfortably navigate and launch applications from the command line
Familiarity with Docker
Nice to have but not strictly required: Experience fine-tuning an ML model, Familiarity with the Streamlit API, Familiarity with the Open AI API
Try these prep courses first
- Learn
- Embedding Theory: What is a document?
- Preprocessing and chunking strategies
- Vector Indexing on an OpenSearch database
- Comparing embedding approaches
- Comparison of Keyword and Vector retrieval
Project- Create a search system in a development environment using a popular podcast series as the data.
- Compare and evaluate the initial system by benchmarking Keyword and Vector retrieval approaches.
- Learn
- Hybrid Search with a Reranker
- Evaluation with golden dataset
- Experimenting with different values
- Answer synthesis with LLM
- Best performance contest
Project- Build on the previous week’s system and evaluate retrieval performance after adding a Reranker model and
- Integrate system with OpenAI’s ChatGPT LLM to answer questions about data.
- Learn
- Adding context to retrieved results
- System evaluation
- Displaying your results through a Streamlit UI
- Optional - Embedder fine-tuning and reranker fine-tuning
Project- Build on the previous week’s system and evaluate overall system performance.
- Display the final product through a Steamlit UI.
A course you'll actually complete. AI-powered learning that drives results.
AI-powered learning
Transform your learning programs with personalized learning. Real-time feedback, hints at just the right moment, and the support for learners when they need it, driving 15x engagement.
Live courses by leading experts
Our instructors are renowned experts in AI, data, engineering, product, and business. Deep dive through always-current live sessions and round-the-clock support.
Practice on the cutting edge
Accelerate your learning with projects that mirror the work done at industry-leading tech companies. Put your skills to the test and start applying them today.
Flexible schedule for busy professionals
We know you’re busy, so we made it flexible. Attend live events or review the materials at your own pace. Our course team and global community will support you every step of the way.
Completion certificates
Each course comes with a certificate for learners to add to their resume.
Best-in-class outcomes
15-20x engagement compared to async courses
Support & accountability
You are never alone, we provide support throughout the course.
Get reimbursed by your company
More than half of learners get their Courses and Memberships reimbursed by their company.
Hundreds of companies have dedicated L&D and education budgets that have covered the costs.