Space is limited
Course logo

Building Advanced RAG Applications

This project-based (theory-light) course will explore the world of building vector-search applications integrated with the powerful capabilities of Generative Large Language Models (LLMs). Along the way we’ll dive into best practices for text preprocessing, vectorization, indexing, and reranking using an Weaviate database. We’ll compare the uses of keyword-based and semantic-based search, and use industry-standard evaluation metrics to benchmark our results. Ultimately we’ll bring the project together by plugging into ChatGPT-Turbo-3.5 for Question Answering, and wrap it up in a Streamlit user interface. Optional portions of the course include fine-tuning a vector embedding model to further enhance our retrieval results.

Instructor profile photo
Chris Sanchez
Senior Data Science Manager at Microsoft
Price
US$ 400
or included with membership
Duration
3 weeks
Space is limited

Course taught by expert instructors

Instructor Photo
Affiliation logo

Chris Sanchez

Senior Data Science Manager at Microsoft

Chris Sanchez is a Senior Data Scientist at Microsoft working in the Office of the CTO under the Strategic Mission & Technologies division. Prior to his current role he focused on building Information Retrieval (IR) systems for customers in the national security domain. During that time he pioneered semantic-based search methods that focused on relevance at the whole-document level. His domain knowledge in the national security arena stems from his prior military career with Naval Special Warfare. He holds a Masters degree in Data Science from UC Berkeley.

The course

Learn and apply skills with real-world projects.

Who is it for?
  • Software engineers - integrate vector search as part of an overall application’s tech stack

  • Data scientists - gain a better understanding of the tradeoffs between keyword and vector-based search and create benchmark datasets that allow for direct comparisons between the two methods

  • Students/recent college grads - gain hands-on experience with building your first vector search application tied to the Question Answering capability of a LLM (OpenAI ChatGPT Turbo-3.5)

Prerequisites
  • Uplimit Search Fundamentals course or professional/academic experience working with search engines such as OpenSearch/Elasticsearch/Solr/Vespa. We will not be teaching search fundamentals in this course.

  • Minimum 1-year of coding in Python to include the following skillsets: OOP including Inheritance, Dictionary and List Comprehensions, Lambda Functions, Virtual Environments

  • Ability to comfortably navigate and launch applications from the command line

  • Familiarity with Docker

  • Nice to have but not strictly required: Experience fine-tuning an ML model, Familiarity with the Streamlit API, Familiarity with the Open AI API

Not ready?

Try these prep courses first

Learn
  • Embedding Theory: What is a document?
  • Preprocessing and chunking strategies
  • Vector Indexing on an OpenSearch database
  • Comparing embedding approaches
  • Comparison of Keyword and Vector retrieval
Project
  • Create a search system in a development environment using a popular podcast series as the data.
  • Compare and evaluate the initial system by benchmarking Keyword and Vector retrieval approaches.
Learn
  • Hybrid Search with a Reranker
  • Evaluation with golden dataset
  • Experimenting with different values
  • Answer synthesis with LLM
  • Best performance contest
Project
  • Build on the previous week’s system and evaluate retrieval performance after adding a Reranker model and
  • Integrate system with OpenAI’s ChatGPT LLM to answer questions about data.
Learn
  • Adding context to retrieved results
  • System evaluation
  • Displaying your results through a Streamlit UI
  • Optional - Embedder fine-tuning and reranker fine-tuning
Project
  • Build on the previous week’s system and evaluate overall system performance.
  • Display the final product through a Steamlit UI.

A course you'll actually complete. AI-powered learning that drives results.

AI-powered learning

Transform your learning programs with personalized learning. Real-time feedback, hints at just the right moment, and the support for learners when they need it, driving 15x engagement.

Live courses by leading experts

Our instructors are renowned experts in AI, data, engineering, product, and business. Deep dive through always-current live sessions and round-the-clock support.

Practice on the cutting edge

Accelerate your learning with projects that mirror the work done at industry-leading tech companies. Put your skills to the test and start applying them today.

Flexible schedule for busy professionals

We know you’re busy, so we made it flexible. Attend live events or review the materials at your own pace. Our course team and global community will support you every step of the way.

Timeline

Completion certificates

Each course comes with a certificate for learners to add to their resume.

Best-in-class outcomes

15-20x engagement compared to async courses

Support & accountability

You are never alone, we provide support throughout the course.

Get reimbursed by your company

More than half of learners get their Courses and Memberships reimbursed by their company.

Hundreds of companies have dedicated L&D and education budgets that have covered the costs.

Reimbursement

Frequently Asked Questions

Still not sure?

Get in touch and we'll help you decide.

Keep in touch for updates, discounts, and new courses.

Questions? Ask us anything at hello@uplimit.com

© 2021-2024 Uplimit