Introduction
From IBM Watson to OpenAI’s ChatGPT, today’s developments in Artificial intelligence are now being put to work, built on increasingly capable foundation models trained on swathes of data far across human knowledge. AI agents can answer frequently-asked questions, help with programming, and classify input according to different criteria.
OpenAI Assistants allow us to build chatbots with custom knowledge and functionality that we can tailor to our particular needs. For example, we can specify for the bot to use language appropriate for different audiences, for example modifying its responses to suit the general public, or allow terminology used by subject matter experts.
Beyond the knowledge that has already been included in general-use LLMs, we can use a technique called Retrieval-Augmented Generation (RAG) to supplement it with additional knowledge it can pull from.
What is Retrieval-Augmented Generation?
Retrieval-Augmented Generation (RAG) is a way to supplement an LLM by pulling from external knowledge bases, and can be as simple as uploading a file for its contents to be made constantly available to the LLM.
This can be used to equip a bot with information specific to its users (e.g. internal company policy) or information about things that frequently change over time, like a set of documents we’re actively working on. Typically, an external process will be used to select the most relevant document to give the AI—very much like giving it a cheat sheet in the middle of a conversation.
An important consideration to generative AI is the risk of confabulation (more commonly referred to as “hallucination”). Due to the statistical nature of LLMs, it can output plausible-sounding but incorrect information, especially of the answer isn’t in its original knowledge base. Insufficient measures to prevent this can have very tangible consequences. The reference document we’re giving it focuses its attention on the facts it needs to answer the user’s question.
We can configure it to cite its sources. By giving specific references, we also improve explainability and verifiability by specifying exactly what part of the document it’s referencing in its answer, and allowing us to check what the source document itself says. It acts more like an expert at searching for information and interpreting it without having to worry about remember whether a document’s keywords were “football” or “soccer”.
Lastly, RAG is more flexible and cost-effective over other ways of adding custom information like fine-tuning, which involves training the LLM further and can be both expensive and time-consuming in dataset preparation.
Creating a RAG-powered Assistant with OpenAI
Let’s create a Teaching Assistant using the OpenAI Playground.
We’re teaching a hypothetical course, and would like to have a bot that’s always available to answer frequently-asked questions. Our sample FAQ sheet looks like this:
DS 101: Intro to Data Science FAQ Sheet
Project 1 Questions
1. What format should I submit? All submissions for Project 1 must be exported PDF. No images or Word files, please.
2. Can I share data with others? Sharing is encouraged. You will be scored on your analysis of the data.
Project 1 Deadline: May 5, 2024 8:00 PM Philippine Time
Project 2 Questions
1. Can I use the data collected in Project 1? Yes.
2. What format should I submit? Submissions should be CSV or TSV.
Deadline: October 5, 2024 8:00 PM Philippine Time
Final Exam Guidelines
1. The exam will be open-book for all textbooks listed in the syllabus.
2. Notebooks are not allowed.
3. Laptops and phones are not allowed.
4. Answers must be written on A4-sized paper in black ink. No pencils allowed.
PlaintextSimply press the Create button to get started.
The first step is to give our Assistant a distinctive and easy-to-remember Name. If you’re part of an organization, this will be visible to all members of the organization.
If you have multiple Assistants with similar functions, you’ll want to make it easy to distinguish between them.
The Instructions field guides the personality of the Assistant, and defines its goals. It is always visible to the Assistant regardless of how long the chat goes on.
On the Instructions field, you can press the arrows to expand the text area to type in.
The Model field specifies the base LLM that will run your Assistant. GPT-4 models are the latest and most powerful of OpenAI’s models, and do better in logic and instruction-following. However, this has a tradeoff in response speed and model pricing.
For this exercise, we’ll use the gpt-4-turbo-preview
, but it’s a great idea to try out the other models to familiarize yourself with their respective strengths and weaknesses. For example, only some GPT 3.5 models are equipped to do Retrieval.
Function Calling and the Code Interpreter are amazing features that should have their own articles. For now, let’s focus on the third tool, Retrieval.
Simply toggling it on and uploading our files, we now have a RAG-power Teaching Assistant!
We can take it for a test run by typing our queries into the main chat box on the right, and pressing Run. For small files, the entire file is put into the chatbot’s memory. For bigger external knowledge bases, only the most relevant chunks (according to their embeddings) are made available to the bot.
On the screenshot below, the [1]
identifies which file the answer was taken from, which corresponds to our uploaded FAQ sheet.
On the top-right side of the chat is the cleanup button, which you can use to reset the conversation status and start from scratch. This is useful for testing how consistently it responds to the same question asked multiple times, or how its behavior changes after uploading new data or modifying its instructions.
On the lower-left side is a the clone button, which can be used to roll out multiple Assistants that largely share the same features. For example, we could have different Teaching Assistants for each year level we’re teaching, and a different set of FAQs.
Conclusion
With just a few steps and zero coding, we can build a chatbot powered by Retrieval-Augmented Generation and quickly explore what it can do with our own documents.
This is just the simplest case for getting started with RAG. Check out OpenAI’s documentation for how to work with the Assistants API in order to embed it into your applications, write and execute code, query external APIs for live data, and more. What knowledge would you want to see in an Assistant?
While building a simple RAG chatbot yourself is possible, for real-world business applications, you need a robust and expertly crafted solution. That’s where Bonito Tech can help! We can craft custom RAG chatbots tailored to your specific needs. Contact us today for a free consultation and see how AI can empower your workforce and elevate your customer experience.