Data Retrieval

When creating a RAG-enabled chat endpoint, we can either use the getDataRetriever method to obtain an instance of a vector store retriever, or we can directly provide configurations for data retriever to the chat endpoint as the retrieverConfig.

This may or may not include data ingestion whereby we load the data, generate embeddings and store them in a vector store. To learn about data ingestion, check Data Ingestion.

Retrieving Data

When a RAG-enabled chat endpoint received a query, it uses the provided data retriever to retrieve the relevant context information from the vector store to which the data retriever belongs.

The underlying process is as follows:

Query Embedding Generation: Each data retriever is attached to a vector store, and a vector store has an embedding model specified for it. This embedding model is used to generate embedding vectors for the given user query.
Vector Store Search: The generated embedding vectors are used to search the vector store to retrieve the most relevant context information. You can configure your data retriever to use different types of search algorithms, such as similarity matching or Maximal Marginal Relevance (MMR), or specify the number of documents to retrieve. All such configurations will greatly impact the performance of your RAG chat endpoint. We recommend you start with default parameters, but always test with different configurations to find the best one for your use case.
Conversion to String: The retrieved context information is converted back to a string format so it can be embedded in the query that will be sent to the LLM model. A specific prompt template maybe used to structure the query and context information in a way that the LLM model can understand and generate a meaningful response.
Response Generation: The query with the context information is sent to the LLM model, which generates a response based on the query and context information.

Retriever Config

The data retriever configuration object is used to specify the data loader, data splitter, and vector store to use. The configuration object can also include options for generating embeddings, search options, and other configurations.

The getDataRetriever method and the retrieverConfig property in the chat endpoint configuration accept the same configuration object. Below are the properties that can be used to configure the retriever:

Required properties

filePath: The path to the file to load the data from.

Optional properties

dataType: The type of data to load. This helps ascertain the best splitting strategy. When providing filePath, specifying dataType is optional. If not specified, the data type is inferred from the file extension. If not providing filePath, i.e., you are specifying docs or splitDocs, you must provide the dataType, since it cannot be inferred from the file extension.
docs: An array of Document objects containing the data to load. This is useful when you want to load data from a source not supported by QvikChat by default. You can use any LangChain-supported data loader (opens in a new tab) to load the data and provide the documents as the docs property. If providing docs, you do not need to provide filePath but you must provide the dataType.
splitDocs: An array containing documents that have been processed through a data splitter. This is useful when you want to load data from a source not supported by QvikChat by default. You can use any LangChain-supported text splitter (opens in a new tab) to split the data and provide the split documents as the splitDocs property. If providing splitDocs, you do not need to provide docs. You must provide the dataType.
jsonLoaderKeysToInclude: An object containing the keys to include when loading JSON data. This is useful when you want to load only specific keys from the JSON data.
csvLoaderOptions: An object containing options to specify when loading CSV data. This is useful when you want to specify the delimiter and other options when loading CSV data.
pdfLoaderOptions: An object containing options to specify when loading PDF data. This is useful when you want to specify additional options when loading PDF data.
dataSplitterType: Use this to specify a specific text splitter you want to use. If not specified, the default text splitter is used based on the data type.
chunkingConfig: An object containing the data chunking configuration. This is useful when you want to specify the data chunking configuration.
splitterConfig: An object containing the data splitter configuration. Use this to specify or override data splitting strategy.
vectorStore: A vector store instance to use for retrieving relevant context data for a given query. When generateEmbeddings is set to true, the generated embeddings are stored in the vector store. If using a hosted vector store, you can provide the instance to the vectorStore property. To learn more, check out the Vector Store page.
embeddingModel: An embedding model instance to use for generating embeddings for the data. When generateEmbeddings is set to true, the embeddings are generated using the provided embedding model. If you want to use a specific embedding model, you can provide the instance to the embeddingModel property. By default, QvikChat will use either Gemini API or OpenAI for embedding model, depending on how you have configured the project. To learn more, check out the Embedding Models page.
generateEmbeddings: A boolean value to specify whether to generate embeddings for the data. If set to true, embeddings are generated for the data. If set to false, embeddings are not generated for the data, only the retriever instance is returned.
retrievalOptions: Use this to configure the data retrieval strategy. Check out the Data Retrieval page for more information.

Data Retriever

The getDataRetriever method is used to obtain a vector store retriever. You can provide configurations as an argument to the method. The method returns a retriever instance that can be used to retrieve context information from the vector store.

Below is an example of a custom configured data retriever:

import { getDataRetriever } from "@oconva/qvikchat/data-retrievers";
 
// Index inventory data and get retriever
const inventoryDataRetriever = await getDataRetriever({
  filePath: "data/knowledge-bases/inventory-data.csv",
  generateEmbeddings: true,
  retrievalOptions: {
    k: 10, // return top 10 matching documents
    searchType: "mmr", // use Maximal Marginal Relevance (MMR) for search
  },
});

The getDataRetriever method returns a retriever instance that can be used to retrieve context information from the vector store. This retriever instance can be passed to the chat endpoint configuration as the retriever property.

import { defineChatEndpoint } from "@oconva/qvikchat/endpoints";
 
// open-ended RAG chat endpoint
defineChatEndpoint({
  enableRAG: true,
  topic: "Inventory Data", // topic of data for which RAG is enabled
  retriever: inventoryDataRetriever,
});

Endpoint Retriever Config

Instead of creating a data retriever separately, you can also provide the data retriever configuration directly in the chat endpoint configuration. Data retriever will be created using the provided configuration when the chat endpoint is initialized.

Be mindful though that when sharing the same retriever configurations (for example, same data) and the generateEmbeddings flag is set to true, the data will be loaded and embeddings will be generated each time any chat endpoint using that retriever configuration is initialized. So, if you have multiple chat endpoints that will be sharing the same retriever config and you are generating embeddings for the data, it is recommended to create a retriever instance separately and share it across multiple chat endpoints as the retriever instance.

// Inventory Data chat endpoint with support for chat history, authentication, caching and RAG
defineChatEndpoint({
  endpoint: "rag-chat-inventory",
  enableChatHistory: true,
  chatHistoryStore, // chat history store instance
  enableAuth: true,
  apiKeyStore, // API key store instance
  enableCache: true,
  cacheStore, // cache store instance
  enableRAG: true,
  topic: "Store Inventory Data", // topic of data for which RAG is enabled
  retrieverConfig: {
    filePath: "rag/knowledge-bases/inventory-data.csv",
    generateEmbeddings: true,
  },
});

Vector Store

The data retriever is attached to a vector store, which is used to store the embeddings generated for the data. The vector store is responsible for storing the embeddings and providing search functionality to retrieve the most relevant context information based on the user query.

By default, QvikChat uses an in-memory vector store. Check Vector Store to learn how to use a hosted vector store supported by LangChain.

Data Ingestion Data Loaders