Update prompt

const result = await ragChat.chat("THIS_IS_USERS_QUESTION", {
  promptFn: ({ question, chat_history }) => {
    return `
      Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.
      Chat History:
      ${chatHistory}
      Follow Up Input: ${question}
      Standalone question:
    `;
  },
});

Adjust the context quality

If your chatbot is negatively affected by the provided context, you can adjust the context quality using the similarityThreshold option. The similarityThreshold is a number between 0 and 1, with a default value of 0.5. Increasing this value will improve quality, but the chatbot may reject more queries due to lack of relevant data.

ragChat.chat("Tell me about machine learning", {
  similarityThreshold: 0.5
});

Disable RAG to bypass context

When RAG disabled, RAG chat skips the vector store and directly sends the question to the LLM model.

const result = await ragChat.chat("THIS_IS_USERS_QUESTION", {
  disableRAG: true
});

Enable debug logs

new RAGChat({ debug: true });

Access the context

You can access and update the context by using the onContextFetched callback. This callback is called after the context is fetched from the vector store before the prompt is sent to the LLM.

let response = await ragChat.chat(question, {
  onContextFetched: (context) => {
    console.log("Retrieved context:", context)
    return context
  },
});

Access the chat history

You can access chat history by using the onChatHistoryFetched callback. This callback is called after the chat history is fetched from the Redis (or memory) before the prompt is sent to the LLM.

let response = await ragChat.chat(question, {
  onChatHistoryFetched: (messages) => {
    console.log("Retrieved chat history:", messages)
    return messages
  },
});

Add metadata into context

You can add metadata while you are inserting a message into the context. The array of metadata fetched from context will be returned with response.

export const rag = new RAGChat();
await rag.context.add({
  type: "text",
  data: "Mississippi is a state in the United States of America",
  options: { metadata: { source: "wikipedia" }}
});
const response = await rag.chat("Where is Mississippi?");

console.log(response.metadata);
// [ {source: "wikipedia"} ]

Access the ratelimit details

You can see how much allowance a user has by using the ratelimitDetails hook.

await ragChat.chat("You shall not pass", {
  streaming: false,
  ratelimitDetails: (response) => {
    remainingLimit = response.remaining;
  },
});

Access the output during streaming

You can access, read, or modify your streamed data.

await ragChat.chat("", {
  onChunk({ content, inputTokens, chunkTokens, totalTokens, rawContent }) {
    // Change or read your streamed chunks
    // Examples:
    // - Log the content: console.log(content);
    // - Modify the content: content = content.toUpperCase();
    // - Track token usage: console.log(`Tokens used: ${totalTokens}`);
    // - Analyze raw content: processRawContent(rawContent);
    // - Update UI in real-time: updateStreamingUI(content);
  },
});

Store metadata in the chat history and access context

let messages: UpstashMessage[] = []
let context: PrepareChatResult = []

const response = await ragChat.chat(question, {
  onChatHistoryFetched(_messages) {
    messages = _messages
    this.metadata = {
      ...this.metadata,
      usedHistory: JSON.stringify(
        messages.map((message) => {
          delete message.metadata?.usedHistory
          delete message.metadata?.usedContext
          return message
        })
      ),
    }
    return _messages
  },
  onContextFetched(_context) {
    context = _context
    this.metadata = { ...this.metadata, usedContext: context.map((x) => x.data.replace("-", "")) }
    return _context
  },
  streaming: true,
  metadata: {
    modelNameWithProvider: `${LLM_MODELS[llmModel].provider}_${llmModel}`,
  },
})