Basic review summary with Chroma
Introduction
This guide will walk you through implementing a basic RAG-powered (Retrieval-Augmented Generation) application using the Chroma vector database in combination with OpenAI’s generative AI.
LLM - Large Language Model.
A Large Language Model (LLM) is an advanced type of AI trained on vast datasets that include text from books, articles, websites, and other publicly available sources. By learning patterns, context, and relationships within language, LLMs can generate and understand human-like text. This makes them versatile for a wide range of tasks, such as writing code, summarizing content, answering questions, and even creative storytelling.
From Specialized Models to LLMs Before LLMs, we relied on specialized, niche AI models to handle specific tasks, such as:
Text Summarization: Condensing long pieces of text into concise summaries. Document Q&A: Extracting answers from structured documents. Sentiment Analysis: Determining the emotional tone of a piece of text. While these models were effective within their focused domains, they lacked flexibility. LLMs revolutionized this approach by unlocking a wide spectrum of possibilities. Instead of requiring separate models for each task, LLMs provide a unified solution that can adapt to various use cases.
Accessibility and Cloud Integration
One of the game-changing aspects of LLMs is their accessibility. With the availability of cloud services and web APIs, developers can now harness the power of LLMs without needing to train or host the models themselves. Platforms like OpenAI, Google, and AWS offer robust APIs for integrating LLMs into your applications. For developers seeking customizable options, platforms like Hugging Face provide access to numerous pre-trained and fine-tunable models.
Why LLMs Matter
LLMs have brought a significant shift in the AI landscape by:
- Eliminating Complexity: Developers can now focus on building applications without worrying about training or managing multiple models.
- Enhancing Flexibility: A single LLM can handle diverse tasks, from generating creative content to answering domain-specific questions, making it a powerful tool for innovation.
- Driving Accessibility: With APIs and cloud services, small businesses and individual developers can access cutting-edge AI without requiring extensive computational resources. LLMs represent the future of AI, enabling smarter, more adaptable, and more efficient applications. Whether you’re developing a chatbot, building intelligent search features, or performing advanced analytics, LLMs provide the foundation to innovate and scale.
Why RAG?
Think of an LLM as an exceptionally bright student with a strong foundational knowledge of almost everything public on the internet. However, even the brightest student wouldn’t excel at a new job without first learning the company’s internal processes and reading its documentation. That’s where RAG (Retrieval-Augmented Generation) comes in.
RAG bridges the gap by giving LLMs access to specific, relevant context, such as internal documents or FAQs, enabling them to provide accurate and context-aware answers. For example, if you're building a chatbot for eCommerce, you want it to respond only based on your store's FAQs or instructions—not randomly generated answers from its vast general knowledge. RAG ensures your chatbot behaves like a well-trained employee, capable of accessing and applying internal knowledge to its tasks.
By combining the intelligence of an LLM with the precision of context retrieval from Chroma, you can create smarter, more reliable AI applications tailored to your specific needs.
Vector Database
Vector databases, like Chroma and Weaviate, act as the memory for AI applications powered by LLMs. Just as traditional applications rely on SQL or NoSQL databases to store and manage data, vector databases are designed to store knowledge in a format that AI can process and retrieve effectively.
What are Text Embeddings?
When LLMs process text, they generate text embeddings—mathematical representations of words, phrases, or documents. These embeddings capture the meaning, context, and relationships between words. Think of it as mapping content to a multi-dimensional space where similar ideas are closer together. For example, the embeddings for "cat" and "animal" would be much closer than those for "cat" and "car."
Why Do We Need a Vector Database?
In Retrieval-Augmented Generation (RAG), the AI needs to fetch relevant knowledge before generating a response. However, LLMs can’t "memorize" all the information needed for specific use cases (e.g., company policies, FAQs, or documentation). This is where vector databases come in.
Vector databases store the text embeddings generated from your documents. When a user asks a question, the database performs a semantic search, retrieving contextually relevant embeddings—even if the query doesn't exactly match the original content. This process enables AI to provide precise, context-aware answers.
Bridging LLM and RAG
By combining LLMs with vector databases, you can create smarter applications that use the AI’s reasoning power while grounding its responses in your specific knowledge base. For your Chroma-based app, this is how you’ll store, manage, and retrieve the content needed for delivering accurate and contextually aware outputs.
In essence, vector databases enable AI to "understand" and "remember" relevant information, making them an essential component for building robust RAG applications.
Chroma vs Weaviate
In this guide, we'll use Chroma due to its simplicity and ease of use, especially for prototyping and basic applications. Chroma's schemaless API allows developers to get started quickly without worrying about defining complex structures or schemas upfront. It’s ideal for experimenting and learning how to build RAG-powered applications.
However, for production-grade applications or projects requiring advanced features, Weaviate might be a better choice. Here's why:
Advantages of Weaviate for Production
Structured Querying: Weaviate supports structured querying similar to SQL, enabling you to filter, sort, and retrieve data with precision. This makes it a great choice for applications with complex data retrieval requirements.
Cloud Hosting: Weaviate offers managed cloud services, eliminating the need to set up and maintain your own infrastructure. This can save time and resources while ensuring scalability and reliability.
Extensive Integrations: Weaviate integrates seamlessly with external tools and data pipelines, making it suitable for enterprise applications that rely on multiple systems working together.
When to Use Chroma
Chroma shines when:
You’re building a prototype or a simple proof of concept. You don’t need complex filtering or sorting features for your data. You prefer a lightweight and schemaless API for rapid iteration. For this guide, Chroma is perfect for demonstrating how to build a basic review summary application with RAG, focusing on the core concepts without the overhead of advanced configurations.
In summary, Chroma is an excellent starting point for learning and small-scale projects, while Weaviate is better suited for production environments that demand more flexibility, scalability, and robustness.
Install Chroma
It is recommended to install Chroma on your local machine to test with Docker. Run your Chroma with:
docker run -p 8000:8000 chromadb/chroma
Create a NodeJS file and test your Chroma connection:
const {ChromaClient} = require('chromadb');
const chromaClient = new ChromaClient({
path: 'http://localhost:8000'
});
(async () => {
await chromaClient.heartbeat();
})();
On your local machine, it temporarily have no authentication mechanism.
Example reviews data
Use the following review data.
[
{
"id": "OeNCvF7XfpuZrLxEaYFe",
"customer": "nappee",
"reviewDate": "2024-11-21T00:00:00.000Z",
"rating": 1,
"reviewContent": "Would definitely recommend this app.",
"app": "seoon-blog"
},
{
"id": "P4mxqhGVgpwcLm4Q3fQW",
"customer": "All gadgets Market",
"reviewDate": "2025-01-01T00:00:00.000Z",
"rating": 5,
"reviewContent": "The support guidelines helped a lot",
"app": "seoon-blog"
},
{
"id": "PskFsABhfoJkFSBPc32z",
"customer": "Rusty Mug Coffee",
"reviewDate": "2024-12-11T00:00:00.000Z",
"rating": 5,
"reviewContent": "Support is very quick to resolve issues. Tiana and Raymond were a big help. The SEO tools are great, and it feels like a paid app, or something that I would have to pay an expensive subscription for, like Shogun.",
"app": "seoon-blog"
},
{
"id": "VhpIk7GEv1S1Kgli4Gvp",
"customer": "HEART BREAKER CAMERAS",
"reviewDate": "2024-12-23T00:00:00.000Z",
"rating": 5,
"reviewContent": "Update: within 24 hours they sent me an email and response saying the implemented draft history and looks like autosave. in my history is my old version. Probably quickest and most effective response to an app issue I have ever experience. Thank you <3",
"app": "seoon-blog"
},
{
"id": "VjLCLMEIMVnAycdeqlLw",
"customer": "Mari’Anna Tees",
"reviewDate": "2024-12-02T00:00:00.000Z",
"rating": 5,
"reviewContent": "Very helpful",
"app": "seoon-blog"
}
]
Insert data to Chroma
Use this following snippet to insert your review data to the Chroma database. In this snippet, you will need to use
text-embedding-3-large
embedding model. The better the embedding model, the better the query will be later on. As we mentioned earlier, the
larger the embedding model, the better the AI will
know the connection between concepts and find the corresponding more correctly. You can compare use
the default embedding function
to see the different
const {ChromaClient, DefaultEmbeddingFunction, OpenAIEmbeddingFunction} = require('chromadb');
const {chunk} = require('@avada/utils');
const embeddingFunction = new OpenAIEmbeddingFunction({
openai_api_key: 'OPEN_API_KEY',
openai_model: 'text-embedding-3-large'
});
const chromaClient = new ChromaClient({
path: 'http://localhost:8000'
});
(async () => {
const reviewData = [
// your reviews
];
await chromaClient.deleteCollection({name: 'reviews'});
await chromaClient.getOrCreateCollection({name: 'reviews'});
const reviewCollection = await chromaClient.getCollection({
name: 'reviews',
embeddingFunction: embeddingFunction
});
const reviewChunks = chunk(reviewData, 1000);
await chromaClient.getOrCreateCollection({name: 'reviews'});
for (const reviewChunk of reviewChunks) {
console.log('One chunk at a time', reviewChunk.length);
await reviewCollection.add({
ids: reviewChunk.map(review => review.id), // Unique ID for the document
documents: reviewChunk.map(review => review.reviewContent), // Text to embed
metadatas: reviewChunk.map(review => ({
app: review.app,
customer: review.customer,
rating: review.rating,
reviewDate: review.reviewDate.getTime()
}))
});
}
})();
Summary reviews
The Chroma does not support the generative feature right out of the box, you will need to implement the openAI API
directly to generate summary about the related reviews matching the query. You can add where query to your database
to filter appId
, or reviewDate
. For example, the reviewDate
will need to be stored as Epoch time in number
since Chroma does not have a Date data type.
const {ChromaClient, DefaultEmbeddingFunction, OpenAIEmbeddingFunction} = require('chromadb');
const {chunk} = require('@avada/utils');
const embeddingFunction = new OpenAIEmbeddingFunction({
openai_api_key: 'OPEN_API_KEY',
openai_model: 'text-embedding-3-large'
});
const chromaClient = new ChromaClient({
path: 'http://localhost:8000'
});
const openAIClient = new OpenAI({
apiKey: 'OPEN_API_KEY'
});
(async () => {
const query = 'Do customer talk about the app editor experience?';
const reviewCollection = await chromaClient.getCollection({
name: 'reviews',
embeddingFunction: embeddingFunction
});
const reviews = await reviewCollection.query({
nResults: 10,
queryTexts: query
});
const input = reviews.documents[0].join('\n');
// Generate summary using OpenAI
const response = await openAIClient.chat.completions.create({
model: 'gpt-4o',
messages: [
{
role: 'system',
content: 'You are an assistant that answers questions about insights about customer support via customer reviews.'
},
{role: 'user', content: `${prompt}. Here are the reviews: \n\n${input}\n\n`}
],
max_tokens: 1000 // Adjust the token limit as needed
});
const summary = response.choices[0]?.message?.content;
console.log("The summary", summary);
})();
Lastly
With the above, example, you will be able to develop a simple RAG-powered application in which it will summarize, provides insights via customer reviews data. The more reviews it has in database, the better the result will be.
Try it with chatbot for internal documentation, FAQs, chatbot for eCommerce, chatbot for summarizing notifications within a day, etc. The limit is endless.
What's more from RAG?
If you find RAG amazing, the next thing you should explore is Table-Augmented Generation (TAG) . This is a more advanced version of RAG that leverages structured data in tables to enhance the AI's understanding. Imagine AI can query database data and generate responses based on the structured data. This opens up a new opportunity for features like: AI report generation, AI data analysis, AI data visualization, etc.