Generative AI The models have the potential to revolutionize business operations, but businesses must carefully consider how to harness their power while overcoming challenges such as protecting data and ensuring the quality of content generated by the model. AI.
The Retrieval-Augmented Generation (RAG) framework augments prompts with external data from multiple sources, such as document repositories, databases, or APIs, to make base models effective for domain-specific tasks . This article introduces the capabilities of the RAG model and highlights the transformative potential of MongoDB Atlas with its vector search functionality.
MongoDB Atlas is an integrated suite of data services that accelerates and simplifies the development of data-driven applications. Its vector data store integrates seamlessly with operational data storage, eliminating the need for a separate database. This integration enables powerful semantic search capabilities through Search for vectorsa quick way to build semantic and AI-driven search applications.
Amazon SageMaker allows businesses to build, train, and deploy machine learning (ML) models. Amazon SageMaker JumpStart provides pre-trained models and data to help you get started with ML. You can access, customize, and deploy pre-trained models and data through the SageMaker JumpStart home page in Amazon SageMaker Studio in just a few clicks.
AmazonLex is a conversational interface that helps businesses create chatbots and voice robots that engage in natural, realistic interactions. By integrating Amazon Lex with generative AI, businesses can create a holistic ecosystem where user inputs seamlessly transform into coherent, contextually relevant responses.
Solution Overview
The following diagram illustrates the solution architecture.
In the following sections, we review the steps for implementing this solution and its components.
Configure a MongoDB cluster
To create a free tier MongoDB Atlas cluster, follow the instructions in Create a cluster. Configure the database to access and network to access.
Deploy the SageMaker integration template
You can choose the integration model (ALL MiniLM L6 v2) on the SageMaker JumpStart templates, notebooks, solutions page.
Choose Deploy to deploy the model.
Verify that the template is deployed successfully and the endpoint is created.
Embedding vectors
Embedding vectors is a process of converting text or image into vector representation. With the following code we can generate vector embeddings with SageMaker JumpStart and update the collection with the vector created for each document:
payload = {"text_inputs": (document(field_name_to_be_vectorized))}
query_response = query_endpoint_with_json_payload(json.dumps(payload).encode('utf-8'))
embeddings = parse_response_multiple_texts(query_response)
# update the document
update = {'$set': {vector_field_name : embeddings(0)}}
collection.update_one(query, update)
The code above shows how to update a single object in a collection. To update all objects, follow the steps instructions.
MongoDB vector data store
Finding Atlas MongoDB vectors is a new feature that allows you to store and search vector data in MongoDB. Vector data is a type of data that represents a point in high-dimensional space. This type of data is often used in ML and artificial intelligence applications. MongoDB Atlas Vector Search uses a technique called k-nearest neighbors (k-NN) to search for similar vectors. k-NN works by finding the k vectors most similar to a given vector. The most similar vectors are those that are closest to the given vector in terms of Euclidean distance.
Storing vector data alongside operational data can improve performance by reducing the need to move data between different storage systems. This is particularly beneficial for applications that require real-time access to vector data.
Create a vector search index
The next step is to create a MongoDB Vector Search Index on the vector field you created in the previous step. MongoDB uses the knnVector
type to index vector embeddings. The vector field must be represented as an array of numbers (BSON data types int32, int64, or double only).
Refer to Examine the limitations of the knnVector type for more information on the limits of the knnVector
type.
The following code is an example of an index definition:
{
"mappings": {
"dynamic": true,
"fields": {
"egVector": {
"dimensions": 384,
"similarity": "euclidean",
"type": "knnVector"
}
}
}
}
Note that the dimension must match the dimension of your integration model.
Query the vector data store
You can query the vector data store using the tool Vector Search Aggregation Pipeline. It uses the vector search index and performs a semantic search on the vector data store.
The following code is an example search definition:
{
$search: {
"index": "<index name>", // optional, defaults to "default"
"knnBeta": {
"vector": (<array-of-numbers>),
"path": "<field-to-search>",
"filter": {<filter-specification>},
"k": <number>,
"score": {<options>}
}
}
}
Deploy the large SageMaker language model
SageMaker JumpStart Base Models are pre-trained extended language models (LLMs) that are used to solve various natural language processing (NLP) tasks, such as text summarization, question answering, and natural language inference. They are available in a variety of sizes and configurations. In this solution we use the Cuddly face Model FLAN-T5-XL.
Search for the FLAN-T5-XL model in SageMaker JumpStart.
Choose Deploy to configure the FLAN-T5-XL model.
Verify that the template is successfully deployed and the endpoint is active.
Create an Amazon Lex bot
To create an Amazon Lex bot, follow these steps:
- On the Amazon Lex console, choose Create a robot.
- For Robot nameenter a name.
- For Execution roleselect Create a role with basic Amazon Lex permissions.
- Specify your language settings, then choose Do.
- Add an example statement in the
NewIntent
User interface and choose Save intent. - Access the
FallbackIntent
which was created for you by default and switch Active in the Accomplishment section. - Choose Build and once the construction is successful, choose Test.
- Before testing, choose the gear icon.
- Specify the AWS Lambda function that will interact with MongoDB Atlas and the LLM to provide answers. To create the lambda function, follow these steps.
- You can now interact with the LLM.
To clean
To clean up your resources, follow these steps:
- Remove the Amazon Lex bot.
- Delete the Lambda function.
- Remove the LLM SageMaker endpoint.
- Remove the SageMaker endpoint from the integration template.
- Delete the MongoDB Atlas cluster.
Conclusion
In the article, we showed how to create a simple bot that uses MongoDB Atlas semantic search and integrates with a SageMaker JumpStart template. This bot allows you to quickly prototype user interaction with different LLMs in SageMaker Jumpstart while associating them with context from MongoDB Atlas.
As always, AWS welcomes your feedback. Please leave your comments and questions in the comments section.
About the authors

Igor Alekseev is a Senior Partner Solutions Architect at AWS in the data and analytics space. In his role, Igor works with strategic partners to help them create complex architectures optimized for AWS. Before joining AWS, as a Data/Solutions Architect, he implemented numerous projects in the Big Data space, including several data lakes in the Hadoop ecosystem. As a data engineer, he has been involved in applying AI/ML to fraud detection and office automation.
Babu Srinivasan is a Senior Partner Solutions Architect at MongoDB. In his current role, he works with AWS to create the technical integrations and reference architectures for AWS and MongoDB solutions. He has over two decades of experience in database and cloud technologies. He is passionate about providing technical solutions to clients working with multiple Global Systems Integrators (GSIs) across multiple geographies.