# Vector Search using Azure Cosmos DB for NoSQL

This notebook demonstrates using an Azure OpenAI embedding model to vectorize documents already stored in Azure Cosmos DB for NoSQL API, storing the embedding vectors and the creation of a vector index. Lastly, the notebook will demonstrate how to query the vector index to find similar documents.

This lab expects the data that was loaded in Lab 2. A current limitation is that the vector search feature for Azure Cosmos DB for NoSQL is supported only on new containers so the vector policy needs to be applied during the time of container creation and it canâ€™t be modified later, as such a new container `product_v` for products will be created in this notebook for use in this guide.

In [1]:
import os
import json
from models import Product
from pydantic import BaseModel
from typing import Type, TypeVar, List
from azure.cosmos import CosmosClient, DatabaseProxy, ContainerProxy, PartitionKey
from dotenv import load_dotenv
import time
from openai import AzureOpenAI
from tenacity import retry, wait_random_exponential, stop_after_attempt

## Load settings

This lab expects the `.env` file that was created in Lab 1 to obtain the connection string for the database.

Add the following entries into the `.env` file to support the connection to Azure OpenAI API, replacing the values for `<your key>` and `<your endpoint>` with the values from your Azure OpenAI API resource.

```text
AOAI_ENDPOINT="<your endpoint>"
AOAI_KEY="<your key>""
```

In [2]:
load_dotenv()
CONNECTION_STRING = os.environ.get("COSMOS_DB_CONNECTION_STRING")
EMBEDDINGS_DEPLOYMENT_NAME = "embeddings"
COMPLETIONS_DEPLOYMENT_NAME = "completions"
AOAI_ENDPOINT = os.environ.get("AOAI_ENDPOINT")
AOAI_KEY = os.environ.get("AOAI_KEY")
AOAI_API_VERSION = "2024-06-01"

## Establish connectivity to the database

In [None]:
# Initialize the Azure Cosmos DB client
client = CosmosClient.from_connection_string(CONNECTION_STRING)

# Create or load the cosmic_works_pv database
database_name = "cosmic_works_pv"
db = client.create_database_if_not_exists(id=database_name)

## Establish Azure OpenAI connectivity

In [4]:
ai_client = AzureOpenAI(
    azure_endpoint = AOAI_ENDPOINT,
    api_version = AOAI_API_VERSION,
    api_key = AOAI_KEY
    )

## Vectorize and store the embeddings in each document

The process of creating a vector embedding field on each document only needs to be done once. However, if a document changes, the vector embedding field will need to be updated with an updated vector.

In [5]:
@retry(wait=wait_random_exponential(min=1, max=20), stop=stop_after_attempt(3))
def generate_embeddings(text: str):
    '''
    Generate embeddings from string of text using the deployed Azure OpenAI API embeddings model.
    This will be used to vectorize document data and incoming user messages for a similarity search with
    the vector index.
    '''
    response = ai_client.embeddings.create(input=text, model=EMBEDDINGS_DEPLOYMENT_NAME)
    embeddings = response.data[0].embedding
    time.sleep(0.5) # rest period to avoid rate limiting on AOAI
    return embeddings

In [None]:
# demonstrate embeddings generation using a test string
test = "hello, world"
print(generate_embeddings(test))

### Vectorize and update all product documents in the Cosmic Works database

In [7]:
# Create the vector embedding policy
vector_embedding_policy = {
    "vectorEmbeddings": [
        {
            "path": "/contentVector",
            "dataType": "float32",
            "distanceFunction": "cosine",
            "dimensions": 1536
        }
    ]
}

# Create the indexing policy
indexing_policy = {
    "indexingMode": "consistent",  
    "automatic": True, 
    "includedPaths": [
        {
            "path": "/*" 
        }
    ],
    "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        },
        {
            "path": "/contentVector/*"
        }
    ],
    "vectorIndexes": [
        {
            "path": "/contentVector",
            "type": "diskANN"
        }
    ]
}

product_v_container = db.create_container_if_not_exists(
    id="product_v",
    partition_key=PartitionKey(path="/categoryId"),
    indexing_policy=indexing_policy,
    vector_embedding_policy=vector_embedding_policy
)

In [None]:
# Create vector embeddings for all products in the database
product_container: ContainerProxy = db.create_container_if_not_exists(
           id="product",
           partition_key={"paths": ["/categoryId"], "kind": "Hash"}
       )

T = TypeVar('T', bound=BaseModel)
# Create generic helper function to query items a container.
# This function re-uses the TypeVar and BaseModel from the Read a document example.
def query_items(container, query, model: Type[T]) -> List[T]:
    query = query
    items = container.query_items(query=query, enable_cross_partition_query=True)
    return [model(**item) for item in items]

# retrieve all products via a query
retrieved_products = query_items(product_container,"SELECT * FROM prod", Product)
print(f"Retrieved {len(retrieved_products)} products from the database.")

print("Starting the embedding of each product, this will take 3-5 minutes...")
# Populate contentVector field for each product in the product_v container that has vector indexing enabled
for product in retrieved_products:
    product.content_vector = generate_embeddings(product.model_dump_json(by_alias=True))    
    product_v_container.upsert_item(product.model_dump(by_alias=True))

print("Embedding complete and product_v container items updated.")

## Use vector search in Azure Cosmos DB for NoSQL

Now that each document has its associated vector embedding and the vector indexes have been created on each container, we can now use the vector search capabilities of Azure Cosmos DB for NoSQL.

In [9]:
def vector_search(
        container: ContainerProxy, 
        prompt: str,         
        vector_field_name:str="contentVector", 
        num_results:int=5):
    query_embedding = generate_embeddings(prompt)    
    items = container.query_items(
        query=f"""SELECT TOP @num_results itm.id, VectorDistance(itm.{vector_field_name}, @embedding) AS SimilarityScore 
                FROM itm
                ORDER BY VectorDistance(itm.{vector_field_name}, @embedding)
                """,
        parameters = [
            { "name": "@num_results", "value": num_results },
            { "name": "@embedding", "value": query_embedding }            
        ],
        enable_cross_partition_query=True
        )
    return items

In [None]:
prompt = "What bikes do you have?"
results = vector_search(product_v_container, prompt)
for result in results:
    print(result)

In [None]:
prompt = "What do you have that is yellow?"
results = vector_search(product_v_container, prompt, num_results=4)
for result in results:
    print(result)   

## Use vector search results in a RAG pattern with Chat GPT-3.5

In [12]:
# Define a generic function to query an item by its ID
def query_item_by_id(container, id, model: Type[T]) -> T:
    query = "SELECT * FROM itm WHERE itm.id = @id"
    parameters = [
        {"name": "@id", "value": id}
    ]    
    item = list(container.query_items(
        query=query,
        parameters=parameters,
        enable_cross_partition_query=True
    ))[0]
    return model(**item)

In [13]:
# A system prompt describes the responsibilities, instructions, and persona of the AI.
system_prompt = """
Your name is "Willie". You are an AI assistant for the Cosmic Works bike store. You help people find production information for bikes and accessories. Your demeanor is friendly, playful with lots of energy.
Do not include citations or citation numbers in your responses. Do not include emojis.
You are designed to answer questions about the products that Cosmic Works sells.

Only answer questions related to the information provided in the list of products below that are represented
in JSON format.

If you are asked a question that is not in the list, respond with "I don't know."

List of products:
"""

In [14]:
def rag_with_vector_search(
        container: ContainerProxy, 
        prompt: str,         
        vector_field_name:str="contentVector", 
        num_results:int=5):
    """
    Use the RAG model to generate a prompt using vector search results based on the
    incoming question.  
    """
    # perform the vector search and build product list
    results = vector_search(container, prompt, vector_field_name, num_results)
    product_list = ""
    for result in results:
        # retrieve the product details
        product = query_item_by_id(container, result["id"], Product)               
        # remove the contentVector field from the product details, this isn't needed for the context
        product.content_vector = None        
        product_list += json.dumps(product, indent=4, default=str) + "\n\n"

    # generate prompt for the LLM with vector results
    formatted_prompt = system_prompt + product_list

    # prepare the LLM request
    messages = [
        {"role": "system", "content": formatted_prompt},
        {"role": "user", "content": prompt}
    ]

    completion = ai_client.chat.completions.create(messages=messages, model=COMPLETIONS_DEPLOYMENT_NAME)
    return completion.choices[0].message.content

In [None]:
print(rag_with_vector_search(product_v_container, "What bikes do you have?"))

In [None]:
print(rag_with_vector_search(product_v_container, "What are the names and skus of yellow products?"))