# First Azure Cosmos DB for NoSQL application

In [5]:
import os
from azure.cosmos import CosmosClient, DatabaseProxy, ContainerProxy
from pydantic import BaseModel
from typing import Type, TypeVar, List
from pprint import pprint
from dotenv import load_dotenv
from models import Product

## Create a database

Ensure the Azure Cosmos DB account connection string is located in a `.env` file in the root of the project, you will need to create this file. The `.env` file should contain the following value (replace the value with your own connection string):

COSMOS_DB_CONNECTION_STRING="cosmos__db__connection_string"

>**Note**: If you are running using the **local emulator**, append the following value to the connection string: `&retrywrites=false&tlsallowinvalidcertificates=true`.

To create a NoSQL database in Azure Cosmos DB, first instantiate a `CosmosClient` object, use the `create_database_if_not_exists` method to create a database if it does not exist to avoid any exceptions should the database already exist. This method will create a database with the specified name if it does not exist, otherwise it will return the existing database.

In [None]:
load_dotenv()
CONNECTION_STRING = os.environ.get("COSMOS_DB_CONNECTION_STRING")

# Initialize the Azure Cosmos DB client
client = CosmosClient.from_connection_string(CONNECTION_STRING)

# Create or load the cosmic_works_pv database
database_name = "cosmic_works_pv"
db = client.create_database_if_not_exists(id=database_name)

## Create a container

There is a handy method that can be used to create a container in the database `create_container_if_not_exists` that allows for the creation of a container if it does not already exist, or retrieves it if it does. In this case, the `product` container is created to store product information.

When creating a container, the partition key is required. Partition keys in Azure Cosmos DB are critical for ensuring scalable and efficient performance. They function as logical sharding mechanisms, distributing data across multiple partitions to balance the load and optimize query performance. It is referenced as a JSON path within the item being stored, prefixed with a `/`. Choosing an effective partition key affects the throughput, latency, and overall efficiency of database operations. Learn more about [partitioning in Azure Cosmos DB](https://learn.microsoft.com/azure/cosmos-db/partitioning-overview).

In [3]:
container: ContainerProxy = db.create_container_if_not_exists(
           id="product",
           partition_key={"paths": ["/categoryId"], "kind": "Hash"}
       )

## Create or Update a document (Upsert)

Documents in Azure Cosmos DB for NoSQL API are represented as JSON objects. In this lab, the Pydantic library is used to create a model for the document. This model is then used to create a document in the database using built-in serialization methods. Find the models in the `models` folder. Notice the class property definitions include aliases, these aliases can be used to override the serialized property names. This is useful when the property names in the model do not match the property names desired in the database.

One method of creating a document is using the `create_item` method. This method takes a single document and inserts it into the database, if the item already exists in the container, and exception is thrown. Alternatively, the `upsert_item` method can also be used to insert a document into the database and in this case, if the document already exists, it will be updated.

In [None]:
product = Product(
        id="2BA4A26C-A8DB-4645-BEB9-F7D42F50262E",    
        category_id="56400CF3-446D-4C3F-B9B2-68286DA3BB99", 
        category_name="Bikes, Mountain Bikes", 
        sku="BK-M18S-42",
        name="Mountain-100 Silver, 42",
        description='The product called "Mountain-500 Silver, 42"',
        price=742.42,
       )

# Upsert the product into the container by converting it to a dictionary using the alias names where present.
container.upsert_item(product.model_dump(by_alias=True))

print(f"Upserted product with ID: {product.id}")

## Read a document

To read a document from the database, use the `read_item` method. This method takes the partition key and the document id as arguments and returns the document. If the document does not exist, an exception is thrown. The `query_items` method can also be used to retrieve documents from the database. This method takes a query string as an argument and returns a list of documents that match the query.

In this case, the `query_items` method is used to retrieve the document from the container as it is desired to retrieve the record without also having to provide the partition key.

In [None]:
# Create a generic helper function to retrieve a an item from a container by its id value
T = TypeVar('T', bound=BaseModel)
def query_item_by_id(container, id, model: Type[T]) -> T:
    query = "SELECT * FROM itm WHERE itm.id = @id"
    parameters = [
        {"name": "@id", "value": id}
    ]    
    item = list(container.query_items(
        query=query,
        parameters=parameters,
        enable_cross_partition_query=True
    ))[0]
    return model(**item)
   
# Retrieve the product from the container by its id and cast it to the Product model
retrieved_product = query_item_by_id(container, product.id, Product)

# Print the retrieved product
print("\nCast Product from document retrieved from Azure Cosmos DB:")
print(retrieved_product)

## Delete a document

The `delete_item` method is used to delete a single document from the database. This method takes the `id` and `partition_key` as arguments and deletes the document. If the document does not exist, an exception is thrown.

In [None]:
container.delete_item(item=retrieved_product.id, partition_key=retrieved_product.category_id)
print(f"Deleted the product with ID: {retrieved_product.id}")

## Query for multiple documents

The `query_items` method is used to query for multiple documents in the database. This method takes a query string to perform a [SQL-like query](https://learn.microsoft.com/azure/cosmos-db/nosql/tutorial-query) on the documents in the container, retrieving all documents that match the query.

In [None]:
# Insert multiple documents
products = [
    Product(
        id="2BA4A26C-A8DB-4645-BEB9-F7D42F50262E",    
        category_id="56400CF3-446D-4C3F-B9B2-68286DA3BB99", 
        category_name="Bikes, Mountain Bikes", 
        sku="BK-M18S-42",
        name="Mountain-100 Silver, 42",
        description='The product called "Mountain-500 Silver, 42"',
        price=742.42
       ),
    Product(
        id="027D0B9A-F9D9-4C96-8213-C8546C4AAE71",    
        category_id="26C74104-40BC-4541-8EF5-9892F7F03D72", 
        category_name="Components, Saddles", 
        sku="SE-R581",
        name="LL Road Seat/Saddle",
        description='The product called "LL Road Seat/Saddle"',
        price=27.12
       ),
    Product(
        id = "4E4B38CB-0D82-43E5-89AF-20270CD28A04",
        category_id = "75BF1ACB-168D-469C-9AA3-1FD26BB4EA4C",
        category_name = "Bikes, Touring Bikes",
        sku = "BK-T44U-60",
        name = "Touring-2000 Blue, 60",
        description = 'The product called Touring-2000 Blue, 60"',
        price = 1214.85
       ),
    Product(
        id = "5B5E90B8-FEA2-4D6C-B728-EC586656FA6D",
        category_id = "75BF1ACB-168D-469C-9AA3-1FD26BB4EA4C",
        category_name = "Bikes, Touring Bikes",
        sku = "BK-T79Y-60",
        name = "Touring-1000 Yellow, 60",
        description = 'The product called Touring-1000 Yellow, 60"',
        price = 2384.07
       ),
    Product(
        id = "7BAA49C9-21B5-4EEF-9F6B-BCD6DA7C2239",
        category_id = "26C74104-40BC-4541-8EF5-9892F7F03D72",
        category_name = "Components, Saddles",
        sku = "SE-R995",
        name = "HL Road Seat/Saddle",
        description = 'The product called "HL Road Seat/Saddle"',
        price = 52.64,
       )
]
for product in products:
    container.upsert_item(product.model_dump(by_alias=True))
    print(f"Upserted product with ID: {product.id}")

# Create generic helper function to query items a container.
# This function re-uses the TypeVar and BaseModel from the Read a document example.
def query_items(container, query, model: Type[T]) -> List[T]:
    query = query
    items = container.query_items(query=query, enable_cross_partition_query=True)
    return [model(**item) for item in items]

# retrieve all products via a query
retrieved_products = query_items(container,"SELECT * FROM prod", Product)
print(f"Retrieved: {len(retrieved_products)} products")


In [None]:
# Print all documents that have a category name of "Components, Saddles"
for result in query_items(container, "SELECT * FROM prod WHERE prod.categoryName='Components, Saddles'", Product):    
    pprint(result)

## Clean up resources

The following cell will delete the database and container created in this lab. This is done by using the `delete_database` method on the database object. This method takes the name of the database to delete as an argument. If it is desired to simply delete the container, the `delete_container` method can be used on the database object. This method takes the name of the container to delete as an argument.

In [10]:
# db.delete_container("products")
client.delete_database("cosmic_works_pv")