3.2 Create Search Index¶

1. Create Search Index Script¶

Let's copy over the create-search-index.py script into our application source folder.

Make sure you are in the root directory of the repo.

Then run this command:

1	`cp src.sample/api/create-search-index.py src/api/.`

2. Understand Index Creation¶

Now, let's take a look at what this does.

Click to expand and view the Python Script to create the search index

src/api/create-search-index.py
    import os
    from azure.ai.projects import AIProjectClient
    from azure.ai.projects.models import ConnectionType
    from azure.identity import DefaultAzureCredential
    from azure.core.credentials import AzureKeyCredential
    from azure.search.documents import SearchClient
    from azure.search.documents.indexes import SearchIndexClient
    from config import get_logger

    # initialize logging object
    logger = get_logger(__name__)

    # create a project client using environment variables loaded from the .env file
    project = AIProjectClient.from_connection_string(
        conn_str=os.environ["AIPROJECT_CONNECTION_STRING"], credential=DefaultAzureCredential()
    )

    # create a vector embeddings client that will be used to generate vector embeddings
    embeddings = project.inference.get_embeddings_client()

    # use the project client to get the default search connection
    search_connection = project.connections.get_default(
        connection_type=ConnectionType.AZURE_AI_SEARCH, include_credentials=True
    )

    # Create a search index client using the search connection
    # This client will be used to create and delete search indexes
    index_client = SearchIndexClient(
        endpoint=search_connection.endpoint_url, credential=AzureKeyCredential(key=search_connection.key)
    )

First the script sets up a search index_client:

Creates an Azure AI Project Client instance (configured with connection string)
Retrieves an embeddings inference client from the AI project (maps to that model)
Retrieves a search_connection object from the AI project instance
Creates an index_client search index client using the search connection (key, endpoint)

First it defines the index based on a vector derived from product data fields.

It maps product name to a title property
It maps product description to a content property
It uses HNSW algorithm (cosine distance) for similarity
It prioritizes "content" for semantic ranking

It then creates the index from CSV and populates it using the index_client.

It defines an index using the specified name and embeddings model
It loads CSV and generates vector embeddings for each description
It uploads each vectorized document into the pre-defined search index

3. Run Index Creation Script¶

To get the index created in Azure AI Search,

Change to the src/api folder
1
cd src/api
Run the script:
1
python create-search-index.py

4. Verify Search Index¶

Then verify that the index was created successfully:

Visit the Azure Portal and look up your Resource Group
Visit the Azure AI Search resource page from that RG
Click on "Search Explorer" from the resource overview page
Click "Search" - verify that you see indexed products