Skip to content

3.2 Create Search Index

1. Create Search Index Script

Let's copy over the create-search-index.py script into our application source folder.

1
cp src.sample/api/create-search-index.py  src/api/.

2. Understand Index Creation

Now, let's take a look at what this does.

Click to expand and view the Python Script to create the search index
src/api/create-search-index.py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
    import os
    from azure.ai.projects import AIProjectClient
    from azure.ai.projects.models import ConnectionType
    from azure.identity import DefaultAzureCredential
    from azure.core.credentials import AzureKeyCredential
    from azure.search.documents import SearchClient
    from azure.search.documents.indexes import SearchIndexClient
    from config import get_logger

    # initialize logging object
    logger = get_logger(__name__)

    # create a project client using environment variables loaded from the .env file
    project = AIProjectClient.from_connection_string(
        conn_str=os.environ["AIPROJECT_CONNECTION_STRING"], credential=DefaultAzureCredential()
    )

    # create a vector embeddings client that will be used to generate vector embeddings
    embeddings = project.inference.get_embeddings_client()

    # use the project client to get the default search connection
    search_connection = project.connections.get_default(
        connection_type=ConnectionType.AZURE_AI_SEARCH, include_credentials=True
    )

    # Create a search index client using the search connection
    # This client will be used to create and delete search indexes
    index_client = SearchIndexClient(
        endpoint=search_connection.endpoint_url, credential=AzureKeyCredential(key=search_connection.key)
    )

First the script sets up a search index_client:

  1. Creates an Azure AI Project Client instance (configured with connection string)
  2. Retrieves an embeddings inference client from the AI project (maps to that model)
  3. Retrieves a search_connection object from the AI project instance
  4. Creates an index_client search index client using the search connection (key, endpoint)

First it defines the index based on a vector derived from product data fields.

  1. It maps product name to a title property
  2. It maps product description to a content property
  3. It uses HNSW algorithm (cosine distance) for similarity
  4. It prioritizes "content" for semantic ranking

It then creates the index from CSV and populates it using the index_client.

  1. It defines an index using the specified name and embeddings model
  2. It loads CSV and generates vector embeddings for each description
  3. It uploads each vectorized document into the pre-defined search index

3. Run Index Creation Script

To get the index created in Azure AI Search, run the script described above.

1
python create-search-index.py

4. Verify Search Index

Then verify that the index was created successfully:

  1. Visit the Azure Portal and look up your Resource Group
  2. Visit the Azure AI Search resource page from that RG
  3. Click on "Search Explorer" from the resource overview page
  4. Click "Search" - verify that you see indexed products