2. Model Catalog¶

The Azure AI Foundry model catalog is the starting point for model selection. It currently has 1800+ frontier, industry, and open-source, models that can be filtered by collection, industy, deployment option, inference task, and license. You can also take advantage of the built-in search capability to find models by name or other criteria. Let's explore this.

Start by opening a new private browser in guest mode and navigating to the Azure AI Model catalog page in Azure AI Foundry. You should see this:

FIGURE: click to expand for example screenshot. Note model count (ex: 1819 models)

Selection

2.1 Filter By Inference Task¶

The first step is to see if the catalog has any models that will fit your specific needs. Typically, this will involve knowing the inference task you want to perform, and filtering the catalog to see matching options. Inference tasks can fall under various categories like:

natural language processing (e.g., text generation, question answering),
computer vision (e.g., image classification, image segmentation)
audio (text-to-speech, audio generation)
multimodal (visual question answering, document question answering) etc.

Filter the catalog by a specific inference task to see matching models

Filter by Text generation → see: 375+ models
Filter by Embeddings → see: 11+ models
Filter by Chat completion → see: 62+ models

2.2 Filter By Deployment Type¶

Now let's look at the first filter (text generation) - this gives us 375+ results that match. How can we filter this down further? One way is to filter by deployment options.

Managed compute - provides a managed online endpoint (API) in a provisioned VM.
Serverless API - provide pay-as-you-go billing and a models-as-a-service (MaaS) approach.

The serverless API option can be more cost-effective and does not consume your model quota while still providing enterprise security and compliance guarantees. Let's try this out:

Filter the catalog by a inference task & deployment type to see matching models

First, Filter by Text generation → see: 375+ models
Then, Filter By Serverless API deployment → see: 3 models (manageable subset)

2.3 Filter By Collection Type¶

Another way to filter models is by collection. At a high level, there are 3 key collections:

Curated by AI - frontier models that have been scanned for vulnerabilities.
Hugging Face - open-source model variants from the community
Benchmark Results - models that we can compare benchmarks on

You can also select a specific model provider in the collections filter, to see only models from that provider. This is a particularly useful filter to use if you want to prioritize using an open-source model, or want to pick models that you can compare benchmarks on. Let's try it.

Filter the catalog by inference task and benchmark results collection to see matching models

First, Filter by Text generation → see: 375+ models
Then, Filter By Benchmark Results collection → see: 22 models (that I can compare)
OR Filter by Hugging Face → see: 322 models (that are open-source)

2.4 Filter By Industry Domain¶

Last but not least, we now have a specialized filter for Industry, allowing you to select models that have been specifically curated and tailored for use in vertical domains like Health and Life Sciences, Financial Services etc. Because these are industry-specific, they can be more effective as the first filter for discovery. Let's try it.

Filter the catalog by a industry to see matching models

Filter by Financial Services → see: 10 models including Saifr → Clear results
First,Filter by Health & Life Sciences Industry → see: 20 models
Then, Filter by Embeddings Inference Task → see: 2 models

2.5 Filter By Fine-Tuning Task¶

Model selection is typically followed by model customization - using prompt engineering, retrieval augmented generation, or fine-tuning - to improve the model response to suit your application quality and safety criteria. Fine-tuning works by performing additional training on an existing pre-trained model using a relevant new dataset to enhacne performance or add new skills.

Currently only a subset of models in the catalog can be fine-tuned, and these may have added constraints like regional availability for fine-tuning. Let's see how this works.

Filter the catalog by for a fine-tuning model for text generation

Filter by Text generation for INFERENCE → see: 375+ models → Clear results
Filter by Text generation for FINE-TUNING → see: 14 models
Then, Filter by Serverless API Deployment → see: 3 models (Llama-2)

2.6 Search By Keyword¶

Sometimes, the predefined filters are not sufficient to reduce the model subset to a manageable level for manual evaluation. This can be for various reasons:

You want to see if the catalog has a specific model name.
The model inference task may not be a standard option.
You want to see if there are models with a specific capability.

Example 1 (Taxonomy mismatch) - search by category name

First, look for Embeddings inference task. → see: 10 models (no Hugging Face)
Now, search for "Sentence Similarity" (HF taxonomy) → see: 7 open-source models

Example 2 (Known entity) - search by name

Search for "smol" → see: 1 model = flagship SLM from Hugging Face
Search for "unsloth" → see: 2 models = from specific community creator

Example 3 (Other keywords) - search by capability

Search for "sql" → see: 2 models = create sql queries using natural language
Search for "biomed" → see: models = focus on biomedical applications & data