Skip to content

2. Model CatalogΒΆ

The Azure AI Foundry model catalog is the starting point for model selection. It currently has 1800+ frontier, industry, and open-source, models that can be filtered by collection, industy, deployment option, inference task, and license. You can also take advantage of the built-in search capability to find models by name or other criteria. Let's explore this.

Start by opening a new private browser in guest mode and navigating to the Azure AI Model catalog page in Azure AI Foundry. You should see this:

FIGURE: click to expand for example screenshot. Note model count (ex: 1819 models)

Selection


2.1 Filter By Inference TaskΒΆ

The first step is to see if the catalog has any models that will fit your specific needs. Typically, this will involve knowing the inference task you want to perform, and filtering the catalog to see matching options. Inference tasks can fall under various categories like:

  • natural language processing (e.g., text generation, question answering),
  • computer vision (e.g., image classification, image segmentation)
  • audio (text-to-speech, audio generation)
  • multimodal (visual question answering, document question answering) etc.

Filter the catalog by a specific inference task to see matching models

  1. Filter by Text generation β†’ see: 375+ models
  2. Filter by Embeddings β†’ see: 11+ models
  3. Filter by Chat completion β†’ see: 62+ models

2.2 Filter By Deployment TypeΒΆ

Now let's look at the first filter (text generation) - this gives us 375+ results that match. How can we filter this down further? One way is to filter by deployment options.

  • Managed compute - provides a managed online endpoint (API) in a provisioned VM.
  • Serverless API - provide pay-as-you-go billing and a models-as-a-service (MaaS) approach.

The serverless API option can be more cost-effective and does not consume your model quota while still providing enterprise security and compliance guarantees. Let's try this out:

Filter the catalog by a inference task & deployment type to see matching models

  1. First, Filter by Text generation β†’ see: 375+ models
  2. Then, Filter By Serverless API deployment β†’ see: 3 models (manageable subset)

2.3 Filter By Collection TypeΒΆ

Another way to filter models is by collection. At a high level, there are 3 key collections:

You can also select a specific model provider in the collections filter, to see only models from that provider. This is a particularly useful filter to use if you want to prioritize using an open-source model, or want to pick models that you can compare benchmarks on. Let's try it.

Filter the catalog by inference task and benchmark results collection to see matching models

  1. First, Filter by Text generation β†’ see: 375+ models
  2. Then, Filter By Benchmark Results collection β†’ see: 22 models (that I can compare)
  3. OR Filter by Hugging Face β†’ see: 322 models (that are open-source)

2.4 Filter By Industry DomainΒΆ

Last but not least, we now have a specialized filter for Industry, allowing you to select models that have been specifically curated and tailored for use in vertical domains like Health and Life Sciences, Financial Services etc. Because these are industry-specific, they can be more effective as the first filter for discovery. Let's try it.

Filter the catalog by a industry to see matching models

  1. Filter by Financial Services β†’ see: 10 models including Saifr β†’ Clear results
  2. First,Filter by Health & Life Sciences Industry β†’ see: 20 models
  3. Then, Filter by Embeddings Inference Task β†’ see: 2 models

2.5 Filter By Fine-Tuning TaskΒΆ

Model selection is typically followed by model customization - using prompt engineering, retrieval augmented generation, or fine-tuning - to improve the model response to suit your application quality and safety criteria. Fine-tuning works by performing additional training on an existing pre-trained model using a relevant new dataset to enhacne performance or add new skills.

Currently only a subset of models in the catalog can be fine-tuned, and these may have added constraints like regional availability for fine-tuning. Let's see how this works.

Filter the catalog by for a fine-tuning model for text generation

  1. Filter by Text generation for INFERENCE β†’ see: 375+ models β†’ Clear results
  2. Filter by Text generation for FINE-TUNING β†’ see: 14 models
  3. Then, Filter by Serverless API Deployment β†’ see: 3 models (Llama-2)

2.6 Search By KeywordΒΆ

Sometimes, the predefined filters are not sufficient to reduce the model subset to a manageable level for manual evaluation. This can be for various reasons:

  1. You want to see if the catalog has a specific model name.
  2. The model inference task may not be a standard option.
  3. You want to see if there are models with a specific capability.

Example 1 (Taxonomy mismatch) - search by category name

  1. First, look for Embeddings inference task. β†’ see: 10 models (no Hugging Face)
  2. Now, search for "Sentence Similarity" (HF taxonomy) β†’ see: 7 open-source models

Example 2 (Known entity) - search by name

  1. Search for "smol" β†’ see: 1 model = flagship SLM from Hugging Face
  2. Search for "unsloth" β†’ see: 2 models = from specific community creator

Example 3 (Other keywords) - search by capability

  1. Search for "sql" β†’ see: 2 models = create sql queries using natural language
  2. Search for "biomed" β†’ see: models = focus on biomedical applications & data