Resources¶

1. API Reference:¶

API Endpoint	Description
/info	Returns the information about the model deployed under the endpoint.
/embeddings	Creates an embedding vector representing the input text.
/completions	Creates a completion for the provided prompt and parameters (think request-response)
/chat/completions	Creates a model response for the given chat conversation. (think multi-turn with history)
/images/embeddings	Creates an embedding vector representing the input image and text pair.

2. Documentation¶

The inference API is currently supported with SDKs for Python, JavaScript, C# and REST API calls.

Azure AI Model Inference API - Main documentation page for the API

3. Code Samples¶

Samples can be used with Serverless API and Managed Compute model deployments. All samples use Python and depend on the azure-ai-inference library, with the async samples also requiring the aiohttp library.

17 Samples using sync client with chat completions endpoint
4 Samples using sync client with embeddings endpoint
6 Samples using async client with chat completions endpoint
1 Samples using async client with embeddings endpoint