Resources¶
1. API Reference:¶
API Endpoint | Description |
---|---|
/info | Returns the information about the model deployed under the endpoint. |
/embeddings | Creates an embedding vector representing the input text. |
/completions | Creates a completion for the provided prompt and parameters (think request-response) |
/chat/completions | Creates a model response for the given chat conversation. (think multi-turn with history) |
/images/embeddings | Creates an embedding vector representing the input image and text pair. |
2. Documentation¶
The inference API is currently supported with SDKs for Python, JavaScript, C# and REST API calls.
- Azure AI Model Inference API - Main documentation page for the API
3. Code Samples¶
Samples can be used with Serverless API and Managed Compute model deployments. All samples use Python and depend on the azure-ai-inference
library, with the async samples also requiring the aiohttp
library.
- 17 Samples using sync client with chat completions endpoint
- 4 Samples using sync client with embeddings endpoint
- 6 Samples using async client with chat completions endpoint
- 1 Samples using async client with embeddings endpoint