Azure AI Services — Images & Video Overview
An AI-102 Mental Map
🖼️ Images
1. Azure AI Vision Overview
Aspect | Details |
---|---|
Service Name | Azure AI Vision (Computer Vision API) |
Primary Use | General-purpose image and video analysis using pretrained models |
Capabilities | OCR, image description, tags, object detection, spatial analysis, face detection |
Auth/Region | Key + region-specific endpoint, e.g. https://<region>.api.cognitive.microsoft.com/ |
SDK | azure-cognitiveservices-vision-computervision |
Common SDK Client | ComputerVisionClient |
Common REST Endpoint | POST /vision/v3.2/analyze |
Inputs | Image URL or stream; query parameters: visualFeatures , details , language |
1.1 Image Analysis (Tags, Description, Objects, etc.)
Feature | Image Analysis |
---|---|
Use Case | Extract high-level information about an image (e.g. what's in it, objects, categories) |
visualFeatures | Description , Tags , Objects , Categories , Brands , Adult , etc. |
Sample Request | |
|
|
Sample Response | |
|
1.2 OCR (Optical Character Recognition)
Feature | OCR (Read API) |
---|---|
Use Case | Extract text from images (e.g., scanned documents, screenshots, photos) |
API Variant | Read API (async model recommended) |
REST Flow | POST /vision/v3.2/read/analyze → GET /vision/v3.2/read/analyzeResults/{operationId} |
Sample Request | |
|
|
Sample Final Response | |
|
1.3 Face Detection (Face API)
Feature | Face Detection |
---|---|
Use Case | Detect faces and their attributes (age, emotion, head pose) |
Service Note | Separate endpoint: https://<region>.api.cognitive.microsoft.com/face/v1.0 |
Auth | Same key + endpoint model, but Face API is a distinct service |
SDK | azure-cognitiveservices-vision-face |
Core SDK Method | .detect_with_url() |
Sample Request | |
|
|
Sample Response | |
|
2. Custom Vision (Prediction)
Aspect | Details |
---|---|
Service Name | Azure AI Vision – Custom Vision (Prediction) |
Primary Use | Image classification and object detection using your own trained models |
Auth/Region | Project-specific endpoint; prediction key in header |
SDK | azure-cognitiveservices-vision-customvision |
Core SDK Methods | .classify_image() , .classify_image_url() , .detect_image() |
REST Endpoint | |
http POST /customvision/v3.0/Prediction/{projectId}/classify/iterations/{iterationName}/image |
|
Key Inputs | Binary image stream or image URL |
Sample Response | |
|
📹 Video
3. Video Indexer
Aspect | Details |
---|---|
Service Name | Azure Video Indexer |
Primary Use | Deep video insights – face detection, transcript, scene segmentation, OCR, labels |
Auth Quirk | Requires session-based access token (not ARM) |
Steps Overview | 1. Get token → 2. Upload video → 3. Get insights |
SDK | None – REST only |
Upload Sample | http POST /{location}/Accounts/{accountId}/Videos?accessToken={token}&name=demo&videoUrl=https://... |
Insights Sample | |
|