Azure RBAC Catalog

Jan 6, 2026

What is Azure RBAC?

Azure Role-Based Access Control (Azure RBAC) is an authorization system built on Azure Resource Manager that provides fine-grained access management of Azure resources. It enables you to manage who has access to Azure resources, what they can do with those resources, and what areas they have access to.

With Azure RBAC, you can segregate duties within your team and grant only the amount of access to users that they need to perform their jobs. Instead of giving everybody unrestricted permissions in your Azure subscription or resources, you can allow only specific actions at a particular scope. Azure RBAC includes over 800 built-in roles, or you can create your own custom roles tailored to your organization's needs.

For more information, see What is Azure role-based access control (Azure RBAC)? in the official Microsoft documentation.

About This Catalog

A comprehensive catalog and monitoring tool for Azure built-in RBAC roles. Browse roles, explore their permissions, track changes over time, find least-privilege roles based on operation requirements, and get AI-powered role recommendations.

Key Features

Role Catalog: Browse all 800+ Azure built-in roles with full permission details.
Operation Explorer: Search 20,000+ resource provider operations.
Change Tracking: Monitor when Microsoft adds, modifies, or deprecates roles.
AI Role Recommender: Describe what you need in natural language, get least-privilege role suggestions.
Diff Viewer: See exactly what changed between role versions.
8 AI Recommendation Modes: From fast keyword matching to LLM-powered semantic understanding.

AI Recommendation Modes

The AI Role Recommender supports 8 different modes, each with different speed/accuracy trade-offs:

TF-IDF: Enhanced TF-IDF + BM25 keyword matching (CPU only)
Semantic: Pure sentence embedding similarity using sentence-transformers
ColBERT: Token-level late interaction for precise matching via ColBERT index
Cross-Encoder: Bi-encoder retrieval followed by neural reranking
LLM: Fine-tuned Qwen2.5-0.5B-Instruct model for direct inference via Ollama
RAG: Retrieval-Augmented Generation with LLM reranking
HyDE: Hypothetical document generation + semantic search
Hybrid: Multi-stage pipeline: TF-IDF → Embeddings → LLM

Fine-tuning was done using Unsloth for efficient LoRA training on consumer hardware. The model takes natural language queries like "I need to read blob storage" and outputs structured JSON with role recommendations and confidence scores.

Knowledge Base

Each role is converted into a searchable document_text combining:

Role name & description — From Azure's Role Definition API
Action keywords — Tokenized from expanded permissions (e.g., Microsoft.Compute/virtualMachines/powerOff/action → virtualmachines poweroff action)
Curated patterns — Human-written query examples (e.g., "read blob storage")

Knowledge Base Pipeline

The diagram above shows how data flows from Azure APIs and curated patterns through the tokenization pipeline to produce the searchable document text used by the AI recommendation engines.

References

TF-IDF/BM25: Robertson & Zaragoza, The Probabilistic Relevance Framework: BM25 and Beyond (2009)
Sentence-BERT: Reimers & Gurevych, Sentence Embeddings using Siamese BERT-Networks (2019)
ColBERT: Khattab & Zaharia, Efficient and Effective Passage Search via Contextualized Late Interaction (2020)
Cross-Encoder: Humeau et al., Poly-encoders: Architectures and Pre-training Strategies (2019)
Qwen: Bai et al., Qwen Technical Report (2023)
RAG: Lewis et al., Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (2020)
HyDE: Gao et al., Precise Zero-Shot Dense Retrieval without Relevance Labels (2022)

Cloud Computing