Skip to content

y For SDE Administrators

Tony Wildish edited this page Oct 21, 2024 · 1 revision

Introduction

SDE Administrators (currently Steven and Tony) have to take care of a number of things:

  • Project management
  • User management
  • Managing SDE Administrators

Project management

This consists of setting up project workspaces and assigning users to them. There are currently four workspace templates available:

  • Unrestricted Workspace
  • Airlock Import Review Workspace
  • Base Workspace
  • Base Workspace with ADF Connection

The Unrestricted Workspace has no firewall preventing it from accessing the internet. This is only for demonstration purposes, and should not be used for real data.

N.B. We will eventually remove this workspace, or otherwise limit access to it.

The Airlock Import Review Workspace is, as it's name implies, only used for reviewing data import/export requests, for all other workspaces. Nobody needs to access this workspace to carry out their work, not even the Airlock Manager who approves or rejects import/export requests. The workspace resources are handled entirely automatically in response to the import/export request workflows triggered in other workspaces.

Only one Airlock Import Review Workspace is needed per SDE. Users never access this workspace directly, nor to Airlock Managers, all access is managed via the airlock mechanism, transparently.

The Base Workspace is the one most projects will need. This has all the functionality of the TRE accessible to it, and the only way to infiltrate or exfiltrate data is to use the airlock request mechanism.

The Base Workspace with ADF Connection is derived from the Base Workspace, but has additional components that allow it to connect to the Azure Data Factory. This workspace was created for us as part of the work to integrate with the Data Core, allowing it to push data into the workspace directly.

Customised workspace templates can be built, should we need them, for other functionality too.

To create a workspace, go to the SDE home page, select Workspaces from the menu on the left, and click Create new. The options are identical for the workspaces described so far, as shown below.

1001 workspace options

Fill in the name and description. The Shared Storage Quota is optional, but storage is cheap, and having shared storage will make multi-user projects much easier to manage, so it's a good idea to add 100 - 500 GB storage, or more, depending on the project nature. Not having shared storage means that the only way to persist data from a VM is to export it via the airlock, which can be a lot of overhead, whereas using shared storage means researchers can create & destroy VMs at will, without losing their data.

Leave the App Service Plan SKU, Address space size, and Workspace Authentication Type options at their defaults, as well as the checkboxes below them. We haven't yet experimented with all these options to understand their significance.

Then click Submit, and sit back while the workspace is created. This can take several minutes, or longer, if the TRE is busy elsewhere.

Installing Guacamole

Every workspace is likely to need Guacamole, the remote desktop service, to allow access to VMs. It's also essential for exporting data from the workspace via the airlock mechanism.

To create the service, click Create new in the workspace Overview, and select Apache Guacamole - Virtual Desktop Service. Fill in the form, which comes down to choosing whether or not to disable Copy/Paste between the desktop and the workspace. Clearly, allowing copy/paste between the two environments is a breach of security, allowing unmonitored transfer of data to/from the workspace. However, for some use-cases, this may be acceptable. That has to be decided on a case by case basis for each project. If in doubt, disable it.

For projects that are handling sensitive data, however, you will need to check the boxes that disable copy and paste. Note that copy/paste will still be allowed between resources within the virtual desktop, but not between the virtual desktop and your laptop.

1010 Guacamole

Enabling the airlock mechanism

Once a workspace exists, it needs to be coupled to the Airlock Review workspace so it can use it to import or export data. This needs a few bits of information:

  • The ID of the Airlock review workspace
  • The ID of the Guacamole service in the Airlock review workspace
  • The ID of the Guacamole service in this workspace
  • The names of the review-VM templates

All of this could be derived on the fly at the time an airlock request is made, and we will automate it to that point, but until then, this is the procedure to follow. To get the ID of the Airlock review workspace, and the Guacamole service, just click on the small 'i' icon on the tile for each object. E.g, here you can see that the Airlock workspace ID is 4bae9185-9151-47£7-bc9c-b3c109fba005.

1002 get airlock review workspace ID

Then, go into your workspace, and on the Overview tab, click Update, on the top-left of the page. The Update button allows you to change some of the parameters of your workspace on the fly, and the one you want is to check the Configure Review VMs, which then opens a new submenu for you to fill in:

1004 updating workspace for airlock

The Import Review Workspace ID is the Airlock workspace ID. The Import Review Workspace Service ID is the ID of the Guacamole service in that workspace. The Export Review Workspace Service ID is the ID of the Guacamole service in the workspace you're updating.

You also need the template names of the import and export VMs. These are tre-service-guacamole-import-reviewvm and tre-service-guacamole-export-reviewvm, respectively.

Once the form is filled in, click Submit, and wait for the wheels to turn.

Tip: I keep a file with the UIDs and template names on my machine, so I can copy/paste them when I create a new workspace. If you click away from the form within the browser while entering information, it will be lost, so you'll want to keep the form open until you're done.

This is a very clunky procedure, and needs repeating if either the workspace Guacamole service or anything about the Airlock Notifier workspace is replaced. We plan to streamline this eventually.

User management

Once a workspace is created, the TRE Admin has to allocate users to roles in that workspace. The three roles available at present are:

  • Workspace Owner - the project manager, with full rights on that workspace, but only on the workspace, not the SDE
  • Workspace Researcher - someone who uses the workspace, and can create user-level resources, but cannot configure the workspace itself, or create shared workspace resources
  • Airlock Manager - someone who approves or denies data import/export requests. This is someone from the PM team, and must not be a project member, or they can create and approve their own import/export requests.

The users in the project first need to have their accounts registered in the PMP SDE tenancy in Azure. Once their accounts are created in Azure, they can be allocated roles. This is done by assigning them a role in the App registration for their workspace resource group. Log in to the Azure portal and select App registrations from the home page. If it isn't there, type App registration in the search bar. 2000 app registrations

In the App Registrations window, search for the app for your workspace. Al SDE workspaces will follow the format {SDE_NAME}-ws-{last-four-digits-of-workspace-id}. You get the workspace ID from the SDE portal, either from the SDE portal home page and clicking on the blue 'i' in the circle in the tile for your workspace (as for workspace services, above), or from the workspace landing page, in the Details tab. In this example, you see my workspace ID ends in 5da9, so the app registration I want is sde002-ws-5da9.

2002 workspace id from details tab

2001 select app registration

Select your application, then select the Managed application in local directory link. 2003 managed application

Then click on the Assign users and groups tile, then + Add user/group, at the top of the screen.

Click None selected under Users and groups, search for the user or users you want to assign, and select them all. Click the Select button at the bottom.

Click None selected under Select a role, then choose one role, and select it. You can't assign a user to multiple roles at the same time.

2004 select role

Once the user(s) or group(s) and role have been selected, you can Assign them.

In addition to the workspace-specific role, all users will need to be granted the TRE User role on the API application , which in our case is called sde002 API. The exact same procedure does the trick.

That's all there is to it. If the user has trouble seeing the right view in the SDE portal after that, they should refresh their credentials (log out/in again, try in a private browser window, wait a few hours, the usual stuff).

Note that the Workspace Owner role is not a superset of the Workspace Researcher, so owners should be given the researcher role too.

To disable a user from accessing a workspace, simply remove their registration for a given workspace application. To disable them from accessing any part of the SDE, remove them from the sde002 API application.

Managing SDE Administrators

To grant someone the SDE Administrator role, use the same procedure as for adding a user to a workspace, but instead, add them to the "{SDE_NAME} API" application - so "sde002 API" for the Alpha release, for example. Give the user the TRE Administrators to grant them full rights on the SDE.

Logging and Monitoring

Setting Up Azure Monitor Using Terraform

The Terraform module located in the azure-monitor directory of the AzureTRErepository provides a template for deploying Azure Monitor components. This typically includes resources such as Log Analytics Workspaces, Application Insights, and relevant configurations to support monitoring across Azure resources.

Key Components of the Azure Monitor Terraform Module

  1. Log Analytics Workspace:
    • The Terraform module creates a Log Analytics Workspace, which acts as a central repository for logs and metrics collected by Azure Monitor.
    • Example code snippet to define a Log Analytics Workspace:
resource "azurerm_log_analytics_workspace" "workspace" {
   name                       = "log-${var.tre_id}-ws-${local.short_workspace_id}"
   resource_group_name        = var.resource_group_name
   location                   = var.location
   retention_in_days          = 30
   sku                        = "PerGB2018"
   tags                       = var.tre_workspace_tags
   internet_ingestion_enabled = var.enable_local_debugging ? true : false

   lifecycle { ignore_changes = [tags] }
   }
  1. Application Insights:
    • Application Insights is configured to monitor the performance and usage of applications. The integration with Log Analytics Workspace is set up to route telemetry data.
    • Example code snippet to define an Application Insights resource:
resource "azurerm_application_insights" "workspace" {
    name                                = local.app_insights_name
    location                            = var.location
    resource_group_name                 = var.resource_group_name
    workspace_id                        = azurerm_log_analytics_workspace.workspace.id
    application_type                    = "web"
    internet_ingestion_enabled          = var.enable_local_debugging ? true  : false
    force_customer_storage_for_profiler = true
    tags                                = var.tre_workspace_tags

    lifecycle { ignore_changes = [tags] }
    }

Given the issues presented in this issue, azapi is utilised instead

resource "azapi_resource" "appinsights" {
  type      = "Microsoft.Insights/components@2020-02-02"
  name      = local.app_insights_name
  parent_id = var.resource_group_id
  location  = var.location
  tags      = var.tre_workspace_tags

  body = jsonencode({
    kind = "web"
    properties = {
      Application_Type                = "web"
      Flow_Type                       = "Bluefield"
      Request_Source                  = "rest"
      IngestionMode                   = "LogAnalytics"
      WorkspaceResourceId             = azurerm_log_analytics_workspace.workspace.id
      ForceCustomerStorageForProfiler = true
      publicNetworkAccessForIngestion = var.enable_local_debugging ? "Enabled" : "Disabled"
    }
  })

  response_export_values = [
    "id",
    "properties.ConnectionString",
  ]

  lifecycle { ignore_changes = [tags] }
}
  1. Diagnostics Settings:
    • Diagnostics settings are configured on various Azure resources to send logs and metrics to the Log Analytics Workspace.
    • Example code snippet for configuring diagnostics:
resource "azurerm_monitor_diagnostic_setting" "example" {
name               = "example-diagnostic-setting"
target_resource_id = azurerm_virtual_machine.example.id
log_analytics_workspace_id = azurerm_log_analytics_workspace.example.id

log {
   category = "Administrative"
   enabled  = true
   retention_policy {
   enabled = false
   }
   }
}

Integrating Azure Monitor with Workspaces

At workspace level there is also an individual azure-monitor module per workspace. This integration ensures that monitoring is tailored to the specific workspace environment, capturing logs, metrics, and other telemetry data from the workspace resources only. The module itself has the following configuration

module "azure_monitor" {
  source                                   = "./azure-monitor"
  tre_id                                   = var.tre_id
  location                                 = var.location
  resource_group_name                      = azurerm_resource_group.ws.name
  resource_group_id                        = azurerm_resource_group.ws.id
  tre_resource_id                          = var.tre_resource_id
  tre_workspace_tags                       = local.tre_workspace_tags
  workspace_subnet_id                      = module.network.services_subnet_id
  azure_monitor_dns_zone_id                = module.network.azure_monitor_dns_zone_id
  azure_monitor_oms_opinsights_dns_zone_id = module.network.azure_monitor_oms_opinsights_dns_zone_id
  azure_monitor_ods_opinsights_dns_zone_id = module.network.azure_monitor_ods_opinsights_dns_zone_id
  azure_monitor_agentsvc_dns_zone_id       = module.network.azure_monitor_agentsvc_dns_zone_id
  blob_core_dns_zone_id                    = module.network.blobcore_zone_id
  enable_local_debugging                   = var.enable_local_debugging
  depends_on = [
    module.network,
    module.airlock
  ]
}

The module has similar layout to the core azure-monitor module, and has deployment of the Azure Monitor Private Linked Scope (AMPLS), along wih private endpoint linking AMPLS to workspace subnet. **In core, it linked it to shared subnet instead, to collected logs from the shared resources.

Extending Azure Monitor with Data Collection Rules and Private Link

In addition to deploying Azure Monitor, Log Analytics Workspaces, and Application Insights, it's crucial to configure Data Collection Rules (DCRs) for fine-grained control over the data you collect from your resources. This section explains how to create a DCR, associate it with a VM, and ensure secure communication via a Data Collection Endpoint (DCE) linked to the Private Link Scope.

Data Collection Rules (DCRs)

Data Collection Rules allow you to define precisely what data should be collected from your resources, such as virtual machines, and where that data should be sent. You can specify logs and metrics to be collected and routed to your Log Analytics Workspace.

Creating a Data Collection Rule Using Terraform

Below is an example Terraform configuration that creates a Data Collection Rule targeting specific performance counters and logs from a VM.

resource "azurerm_monitor_data_collection_rule" "example" {
  name                = "example-dcr"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name

  data_sources {
    performance_counters {
      counter_specifiers = [
        "\\Processor(_Total)\\% Processor Time",
        "\\Memory\\Available MBytes"
      ]
      sampling_frequency_in_seconds = 60
    }

    windows_event_logs {
      category = "Security"
      event_levels = ["Error", "Warning", "Information"]
    }
  }

  destinations {
    log_analytics {
      name              = "loganalyticsdestination"
      workspace_resource_id = azurerm_log_analytics_workspace.example.id
    }
  }
}

This rule collects processor time and available memory metrics, as well as security event logs, and sends them to a Log Analytics Workspace.

Associating the DCR with a Virtual Machine

Once you've created the DCR, you can associate it with a specific VM so that the VM adheres to the data collection policies defined in the rule.

Associating the DCR to a VM

Use the following Terraform configuration to associate the DCR with a VM:

resource "azurerm_monitor_data_collection_rule_association" "example" {
  name                  = "example-dcr-association"
  data_collection_rule_id = azurerm_monitor_data_collection_rule.example.id
  target_resource_id    = azurerm_virtual_machine.example.id
}

This configuration links the previously defined DCR to a specific virtual machine, ensuring that the VM collects and sends the specified data to Azure Monitor.

Creating a Data Collection Endpoint and Adding It to a Private Link Scope

To secure data collection and transmission, you can create a Data Collection Endpoint (DCE) and associate it with the Private Link Scope. This ensures that the data is transmitted over a private, secure network rather than the public internet.

Defining a Data Collection Endpoint

Below is an example of creating a DCE using Terraform:

resource "azurerm_monitor_data_collection_endpoint" "example" {
  name                = "example-dce"
  location            = azurerm_resource_group.example.location
  resource_group_name = azurerm_resource_group.example.name

  network_access_type  = "PrivateLink"
  public_network_access_enabled = false
}

This configuration sets up a DCE that only allows data collection over a private link, ensuring secure data transmission.

Adding the DCE to a Private Link Scope

Finally, you can add the DCE to your Private Link Scope, enabling it to function within the scope of your secure, private network.

resource "azurerm_private_link_scope_data_collection_endpoint_association" "example" {
  private_link_scope_id          = azurerm_private_link_scope.example.id
  data_collection_endpoint_id    = azurerm_monitor_data_collection_endpoint.example.id
}

This step associates the DCE with the Private Link Scope, ensuring that the DCE is part of the secure, private network environment you've established.