-
Notifications
You must be signed in to change notification settings - Fork 0
y For SDE Administrators
SDE Administrators (currently Steven and Tony) have to take care of a number of things:
- Project management
- User management
- Managing SDE Administrators
This consists of setting up project workspaces and assigning users to them. There are currently four workspace templates available:
- Unrestricted Workspace
- Airlock Import Review Workspace
- Base Workspace
- Base Workspace with ADF Connection
The Unrestricted Workspace has no firewall preventing it from accessing the internet. This is only for demonstration purposes, and should not be used for real data.
N.B. We will eventually remove this workspace, or otherwise limit access to it.
The Airlock Import Review Workspace is, as it's name implies, only used for reviewing data import/export requests, for all other workspaces. Nobody needs to access this workspace to carry out their work, not even the Airlock Manager who approves or rejects import/export requests. The workspace resources are handled entirely automatically in response to the import/export request workflows triggered in other workspaces.
Only one Airlock Import Review Workspace is needed per SDE. Users never access this workspace directly, nor to Airlock Managers, all access is managed via the airlock mechanism, transparently.
The Base Workspace is the one most projects will need. This has all the functionality of the TRE accessible to it, and the only way to infiltrate or exfiltrate data is to use the airlock request mechanism.
The Base Workspace with ADF Connection is derived from the Base Workspace, but has additional components that allow it to connect to the Azure Data Factory. This workspace was created for us as part of the work to integrate with the Data Core, allowing it to push data into the workspace directly.
Customised workspace templates can be built, should we need them, for other functionality too.
To create a workspace, go to the SDE home page, select Workspaces from the menu on the left, and click Create new. The options are identical for the workspaces described so far, as shown below.
Fill in the name and description. The Shared Storage Quota is optional, but storage is cheap, and having shared storage will make multi-user projects much easier to manage, so it's a good idea to add 100 - 500 GB storage, or more, depending on the project nature. Not having shared storage means that the only way to persist data from a VM is to export it via the airlock, which can be a lot of overhead, whereas using shared storage means researchers can create & destroy VMs at will, without losing their data.
Leave the App Service Plan SKU, Address space size, and Workspace Authentication Type options at their defaults, as well as the checkboxes below them. We haven't yet experimented with all these options to understand their significance.
Then click Submit, and sit back while the workspace is created. This can take several minutes, or longer, if the TRE is busy elsewhere.
Every workspace is likely to need Guacamole, the remote desktop service, to allow access to VMs. It's also essential for exporting data from the workspace via the airlock mechanism.
To create the service, click Create new in the workspace Overview, and select Apache Guacamole - Virtual Desktop Service. Fill in the form, which comes down to choosing whether or not to disable Copy/Paste between the desktop and the workspace. Clearly, allowing copy/paste between the two environments is a breach of security, allowing unmonitored transfer of data to/from the workspace. However, for some use-cases, this may be acceptable. That has to be decided on a case by case basis for each project. If in doubt, disable it.
For projects that are handling sensitive data, however, you will need to check the boxes that disable copy and paste. Note that copy/paste will still be allowed between resources within the virtual desktop, but not between the virtual desktop and your laptop.
Once a workspace exists, it needs to be coupled to the Airlock Review workspace so it can use it to import or export data. This needs a few bits of information:
- The ID of the Airlock review workspace
- The ID of the Guacamole service in the Airlock review workspace
- The ID of the Guacamole service in this workspace
- The names of the review-VM templates
All of this could be derived on the fly at the time an airlock request is made, and we will automate it to that point, but until then, this is the procedure to follow. To get the ID of the Airlock review workspace, and the Guacamole service, just click on the small 'i' icon on the tile for each object. E.g, here you can see that the Airlock workspace ID is 4bae9185-9151-47£7-bc9c-b3c109fba005.
Then, go into your workspace, and on the Overview tab, click Update, on the top-left of the page. The Update button allows you to change some of the parameters of your workspace on the fly, and the one you want is to check the Configure Review VMs, which then opens a new submenu for you to fill in:
The Import Review Workspace ID is the Airlock workspace ID. The Import Review Workspace Service ID is the ID of the Guacamole service in that workspace. The Export Review Workspace Service ID is the ID of the Guacamole service in the workspace you're updating.
You also need the template names of the import and export VMs. These are tre-service-guacamole-import-reviewvm and tre-service-guacamole-export-reviewvm, respectively.
Once the form is filled in, click Submit, and wait for the wheels to turn.
Tip: I keep a file with the UIDs and template names on my machine, so I can copy/paste them when I create a new workspace. If you click away from the form within the browser while entering information, it will be lost, so you'll want to keep the form open until you're done.
This is a very clunky procedure, and needs repeating if either the workspace Guacamole service or anything about the Airlock Notifier workspace is replaced. We plan to streamline this eventually.
Once a workspace is created, the TRE Admin has to allocate users to roles in that workspace. The three roles available at present are:
- Workspace Owner - the project manager, with full rights on that workspace, but only on the workspace, not the SDE
- Workspace Researcher - someone who uses the workspace, and can create user-level resources, but cannot configure the workspace itself, or create shared workspace resources
- Airlock Manager - someone who approves or denies data import/export requests. This is someone from the PM team, and must not be a project member, or they can create and approve their own import/export requests.
The users in the project first need to have their accounts registered in the PMP SDE tenancy in Azure. Once their accounts are created in Azure, they can be allocated roles. This is done by assigning them a role in the App registration for their workspace resource group. Log in to the Azure portal and select App registrations from the home page. If it isn't there, type App registration in the search bar.
In the App Registrations window, search for the app for your workspace. Al SDE workspaces will follow the format {SDE_NAME}-ws-{last-four-digits-of-workspace-id}. You get the workspace ID from the SDE portal, either from the SDE portal home page and clicking on the blue 'i' in the circle in the tile for your workspace (as for workspace services, above), or from the workspace landing page, in the Details tab. In this example, you see my workspace ID ends in 5da9, so the app registration I want is sde002-ws-5da9.
Select your application, then select the Managed application in local directory link.
Then click on the Assign users and groups tile, then + Add user/group, at the top of the screen.
Click None selected under Users and groups, search for the user or users you want to assign, and select them all. Click the Select button at the bottom.
Click None selected under Select a role, then choose one role, and select it. You can't assign a user to multiple roles at the same time.
Once the user(s) or group(s) and role have been selected, you can Assign them.
In addition to the workspace-specific role, all users will need to be granted the TRE User role on the API application , which in our case is called sde002 API. The exact same procedure does the trick.
That's all there is to it. If the user has trouble seeing the right view in the SDE portal after that, they should refresh their credentials (log out/in again, try in a private browser window, wait a few hours, the usual stuff).
Note that the Workspace Owner role is not a superset of the Workspace Researcher, so owners should be given the researcher role too.
To disable a user from accessing a workspace, simply remove their registration for a given workspace application. To disable them from accessing any part of the SDE, remove them from the sde002 API application.
To grant someone the SDE Administrator role, use the same procedure as for adding a user to a workspace, but instead, add them to the "{SDE_NAME} API" application - so "sde002 API" for the Alpha release, for example. Give the user the TRE Administrators to grant them full rights on the SDE.
The Terraform module located in the azure-monitor
directory of the AzureTRErepository provides a template for deploying Azure Monitor components. This typically includes resources such as Log Analytics Workspaces, Application Insights, and relevant configurations to support monitoring across Azure resources.
-
Log Analytics Workspace:
- The Terraform module creates a Log Analytics Workspace, which acts as a central repository for logs and metrics collected by Azure Monitor.
- Example code snippet to define a Log Analytics Workspace:
resource "azurerm_log_analytics_workspace" "workspace" {
name = "log-${var.tre_id}-ws-${local.short_workspace_id}"
resource_group_name = var.resource_group_name
location = var.location
retention_in_days = 30
sku = "PerGB2018"
tags = var.tre_workspace_tags
internet_ingestion_enabled = var.enable_local_debugging ? true : false
lifecycle { ignore_changes = [tags] }
}
-
Application Insights:
- Application Insights is configured to monitor the performance and usage of applications. The integration with Log Analytics Workspace is set up to route telemetry data.
- Example code snippet to define an Application Insights resource:
resource "azurerm_application_insights" "workspace" {
name = local.app_insights_name
location = var.location
resource_group_name = var.resource_group_name
workspace_id = azurerm_log_analytics_workspace.workspace.id
application_type = "web"
internet_ingestion_enabled = var.enable_local_debugging ? true : false
force_customer_storage_for_profiler = true
tags = var.tre_workspace_tags
lifecycle { ignore_changes = [tags] }
}
Given the issues presented in this issue, azapi is utilised instead
resource "azapi_resource" "appinsights" {
type = "Microsoft.Insights/components@2020-02-02"
name = local.app_insights_name
parent_id = var.resource_group_id
location = var.location
tags = var.tre_workspace_tags
body = jsonencode({
kind = "web"
properties = {
Application_Type = "web"
Flow_Type = "Bluefield"
Request_Source = "rest"
IngestionMode = "LogAnalytics"
WorkspaceResourceId = azurerm_log_analytics_workspace.workspace.id
ForceCustomerStorageForProfiler = true
publicNetworkAccessForIngestion = var.enable_local_debugging ? "Enabled" : "Disabled"
}
})
response_export_values = [
"id",
"properties.ConnectionString",
]
lifecycle { ignore_changes = [tags] }
}
-
Diagnostics Settings:
- Diagnostics settings are configured on various Azure resources to send logs and metrics to the Log Analytics Workspace.
- Example code snippet for configuring diagnostics:
resource "azurerm_monitor_diagnostic_setting" "example" {
name = "example-diagnostic-setting"
target_resource_id = azurerm_virtual_machine.example.id
log_analytics_workspace_id = azurerm_log_analytics_workspace.example.id
log {
category = "Administrative"
enabled = true
retention_policy {
enabled = false
}
}
}
At workspace level there is also an individual azure-monitor module per workspace. This integration ensures that monitoring is tailored to the specific workspace environment, capturing logs, metrics, and other telemetry data from the workspace resources only. The module itself has the following configuration
module "azure_monitor" {
source = "./azure-monitor"
tre_id = var.tre_id
location = var.location
resource_group_name = azurerm_resource_group.ws.name
resource_group_id = azurerm_resource_group.ws.id
tre_resource_id = var.tre_resource_id
tre_workspace_tags = local.tre_workspace_tags
workspace_subnet_id = module.network.services_subnet_id
azure_monitor_dns_zone_id = module.network.azure_monitor_dns_zone_id
azure_monitor_oms_opinsights_dns_zone_id = module.network.azure_monitor_oms_opinsights_dns_zone_id
azure_monitor_ods_opinsights_dns_zone_id = module.network.azure_monitor_ods_opinsights_dns_zone_id
azure_monitor_agentsvc_dns_zone_id = module.network.azure_monitor_agentsvc_dns_zone_id
blob_core_dns_zone_id = module.network.blobcore_zone_id
enable_local_debugging = var.enable_local_debugging
depends_on = [
module.network,
module.airlock
]
}
The module has similar layout to the core azure-monitor module, and has deployment of the Azure Monitor Private Linked Scope (AMPLS), along wih private endpoint linking AMPLS to workspace subnet. **In core, it linked it to shared subnet instead, to collected logs from the shared resources.
In addition to deploying Azure Monitor, Log Analytics Workspaces, and Application Insights, it's crucial to configure Data Collection Rules (DCRs) for fine-grained control over the data you collect from your resources. This section explains how to create a DCR, associate it with a VM, and ensure secure communication via a Data Collection Endpoint (DCE) linked to the Private Link Scope.
Data Collection Rules allow you to define precisely what data should be collected from your resources, such as virtual machines, and where that data should be sent. You can specify logs and metrics to be collected and routed to your Log Analytics Workspace.
Below is an example Terraform configuration that creates a Data Collection Rule targeting specific performance counters and logs from a VM.
resource "azurerm_monitor_data_collection_rule" "example" {
name = "example-dcr"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
data_sources {
performance_counters {
counter_specifiers = [
"\\Processor(_Total)\\% Processor Time",
"\\Memory\\Available MBytes"
]
sampling_frequency_in_seconds = 60
}
windows_event_logs {
category = "Security"
event_levels = ["Error", "Warning", "Information"]
}
}
destinations {
log_analytics {
name = "loganalyticsdestination"
workspace_resource_id = azurerm_log_analytics_workspace.example.id
}
}
}
This rule collects processor time and available memory metrics, as well as security event logs, and sends them to a Log Analytics Workspace.
Once you've created the DCR, you can associate it with a specific VM so that the VM adheres to the data collection policies defined in the rule.
Use the following Terraform configuration to associate the DCR with a VM:
resource "azurerm_monitor_data_collection_rule_association" "example" {
name = "example-dcr-association"
data_collection_rule_id = azurerm_monitor_data_collection_rule.example.id
target_resource_id = azurerm_virtual_machine.example.id
}
This configuration links the previously defined DCR to a specific virtual machine, ensuring that the VM collects and sends the specified data to Azure Monitor.
To secure data collection and transmission, you can create a Data Collection Endpoint (DCE) and associate it with the Private Link Scope. This ensures that the data is transmitted over a private, secure network rather than the public internet.
Below is an example of creating a DCE using Terraform:
resource "azurerm_monitor_data_collection_endpoint" "example" {
name = "example-dce"
location = azurerm_resource_group.example.location
resource_group_name = azurerm_resource_group.example.name
network_access_type = "PrivateLink"
public_network_access_enabled = false
}
This configuration sets up a DCE that only allows data collection over a private link, ensuring secure data transmission.
Finally, you can add the DCE to your Private Link Scope, enabling it to function within the scope of your secure, private network.
resource "azurerm_private_link_scope_data_collection_endpoint_association" "example" {
private_link_scope_id = azurerm_private_link_scope.example.id
data_collection_endpoint_id = azurerm_monitor_data_collection_endpoint.example.id
}
This step associates the DCE with the Private Link Scope, ensuring that the DCE is part of the secure, private network environment you've established.