Teams is the Last Workload to Support Deidentification of Personal Data
On March 16, Microsoft published message center notification MC244599 to announce that the usage data in Teams reports would support the same obfuscation of personally identifiable information (PII) in usage reports as the other workloads do. On April 9, they said that the roll-out of the feature was complete. This is Microsoft roadmap item 70774.
The text in MC244599 and roadmap item 70774 might lead you to think that this is a Teams feature. It’s not. As evident in this December 2020 post, workloads like Exchange Online and SharePoint Online could disguise user-identifiable information like email addresses and display names as well as SharePoint site names in the Microsoft 365 admin center reports. This is a case of Teams catching up. What’s odd about Teams only now obscuring its usage data is that the Microsoft Graph was able to obfuscate the raw Teams usage data then (see the example in the previous post).
Obscuring Personal Data
The setting to control the display of obfuscated user, group, and site data is in the Org-wide Reports section (Figure 1).
After setting the switch, the usage reports for workloads available in the Microsoft 365 admin center contain obfuscated user data (Figure 2).
The setting also covers the usage reports available in the Teams admin center.
The Graph Reports API is to Blame
The setting to control the anonymization of personally identifiable data applies to all reports generated by the Microsoft Graph Reports API, which is the basis for the usage reports in the Microsoft 365 admin center. Deciding to obscure usage data can cause an admin to swap settings to access some information. For instance, the admin center has a report for Microsoft browser usage (Chrome, Brave, and Firefox are studiously ignored). The report is useful to find people who still use the legacy Edge browser, which Microsoft removed from the April 2021 update. But if you look at the report to find the names of people to contact to ask them to switch to a supported browser, you’ll be the deidentified strings like C58FABF670363F68A787078886FCB1A1.
The same issue exists in reports like active users or groups activity, which are examples where the data is all but useless if you don’t know what users are active (and who isn’t) and what groups are in use (and which are not). In all cases, an admin can fix the problem quickly by resetting the switch, but it does show how unintended consequences often flow from an action.
ISV and Your Own Reports as Well
Microsoft hypes the Graph Reports API to ISVs and customers as an easy way to integrate Microsoft 365 usage reporting into existing reporting solutions. This is true, but the downside is that the same switch used to control user anonymization in the Microsoft 365 admin center usage reports affects any other use of the API in a tenant.
For example, we have a PowerShell script to collect information about user activity from a range of Microsoft 365 workloads to present a per-user synopsis of how they interact with the service. The script uses the Reports API to fetch usage data from each workload and combines it together for each user to create the report. If the tenant switches on data obfuscation, the usage report fetched by the script is anonymized and returns data like this:
Report Refresh Date : 2021-04-13 User Principal Name : 47A3F2B66A3C6BF31F1C629D02B43A24 Display Name : 24589499045E94C4FF5C4A681A467937 Is Deleted : False Deleted Date : Last Activity Date : 2021-02-20 Send Count : 76 Receive Count : 123 Read Count : 0 Assigned Products : MICROSOFT 365 E5 DEVELOPER (WITHOUT WINDOWS AND AUDIO CONFERENCING) Report Period : 90
Although the user’s privacy is protected, from an organizational perspective the value of the report is negated.
Understand What Obfuscation Means
It’s easy to understand why Microsoft builds the ability to anonymize user data in reports into the admin center. Several user-assignable roles (like Reports Reader) can access the reports, so it’s good to have a way to protect user privacy, even if it’s only surface-deep. What’s less understandable is the impact the switch has on custom reporting. It just seems a little crude to have a binary switch which control all output.