Apply Office 365 Sensitivity Labels at Scale to at-rest Data
Office 365 sensitivity labels are used to mark messages and Office documents with visual indicators of the importance or sensitivity of an item. Optionally, a sensitivity label can also invoke rights-management based encryption to protect labeled items.
Manual application of sensitivity labels is a good way to protect new messages and documents but does nothing to deal with the mass of documents and messages that already exist inside Office 365. To address the issue, Microsoft is running a preview program for auto-labeling Word, Excel, and PowerPoint files stored in SharePoint Online sites and OneDrive for Business accounts (Exchange Online will come later). The solution is intended to allow Office 365 tenants to protect existing content at scale without needing anyone to review large quantities of documents.
Microsoft’s Information Protection team recently hosted a Teams Live Event (Figure 1) to discuss what they’re working on in the preview and hope to make generally available later this year. The event recording (in Stream) is available to all (via anonymous join) and is a good example of how Teams can host product briefings if you’ve never used this technology. Further information is in the Yammer Information Protection community (also open)
Auto-Label Policies for Sensitivity Labels
Auto-label policies are configured with rules that look for at-rest (closed) documents containing Office 365 sensitive data types (such as bank account details). For example, a policy might include rules to detect documents with two or more instances of passport or personal identity card numbers and protect matching documents with a sensitivity label called Personal Information. Apart from the hundred-plus standard sensitive data types defined by Microsoft, auto-label policies also support custom sensitive data types defined by the tenant, meaning that you could scan for documents relating to a sensitive project and auto-label those files.
The source documents are examined by a background process capable of applying sensitivity labels to up to 25,000 documents per tenant daily (so it might take some time to process all content in the target sites). The process scans files to find instances of sensitive data types that match the rules set in auto-label policies. When a match is detected, the process applies the sensitivity label unless a user has already applied a sensitivity label (explicit assignment always beats auto-assignment). Labels applied automatically remain with documents if they are moved out of a site or account processed by an auto-label policy.
The preview allows tenants to have up to ten auto-label policies (these limits might be upgraded when auto-label policies reach general availability). During the preview, each policy can cover up to ten sites or accounts. When generally available, the load imposed on the service needed to process files means that it’s likely that a limit for scanned sites will still exist. This will force tenants to select sites where sensitive data needing protection is most likely to be found. Office 365 E5 or Microsoft 365 E5 licenses are needed for all accounts that contribute files to the sites scanned by auto-label policies.
Auto-labels have a test mode, meaning that you can discover what files in a target location match the rules set in a policy. The idea is to allow administrators to tune policy rules by seeing what effect changes to rule conditions have. Another interesting feature is the content explorer, which displays lists of files containing sensitive data types in the scanned sites. Again, the idea is that admins can use this information to fine-tune auto-label policy settings.
If the preview progresses as expected, auto-label policies should be generally available in the Microsoft 365 compliance portal in the March-April 2020 timeframe. If you want to join the preview, you can register your tenant details here.
We have a complete chapter about protecting Office 365 content in the Office 365 for IT Pros eBook. We’ve been tracking the progress of sensitivity labels since their first introduction, so our coverage is pretty good.