Microsoft 365’s New Retention Assistant Processes Multiple Workloads

Aiming for Workload-Agnostic Retention

Both Exchange and SharePoint on-premises servers offered retention processing. Exchange’s Messaging Records Management (MRM 2.0) persists today in the form of mailbox retention policies. Microsoft 365 retention policies, first introduced in 2017, replaced SharePoint’s equivalent (here’s an example of SharePoint retention in use) as part of a strategy to implement workload-agnostic data governance. In other words, deploy retention and other capabilities in a way that worked across all Microsoft 365 workloads.

The initial implementation of retention policies (and classification labels, as Microsoft then called retention labels) supported Exchange Online (including public folders), SharePoint Online, OneDrive for Business, and Skype for Business Online.

Compliance Records and Retention

Teams was the first Office 365 application to store compliance records in Exchange Online mailboxes. Compliance records are mail items created to capture details of chats and channel conversations, including messages sent by people who don’t have Exchange Online mailboxes like on-premises users, guest accounts, and external (federated) users (the substrate stores these messages in special cloud-only mailboxes). Yammer adopted the same mechanism in 2020, but only for networks configured in Microsoft 365 native mode. Planner announced that they would follow likewise but have not yet delivered this capability.

If you check the number of compliance items in your mailbox, you might find that you have more than you imagine. Most of my chatting happens in other tenants, but even so, I have quite a few compliance records for Teams chats (and fewer for Yammer conversations). Exchange Online doesn’t charge the storage consumed by these messages against user mailbox quotas.

$Folders = Get-ExoMailboxFolderStatistics -Identity Tony.Redmond@office365itpros.com -IncludeOldestAndNewestItems -FolderScope NonIPM
$Folders |?{$_.Name -eq "TeamsMessagesData" -or $_.Name -eq "Yammer"}| Format-Table Name, ItemsInFolder, FolderSize, NewestItemReceivedDate -AutoSize

Name              ItemsInFolder FolderSize                     NewestItemReceivedDate
----              ------------- ----------                     ----------------------
Yammer                      330 4.144 MB (4,344,857 bytes)     23/08/2021 08:23:50
TeamsMessagesData         14815 1.547 GB (1,660,565,924 bytes) 23/08/2021 17:22:38

Like any other Exchange Online content, Microsoft Search includes the compliance records in its indexes, which makes the records available for eDiscovery. Communications compliance policies also process compliance records to detect policy violations such as threatening or offensive behavior.

In addition to their role in eDiscovery, compliance records are the basis for retention processing for their originating workloads. Instead of having to create separate jobs to apply retention policies against the workload repositories, a single processing job runs against the compliance records and synchronizes policy actions such as removing items with the workloads.

Originally, the Exchange Managed Folder Assistant (MFA) processed the Teams compliance records. This was a pragmatic step because MFA could process the compliance records along with other mailbox content and avoided the need to create a special retention assistant for Teams. When MFA processed Teams content, it synchronized the results back to Teams, which applied the deletions to its Cosmos DB-based message store. Teams clients then picked up the changes from the message store and removed items in local caches.

The implementation worked well for Teams channel conversations and chat messages, but a better solution was needed to deal with the growth of workloads supporting retention policies and the need to process special cases like conversations in Teams private channels. To solve the problem, Microsoft implemented a specific background assistant to handle retention processing for all Microsoft 365 workloads except Exchange Online (Exchange is different because it supports both the older mailbox retention policies in addition to Microsoft 365 retention policies).

The Retention Assistant and the Microsoft 365 Substrate

Today, the retention assistant evaluates retention policies against OneDrive for Business, SharePoint Online, Yammer, and Teams using the “digital twins” of workload items stored in the Microsoft 365 substrate. Digital twins are copies of content stored in the Microsoft 365 substrate (in Exchange Online) to make it easier for shared services like Search and artificial intelligence services to process items. It’s much easier for a service to process items in a single place than having to deal with multiple repositories, each requiring a different interface (API).

In some cases, the digital twins are not complete copies, but they’re sufficient to allow shared services to operate efficiently and effectively. If you want to hear more about how the substrate works, consider signing up for TEC 2021 to hear Microsoft CTO for Modern Workplace Transformation Jeffrey Snover discuss the topic.

Substrate items coming within the scope of retention policies are subject to retention periods and actions. As the assistant processes items, it checks if the item’s retention period has lapsed, and if this is the case, invokes the retention action. This could involve additional processing, like putting the item into a manual disposition cycle, or it could be a simple delete action. The assistant interacts with the underlying workloads to ensure compliance with retention policy actions. For instance, if the assistant determines that the retention period for a document stored in a SharePoint Online document library has expired, it instructs SharePoint Online to move the document into the site recycle bin.

Black Box Processing

Although it’s possible to track the processing done by MFA for mailbox items, Microsoft has not yet made an equivalent capability available for the retention assistant. This means that administrators who want to validate the effectiveness of retention processing need to make manual checks of content in repositories like SharePoint Online sites to ensure that items which should retention policies should remove are no longer present. As evident in this useful flowchart, understanding how retention policies process items can be a complex business, especially when locations like sites or teams come within the scope of multiple policies.

Making sense of complex retention policies is why the retention assistant exists. I’m sure it does its job very well. It would just be nice to be able to validate and understand exactly what actions the assistant takes for different locations. Is that too much to ask?


Learn more about how Office 365 really works on an ongoing basis by subscribing to the Office 365 for IT Pros eBook. Our monthly updates keep subscribers informed about what’s important across the Office 365 ecosystem.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.