Lack of APIs and Integration of Data Across Apps Creates Challenges
Updated: February 24, 2022
I’m bemused by the solutions people proposed when asked about Teams backup. It’s reasonable to consider backing up any IT service and given the recent growth in Teams usage I have seen several requests from people looking to understand what they can do to secure Teams data. The reasons why you might want to backup a cloud service include:
- Securing critical data against loss caused by an external attack (including ransomware and other malware).
- Stopping rogue administrators removing or altering information.
- Moving copies of data to external repositories to ensure people can work if the cloud service is unavailable for an extended period.
These are classic reasons long cited in the on-premises world. However, in the cloud, things are different because you typically don’t have the same level of access to data that you enjoy on-premises.
No Microsoft Backups for Office 365 Data
Apart from SharePoint Online, Microsoft doesn’t backup its Office 365 services. Microsoft relies on its technology (like native data protection for Exchange Online) to protect data, so if you want a backup, you must use a third-party service. There are many services available and generally there are no problems backing up or restoring mailbox and document data. Backup products for Exchange Online and SharePoint Online have roots in on-premises technology and the methods to move data in and out of mailboxes and sites are well understood. APIs, albeit never designed for cloud backups, are available, and everything works. Well, everything works until sensitivity labels and encrypted content are introduced into the mix, but that’s another discussion.
Teams is a Cloud App
Teams is a different matter. Unlike Exchange and SharePoint, Teams is a product of the cloud. It does not exist on-premises and no one ever developed backup interfaces for Teams. But more importantly, Teams is built on top of multiple Office 365 and Azure services. The data in these services is interconnected and dependent. Restoring a mailbox is simple compared to the reconstruction of a team, complete with all its channels, tabs, conversations, meetings, and so on.
Claims of Teams Backup Vendors
Some vendors claim their products cover Teams backup. Most ISVs base their claim to cover Teams on copying the Teams compliance records stored in group and personal mailboxes in Exchange Online. Although it is possible to copy Teams compliance records like any other Exchange mailbox data, this is not a backup. It doesn’t even come close for two reasons:
- Teams compliance records are designed to capture communications for eDiscovery and compliance use. They are not the actual data and the compliance records are not true copies of the original because they lack certain elements of Teams messages, such as reactions.
- No API exists to restore Teams compliance messages into a Teams channel conversation or personal chat. You could read the compliance records and use Graph API calls to write new messages into channel conversations and chats, but this is not a true restore because the newly-written items would be dated differently to the original and lack all the data not copied to compliance records.
Any backup vendor who insists that they deliver Teams coverage through Exchange Online exhibits a woeful ignorance of Teams technology. If a vendor doesn’t understand the strengths and weaknesses of their product, you shouldn’t use them.
The second (less common) approach is to use the beta Teams migration API to backup Teams data. I covered how BitTitan uses the API for cross-tenant migration in a Petri.com article in August 2019 (AvePoint and Quadrotech use the same API for their tenant to tenant migration products). Not much has happened since to develop the API since and the same problems exist. One glaring issue is the inability to handle Teams personal chats. This issue is addressed in the Teams Export Graph API, but that API has challenges of its own, notably the cost/consumption models that Microsoft wants to use to charge for API use.
The commentary about Teams backup found on vendor sites often lacks technical depth and understanding, such as the discussion from which I took Figure 1. I’m not sure if the various vendors cited agree on the assessment of their capabilities! To be fair to the author and the vendors, capabilities change over time and it’s wise to check what the current status is when discussing the problem of backing up Teams with ISVs.
Teams Backup Fails to Deal with all Teams Data
Both approaches fail to take the wide spectrum of Teams interconnected data into account. Backing up one piece of information secures that data, but that data might be useless if other connected data is not copied and available.
Table 1 lists some of the connected data used by Teams. It’s not a definitive list and other data might be needed (like OneNote) to create a comprehensive backup of Teams in an Microsoft 365 tenant. The purpose of the list is to illustrate the wide array of user and system data consumed by Teams. If you want to backup Teams, you need to understand what data is used with Teams in your tenant. Once you know that, you can figure out how to solve the backup problem.
In some cases, a workaround might compensate for the lack of a backup API. For instance, you could download every video from Stream and copy the video to a backup site. You could use the Graph API to copy Plans, and so on.
|Teams data||Location||Backup Situation|
|Personal and group chat messages||Azure CosmosDB.||No Teams backup API available. The Microsoft 365 substrate captures imperfect copies of personal chats and stores them as mail items in the mailboxes of chat participants. See note about compliance records below.|
|Regular channel conversations||Azure CosmosDB.||No Teams backup API available. The Microsoft 365 substrate captures imperfect copies of regular channel conversations and stores them as mail items in the group mailboxes of the team owning the channel. See note about compliance records below.|
|Private channel conversations||Azure CosmosDB.||No Teams backup API available. The Microsoft 365 substrate captures imperfect copies of private channel conversations and stores them as mail items in the individual mailboxes of the members of the channel. See note about compliance records below.|
|Shared channel conversations||Azure CosmosDB.||NoTeams backup API available. The Microsoft 365 substrate captures imperfect copies of shared channel conversations and stores them as mail items in the special cloud-only mailbox owned by the channel. This mailbox is invisible to normal administrative processes. See note about compliance records below.|
|Loop components in Teams chat||OneDrive for Business||The .fluid files created for loop contents stored in OneDrive for Business accounts can be backed up. However, they can’t be restored into Teams. Compliance records are captured for loop components, but these records store no content and are merely a pointer to the relevant file in OneDrive.|
|GIFs used in Teams messages.||Teams CDN.||No backup API available.|
|Documents shared in personal and group chats||OneDrive for Business.||Backed up with OneDrive for Business.|
|Documents shared in Teams channels (Files).||Document libraries and folders in SharePoint Online sites||Backed up with SharePoint Online.|
|Private channels||Separate SharePoint Online site per private channel.||Backed up with SharePoint Online (if the backup product processes the sites used by private channels).|
|Shared channels||Separate SharePoint Online site per shared channel.||Backed up with SharePoint Online (if the backup product processes the sites used by shared channels).|
|Email sent to Teams channels via connector.||Azure CosmosDB and SharePoint Online,||Backed up with SharePoint Online (messages posted to channels are not backed up).|
|Messages posted to channels via Office connectors.||Azure CosmosDB.||No backup API available.|
|Teams calendar.||User and group mailboxes (Exchange Online)||Backed up with Exchange Online data. Personal meetings are in the calendar folder of user mailboxes while channel meetings are in the group calendar of the team owning the channel. Meetings scheduled in a shared channel are in a calendar folder of the cloud-only mailbox used by the channel.|
|Teams meeting recordings||Stream/OneDrive||No Stream backup API available. Recordings of meetings stored in OneDrive for Business and SharePoint can be backed up along with other OneDrive and SharePoint data.|
|Teams meeting insights||Exchange Online||The attendance reports for meetings, registration reports for Teams webinars, and meeting transcripts are stored in Exchange Online mailboxes of meeting organizers. These artefacts can be backed up along with other Exchange Online data if the backup fetches the data from their locations in folders in the non-IPM part of the mailbox. The information stored in OneDrive for Business to allow spoken text and webinar details to be searched might be backed up along with other OneDrive information but restoring the data to the right place might be problematic.|
|Teams Wiki||SharePoint Online.||Should be backed up with other SharePoint data.|
|Teams compliance records||Exchange Online mailboxes||Backed up with contents of Exchange Online user and group mailboxes (if the backup product processes the TeamsMessageData folder in the non-IPM part of the mailboxes). Note that these are imperfect copies of Teams messages which remain in Azure Cosmos DB. The Microsoft 365 substrate captures compliance records for all Teams messages, including those sent by guest and hybrid users, and messages from external (federated) chats. The substrate uses cloud-only mailboxes to hold compliance records for guest, hybrid, and federated messages.|
|Planner (Tasks in Teams)||Azure||No Planner backup API available. Compliance records for Tasks created by Planner (and To Do) are held in Exchange Online for eDiscovery and compliance purposes, but the actual Planner data remains in Azure.|
|Teams audit data.||Office 365 audit log.||Can be extracted with the Search-UnifiedAuditLog cmdlet (PowerShell).|
|First party apps like Approvals, Viva Insights, etc.||Various||Depends on the repository (like Dataverse). Not obviously accessible to backup products.|
|Third-party apps.||Teams app store and third-party repositories.||Responsibility of third-party apps.|
|Teams membership and group object.||Azure Active Directory.||Can be backed up by reading information from Azure AD (membership of Teams private channels is not in Azure AD).|
|Teams policies and settings.||Azure.||Some data can be backed up by reading policies and settings with PowerShell and saving the configuration settings in a file accessible for backup.|
|Teams usage data.||Microsoft Graph.||Can be read from the Teams Reports Graph API, but needs to be saved in a form accessible to backups.|
|Whiteboard used in meetings||Microsoft Whiteboard service.||No backup API available. This issue is addressed when Microsoft moves the storage of whiteboard content to OneDrive for Business (MC253185).|
The problem with workarounds is that they often lack automation and the ability to scale. How many videos does a tenant store in Stream? How many are generated daily? How many plans are created and how many tasks are added, changed, or removed daily? And whiteboards?
Restore an Even Bigger Issue
Vendors of Teams backup solutions want to sell products. Their access to your data is limited by the available APIs, so no great mystery exists as to why a comprehensive backup for Teams is so difficult to achieve. And once you have some backup data, consider that restoring Teams is even more problematic.
For more information about Office 365 backups, read Chapter 4 of the Office 365 for IT Pros eBook. Our approach can be summarized as “understand what you need to backup and why before you commit to an external backup service.”