#Microsoft365Microsoft TeamsMicrosoft 365
Written By Peter Rising
I’m often asked about eDiscovery in Microsoft 365, and it’s a technology that is easily misunderstood. What does it do, and why? What are the different versions of eDiscovery available in Microsoft 365 and how do you ensure that you are correctly licensed to use them? This article focuses on Advanced eDiscovery, but before we get to that we need to understand eDiscovery as a whole and the different flavors available.
Note: most of what I cover in this article can be found on docs.com. However, you do have to jump around a fair bit to see all the relevant and associated content, so this post attempts to describe the key features of Advanced eDiscovery in a single article which is hopefully easy to digest, and with a sprinkling of my thoughts throughout.
Now, I’ll be candid with you and admit that I have found Advanced eDiscovery to be a complex subject, and it really isn’t easy to follow in many ways. At the time of writing this article I feel I’ve developed a solid grasp of the technology, but every time I examine Advanced eDiscovery further and test it more vigorously I learn something else and understand it better. So, if you read this, and feel that I’ve been inaccurate in any way, please come and talk to me about it , and I will go back and update and correct this piece to improve its accuracy.
So, what is eDiscovery?
eDiscovery within Microsoft 365 is a means for compliance administrators to search for, identify, place on hold, and export information which may be required for legal cases, or internal compliance investigations within an organization. eDiscovery tools can search content in Exchange Online, SharePoint Online, OneDrive, Teams, Microsoft 365 Groups, and Yammer (only if Yammer is configured in Microsoft 365 mode). Microsoft currently provide two types of eDiscovery which are, Core eDiscovery, and Advanced eDiscovery. Which of these tools are available to you will depend on the Microsoft 365 licensing that you have in your tenant?
Content search
Content search allows organizations to search for data without having to create a formal eDiscovery case. Content search is used to find email, documents, and conversations in Microsoft 365 services such as Exchange Online, SharePoint Online, OneDrive, and Teams. When you perform a content search from the Microsoft 365 Compliance center, you build your search criteria with keyword queries and conditions to narrow the results, search Microsoft 365 locations, and third-party data imported into Microsoft 365 from other services such as Google and Facebook and export the results of your searches for detailed analysis. It is important to point out that Content search does not search the external sources, but only data imported from those sources into Microsoft 365. To run a content search, you must be a member of the eDiscovery Manager role group (which is a Compliance Management role group as opposed a role group from the Microsoft 365 admin center).
There are four ways to run a content search which are, New search, Guided search, Search by ID list, and PowerShell. New search lets you manually compose search criteria, Guided search allows less confident administrators to benefit from a wizard to help them construct their searches, and Search by ID list provides the ability to search for email mailbox content using a list of Exchange IDs which are provided using a CSV file generated by either a previous standard or guided content search. Guidance on running content search using PowerShell may be found here.
Figure 1 shows the options to run a content search from the Microsoft 365 Compliance center.
Content search is also used to search for data in cloud-only mailboxes used for hybrid and guest users.
For awareness of the licensing requirements to run a content search, please refer to this article.
Detailed information on content search is here.
Core eDiscovery
Whilst content search is a standalone feature, it is also incorporated into Core eDiscovery. In addition to content search, Core eDiscovery also provides the ability to create cases for investigations, place content on hold, and export the search results. Creating an eDiscovery case may be required if your organization is asked to comply with a litigation process and evidence needs to be collected and presented to an opposing legal counsel. Cases may also be needed for other reasons, many of which do not involve lawyers and consist of internal investigations.
A Core eDiscovery case is configured by compliance administrators with access to the eDiscovery Manager role group from the Microsoft 365 Compliance center. When setting up a new case, there are four tabs within the case: Home, Holds, Searches, and Exports.
The Home tab (Figure 2) shows the name of the eDiscovery case, the date the case was created, the status of the case, and the case description.
The Holds tab is where you create new holds relevant to your case, although creating a hold is not a mandatory step. You may choose users, groups, teams, or sites to place on hold for your case. This means that the content you place on hold will be protected from permanent deletion for the duration of the hold.
Note: Retention policies may also be in place in your organization but will only be valid for the designated retention period. Should you have a retention policy which has conflicting settings with an eDiscovery case hold, the hold will take precedence.
Figure 3 shows the Holds tab within a Core eDiscovery case in the Microsoft 365 Compliance center.
For awareness of the licensing requirements to run a Core eDiscovery case, please refer to this article.
Advanced eDiscovery
Now onto the main event, and while at first glance Advanced eDiscovery may appear to simply contain all the features described so far in this article plus some extras all wrapped in a shiny different UI, this is very much not the case. What you see in Advanced eDiscovery today is the result of the acquisition of Equivio by Microsoft in 2015. Equivio contained machine learning powered compliance solutions which were brought into Advanced eDiscovery and have been enhanced ever since.
With Advanced eDiscovery, you still create an eDiscovery case, and you may still apply holds, carry out searches, and export reports and results. In addition to this, there is extended functionality. You may configure global analytics settings to enable Attorney-client privilege detection in all Advanced eDiscovery cases, and when you create an Advanced eDiscovery case you will see many more tabs to access and configure the settings and parameters for your case. These tabs are shown in Figure 4.
I’ve deliberately selected the Settings tab in Figure 4, as you will be taken directly to this tab when creating your Advanced eDiscovery case providing you select the recommended option of Yes, I want to add members or configure the analytics settings.
The other option you will see when creating the case is No, just go to the home page. I’ll use the default case settings for now. Choosing this option will take you into the Overview tab for the case. I recommend adding members or configuring the analytics at this stage, purely as it makes sense to me to configure the settings shown in Figure 4 prior to adding data sources, setting up collections, or the other stages of the case.
Let’s examine each tab of an Advanced eDiscovery case. Due to the way that Advanced eDiscovery processes data within cases, we will examine some of these tabs out of sequence.
Settings
Although the Settings tab is the last tab on the right, it is important to examine this as your first step to ensure that the core settings, permissions, and search and analytics features are appropriately configured for your case.
Case Information shows the settings you entered when you created the case. From here, you may change the case name, case number, and description. You may also close the case, delete the case, or copy the support information for the case from here.
Access & permissions allows you to view and modify which users and role groups have access to this case.
Search & analytics allows you to set thresholds for the searches that you complete within your case. For example, the Near duplicates and email threading is set to detect document and email similarities. The accuracy percentage that you set here determines how long your search will take. Choosing to enable Themes from here analyzes the text within documents for writing patterns. In addition, the Ignore text option allows you to set words and phrases to be ignored within near duplicates, themes, and email threads. Finally, you choose to enable Optical Character Recognition (OCR) for the case. This enables the ability for the searches in your case to search images for text and convert them to real text in the results. OCR allows you to set an accuracy percentage, and as per the Near duplicates and email threading option, the higher you set the accuracy percentage, the more time consuming your searches are likely to take.
Overview
In the Overview tab, you view dashboard style statistics for your Advanced eDiscovery case. There are several widgets in this tab where you gain quick insights to the case including custodian numbers, and any custodians on hold (custodians are explained under Data Sources), a communication summary of notices sent to custodians, a recent job status list, and a section detailing any errors.
Data sources
The data sources tab is where you start to build out your Advanced eDiscovery case. Here you configure Custodians for your case. A custodian is a person of interest to the eDiscovery case – essentially a custodian is one of your users within Microsoft 365. When you add custodians to your case, you also need to select the custodial locations that are to be included for each custodian. There are two custodial locations – Exchange and OneDrive. However, it is important to know that Exchange mailboxes also include an imperfect copy of teams compliance data which is copied from the Microsoft 365 substrate. This includes copies of 1:1 teams conversations in user mailboxes and copies of channel conversations in group mailboxes.
You choose to select additional locations to apply to custodians in your case if there is content relevant to the case that custodians have worked on within other Microsoft 365 locations. These include additional Exchange email mailboxes, SharePoint sites, Teams, and Yammer (if in Microsoft 365 mode).
The final step when choosing your custodians is to decide whether to apply a hold to all content associated with the custodians you have selected. When a custodian is placed on hold, all content in all locations associated with the custodian is preserved until the hold is removed, or the custodian is released from the case. See how to view hold statistics here.
Placing a hold on custodial data sources is selected by default so if you do not wish to place custodial content on hold, you need to explicitly deselect this option for each custodian.
Also within the Data sources tab is a Data locations section. Here you add non-custodial data sources (data not associated to the custodians) to the case, such as Exchange mailboxes and SharePoint sites. You place these non-custodial locations on hold. Once you have your custodian and data source selections configured for the case, you move to the Collections tab.
Collections
Collections, previously known as Searches, brings together custodial and non-custodial data sources, along with additional locations, and you set conditions with are effectively search parameters for your Advanced eDiscovery case to operate with (Figure 5).
Here you collect data for your case, which now includes the ability to preview estimates, and options for quickly collecting content for a review set (review sets will be discussed in the next section).
When setting up a collection for your case, you choose the Custodial data sources to include in your collection. These are the Custodians that you setup in the Data sources tab in the previous step. You add either all or some of the custodians, and which locations to make available to the collection process. If you don’t select all the custodians at this time, other custodians can be added later.
Next you choose which of the non-custodial data sources (if any) to include in the collection. These will have been defined in the Data sources tab earlier under Data Locations. As with the custodians, you select some or all the non-custodial data sources to the case and go back to add more at a later point if required.
Next, you choose additional data source locations to search if required. These are locations not associated with the custodian’s setup earlier and can include Exchange mailboxes, SharePoint sites, and Exchange public folders.
Now you optionally add keywords and conditions to be applied to the custodial and non-custodial data sources identified in the earlier steps. This helps to narrow the search results returned for your collection.
Finally, you have the option to save your collection as a draft (this is one of the newer features). When you choose this option, your collection is saved for further review and amendment, and you are able to view an estimated set of collection results and a preview to validate the size and scope of your collection prior to committing your collection to a new or existing review set.
When you are ready to commit your collection to gather items and add them to a review set, you configure this as per the options illustrated in Figure 6.
There are some additional settings to consider here before committing the collection. If you wish to view any results for Teams and Yammer messages in conversational format you need to select the option to Collect contextual Teams and Yammer messages around your search results. This is one of the most valuable features within Advanced eDiscovery in my opinion, as the ability to view results in conversations makes it far simpler to review and make sense of the information presented. You may also choose to Collect cloud attachments from items found in your search results and Collect all versions of SharePoint items (doing this can significantly increase the volume of items added to your review set).
Jobs
When your new collection is committed, you monitor its progress from the Jobs tab. Depending on the settings of your collection, it could take several minutes or several hours to complete. During the process, you may keep track of all the relevant stages, including how long it takes to add data to a review set, and the status of conversation reconstruction. The status from each job line item will transition from Created, to In progress, to finally Successful.
However, should there be any issues, you may also see some differing statuses in your jobs such as Creation failed, Partially successful, or Failed. Further guidance on remediation if you encounter unsuccessful status messages within jobs may be found here.
Review sets
When you commit your collection to a review set in an Advanced eDiscovery case, you see your review set shown as ready almost immediately in the Review sets tab. However, this does not mean that the review set is fully populated and ready for inspection, and you should be guided by the progress shown in the Jobs tab to determine when best to start examining your review set. I recommend waiting for the jobs relating to the review set to be complete before you start examining your review set results.
When all jobs are completed, open the review set to view the results. The results are displayed line by line as shown in Figure 7 and are sorted into columns where you see the Subject/Title, Status, Date, Sender/Author, File class, and much more. Conversational results are expanded by selecting the down arrow to the left of each item.
From the Action menu, you have the option to download or export the results, add to another review set, convert selected redacted files to PDF (area redactions are set when previewing individual items in review set results), and Retry conversation PDF view.
By selecting the Manage review set option, you may run analytics for the review set, run a summary report, save and view saved load sets of your review set, manage the tags in your case, and load data into your review set from sources other than Microsoft 365. Non-Microsoft 365 data must be uploaded to the review set using AZCopy. Further guidance on this process may be found here.
You can modify the view of the results from the Group Teams or Yammer conversations column to alternatively Show all documents, or Group family attachments (which shows emails and their associated attachments grouped together).
When you select an item in your review set results, you will see a preview to the right of your selection where you may toggle between Metadata, Source, Plain Text, and Annotate views of the item. When in annotate view, you may select text, choose color or pencil options to markup the item, choose area redaction to redact selected content prior to export or download, and you may also remove drawings from a current page or all drawings from the document.
You can download an original or PDF version of each individual review set result where compatible.
Finally, you can switch from the individual results view to search profile view which is useful as it provides a graphical view of the results which may be filtered and saved as a query.
If you have configured your Advanced eDiscovery case in a logical and effective manner, the results you will see in your review sets will be of great value in managing your cases.
Note: In my experience of working with review sets in Advanced eDiscovery cases, I have occasionally found that conversation-based results are not immediately accessible in the review sets, despite the jobs showing as complete and the review set showing as ready. In such instances, I have found that patience is the key and checking the review set again the following day will yield the expected results.
Communications
The Communications tab allows you to manage and automate legal hold notifications, escalations, and reminders all in one place. This process allows an issuing officer (or the case owner) to send a notification to custodians of data in cases to inform them of the requirement to preserve information relating to the case. The notification may be customized and will require the recipient to acknowledge the notification. Should an acknowledgement not be received from the custodian, a reminder can be triggered, followed by an escalation which would be delivered to the custodian’s line manager. The process of communication is as follows:
Issuance notice > Reissuance notice > Release notice > Reminders and Escalations
Hold
From the Hold tab, you will see any existing custodial holds which have been applied to your Advanced eDiscovery case. You can edit or delete holds from here.
You can create new non-custodial holds from this tab.
Processing
The Processing tab provides Advanced Custodian Indexing and allows you to process any errors with file identification, and the expansion of embedded documents and attachments. For more information on working with and remediating processing errors in Advanced eDiscovery cases please refer to this Microsoft article.
Exports
The Exports tab allows you to manage the export of data from your review sets. By selecting the Export options from within your review set results, an export job will be created and completed in the jobs tab. The export will then be visible in the exports tab where you may choose to download and files and folders which have been generated by the export job. For more information on exports please see here.
For awareness of the licensing requirements to run an Advanced eDiscovery case, please refer to this article.
Advanced eDiscovery is the most comprehensive version of eDiscovery, and it should be when you consider the steep price rise for Microsoft 365 customers to step up from E3 to E5 (though other add-on licensing also provides Advanced eDiscovery). The ability to preview collections before you refine and commit them is a good step when you consider this time it takes to run some collections, review sets are extremely powerful and provide many useful options to Compliance administrators who need to carry out legal and other types of investigations. I’m particularly impressed by the ability to reconstruct conversations, and the annotation and redaction features also continue to impress me whenever I test them out.
I do have one minor nit-pick in that I feel the tabs in cases are ordered illogically based on the Advanced eDiscovery workflow process. The other more major nit-pick I have (as mentioned earlier) is that in testing, I have found that the full results of a review set can take time to be available, even when the review set job shows as fully completed. This could cause some confusion. In fairness however, when testing this for this article it worked immediately.
Microsoft are continuing to refine and develop this feature though, and if you look at the Microsoft 365 roadmap for Advanced eDiscovery, there are some good things on the way, including support for Teams reactions, Advanced eDiscovery Graph API’s, and Expanded support to search and export items in SharePoint and OneDrive for Business recycle bin.
Advanced eDiscovery in Microsoft 365 gets an overall thumbs up from me, and I’ll be watching how it continues to develop and improve with interest.
Content searcheDiscoveryAdvanced eDiscovery#Compliance
Peter Rising