Microsoft 365 eDiscovery (formerly, Office 365 eDiscovery) includes several useful features for eDiscovery and compliance matters such as email archiving, Legal Holds, and eDiscovery Search capabilities.
We run a fair amount of searches for ESI (electronically stored information) in Microsoft 365 eDiscovery and can attest it has helpful search features, but, despite Microsoft building out the eDiscovery Compliance Center and continuously improving functionality, Microsoft 365 eDiscovery search features have certain limitations that users must be aware of and often must be augmented with other eDiscovery software and workflow tweaks.
Below are four Microsoft 365 eDiscovery Search limitations to be aware of and some workarounds we use.
Four Key Limitations of Microsoft 365 eDiscovery Search Capabilities:
Not all files are indexed for searching
If you have ever searched emails in Microsoft 365 Core eDiscovery you probably noticed results often contain many unindexed items as shown in the example below. (See further below for feature differences between Microsoft 365 Core eDiscovery and Advanced eDiscovery).
It is important to understand what these unindexed items are and how to handle them to make sure you are not ignoring any potentially important items in your searches and data exports.
Documents fall into the unindexed category if they fit into one of the below conditions:
- Unrecognized or Unsupported file type: Microsoft 365 only recognizes and indexes 58 file types and all non-Microsoft file types are not recognized. Click here for a full list of indexed file types.
- Image files
- Email messages have an attached file without a valid handler (an app capable of opening the file) such as image files – this is the most common cause for a file to be partially indexed
- Too many files attached to an email message
- A file attached to an email message is too large
- Oversized spreadsheets
- Password-protected files
- Indexing errors
When exporting data from Microsoft 365 it is often advisable to include unindexed items in your export so it can be processed and searched in a more advanced eDiscovery tool. These unindexed items are often quite large and can be several gigabytes of data and admittedly may not contain relevant data. The struggle is that you won’t know if there is any relevant data in the unindexed items until you export it for review. (Proportionality and issues of reasonable accessibility of ESI often impact the decision to export unindexed data, but these considerations are beyond the scope of this article).
Once loaded to a more advanced eDiscovery software, these items will be OCRed (made searchable through optical character recognition) so you have the capability to search and cull anything that does not hit on your search terms. This will leave you with potentially responsive ESI and documents that Microsoft 365 was unable to index.
Download this list below by clicking the image or by clicking here.
NOTE: Certain Microsoft licenses (E5) have Microsoft 365 Advanced eDiscovery which has more features than Core eDiscovery. However, Advanced eDiscovery also has limitations that still may necessitate the use of more robust eDiscovery tools (e.g. only native production formats, inability to process third-party data, and keyword search limitations described below). Below are the main differences between Core and Advanced eDiscovery:
Keyword searches have limitations
Most eDiscovery projects, compliance matters, and internal investigations are heavily dependent on keywords and the good ol’ eDiscovery search query to cull large datasets. Matters can have a few search terms or have lists of hundreds of terms – trust me, the list of terms can be extensive.
One major limitation of Microsoft 365 eDiscovery search queries is that you can only search 50 keyword lists in a single search. This can be challenging and time-consuming if you have more than 50 keywords or search queries which often is the case.
Additionally, when searching several keywords in a single search, the search results do not show which keywords triggered the inclusion of a specific item. Microsoft 365 highlights keywords found in the preview of a document, which is helpful, however, if the term is found in an indexed metadata field you need to dig deeper to figure out why an item was included in the search results. Often, the only way to figure out which keyword was responsible for the item hit is to set up multiple single keyword searches.
Another limitation of eDiscovery Search in Microsoft 365 involves wildcard searches. (Wildcard searches use a character like an asterisk “*” to take the place of one or more characters so you can expand a search. For example, searching for “Dav*” would return both “David” and “Dave” and any other words starting with “Dav”).
In Microsoft 365 eDiscovery, you may only use prefix wildcard searches and the prefix must include at least 3 characters; for example, cat* or set*. Suffix searches (*cat), infix searches (c*t), substring searches (*cat*), and prefixes less than 3 characters (ca*) are not supported. If your search terms include one of these unsupported wildcard formats you are out of luck.
Depending on your search requirements, these limitations may complicate things. In some instances, it may make more sense to do a full mailbox collection and process data in a tool with more advanced search capabilities. For instance, before utilizing Microsoft 365 eDiscovery search features, we analyze each matter individually and determine the most time and cost-effective workflow and the eDiscovery platform with the best capability to meet the needs of a project.
Double collection due to lack of ability to exclude prior exported data/searches
Another Microsoft 365 eDiscovery limitation is the inability to use saved searches as a condition in a new search. This function is quite helpful when multiple data collections and exports are required (which is a common occurrence because as legal and compliance matters proceed, additional information is learned and additional custodians are identified for which additional data collections must be run).
For example, if you previously exported data that hit on the term ‘red’ and then later wanted to export data that hit on the term ‘blue’, it would be ideal and cost-effective to only export items that hit on the term ‘blue’ that were not previously exported. If Microsoft 365 provided a way to create static saved searches or if exported documents were tracked in a Microsoft 365 field, you could add the saved search or export field as an exclusionary search condition to ensure you are not double collecting data.
One way to potentially limit recollecting data in Microsoft 365 from your previous exports is to use the “AND NOT” Boolean connector to exclude previously exported data. Using the example above, you would export hit results for ‘blue AND NOT red’ so you don’t recollect documents that hit on both blue and red that were previously exported. Note that this only works if you have the same date, location, and other base criteria as your previous exports.
This way of excluding previously exported data can get complex if you have several search terms or different base search criteria for each term. Depending on the complexity of your searches it is likely best to export your data without using ‘AND NOT’ Boolean operators to exclude previously exported data and let deduplication in more advanced eDiscovery tools cull previously exported data from Microsoft 365.
Searching takes time to execute (so plan accordingly)
One often overlooked and valuable feature of a search engine is quick search executions. When you analyze search term hits or need to run complex searches, running multiple searches and reviewing results quickly becomes increasingly important. Due to batch processing searches in Microsoft 365 eDiscovery can be slow.
Compared to other eDiscovery tools, Microsoft 365 eDiscovery Search can take much longer and more complicated matters because users may only preview the first 1000 hit results. Additionally, if you are searching multiple locations, or mailboxes, in Microsoft 365 you need to search and select each location or mailbox; this process can be very tedious. If you are facing time constraints, this might also warrant a more expansive export in Microsoft eDiscovery to run searches in other software.
We run many searches in the Microsoft 365 eDiscovery environment for our customers, especially for compliance departments and in connection with large-scale document reviews. If you want to learn more about how we pair Microsoft 365 eDiscovery with other eDiscovery tools to improve ESI collection and search efficiency, let us know.
Additional Articles About Microsoft eDiscovery