Percipient LogoPercipient LogoPercipient LogoPercipient Logo
  • About
  • Services
  • Articles & Resources
  • Contact

5 Search Tricks to Increase Document Review Efficiency

August 28, 2017

5 Tricks we have for you to effectively review documents alongside of your Artificial Inteligence and machine learning.

Despite advances in artificial intelligence and machine learning in e-discovery software, proper use of search functionality is still an important tool for efficient document reviews. However, searching can be tricky and, if not phrased right, may turn up less than ideal results.  Why is that, and what can be done to improve search results?  Here are five important search concepts and tricks to help return better targeted search results.

 

Noise words

When building a search index, e-discovery software uses “noise words” to improve search performance. Noise words are words that are so common that they are deemed unimportant for the purposes of searching (for example, words like and, if and it). Most e-discovery software skips noise words when it indexes documents.

 

That noise words are ignored e-discovery software can be an issue if you are searching for similar phrases. For example, if you are trying to search the phrase “IT department”, your results may bring up any document that has two words with the second word being department.  

 

There are a few things you can do to avoid this issue.  You can use proximity searches and other key terms to filter out junk search results.  For example, if you notice that IT department is always followed by the word computers you can use proximity searching to narrow down your search results (Department w/2 computers). You can also use other key words that are closely associated with your term to identify documents.

 

If using proximity search and expanding keyword searches does not help, another way to address noise word related search issues is deleting the noise word from your software’s default noise word list and re-run the index.  This is a good option if you anticipate using the noise word frequently in your searches or if using proximity searching does not return accurate search results.  (Note, you must re-index your documents after adjusting noise words and it might take longer for your searches to run if you delete words that are in your noise words list).

 

Here is a list of default noise words used by a few popular document review platforms:

 

Image of e-discovery noise words

 

(For further information about noise words, you can check out those used by Relativity and dtSearch).

 

Understanding syntax and Operators

Most e-discovery software offerings have different types of search indexes that can be used during a search. Common indexes are keyword, dtSearch, and Lucene. It is important to understand which index you are using to use to optimize your search results.  A big mistake people make is using search syntax not recognized by the specified search index. Take a look the following helpful chart to help you build your search:

 

Image of ediscovery search syntax and operators

 

Order of operations


If you typed in dog AND cat OR bird, in what order do you think your search will be performed? Many assume that because dog AND cat appear first, the software searches for those words first.  That is not the case.

 

It is important to understand order of operations to obtain accurate search results. The criteria within logic groups or parentheses are assessed first before evaluating against the other search conditions.

 

In a long string of search terms without any indication of order, OR conditions are always performed first. So, in the case of the search dog AND cat OR bird, the search will run cat OR bird first and then run AND dog (i.e. dog AND (cat OR bird)).  Using parenthesizes and logic groups in complex searches can help you specify the order in which you want your search to be run.

 

Using RegEx to search

RegEx is a search type that uses patterns instead of terms or phrases to search documents.  This is especially helpful when searching for things like social security numbers, phone numbers, bates numbers, zip codes, URLs, email address, dates, etc. Here is a common use case chart and the appropriate RegEx syntax to use for searches.  

 

Image of regex search terms

 

Understanding Fields and Searches

When creating custom fields within your document review platform, something many overlook is that the type of fields added impacts how searches are performed.  This is an important consideration during case setup because in most review platforms a field type cannot be changed so you are stuck with the field type unless you take several time-consuming steps to transfer data into a newly created field. Understanding how you would want to search within each field will help you determine what type of field you want to use.  Use this chart to help you decide what type of field you should use:

Image of ediscovery software field types and operators

 

Share
Chad Main
Chad Main

Related posts

Image of removing folder through use of ediscovery search terms
January 2, 2021

Big Legal Document Review? Don’t Forget Exclusion Search Terms


Read more
Image of Statisitical-Sampling-in-ediscovery-managed-document-review
December 2, 2020

Statistical Sampling in Legal Document and Data Reviews


Read more
Image of Microsoft 365 ediscovery search logo
November 11, 2020

Augmenting Microsoft 365 eDiscovery Search Capabilities for Best Results


Read more
Percipient Logo

Learn

Articles & Resources

Technically Legal Podcast

Company

About

Services

Contact

Talk to Us
(c) Percipient, LLC – not a law firm and
not licensed to practice law in any jurisdiction.
Privacy Policy
Website construction by WorkSite, LLC