loading please wait..

E-Discovery & Legal Ethics – What to Do About It Part 6: Data Searches

Image of Magnifying Glass for ESI searchThis is the sixth in a series examining electronic discovery concepts addressed in recent legal ethics opinions, such as State Bar of California Opinion CAL 2015-193,  relating to technology and an attorney’s duty of competence.


In short, the California opinion instructs that attorneys handling matters with e-discovery components must do one of the following: 1) acquire adequate learning and skill before handing cases involving e-discovery; 2) engage co-counsel or technical consultants familiar with e-discovery; or 3) decline the representation.


The opinion states that lawyers with cases involving e-discovery must understand certain e-discovery tasks or bring in co-counsel or e-discovery consultants to assist. The tasks, which are discussed in this series of articles, are:



This article addresses the sixth task, understanding proper data search techniques.



There Are Multiple Search Types

A core feature of e-discovery software is the ability to search for documents. Keyword search is the most obvious type of search functionality, but there are other types of search techniques:

  • Keyword
  • Boolean
  • Fuzzy and Phonic
  • Stemming
  • Synonym
  • Concept
  • Computer Assisted Review (CAR) / Technology Assisted Review / Predictive Coding
  • Non-Keyword Based Search Types


Each data search technique is briefly discussed below.



We are all familiar with keyword searches. We use keyword searches every time we use a search engine like Google. Using keywords input by the user, search engines return a list of webpages that contain or are related to the keywords. Using keyword searches in e-discovery is the same. Search terms are input into a search function and a list of documents searched containing a search term is returned.



Boolean searches are an advanced method of keyword searching and permit word and phrase combinations using operators such AND, OR, NOT (known as Boolean operators) to limit or broaden searches.


For instance searching for “basketball AND court” will return documents containing both terms. Using “court NOT tennis” will return all documents containing the word court, except those that also have the term “tennis”. A search using “apples OR oranges” will return documents containing either term.


There are other Boolean search operators and the University of California at Berkeley has a good Boolean search cheatsheet.


Most lawyers learn Boolean search techniques in law school when they begin to use legal research tools such as Westlaw or Lexis/Nexis.


Fuzzy and Phonic

Fuzzy search is a technique that finds words based on spelling similarities. It is a helpful tool to find relevant documents with misspellings. For instance, using a fuzzy search, e-discovery software would return documents containing either “apple” and “appple”. Similarly, phonic searches provide results that sound alike.  For instance, a search for “John” would also return documents containing “Jon”.




As aptly explained by the EDRM search glossary, a “stemming” search “returns matches for all variations of the root word of the initial query word. For example, if the query word was sing, then if a search used stemming the search results would match singing, sang, sung, song, and songs as well as sing.”



Synonym and Concept

A synonym search, as its name implies, uses a thesaurus and returns documents containing either the search term or a synonym. Similarly, a concept search also expands the search beyond the keywords chosen and also returns documents relevant to the concept underlying the search terms. For instance, searching for “football” might also return documents about “soccer.”


CAR / TAR / Predictive Coding

TAR, technology assisted review (sometimes referred to as Computer Assisted Review (CAR) or predictive coding), utilizes computer algorithms to identify electronic documents that are similar to other documents.


According to the Grossman-Cormack Glossary of Technology Assisted Review, TAR is “[a] process for prioritizing or coding a collection of electronic documents using a computerized system that harnesses human judgments of one or more Subject Matter Expert(s) on a smaller set of documents and then extrapolates those judgments to the remaining Document Population. . . .”


Using TAR in legal document reviews is like using the “Because You Watched” feature in Netflix. In Netflix, if you enter star ratings for movies you watched based on whether you like them or not, Netflix will learn your tastes and suggest movies for you to watch. For instance, if you consistently give high ratings to comedy movies, it will suggest other comedies for you to watch.


Similarly, using predictive coding in document reviews, documents are marked relevant or not. The e-discovery software then analyzes the contents of documents marked for relevance and applies what it learns to the rest of the documents in the database. The software determines how likely each document is to be relevant and marks them as such automatically (or gives each document a relevance score) so that documents that are more likely to be relevant may be reviewed first and those that are less likely relevant may be reviewed later (or not at all).


Non-Keyword Based Search Types


There are also search techniques that are not based on keywords. For instance, e-discovery software searches may return documents based on date ranges, document types or email sending or receiving domains.


Becoming Familiar With Search Types Only First Step

Being familiar with different search types is only part of using data search techniques. To actually perform data searches, a person must also understand how to use e-discovery software to do the searches. This is especially true if computer assisted review is used. Obviously, not every lawyer is familiar with e-discovery software (or even comfortable with technology at all), but that does not prevent counsel from providing competent representation in a legal matter with e-discovery. As advised in the California e-discovery ethics opinion, an attorney may associate with qualified co-counsel or an e-discovery vendor or consultant who may help with the data searches.


Helpful Resources

To learn more about data search techniques, both EDRM and the Sedona Conference offer helpful resources on e-discovery searches.


Posted on November 2, 2015 in E-Discovery, Electronically Stored Information (ESI), Ethics and Rules of Professional Conduct, Legal Technology, Predictive Coding, Search

About the Author

Chad Main is an attorney and the founder of Percipient. Prior to founding Percipient, Chad worked as a litigator in Los Angeles and Chicago. He is a member of the Seventh Circuit Electronic Discovery Pilot Program Committee and may be reached at cmain@percipient.co.
Download the E-Discovery Legal Ethics E-Book
Enter your E-mail address to receive your free copy of this e-book.
We will not share your information with others.