loading please wait..

Vertical, Horizontal and Global Deduplication Explained

Image of deduplication icon“Deduplication” or “Deduping” is the process of comparing computer files in a data-set and removing or segregating duplicates. Two significant benefits of using e-discovery software are deduplication capabilities and identification of “near duplicates.” (Near duplicate documents are those that are closely related, such as a contract drafts with textual differences, or a document in different formats). Deduping a document collection reduces the number of documents to review. E-discovery software generally dedupes document collections by analyzing the hash value of the files.

Vertical deduplication occurs when duplicates are removed from documents collected from individual data custodians. This is also sometimes called custodian deduplication.

Horizontal, or global, deduplication occurs when a whole data-set is analyzed and duplicates are removed.

For more on deduplication, please visit the deduplication page in the EDRM glossary.


Posted on June 2, 2015 in E-Discovery, Electronically Stored Information (ESI), Software

About the Author

Chad Main is an attorney and the founder of Percipient. Prior to founding Percipient, Chad worked as a litigator in Los Angeles and Chicago. He is a member of the Seventh Circuit Electronic Discovery Pilot Program Committee and may be reached at cmain@percipient.co.