Stop sending garbage to outside counsel (you’ll pay for it)

Corporate legal departments are often tempted to offload discovery projects to outside service providers or law firm partners, either because they hope to offload risk and liability, or because they simply don’t think they have the bandwidth, expertise or patience to handle those data-intensive matters.

But with the emergence of user-friendly, low-overhead technology, that approach can be short-sighted and extremely expensive.

When a discovery request, subpoena or other “search project” comes down the pike, it is not uncommon for in-house legal teams, who are almost always staffed more leanly than their outside counterparts, to indiscriminately collect large amounts of data without regard to the relevance of that information to the request. Then that collection, which generally contains a high volume of “junk” data such as duplicates, system files, and extraneous email, will be handed over to an outside partner. Up to 90 percent of a typical data collection will be made up of information that is not relevant to the discovery request at issue.

The cost associated with processing searching and — above all — reviewing that non-relevant data can be enormous and, in fact, is often the greatest expense associated with discovery. This is where law firms and vendors feast — because they often charge by the hour (the former) or by obscure charge-what-thou-wilt rate. And, to be sure, neither is incentivized to reduce data volumes. 

A matter of proportionality 

Recent amendments to the Federal Rules of Civil Procedure emphasize proportional discovery outcomes, meaning, essentially, that costs should align with benefits. While the principle of proportionality generally speaks to how outside counsel approaches obtaining discovery from opposing parties, it should also be a mandate to in-house counsel. The motto should be: It is unacceptable to corporate executives and stakeholders to spend more on discovery than absolutely necessary, considering the amount in controversy and the importance of the issues — and it should be incumbent upon in-house legal teams to reduce costs where they can.

Identifying and eliminating irrelevant data from collections before involving vendors and outside counsel is crucial to mitigating costs and acting proportionally. Legal departments should seek to secure tools that allow them to independently perform preliminary culling and early case assessment, but that enable outside partners to advise where necessary. Platforms that, at a minimum, group documents and information by similarities, such as file type and email domain, and allow filtering by date range, enable legal teams to cull a document collection in bulk — potentially saving hundreds of thousands of dollars in high-volume matters.

For example, assume that a law firm reviews 100 documents per hour, and bills $500 for that hour of time. Now assume that a gigabyte contains about 3,000 documents. A common hypothetical scenario might involve a corporate client passing a 50-gigabyte collection to the law firm to analyze and review those files (we’ve seen it). At $5 per document, a typical fee, the client would incur $150,000 in fees for that entire collection to be reviewed.

But what if 80 percent of that material is garbage? Culling it from the collection before involving outside counsel would reduce that bill by $120,000. Again, we’ve seen it. 

Ways to save…

For corporate legal teams, realizing those costs savings is dependent upon safe, smart and efficient culling methods. Two are provided here.

Often, domain names can tell enough about an email to enable a determination as to whether it is irrelevant or otherwise inappropriate for inclusion in your production. These domains include those used by your outside attorneys, which are likely to be subject to attorney-client privilege, as well as those domains that indicate commercial or spam email.

These files can be excluded by searching email metadata fields, including To, From, CC, and BCC, for the specific email domains by using the format: * Note that it may be prudent to analyze email sent to and from outside attorneys separately, should you not exclude them from the collection altogether.

There are also many freely accessible websites, such as, that include directories of domain names used by popular commercial sites and should be referenced so irrelevant commercial messages from your dataset can be identified and excluded. With some advanced software, in-house teams can generate a report listing all the domains found within your dataset and select the ones to exclude from that list.

Deduplication is another essential method for reducing downstream discovery costs. Most e-discovery platforms remove exact duplicates, but don’t identify near-duplicates with as much consistency. Minor differences, like the date on which an email was received, will not be recognizable to many tools, causing multiple “near-dupes” to be present even though they are essentially the same document.

To eliminate near-dupes, use the body text of a Word document or, with email, a combined string of “from, to, cc, bcc, subject” to generate an MD5 hash value. Documents that match should be considered near-dupes and excluded from further review. If this approach is taken, be sure to verify the results by sampling the near-dupes with a well-defined sampling technique.

Seek out flat-fee pricing

Corporate legal teams are business minded, and value cost predictability over all else.

When selecting tools that empower legal teams to process, analyze, cull and review data in-house, it may be wise to seek out tools and providers that offer flat-fee pricing, which is becoming increasingly common within some small, but growing sectors of the legal technology market. 

With flat-fee arrangements, costs associated with tasks such as processing, analyzing, culling, reviewing are bundled by the technology provider with others — such as the cost to use the software — into one predictable fee that remains constant irrespective of volume.

An increasingly common form of all-inclusive, flat-fee pricing charges not by data consumed, but by matter or case — where the client pays one bulk sum for “end-to-end” discovery services. When vetting a technology or vendor, it is important to ensure that there are no hidden charges within the purported “flat fee,” such as surcharges for user licenses, data processing, or productions.   

In certain instances, it is necessary to involve outside counsel or vendors earlier in the case, such as when a sensitive matter or investigation requires a forensic collection. But data subject to most run-of-the-mill requests can be dramatically reduced with the appropriate tools and methods prior to involving outside partners. Corporate legal departments that don’t seize that opportunity are flushing money down the drain.