Far too often, discovery is an opaque process, characterized by black-box technologies, black-box processes, and inexplicable costs. In too many instances, discovery becomes the antithesis of the liberal transfer of information it was designed to be.
Yet when it comes to ensuring an accurate process—and a fair shake at discovering the facts of a matter—transparency is key. This is particularly true when it comes to discovery software. During the discovery process, the software platform used can have a significant, but often under-appreciated, impact on the outcome.
Of course, many practitioners are aware that their discovery technology can impact factors such as cost, speed of review, and usability. But the technology can also impact your very ability to find the information you need. Here, three factors play a major role:
- your platform’s indexing;
- its exception reports; and
- its ability to handle embedded files.
Yet it’s not just the abilities and limitations of your technology that you need to keep in mind. It’s your opponent's as well.
First, Know Yourself—And Your eDiscovery Platform
As we’ve mentioned in past posts, there is often a whole universe of information that your documents contain but which your platform may not reflect. Some of the most widely used discovery software fails to index a surprisingly large number of common words and characters. The most common discovery platform ignores, by default, all single digits, many special characters, and 112 words. Those words and symbols aren’t indexed and, thus, they simply don’t exist in the platform’s eyes. As a result, that platform, and many like it, can’t find the word “e-discovery.” For the same reason, many discovery tools can’t find the phrase “I did it”—at least not without some serious tweaking on the backend—since each word is treated as "noise" and left unindexed by the platform.
The importance of the index, and of its abilities and limitations, stems from one of the main ironies of document search—that you’re almost never searching the documents themselves. As Craig Ball reminded us recently, “any time you approach electronic search and electronic discovery, you're never searching the actual electronic evidence.” Rather, you're searching the tool's interpretation of that evidence. Ball is a trial lawyer and eDiscovery expert who has served as a special master on some of the most challenging cases in the U.S. Last month, he hosted a webinar with Logikcull on eDiscovery search accuracy, where he explained the importance of a discovery tool’s approach to indexing:
We are almost always searching an extraction of textual material in some levels of metadata from the items that make up the actual evidence. Now, that may seem like a distinction without much difference, but in fact when you take a look at the processes behind that extraction, you will often find that the index is a compromise. It will not only lack certain information that in many cases will prove to contain responsive information (or at least some responsive information), but there will also be compromises made in information that is deliberately excluded from the index in order to make the index perform with greater efficiency.
Ultimately, it is the index that matters most.
All eDiscovery searches ultimately get directed at the index and typically the index alone. So if it is not in the index, there's nothing you can do in your query. No amount of special characters, parentheses or quotation marks will serve to identify a word that simply did not make it into the index for one reason or another.
Without understanding how your software works, what it indexes and what it does not, you can’t fully understand what can be found within it—and what it might be overlooking.
(Miss the webinar? Watch it on demand now.)
Similarly, how your platform treats exception reports can have a sizable impact on the accuracy of your discovery process. Exception reports disclose the processing errors encountered when preparing data for review, things like unreadable or corrupt files, password-protected documents, and more.
A platform that makes exceptions clear and transparent makes it easy to identify and resolve any errors. Alternatively, a platform that forces you to seek out exception reports or buries errors under cryptic, overly technical labels, is doing you a disservice—making it harder to identify which documents you have access to and which you do not. And if a document is important enough to be password protected, wouldn’t you want to know that?
Then there are embedded files. These are “hidden” files that can be found in other documents, such as the database embedded in the pie chart in a presentation file. Embedded files can be present in any discovery project, but not every platform has the ability to handle them. Some do not extract embedded files at all. Others are inconsistent with them. For your search to be accurate and comprehensive, you’ll want to make sure that embedded files are accessible.
Of course, understanding these factors is incredibly important when selecting and using your own discovery software. But you also need to be aware of the abilities and limitations of your opponent’s software as well.
Know What Your Opponent’s Technology Is Capable of as Well
The discovery process is never one sided. The capabilities of your opponent's discovery software can be as impactful to the accuracy of your process as your own. For example, if you are requesting documents that hit for a specific search phrase or keyword, will the producing party’s review software be able to handle those searches successfully? It might not if the query is something like:
“20%” AND ("payment" OR "amount" OR "check" OR “pay”)
This is, as Ball explains in the webinar, a query he experienced in his own work. It was negotiated between the parties after significant back and forth. Only later did they realize that the percentage sign would not be recognized by the search platform used—meaning that many responsive documents could have been missed had the error not been identified.
One of the top pieces of advice he gives, Ball says, is to assess an opponent’s search tool.
For the most part, there's no great shame in using a tool with certain limitations. The shame comes from not knowing what those limitations are and failing to disclose them to the other side so that all parties, and the court when necessary, can set their expectations of what search can do to mirror the reality.
In order to find potentially relevant evidence, there needs to be a transparent process on both sides, though Ball recognizes that parties can sometimes resist providing insight into their tool.
The need to understand your opponent’s tool highlights another trend in discovery technology: both sides relying on the same tool. Now, we don’t mean the same dated, legacy software that was once the only alternative to paper-based discovery. We mean modern tools that make the discovery process quicker, easier, and more transparent. Indeed, one of the fringe benefits of looking in to an opponent’s tools is finding ones that may better suit your needs.
When both sides take advantage of these tools, they eliminate the need to fight over file formats, load files, and exception reports. They can have confidence in their knowledge of the platform’s capabilities, because it’s their platform too. And the actual sharing of documents can be almost instantaneous, reducing friction points in a process that is far too often filled with needless friction and delay.
When that is not the case though, legal professionals who want to ensure a thorough, accurate, transparent process will have some research to do—on the capabilities of their own platform and their opponent’s as well.
This post was authored by Casey C. Sullivan, who leads education and awareness efforts at Logikcull. You can reach him at casey.sullivan@logikcull.com or on Twitter at @caseycsull.