Six eDiscovery search tips (for those without the help of ChatGPT)
With all the recent focus on ChatGPT, there is great anticipation over the bold progress expected with its application across the legal industry, particularly when it comes to eDiscovery search. This milestone makes me recall when Apple’s Siri app was launched for the first time in October of 2011 – it, too, offered some significant efficiency gains ahead of its time. To this day, its application also gives the most unexpected and sometimes humorous results when used.
That said, it will be a long time before ChatGPT or other large language models make strategic searching with critical thinking skills needless across database systems. So, for those of us on the frontlines trying to find just the right evidence for an upcoming deposition, here are a few search tips to keep in mind:
Tip 1: Are you using a field to force a sort order on your search result list? This will likely prevent the records most likely relevant to your search from bubbling up to the top.
Tip 2: Do your search results include documents that don’t seem to contain the terms within your search? If so, it is likely that the terms are within the associated metadata for the records but not actually within the document text itself. Try limiting the search to the actual Body Text only.
Tip 3: Don’t go too wild with your wild cards. Avoid pitfalls with your search terms by limiting the use of wild cards on two- and three-character terms (e.g., Ed* or Bob*).
Try using other limiting language instead (e.g., (Bob or Bobby) w/2 Smith)).
Tip 4: Are you getting over-inclusive results due to header or footer language? Speak with your eDiscovery professional about how to capture and field this information separately without including it in the document index. This will also result in improved predictive coding results.
Tip 5: Still having over-inclusive results? Check what fields are searchable and what weight of importance is applied to each field (again, your eDiscovery professional can help with this). Some common issues to look out for would include a) having the folder path to the native as part of a searchable field; b) including problematic terms within a Custodian or Source field (e.g., “Billing Dept.”); or c) having organizational metadata that is searchable as opposed to just filterable options (e.g., a field called Deponent with values like “Bob Smith Depo”).
Tip 6: Still need help with over-inclusiveness? Try having your eDiscovery professional modify your Stop Word List to include problematic terms. A few strategic modifications there can make a big difference. Keep in mind that you may also have stop words for other languages.
Whatever eDiscovery tool you use, be sure to take advantage of any other helpful AI it includes. Using features like Predictive Coding (whatever flavour you chose), Review-in-Context, Predictive Search or ‘find-more-like-me’, Predictive Filtering, and the like will be an efficiency gain. And don’t forget the benefits of visualization tools like Hypergraph, Visualizer, and Chat and Excel viewers.
Heidi Amaniera is a Director in LegalTech Professional Services at OpenText with world-wide leadership responsibilities over off-cloud and public cloud implementation, enablement, managed services, and consulting for Axcelerate. Heidi’s background includes management of eDiscovery services within both the vendor and law firm environments. She also spent over 15 years as a seasoned litigation paralegal specializing in Intellectual Property. Originally published here.