SharePoint's flexibility requires careful use of e-discovery protocols

By Mark Gerow, Kevin Moore, and Matthew Kesner

As discoverable content continues to grow at an accelerating rate, much of it has found its way into Microsoft SharePoint. SharePoint's collaboration tools, its flexible management of documents and other data, and tight integration with Microsoft's Office suite has spurred its wide adoption in the legal industry. But SharePoint's flexibility also presents challenges, both for organisations that need to manage data contained in SharePoint sites, and for law firms tasked with executing sound electronic data discovery (EDD) methods.

For users, the central question is how to store sufficient content to meet the needs of the business, while minimizing the amount of discoverable content. For EDD, the central question is how to efficiently extract and analyse all relevant content.

When it comes to EDD, electronic information management and legal records management (LRM) are tightly intertwined. The guiding principle of EIM is to manage and retain content to meet the needs of the business or applicable legal requirements. The guiding principle of LRM is to know where every relevant document and piece of data relating to a client matter is housed, so it may be retained as long as necessary — and destroyed or transferred back to the client when appropriate or required.

To successfully support either function in SharePoint, content must be appropriately tagged with metadata, as is done in traditional legal document management systems. The challenge with SharePoint is in its flexibility — each user can devise a unique scheme for tagging content with metadata, making it difficult to treat all content consistently.

Many users do not realise that SharePoint has tools that enforce consistent metadata tagging and records management. Content Types enables the consistent application of metadata and validation across multiple sites. The Records Center has been enhanced in SharePoint 2010. Together, these two features can form the basis for a firm-wide content tagging and records management policy. The dual goals: retain enough of SharePoint's flexibility to keep it an attractive platform for collaboration, while providing sufficient consistency to effectively manage content throughout its entire lifecycle.

Lifecycle content management requires addressing the question of data disposal. There are significant efficiency and risk-management benefits to systematically winnowing data. Organisations can become more efficient in day-to-day operations. Less data also means lower costs of collection, processing, and review. Another benefit: potential litigants can more quickly get a sense of the strength of a defense, minimising litigation fees and costs.

Unlike a typical document management system, which permanently deletes documents and associated metadata when requested by authorised users, SharePoint moves that data to a recycle bin. This helps the user/administrator recover content if a deletion was in error. But it also means that content the user assumes was destroyed may be collectible and discoverable for 60 or more days after deletion — unless that content was permanently deleted by a SharePoint administrator.

Another risk is user-maintained personal sites (My Sites) that can contain documents and other data. Absent clear governance policies, users may accumulate gigabytes of content without any formal plan for its destruction. Consider establishing day-to-day policies, and records management and enforcement rules and protocols, to purge stale content. Automated age-based disposition rules, and storage limits, may be appropriate, accompanied by periodic warnings to users. Users can be "highly encouraged" (word choice and tone depending on firm culture) to move items requiring long-term retention to SharePoint areas that are exempt from retention policies, such as a practice group or department library.

Another challenge with SharePoint is its flexibility with metadata. Each practice group, department, or user can potentially assign different and inconsistent metadata to files. Given the cooperation of the source organisation, it may be possible to create a "map" of metadata synonyms across various libraries and lists. For example, one SharePoint document library may contain the field "Client Number" while another uses the field name "Client #."

Some situations may require building a map of the metadata field names, so you may need to obtain the list of metadata fields for thousands of lists or libraries across an organisation's SharePoint implementation. Fortunately, SharePoint provides an application programming interface that supports this task. You'll likely need to program custom reports so that reviewers can identify all synonyms. 

As with a DMS, SharePoint stores documents and their metadata separately. If files are copied out of SharePoint, by dragging them from "Explorer View" into a folder on a local or networked hard drive, the metadata will be left behind. To extract a complete set of content, including versions and metadata, one of two methods may be used: 1) obtain a complete copy of the SharePoint content database(s) and import into another SharePoint instance maintained by the firm performing the discovery; 2) extract the documents and metadata to the file system using a third-party or custom tool that retains associated metadata and maps it to associated documents.

Assuming access to the original or full copy of a SharePoint repository, the built-in SharePoint search may be used to discover data — if content is tagged consistently with metadata. Once a search index is created that includes the document text as well as metadata, a variety of full-text or keyword searches based on specific metadata fields can be used to find the data.

While search-based discovery can be powerful, SharePoint hides duplicates in search results by default. It is best not to rely on SharePoint search as the sole EDD tool, but use it in conjunction with other software once likely avenues for detailed analysis are identified. In addition to SharePoint's native search, many third-party tools can help perform collection, preservation, and analysis of SharePoint content. When selecting software, here are basic questions to ask:

» Does it require that an agent be installed on the SharePoint servers? If yes, the party performing EDD will need administrative privileges (or support).

» Does it rely on native SharePoint search, a proprietary search, or both? SharePoint's native search was not designed with e-discovery in mind.

» What metadata can the tool use and collect? The ideal tool should be able to process all types of metadata that SharePoint can attach to documents and other content.

» What export formats are supported: EDRM XML, CSV, XLS, PDF, or proprietary? Support for standard review formats means there will be a wider selection of additional review tools to augment the built-in capabilities.

» Does an export require a secondary SharePoint server for review? For large collections, it may be costly and resource-intensive to build a second SharePoint farm.

To tame your SharePoint beast, consider implementing a lifecycle governance policy — as well as a firmwide metadata standard that ensures all information stored in SharePoint has a basic set of data useable for records management, and, when appropriate, content destruction. This will help your firm shepherd data through its lifecycle and ultimate demise, reducing the risk that the firm will be forced to react if those documents and data become subject to discovery. For collection and review of SharePoint content, understand its architectural characteristics, so you can successfully extract and analyse content in a technologically sound and legally defensible manner. Once that baseline knowledge and skill sets are in place, a combination of third-party tools and custom programming may be used to reliably plumb the depths of any SharePoint repository.

Matthew Kesner (CIO), Kevin Moore (IT director), and Mark Gerow (director of applications and business) are with Mountain View, California law form Fenwick & West. Reprinted with permission from the December 2012 edition of Law Technology News magazine,© 2012 ALM Media Properties, LLC. All rights reserved. Further duplication without permission is prohibited.