Ruling the World of Information Management and Electronic Discovery

Text-only Preview

E-Discovery Insights – Clearwell Systems, Inc.
Ruling the World of Information Management and Electronic
by Kurt Leafstrandon November 17th, 2010
If you’re anything like Dr. Evil, Tears for Fears, or Napoleon, ruling the
world is at or near the top of your to-do list, and part of ruling the
world is having as omniscient a knowledge as possible of what’s going
on, in order to better control it. Ruling the world has also long been
the dream of many software vendors, who want to own and
understand all the information in an enterprise in order to, um,
provide maximum value to their customers… oh, and also to lock them
in to a single underlying platform that allows them to control as much
of the organization’s information management decisions as possible.
In some cases, these dual interests are aligned. However, in e-discovery, it’s not so clear. Over
the last couple of years, many vendors have pushed a notion of “index everything” or so-called
“proactive” e-discovery, in which you have instant access to all the information in your
enterprise, in real-time, from which to drive your e-discovery process. But is this feasible? Or
even desirable?
The Myth of the Silver Bullet
It can be tempting for IT to turn to an enterprise search solution that can index all data sources
– laptops, desktops, file servers, SharePoint servers, databases, email archives, content
management systems – and enable e-discovery across the entire enterprise in an instant. The
reality is that while such a solution may work for enterprise search in small and medium-sized
companies with a finite scope of data, the level of complexity in scale and defensibility of
operations makes this simply not an achievable approach for e-discovery at most large
enterprises. As Anne Kershaw and Joe Howie of the Electronic Discovery Institute noted in their
just-published Judges’ Guide to Cost-Effective E-Discovery:
“There is no single silver bullet that solves all problems associated with escalating discovery
costs and delays. As noted above, the single most effective cost reduction method is the
focused collection of records most likely to contain relevant information. Some argue that
e‐discovery is best accomplished by taking large amounts of data from clients and then applying
keyword or other searches or filters. While, in some rare cases, this method might be the only
option, it is also apt to be the most expensive. In fact, keyword searching against large volumes
of data to find relevant information is a challenging, costly, and imperfect process. A much
better approach is to ask key client contacts to help you locate core relevant information and
then, by reading that information, determine other sources of relevant information.”
To read more visit

E-Discovery Insights – Clearwell Systems, Inc.
What are the specific reasons why a targeted collection approach is superior? From our
conversations with clients as we have been developing our solution to this problem over the
last couple of years, three major drawbacks to the index-everything approach stand out.
1. Impact to Existing IT Environment
While the collect-and-preserve approach employed by Clearwell is widely accepted for e-
discovery, index-everything and preserve-in-place solutions have recently emerged, originating
from other enterprise applications such as knowledge management and enterprise search.
These approaches from other domains have significant disadvantages when applied to e-
discovery, including impact to existing IT infrastructure and processes that result in increased
cost and complexity. For instance, the scope of e-discovery can exceed the amount of
information being indexed by knowledge management or enterprise search applications.
According to Forrester, the majority of enterprise search implementations range in size from
the hundreds of thousands to tens of millions of records, not billions of documents that are
potentially discoverable during litigation. Consequently, index-everything solutions must index
a much larger volume of data across a broader range of applications and data stores than would
typically be necessarily for enterprise search.
Indexing such a large amount of data has implications for the entire IT environment. These
solutions either crawl data repositories over the network or employ agents on local desktops
and laptops to find new and modified files. IT organizations using these solutions report
experiencing disruptions including:
• Requiring read access and permissions to numerous line-of-business applications and storage
systems where data resides
• Significant increases to disk I/O for enterprise applications, network file shares, and client
• Increased network consumption as large amounts of data are read over the network
• Increased consumption of local hard drive space on employee desktops and laptops for
search indexes and redundant copies of preserved files
• Scheduling resource-intensive indexing tasks during off-peak hours, impacting the ability of IT
departments to complete backups during shrinking backup windows
Taken together, these issues add cost and complexity to the deployment of index-everything
and preserve-in-place solutions. This often results in organizations not fully deploying the
solution after purchasing licenses and spending months or years trying to integrate with their
existing systems.
2. Risk of Missing Critical Data
To read more visit

E-Discovery Insights – Clearwell Systems, Inc.
Another key concern of organizations seeking to meet e-discovery requests is the ability to find
all relevant files and documents for a case. Missing even a few important documents may result
in multimillion dollar fines and sanctions. UBS and Morgan Stanley each paid $29.2 million and
$12.5 million, respectively, for losing key files during litigation. It is therefore critically
important that e-discovery solutions have the ability to not only index and search common file
types, but also a range of less common but equally important files such as those within nested
container files, encrypted files, and TIFF images containing text. Solutions that originate from
applications outside the e-discovery domain often skip these files because 100% accuracy is not
required for other applications such as enterprise search. Across organizations with billions of
documents, there may be hundreds of thousands of potentially relevant files which are in the
dark and unknown to legal teams because they are not indexed. Know more on legal electronic discovery.
Index corruption is another commonly reported issue with index-everything solutions that
results in incomplete search results. Search indexes are susceptible to data corruption just like
any other computer file, but the large size of indexes containing billions of records increases the
probability of errors. In fact, this is a common problem of most archive solutions and other
solutions that manage billions of records. A corrupt search index will result in incomplete
results or in the worst case scenario, the inability to conduct searches until the index is
repaired. In some situations, data must be re-indexed to rebuild a corrupt search index which is
time consuming due to the slow speed of some solutions.
The net result isthat in-place solutions increase the likelihood of missing critical data, exposing
the organization to considerable legal and financial risk.
3. Time Delays and Uncertainty in Searches
When embarking on a project to make all enterprise data searchable for e-discovery, an
important consideration is indexing speed in relation to total outstanding data and projected
data growth. Organizations deploying such a solution typically have a large amount of existing
data that needs to be indexed, and this index must be continually updated as data is modified
and new data is created. Many companies report that although vendors claim high processing
rates, these high rates erode over time as companies index greater amounts of their existing
data, increasing the size of search indexes. Beyond an application’s ability to index data, there
are exogenous factors affecting indexing performance including network speed, disk I/O, and
latency. Along with index size and the number of search indexes, these factors can also affect
search query performance, resulting in searches that take hours or days to return results.
Another issue facing organizations deploying index-everything solutions is that end users may
be creating and modifying documents faster than the solution can index them. As a result,
there is a widening gap between the state of data in the wild and the solution’s picture of that
data, leading to incomplete search results. Equally troubling, search results may include files
that were moved after the search engine indexed them, and so they appear in the results but
cannot be viewed, retrieved, or preserved. End users clicking on the link to an item may receive
an error similar to the “404 Error: File Not Found” that everyone has experienced when
To read more visit

E-Discovery Insights – Clearwell Systems, Inc.
browsing the web. This presents a significant defensibility problem in e-discovery, and IT teams
often end up tracking down these missing files one-by-one to ensure they are preserved. The
result is that organizations may be exposed to unnecessary legal risk while IT teams have the
additional burden of manually tracking down hundreds of files for each legal matter.
A Better Approach to Collection and Preservation
Recognizing the challenges of collection and preservation, Clearwell has developed a targeted
approach that enables organizations to defensibly collect and preserve data without increasing
the work of IT or exposing the organization to risk. Targeted collection provides an easy way for
IT or Legal teams to collect from all critical data sources and securely manage collected data in
a preservation store for the duration of a case. Unlike index-everything and preserve-in-place
approaches, Clearwell is up and running quickly, delivering value in hours or days without the
cost and complexity of lengthy multi-month deployment timelines. In addition, Clearwell’s
targeted collect-and-preserve approach has a number of benefits over in-place approaches:
Minimal impact to IT infrastructure: Clearwell only collects potentially relevant data from
custodians involved in a case or investigation, targeting resources at the most important data
instead of wasting resources on indexing all data across the entire organization. As a result,
targeted collection requires less impact to existing applications and storage systems, does not
cause significant increases to disk I/O or network consumption, and does not require agents to
be installed on client machines or servers.
Finds all critical data: Purpose-built to support the complex and difficult to read file types
required by e-discovery, Clearwell can index and search all critical content such as nested
container files, encrypted files, images containing text, and hidden content.
Up-to-date collection: Clearwell collects all relevant data for e-discovery by targeting
information that is related to custodians in the case. Because this approach is not limited by
legacy indexing approaches, Clearwell is able to collect data that has been recently modified or
Maintains existing workflow: With Clearwell, end users are able to continue using their existing
workflows and business processes without interruption. Using targeted collection, Clearwell
can collect data in the background without altering data where it resides. When users create or
modify files in the normal course of business, Clearwell incrementally collects new data
Reduces risk: Targeted collection significantly reduces the risk of spoliation by retaining data in
a secure preservation store, providing a defensible process that maintains chain of custody. As
a result, data cannot be tampered with by end users or accidently lost on laptops, desktops, or
other data repositories not under the control of IT.
To read more visit

E-Discovery Insights – Clearwell Systems, Inc.
Collecting and preserving evidence are critical steps in the e-discovery process. Solutions that
promote indexing everything as the optimal solution for your e-discovery problems might be
conceptually promising, but create new challenges for IT and increase risk in practice. As a
result, organizations are seeking a solution that enables them to respond effectively to e-
discovery without causing major disruptions or exposing the organization to additional risk.
Clearwell’s targeted approach solves the challenges of collection and preservation by making it
easy to collect data from all critical data sources and preserve data defensibly, without
incurring greater risk or disrupting the organization’s business processes. Know more on Electronic Evidence

Know More on: Litigation Software.
e discovery tools.
To read more visit