Archive for the ‘information intelligence’ Category
By utalley on November 17th, 2011
eDiscovery: Incremental, Single-Instance Collections vs. Data Re-Use
I recently read an article in LTN covering the Guidance product announcement – “Guidance Adds Data Re-use Feature to EnCase eDiscovery”. We applaud their addition of this feature since at StoredIQ we’ve had this capability for many years and believe it is a fundamental component for conducting thorough, legally sound collections. Our term for ‘Data Re-use’ is ‘Incremental, Single-Instance Collection’. What does this mean? In instances where the same files are relevant for multiple cases, StoredIQ will copy and place on legal hold only a single instance of that file. If that file is required for multiple matters, each matter will utilize that single copy, saving storage space as well as the time and bandwidth required to collect the data. And, with incremental collections, only files that are new or have been modified since the last collection will be collected for preservation, further streamlining the collection process. Only when all matters for a given file are concluded, and the obligation for legal hold is removed, will the file be available for disposition from the repository.
Possibly because StoredIQ has had this capability for quite some time, we’ve taken for granted that this is a standard feature of any good eDiscovery technology that has a collections component. The LTN article raised our awareness that this is something we should talk about more.
Maybe more newsworthy than the addition of this feature is the fact that Guidance has not had this capability until now. It should make their customers wonder how many case collections have been jeopardized by not having the capability to search the preservation location from previous and simultaneous, on-going cases.
I recently read an article in LTN, authored by Evan Kobletz, covering the Guidance product announcement – “Guidance Adds Data Re-use Feature to EnCase eDiscovery”. After some discussion at StoredIQ, we’re actually pretty excited about the coverage. It sheds light on capability that we’ve had for years now, and probably don’t talk enough about. In fact, the article also highlights several competitors that still don’t have it. The StoredIQ term for ‘Data Re-use’ is ‘Incremental, Single-Instance Collection’, but setting aside semantics, we believe it’s a fundamental component for conducting thorough, legally sound eDiscovery collections.
What does this mean to eDiscovery customers? The first time a file is relevant to a case, we’ll take a forensically sound copy and place it on a retention server for preservation with a litigation hold tag specific to the given matter, without altering the metadata and without interrupting end users. If it’s an ongoing case, we’ll perform incremental collections – meaning that we’ll only get another copy if that file has been changed (or if other new relevant files are created). When another case crops up, and the same file is once again relevant, StoredIQ is aware that the file is already on retention and instead of taking the time, bandwidth and storage space to collect another copy StoredIQ just places an additional hold tag on the file. If your company is in a highly litigious industry or has a number of serial litigants, you can imagine the savings this can add up to over time. Only when all matters for a given file are concluded, and the obligation for legal hold is removed, will the file be available for disposition from the repository.
Possibly because StoredIQ has had this capability for quite some time, we’ve taken for granted that incremental, single-instance collection is a standard feature of any intelligent eDiscovery technology that has a collections component. And more importantly, a feature that eDiscovery customers should consider closely. Note that the article also mentions that this feature also enables users to “search collection sets from previous litigation”. That statement alone makes me wonder how many case collections have been jeopardized by not having the capability to search and produce data from the preservation location used by previous and simultaneous, on-going cases?
On a broader scale, in the LTN article, Kobletz, states, “Data reuse is a growing trend in the e-discovery industry.” We at StoredIQ actually see ‘data reuse’, to use the same term, as a trend that goes well beyond eDiscovery. The same data that your legal team needs to identify and collect for a legal matter, is also the same data that your records management team needs to classify, your IT team needs to store and manage, and your compliance officers need to govern. At the end of the day, your corporate data is all being ‘re-used’ by multiple departments – not just the legal team for multiple matters.
What companies need is the ability to identify, classify, manage, and act on their data assets – to provide value across the entire organization. That’s something you won’t get from Guidance, or any point solution eDiscovery product. At StoredIQ, we’re focused on delivering powerful information governance products that can provide the comprehensive data insight and control that corporate counsel, compliance managers, and records managers need to make the best and most informed decisions, while meeting the stringent requirements that IT departments demand.
Share
TOPICS: eDiscovery, information governance, information intelligence, information management
By utalley on August 1st, 2011
Using Data Mapping and Assessment to Minimize eDiscovery Cost and Risk
Last week Dennis Kiker contributed an interesting article to Law Technology News entitled How To Manage ESI To Rein In Runaway Costs. At the heart of the problem is that we’re a country of corporate data hoarders. We keep data past its expiration; we don’t have a good system in place for categorizing and managing it, and are overwhelmed when a legal request necessitates identifying and collecting data relevant to a case. Dennis states:
Despite the high cost of its painstaking preservation and storage, much of this data will never be relevant to any legal case. Indeed, according to a 2009 survey by Framingham, Mass.-based IDC, 60 to 80 percent of the information retained by corporations in America has no value from a business or legal perspective.
Legal departments have historically focused on the ‘right side’ of the Electronic Discovery Reference Model (EDRM) – the analysis and review stages. However, if the quality of collected data in the review platform is unnecessary, insufficient, spoiled, or irrelevant; this significantly increases an organization’s legal cost and risk.
Kiker goes on to say… the best approach for many companies is to get serious about cleaning up their information environments. By “taking out the trash” in a major way, companies stand to make big cuts in their annual data-storage bills, which can also run into the six figures. This also enables them to more quickly and more accurately identify potentially relevant information for the attorneys to sift through during a review process, potentially lowering their legal bills.
Legal teams are increasingly realizing the business value and ROI from strengthening their company’s ‘left-side’ EDRM capabilities and understand that sound information governance practices result in highly targeted and effective eDiscovery.
The article points out that shrinking the overall stack of data is a good start to minimizing eDiscovery costs, but companies also need to find all the relevant information contained in their data. He says:
Data mapping offers a way to solve this problem. The basic idea is to create a master index that spells out exactly where content is stored. Surprisingly, many companies have never taken this critical information management step.
In fact, Barry Murphy was reflecting on the Carmel Valley eDiscovery Conference and commented in his blog: Get specific. Know where data lives and do the data maps. It’s impossible to preserve data if you don’t know where it is.
At StoredIQ we couldn’t agree more. To prove it, during the month of August, StoredIQ is extending a promotional offer for our data assessment and mapping service. The first 10 qualified companies will pay only $10,000, a savings of $5,000 off list price.
StoredIQ Data Assessment Services provide unprecedented visibility into the unstructured data across the enterprise. This invaluable service quickly gives organizations critical understanding of their business content to make more informed decisions about the management, retention, and disposition of their data.
To learn more about this offer and to take the first step toward managing your escalating ESI-related costs and risk – contact us today!
Share
TOPICS: data assessment, eDiscovery, information governance, information intelligence, information management, litigation readiness, records management
By pmyers on June 16th, 2011
Fighting for Last Year’s Prize
Phil Myers, CEO, StoredIQ
Predictive coding patent dispute a ‘pebble in the storm’ for firms seeking to reduce the cost and complexity of eDiscovery.
We’ve been watching as the furor grows around Recommind’s recent claims to own the patents that will make computer-expedited review (AKA predictive coding) a proprietary process. The implications may be onerous to be sure. With Recommind out enforcing this patent against any and all review tools that use some form of it, vendors could be wrapped up in legal disputes for years to come. The irony of this is that the very vendors who built their businesses to drive down the cost and complexity of eDiscovery may now be on a path to help it spiral out of control because of their own desire to control the market.
At StoredIQ, we’re focusing our energies on offering our customers a different perspective that will have a more meaningful impact on their future. While most of the focus in the press has been around the validity of Recommind’s patent and even the competitive aspects of whether or not Recommind can or should enforce a claim like this, we think that the pragmatic view of this dispute is that it is ‘fighting over last years prize’. That’s because in reality the market has already moved past review automation and is now starting to focus on the bigger problem of what happens before you ever hit review and employ predictive coding – the problem of over-collection. In fact, studies have shown that a majority of the time spent in review is wasted looking at documents that shouldn’t have ever been collected in the first place.
Understanding large data pools well enough to extract and collect relevant subsets for both reactive eDiscovery and proactive Information Governance is the single biggest cost reduction exercise any enterprise can focus on.
Gartner says it costs on average $18,750 per gigabyte in eDiscovery. To drive down these costs, the industry needs innovative solutions that can quickly identify, analyze and collect information that’s relevant. What is lost in most of this discussion is that a predictive coding patent is a nice innovation for review but it does little to solve the primary problem of information collection, governance and management … by far the biggest cost reduction opportunity in eDiscovery.
For those who have tried to apply these computer-expedited review technologies earlier in the cycle as collection solutions, all kinds of problems have emerged … from defensibility to chain of custody questions to trust that the ‘black box’ algorithms have found everything. The idea that merely collecting ‘like’ documents from a large data pool that defy standard categorization based on a small sample is widely viewed for what it is … a short cut that is ‘guessing’ that the sample collected is representative of a much larger pool of data ‘in the wild’.
The good news is that there is already a better way to solve this problem. Solutions are available that can proactively and efficiently index data ahead of time and store the intelligence about the data for discovery. Hardened over the last decade, they are now proven to scale and have automated change management built-in. At StoredIQ, we can provide a high-speed, high-precision discovery solution that works across petabytes of data. Contrary to the belief of many at the time we started down this path, we’ve found that this approach is the fastest, most reliable and lowest cost means of producing defensible data for legal. Now that we are in production with hundreds of installations , the results speak for themselves.
Take the recent Gulf Oil spill matter for example. The customers using our approach have completed their data collection and validated a legally defensible dataset for review by both government regulators and other litigants. In one case, the results reduced the amount of data by 100:1 over a very large data pool. This was all done prior to review and took less than a month to complete the indexing, identification, collection and processing. The cost savings were significant, 95% savings compared to service provider collection and a mind-numbing amount in avoiding unnecessary review costs. The litigants using the old school approach of collect everything and pare it down in review with tools like predictive coding? Well, they’re still processing and their costs are not something anyone wants to talk about.
This patent dispute is hot right now but the focus will soon shift. As Anne Kershaw and Joseph Howie found in their study in October 2010, ‘Crash or Soar’ there are advantages to predictive coding when compared to linear review. But, the real advantage was when the technology was deployed post culling of the data because then the data truly was uniform. The point of their study and that of many others is that the capability can serve a valuable purpose in the right environment. But, without a foundation of good information management in front of it, the predictions are suspect at best. As is the value of Recommind’s patent.
Leave your comments. We welcome your thoughts and suggestions.
Share
TOPICS: eDiscovery, information governance, information intelligence, information management
Recent Comments