A friendly reminder that all of the opinions expressed here are completely my own and not those of my employer’s.
I am late this year putting together my thoughts around trends for 2014 in ECM. To be frank many of the trends in ECM seem obvious with much already having been written about them.
- Everyone is moving to cloud and this no longer trend worthy news .
- There will be a few acquisitions, especially among the mid-tier players to round out capture, workflow and mobile capabilities.
- IPO’s of a few key players will be frequently discussed but deferred until 2015.
- DropBox will relaunch it’s business offering AGAIN. Look for them to make acquisitions of overlapping tools to gain this foothold.
I struggled to find something more substantive to cover until I found myself in a lively discussion on the topic of “Big Content.”
Commercialization of value extraction from the enormous amounts of unstructured data being generated today is the next major focus for advancement in the ECM industry. Going beyond improving transactional throughput and accuracy and into understanding. It may be ironic since I am one of the “Big Men On Content” but I do not like the term “Big Content”. This term perpetuates outdated stereotypes and division at a time when technologies should be coming together
To understand what is meant by this term I did what anyone else would do. I went to bigcontent.com. Surely the genius that had the foresight to grab the URL could tell me if it really is separate from big data. I was disappointed. A marketing firm jumped on the term and is using it in a completely different context. Good for them but it is perhaps a missed opportunity from an ECM perspective.
Big Content as it relates to ECM seems to be the content management industry’s attempt to ride the coattails of Big Data marketing. A never ending quest to be appreciated as much as the more popular sibling, structured data.
So what do we mean by Big Content. Is it a subset, a superset or something altogether different from what we are now calling Big Data? EMC’s Dave Dietrich wrote this piece on Big Data misconceptions and takes the position that unstructured data that rises to Big dimensions is a subset.
To be “big” in this context Dietrich argues the data in question must have great volume yes but also must have both variety and velocity. Certainly some unstructured data has these characteristics. He goes so far as to say most Big Data problems are grounded in the unstructured citing last year’s IDC’s Digital Universe study.
Gartner’s Darin Stewart seems to agree that Big Content is a subset of big data. He goes on to posit that slow uptake of interest in the unstructured aspect is because of a lack of “comfort” in dealing with documents as opposed to databases on the part of IT. He touches on what I feel is the crux of the issue but I don’t think it has anything at all to do with the comfort itself. I think it is the utter lack of an integrated tooling approach across the industry.
All silos begin as words.
If you make it a separate category you may one day have tools that let you do meaningful things but without a common analytical approach it will perpetuate the integration burden the structured and unstructured worlds deal with today. Deriving value from structured data will always be easier and as separate solutions structured data applications will continue to maintain the attention of the buyers.
The very things long term ECM proponents hope to achieve by trumpeting Big Content will continue the technological isolation and lack of innovation that the industry has struggled with for a decade. What is needed is a coordinated drive to raise the expectations of the emerging structured analytical tools to demand search, content analytics, sentiment, etc. Some offerings, particularly those tailored to social media analytics began this work but we must continue to push beyond 140 characters to more valuable and information rich content.
We do not need a category for Big Content tooling, marketing and expertise. We already have one. It is called Big Data. And it is very nice:
Investment in this is happening whether you call it Big Content or not. IBM’s billion dollar investment in Watson is the best example. From a user perspective, Watson does not make a distinction between structured and unstructured sources as part of the question when presented. Likewise as we begin to think about other analytical frameworks from a user’s point of view, the difference in the structure of the data sources should make less difference over time and disappear altogether eventually.
One might ask, isn’t this just the same content and semantic analytics that we have been talking about for years. The answer is “sort-of.” You will be hard pressed to find any of the lofty promises of those initiatives fulfilled. It is my contention that this tooling needs to scale and be formally folded in to the analytical tool set of big data. The correlated structured data provides context for the extracted unstructured. At some level this is happening but as an industry I think we derail this motion when we attempt to create differentiation in categories.
This convergence of structured and unstructured analysis will be hampered if we spend overt mental and marketing energy today perpetuating a separation of the disciplines simply to defend the value of our current expertise.
The trend has begun. We need to help it along or get out of the way.