Opinions and discussion on content management and document management by two of the biggest guys in the business. *Measured by combined weight

XML Repositories – If You Can’t Beat ‘Em – Open Source ‘Em

Author

Follow Marko

Disclaimer

The opinions shared here represent those of the contributor themselves and not those of their employers nor that of Big Men On Content as a whole.

Recently I read where Oracle contributed their XQilla XQuery Engine to the Apache open source community. The engine and the Oracle Berkeley DB XML repository are well known projects in the XML repository space. Despite the humanitarian rhetoric (and active contribution) that surrounds announcements like these, seldom does a large company have motivations for open sourcing their developments other than money.

I’ve written a bit over the last few months on XML content repositories in reference to EMC acquisition activities. The topic is of particular interest because it has one quality that is lacking in almost every thing I read about ECM. It’s different. Using XQuery (essentially XML) as a foundational technology in building a content management solution is not unique or even that new. Astoria was the first I looked at several years ago but in recent years the advancement of DITA and prevalence XML technology in general has brought this product space into its own. Mark Logic and the EMC acquisition X-Hive are the two of the best examples. While these are truly platforms, other vendors have built vertical solutions using the technology around procedural content that lends itself to more rigid structural models. There are over 50 XQuery implementations listed in W3C.

A New Way of Doing Business

What all XML repositories have in common is the fundamental adoption of XQuery over SQL as the query language for content discovery and meta-data management. This mechanism goes beyond simple support for API on top of a relational database architecture, integrating XQuery into the very core of the system. To me this is possibly the greatest advancement in content and information access since the release of SQL-86. Even if I am exaggerating a bit – assuming XQuery is a promising paradigm for data access (particularly for content) why would Oracle give their implementation away? You don’t see them open sourcing Stellent UCM?

Oracle came to prominence in the database market, not because they owned the SQL standard, or even because they implemented it. Their extensions to the standard were simply better. Scalability, Oracle Forms, stored procedure implementations and proprietary language extensions made it a coder’s favorite. It’s admittedly a wild oversimplification of the issue, but I still believe XQuery and systems built on this technology are a threat to the dominance of SQL as the programmer’s choice for data qualification, particularly where SOA and/or content management systems are involved.

Predatory Open Sourcing

Companies, particularly large publicly traded ones, don’t open source their software assets unless there is value to be gained. This value must exceed whatever the market potential is for proprietary development. So, how do they derive this value? Here are some of my thoughts (cynical though they may be).

  1. Free Research and Development – Many coders have an overly naive belief that this is the real value of open sourcing. People contributing to a common goal for the betterment of all. Well – your employer rarely thinks this way. They want something for free and are happy to let others do the heavy lifting while they collect maintenance payments and keep the risk managers at their corporate clients happy.
  2. Divestiture of Unrelated Technology – There are times when you have the right idea in the wrong place. Not every company has the resources, both mental and monetary to fully realize the potential of some ideas and will throw them out for the masses. There are still unfortunately old school CEO’s and CTO’s out there that think every idea needs to be hidden under a rock until they themselves can profit from it though. These dinosaurs thankfully are slowing being crushed into fuel for the rest of us.
  3. Pure Marketing – It’s been said that there is no such thing as bad publicity, and while I think the captain of the Exxon Valdez would disagree, getting your name out there and associated with good solid innovation it a great thing even if there is not a direct correlation to license revenue.
  4. Disruption of a Competitor – In some sense, the previous points are really summed up in this one. Lowering my costs, refocusing on my core and getting my name out there are means to the same end. Competitive advantage. If I am having trouble with a competitor, the best way to throw him off is to devalue his product. It is VERY HARD to compete with free.

    So why should Oracle release their XML repository and XQuery technology into the wild. One reason leaps to mind – aggravate Mark Logic and company. Folding XQilla or Berkeley capability into the core of Stellent UCM would seem a reasonable product strategy. Why not there? Perhaps there is more XQuery in the future for content management at Oracle but as pure SQL dependent API’s fall out of fashion, will the glory that is the open database platform be relegated to platform independent service calsl? One could only hope.

    All is Fair in Love and Software

    It has to be noted that Oracle is just as much a target of predatory open sourcing as they are a perpetrator. IBM’s investment in open source Oracle clone EnterpriseDB rounds out big blue’s open source portfolio and is the most recent example of IBM trying what ever it can to unseat Oracle from its lofty perch. I guess making DB2 less of a pain is harder than giving away a PL/SQL emulator.

    Happily Ever After

    What then is to become of the XML repositories that started me on this rant. In the end the market will be the judge, but I would encourage you to read up on the technology and familiarize yourself with the possibilities. Take a look at MarkMail, a handy example of the power of XQuery applied to mail list archives that date back to the dawn of recorded internet history. (1994) Simply put, the industry needs better options than traditional DBMS indexed file systems for accessing the legendary 80% of corporate data locked away in towers of unstructured content. Open source or not, XQuery driven repositories may hold the key.

    Tagged as: , , , , ,

    Leave a Reply

    Please log in using one of these methods to post your comment:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out / Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out / Change )

    Google+ photo

    You are commenting using your Google+ account. Log Out / Change )

    Connecting to %s

    %d bloggers like this: