Opinions and discussion on content management and document management by two of the biggest guys in the business. *Measured by combined weight

Lost Knowledge – DM Partner’s Essence

Author

Follow Marko

Follow Lee

Disclaimer

The opinions shared here represent those of the contributor themselves and not those of their employers nor that of Big Men On Content as a whole.

I had the incredible fortune last month to find myself in the hearts of two very different ancient civilizations.  I hinted in a previous article to my trip to Acropolis of Athens, the very birth place of Western Thought.  A planned trip with all the expected turns and a few surprises.  My second journey found me climbing the 3rd largest pyramid in the world at Teotihucan, outside Mexico City.  An unplanned trip that even with some quick research would shake the foundations of my global knowledge.

I had planned to write about Western Thought and it’s ties to content management today, but as I walked in the footprints left so many years ago I realized that the current path often leaves behind many things only to be relearned.  It is with this in mind that I’d like to go back in the history of content management and shine a light on some technologies that have disappeared.

In Search of Meaning
In 2000 I was introduced to a very interesting company out of Belgium called DM Partners.  DM Partners was a small linguistics company backed by capital from Lernout & Hauspie, at the time a giant in the linguistics field.  Their lead product Scout was a multilingual search engine, but what really caught my attention was Essence.

Essence itself was originally a component of Scout.  It’s role was to take the multilingual search results and summarize them into short paragraphs to allow review of the content.  The company recognized that machine translation was not 100% and saw a summary as a way to allow a searcher to quickly decide if further translation of the document was needed.  But Essence was not a second fiddle product.

Getting to the Essence
Essence in itself was a very powerful tool.  It would take any amount amount of content and could be summarized into a short paragraph of the users choosing.  They even had an on-line demo available.  You would simply give it the content, in the case of the demo a URL, and then tell it how many sentences you wanted.  It would then summarize the page into a single paragraph, of in some cases very long sentences.  I used the product almost daily to summarize all of the press releases, mostly “Barney” release, being pushed out at the time.

Better still with the real product you could give it a collection of content and it would read through it and give you a summary.  It would spit out keywords and based on the keywords it would summarize the collection for you.  Imagine that being able to give an entire thread of emails to a system and it would tell you the basic meaning of it.  That technology would be great today.

Under the Covers
So how did this work.  Honestly it’s beyond me, the development team made me feel like a computer newbie.  Most were from the field of computational linguistics.  Basically their development team focused on taking words and developing mathematical computations for them.  (A great knowledge to have when the Terminators take over 🙂

Based on the number of occurrences of a word and an individuals words importance, different rankings were given.  Add to this the recognition of parts of speech, especially tenses and grammatical gender, and you had a very powerful tool.  AND IT WORKED.

Where Are They Now?
Unfortunately, DM Partners was a victim of the fallout of Lernout & Hauspie.  Being venture backed by L&H, DM Partners found itself being sold when L&H imploded.  L&H’s collapse was the largest commercial collapse prior to Enron.  After the acquisition neither Scout or Essence were herd of again.

But all may not be lost.  It looks like a little company called Attensity may have a very similar offering.   Imagine that, reviewing a series of emails may not take hours after all.

Tagged as: ,

Categorised in: Content Management, Lost Knowledge, Technology, Text Analytics, Translation

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: