Opinions and discussion on content management by two of the biggest guys in the business. *Measured by combined weight

The ROI on ECM – Estimating Storage


The opinions shared here represent those of the contributor themselves and not those of their employers nor that of Big Men On Content as a whole.

Lee and I were talking and one of the most common questions we hear from clients implementing or managing a content management system is how to calculate a return on their system.  Companies end up spending hundreds of thousands to millions of dollars on a CMS system with no idea of how to track the value it is bringing to their organization.


There is no one simple formula to calculate ROI for a CMS system.  Instead what we have discovered is there are a series of calculations one can make to understand the costs and saving of the system.   These numbers can then be used to calculate the return on the implementation.  To introduce these calculations we decided to start a series of posts, with this being the inaugural issue.  Being the first one we’ll start with a simple, yet regularly asked question, how much storage do I need?

Calculating storage is a rather simple task.  All you really need to do is think about what is really going on in the CMS system.  Storage looks at objects, renditions, and versions.

Part of the series Calculating the Return on Investment with ECM.


The basic formula is:

    objects * versions * (object size + (renditions * rendition percentage * object size))

The first component is very simple, the number of objects that will be stored by the system.  The more accurate this number of course the better and what is really needed is the individual objects.  With Office documents this is rather easy as each one is a single file and usually counts are easy to estimate.  It gets a bit more difficult with HTML or XML documents as both can be made up of several content objects.  This means that you will need to estimate how many images for HTML or files are being used for XML.


While we’re talking about objects, the next number needed is the object size or average size of the objects being stored in the system.  Depending on the type of content in the system this will either be kilobytes (KB) or megabytes (MB).  Of course it’s important to make sure the same size is being used.  When estimating file size HTML or XML usually averages around 50 KB, Word around 250 KB, and PowerPoint around 2 MB.  Of course each organization can do things their own way, so if you use a lot of images in your Word documents you could find your average size easily in the megabyte range. 


While the basic formula looks only at one average file size, the formula can be made more accurate by considering each file type as individual collection of objects.  For example:

   PowerPoint objects * versions * (object size + (renditions * rendition percentage * object size)) +

   Word objects * versions * (object size + (renditions * rendition percentage * object size))


Versions are the number of copies of the object will be stored in the system.  Typically this ties to object state such as work in progress (WIP), staging, or production copies.  While it is easy to say that a web system for example will have only three versions of an object WIP, staging, and production, though often there will be several WIP copies stored in the system.  An object can also be sent back from staging and this would add another version to the system. 


Versions also need to consider that documents continue to be edited over time, sometimes called editions.  This need to be considered with the object count.  Over time changes to an existing object can start a new version tree or simply continue the existing tree. 


Looking at the formula you will also see that versions not only multiplies against the object size but also against the renditions.  This is because renditions are often made for each version or state of the document.  If this is not the case then simple changes can be made to the calculation:

   objects * WIP versions * (object size + (renditions * rendition percentage * object size)) +

   objects * staging versions * (object size + (renditions * rendition percentage * object size)) +

   objects * production versions * (object size + (renditions * rendition percentage * object size)) +


Renditions are the number of other formats the object will be store in the system.  For example a Word document may be stored as a PDF file to save download time or ensure the document is not edited.  System renditions, like a text rendition for full text engines, also need to be considered.  Typically an object may have only one or two additional, though when looking a images you can easily have four or five; original, low resolution image, high resolution image, and thumbnail.


Rendition percentage is  a factor to calculate the average size of the renditions.  I like this approach as it’s easy to estimate them.  For example a PDF is typically 50% of the size of an office document and a text file is typically 10 to 15%.  Of course the average size of the rendition can be used instead of the renditions, rendition percentage, and object size, for example:

   objects * versions * (object size + total rendition size)


Furthermore the basic formula can be enhanced by looking at each rendition separately.  For example:

   objects * versions * (object size + (PDF rendition percentage * object size) + (text rendition percentage * object size))


Sample Case

Let’s say we’ve been asked to estimate the storage for a press release library.  The system currently has 10,000 Word documents with an average file size of 500 KB.  In addition to the original document two renditions estimated at a quarter of the size are needed.  Each object will on average have four versions.  Here’s how we do this:

   objects * versions * (object size + (renditions * rendition percentage * object size))

   10,000 * 4 * (500kb + (2 * .25 * 500kb))

   30,000,000 kb OR 30 gig


After doing some additional research, we find that an HTLM copy is needed for the web as is only needed for staging and production.  The system also needs a PDF that can be sent in an email when the document is in production.  As stated earlier PDF is typically 50% of the original object size and the HTML (text file) version is 15%. 

   objects * WIP versions * object size +

   objects * staging versions * (object size + (text rendition percentage * object size)) +

   objects * production versions * (object size + (text rendition percentage * object size) + (PDF rendition percentage * object size))

   10,000 * 2 * 500kb +

   10,000 * 1 * (500 kb + (.15 * 500kb)) +

   10,000 * 1 * (500kb + (.50 * 500kb) + (.15 * 500 kb))


   10,000,000kb +

   5,750,000kb +



   24,000,000kb OR 24 gig

Tagged as: ,

Categorised in: EMC, ROI on Enterprise Content Management

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: