Mon 14 Nov 2005
As I was saying in an earlier post (Database vs, Data Warehouse), your Application database is not your Data Warehouse for the simple reason that your application database was never designed to answer queries. Your app is a jumble of tables interlinked to define the entities in your business and their relationships (usually depicted in an Entity-Relationship Diagram). You just have to look at an Entity-Relationship Diagram to know that its not an easy job trying to get some queries and answers out of it. (You can see an example in the article linked to at the end of this post).
To design a good Data Warehouse that actually answers user queries effectively you need to do what is known as Dimensional Modeling.
In a DW, the data is kept in a standard, intuitive framework or structure that facilitates high performance access. This usually includes a Fact Table containing aggregatable facts (such as sum of sales amount) linked to one or more smaller Dimension Tables containing descriptive information (like time, region, product, etc. that you may want to analyze the facts by) in a star like structure called a Star Schema. This is exactly how your application database is not structured.
The process of identifying the facts and dimensions for a business scenario and designing the corresponding Data Warehouse is called Dimensional Modeling. Ralph Kimball’s The Data Warehouse Toolkit: The Complete Guide to Dimensional Modelingis an excellent text that is frequently referred to by amateurs and professionals alike. It is about how to actually design and build a repository that will deliver real value to real people. (If you still don’t have this book yet, please stop wasting your time and go get your copy now).
And while you go get that book, do read this article, an original classic by Ralph Kimball (from way back in August 1997) called the Dimensional Modeling Manifesto.
This is the article that sets the stage (and the manifesto) for the ideas developed by Ralph in his book. And like good wine, its relevance has only increased with time.
Read it, and then go get that book.