MySQL makes an obvious and affordable choice for data warehousing. Its traditional weaknesses (from OLTP system point of view) are actually its strengths when it comes to Data warehousing and OLAP.

If you look at the requirements from a DW, you’d see that unlike OLTP systems, transactions are not required, and 60×24x7 uptimes are not really needed. On the contrary the system should be able to handle queries most efficiently and should provide capabilities for high-speed bulk loading in batch jobs. This is where MySQL shines over competition.

MySQL’s non-transactional MyISAM db engine is one of the fastest around when it comes to querying large amounts of data. You don’t get transaction support, and thats alright because you don’t need transactions (and the huge overheads they bring). And its bulk insert features (LOAD DATA INFILE…) are extemely fast as well!

In case you didn’t know it, MyISAM tables can be merged into larger tables for querying. So if your data is split into 4 tables (1 for each quarter of year), you can dfine a merge table on top of them so that your users can directly run SQL on single merge table instead of running 4 queries on 4 tables. Sounds similar to Oracle’s partitioning? Well, it gives you the same benefits free of cost.

MySQL is simple to install and maintain, can be easily setup and optimized for reporting, and is open source licensed under GPL.

A good design for a reporting system based on MySQL is to keep the fact tables as MyISAM (fastest disk based tables) and keep the smaller dimension tables in memory. You’ll be surprised to see how your reports fly. We’ll consider such a design in our comprehensive analytics project this month.

Keep watching…