The craft of analytics has suffered due to the severe hype surrounding commercial data-mining tools. Try checking the websites of most commercial data-mining software vendors to confirm for yourself. Buying that million dollar tool will not help if you do not know the craft. Unfortunately, even the ‘consultants’ of such companies confuse the tool for the art. Often enough, all that these consultants can do is regurgitate the same meaningless crap their marketing guy shafted into their heads.

But the playing field is changing. As in other fields of human endeavour, the tools are getting better and cheaper, thus removing the first barrier to entry: Access to professional tools. (more…)

So how is a data warehouse different from your regular database? After all, both are databases, and both have some tables containing data. If you look deeper, you’d find that both have indexes, keys, views, and the regular jing-bang. So is that ‘Data warehouse’ really different from the tables in you application? And if the two aren’t really different, maybe you can just run your queries and reports directly from your application databases!

Well, to be fair, that may be just what you are doing right now, running some EOD (end-of-day) reports as complex SQL queries and shipping them off to those who need them. And this scheme might just be serving you fine right now. Nothing wrong with that if it works for you.

But before you start patting yourself on the back for having avoided a data warehouse altogether, do spend a moment to understand the differences, and to appreciate the pros and cons of either approach. (more…)

Over the next one month, we will do some cutting edge analytics using entirely free open-source software (so that you dont have to approach your boss for that million dollar analytics budget).

Yes, you guessed it right. This blog is for people who, like me, are tired of all the hype created around analytics by commercial BI vendors and the exorbitant prices they charge for bad software products. This blog is for people and organizations who want to discover the world of analytics and use advanced data-mining for actualizing their highest potential, but haven’t been able to do it so far. (more…)

When you are bootstrapping your way towards developing your dream software product, you have to be extra careful with your first few customers. You have to make sure they get all the benefits you had to offer. And you have to make sure that your upcoming product meets their custom requirements. After all, they are paying you for it, and are helping in the actualization of the vision, right?

Ummm… Not so simple really.
(more…)

Unfortunately most of the websites on analytics and data-mining are chock full of hype and have very little content that you can actually learn from. Still, there are some sites that are regularly referred to by the insiders. Very few, but still. :-)

Here is a great Course on Data Mining created by Dr. Piatetsky-Shapiro and 3 computer science faculty members from Connecticut College working in conjunction with an instructional designer.

This course is organized as 19 modules of 75 minutes and provides a thorough overview of the field. The detailed course outline is available here.

If you know of some other good free courses about data-mining/BI/analytics, please let me know and I’ll put them up here.

Data mining has been defined as

The nontrivial extraction of implicit, previously unknown, and potentially useful information from data

and

The science of extracting useful information from large data sets or databases.

Well, in simpler terms, data-mining is what you do when you are unable to know your customers as you would if you were living and working in a small community. (more…)

If you are doing business intelligence/analytics/data-mining today and are satisfied with your results, this blog may not be for you.

This blog is about doing the best of that stuff in an accessible (here) and affordable (free) way. Its about Open Source Analytics.

« Previous Page