There are far too many conflicting and confusing definitions of Data Mart and Data Warehouse floating around. The long running debate between Ralph Kimball and Bill Inmon, the two Titans of Data Warehousing, only adds to the confusion.

In this post, we’ll try to get some sanity around the concepts, without getting drawn (hopefully) into the crossfire.

(more…)

A good resource on Data Mining using R environment is “Data Mining - Desktop Survival Guide” by Graham Williams. Its a well researched and frequently updated free online book. Graham does a good job of going through the data mining process and provides detailed descriptions of the commonly used algorithms.

Exmaples are in R (all with R source code, of course), with some good examples of graphical visualization as well.

Click here for Data Mining - Desktop Survival Guide

R Environment is a part of DecisionStudio-Professional and can also be separately downloaded from http://r-project.org.

I would, in due course, spend some time on the R Environment (available as part of DecisionStudio Professional, and separately downloadable from http://r-project.org). R provides an excellent alternative to commercial products for modeling, statistical analysis, and graphics. Originally designed by AT&T Bell Labs, the R environment is fast becoming the standard for cutting edge number crunching.
(more…)

On Friday we released DecisionStudio Professional - a comprehensive and free desktop BI Platform that gives you all the tools needed for analytics under a single package licensed under GNU Public License (GPL).

DecisionStudio Professional (DSP) is an advanced graphical data mining, reporting, modeling, and analysis environment built on top of the best-of-breed open source projects. Some of these include:
Optimized MySQL database as data warehouse platform
SQL Workbench (MySQL Query Browser and DBDesigner) for Data Analysts
R environment for statistical analysis and modeling
iReport Reporting GUI and JasperReport reporting library
Python with Boa Constructor IDE for application and GUI development

DecisionStudio Professional is the only end-to-end open source analytics platform that provides comprehensive capabilities to each role. Data Analysts get to store, process, and publish data on a standard MySQL platform; Reporting Analysts would like iReport and the integration with Office tools; and Modelers would love the excellent R Environment. It also includes Python along with a drag-n-drop GUI building environment for analytics Application Developers.

You can find out more about DecisionStudio Professional at decisionstudio.com, and can download your copy at Sourceforge.net. Click here to download the product brochure (PDF).

Go ahead, it’s completely free and will always stay so. ;-)

Analytics and Business Intelligence is really about the conversion of raw data into optimal and actionable decisions to create tangible business value. Otherwise, what’s the point?
(more…)

In the last post (OLAP Reporting on Open Source Software - I) we spoke about Mondrian, an open-source OLAP server.

In this post we would be setting up OLAP reporting for a hypothetical retailer called FoodMart that sells various grocery products in a chain of stores across US, Canada, and Mexico.
(more…)

OLAP (On-Line Analytical Processing) reporting systems provide what is commonly known as “slice-and-dice” functionality to non-technical end users. Users are able to see ad-hoc reports and charts to answer ad-hoc questions they may want answered. Another commonly used name is “drill-down reporting” on “OLAP cubes”. In essence this is similar to the Excel based Pivot reports, only that OLAP systems can do the same thing on massive amounts of data.

The OLAP Reporting revolves around two simple concepts: Dimensions, and Measures.
(more…)

Happy New Year, and back to work.

The Data Warehouse is the foundation of any analytics initiative. You take data from various data sources in the organization, clean and pre-process it to fit business needs, and then load it into the data warehouse for everyone to use. This process is called ETL which stands for ‘Extract, transform, and load’.
(more…)

One month, 3 hours, and 10 minutes since I last posted.

Down. But not out.

Tomorrow IS another day. Catch you then.

Merry Christmas folks. Hope you have been having a great time.
:-)

Prompted by a reader comment, this post is about that elusive difference between Analytics and regular IT. Or is there really a difference?
(more…)

« Previous PageNext Page »