<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Open Source Analytics &#187; Open Source Analytics</title>
	<atom:link href="http://opensourceanalytics.com/category/open-source-analytics/feed/" rel="self" type="application/rss+xml" />
	<link>http://opensourceanalytics.com</link>
	<description>Comprehensive Analytics on Open Source Software.</description>
	<lastBuildDate>Tue, 25 Sep 2007 15:12:42 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.3</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>KETL ETL tool Training Document</title>
		<link>http://opensourceanalytics.com/2006/07/24/ketl-etl-tool-training-document/</link>
		<comments>http://opensourceanalytics.com/2006/07/24/ketl-etl-tool-training-document/#comments</comments>
		<pubDate>Mon, 24 Jul 2006 10:42:15 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[Open Source Analytics]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/?p=51</guid>
		<description><![CDATA[KETL is an open source ETL tool by Kinetic Networks that is gaining mindshare of late.  It is currently downloadable as part of Bizgres BI project, but can be setup for other databases with a little tweaking.
KETL is different from Kettle, another open source ETL tool.  You can read more about the similar [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.kineticnetworks.com/opensrc.html">KETL</a> is an open source ETL tool by Kinetic Networks that is gaining mindshare of late.  It is currently downloadable as part of <a href="http://www.bizgres.org/home.php">Bizgres BI project</a>, but can be setup for other databases with a little tweaking.</p>
<p>KETL is different from <a href="http://www.kettle.be/">Kettle</a>, another open source ETL tool.  <a href="http://www.nicholasgoodman.com/bt/blog/2005/12/20/ketl-kettle/">You can read more about the similar names here at Nicholas Goodman&#8217;s blog.</a>  While Kettle is GUI oriented, KETL is scripted and probably more robust.</p>
<p><a href="http://www.kineticnetworks.com/KETL/KETL_Open_Source_Training.pdf">Read the KETL training doc to know more about its architecture and usage.  </a></p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2006/07/24/ketl-etl-tool-training-document/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Sales Data Mart &#8211; Dimensional Model for Retail</title>
		<link>http://opensourceanalytics.com/2006/04/28/sales-data-mart-dimensional-model-for-retail/</link>
		<comments>http://opensourceanalytics.com/2006/04/28/sales-data-mart-dimensional-model-for-retail/#comments</comments>
		<pubDate>Fri, 28 Apr 2006 06:12:08 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[DecisionStudio-Professional]]></category>
		<category><![CDATA[Open Source Analytics]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/?p=48</guid>
		<description><![CDATA[If you have followed some of the earlier posts, you would remember that a data mart is created as a star schema through a process known as dimensional modeling.  In this post we will create a dimensional model for Sales data mart at a hypothetical retailer.  
Note:  To go through the examples [...]]]></description>
			<content:encoded><![CDATA[<p>If you have followed some of the earlier posts, you would remember that a data mart is created as a star schema through a process known as dimensional modeling.  In this post we will create a dimensional model for Sales data mart at a hypothetical retailer.  <span id="more-48"></span></p>
<p><strong>Note:</strong>  To go through the examples here, you would need MySQL database and DB Designer for database modeling.  You can either install <a href="http://decisionstudio.com/product" target="_blank"><strong>DecisionStudio Professional</strong></a> (<a href="https://sourceforge.net/projects/ds-professional" target="_blank">download from sourceforge</a>), which has both MySQL and DB Designer along with many other analytics goodies, or else you can install them individually from <a href="http://mysql.com" target="_blank">MySQL website</a> and <a href="http://fabforce.net/downloads.php" target="_blank">DB Designer website</a>.  You would also need to <a href="https://sourceforge.net/projects/ds-professional" target="_blank">download the sample foodmart database</a> available along with DecisionStudio Professional.</p>
<p>Now let&#8217;s assume you are an IT person at FoodMart (a hypothetical retailer) who has decided to build a sales data mart as the first step in rolling out comprehensive analytics.  In discussions with the sales department you have figured that the <strong>no. of units sold, dollar amount of sales, and the number of unique customers </strong>in a segment are the main metrics they look at.  Digging deeper you figure that the sales guys are likely to want <strong>analysis by product, product category/class, brand, store location (city, state, region, country, &#8230;), customer demographics, and also by individual promotions and promotion categories</strong>.  It may not be explicitly mentioned, but the metrics would also be analyzed by time (day, week, month, quarter, &#8230;)</p>
<p>Now that you have figured out the business metrics to be measured, this gives you the facts you would need in the data mart <strong>&#8216;fact table&#8217; </strong>for calculating them.  Similarly you have figured out the potential segments for analysis, and that gives you the <strong>&#8216;dimensions&#8217; </strong>for analysis.  The &#8216;fact table&#8217; linked to the &#8216;dimension tables&#8217; makes up the <strong>&#8217;star-schema&#8217;</strong> (because of the star-like structure), also known as the data mart.</p>
<p>With this information in place, we have the high level <strong>Dimensional Model for Sales</strong>.<br />
<a href="http://opensourceanalytics.com/wordpress/wp-content/FoodMartDimensionalModelSales.PNG" target="_blank"><img src='http://opensourceanalytics.com/wordpress/wp-content/thumb-FoodMartDimensionalModelSales.PNG' alt='Dimensional Model for Sales Cube at FoodMart' /></a></p>
<p>Sales_Fact_1998 is the main fact table that has sales information by store/location, product, time, customer, and promotion.  Correspondingly there are 5 dimension tables joined to the fact table through foreign keys in the star-schema.</p>
<p>The dimension tables in turn have detailed data that can now be used for <strong>defining ad-hoc analysis segments</strong>.  For example, we can put demographic filters on the customer dimension (say age&lt;30, married, college-educated), choose specific product class(es) in the product table (say Dairy Products), specify a limited time period, and then get our metrics calculated for the ad-hoc segment.</p>
<p>The image below shows the detailed information available in the dimension tables for defining ad-hoc segments.<br />
<a href="http://opensourceanalytics.com/wordpress/wp-content/FoodMartSalesCubeData.PNG" target="_blank"><img src='http://opensourceanalytics.com/wordpress/wp-content/thumb-FoodMartSalesCubeData.PNG' alt='Dimensional Data for analysis in Sales Cube' /></a></p>
<p>You can <a href="http://opensourceanalytics.com/wordpress/wp-content/FoodMartDatabaseModel.xml" title='Sales Star Schema, and FoodMart database schema' target="_blank">download the data model here</a>, and then open the saved model using DB Designer in DecisionStudio Professional (Start -> Program Files -> DecisionStudio Professional -> Data Analyst -> DB Designer Workbench).  You can see other tables in the FoodMart database by scrolling around on the canvas (scroller in top-right corner).  </p>
<p>Do note that our dimensional model for sales covers only a small relevant set of tables from the entire FoodMart database.  You can load the entire downloaded FoodMart data into MySQL <a href="http://decisionstudio.com/wiki/doku.php?id=restoring_foodmart_data" target="_blank">as outlined here</a>, and can query on the data using Query Browser (Start -> Program Files -> DecisionStudio Professional -> Data Analyst -> MySQL Query Browser).  </p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2006/04/28/sales-data-mart-dimensional-model-for-retail/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
		</item>
		<item>
		<title>Open Source BI Trend Will Grow &#8211; Here&#8217;s Why</title>
		<link>http://opensourceanalytics.com/2006/03/20/open-source-bi-trend-will-grow-heres-why/</link>
		<comments>http://opensourceanalytics.com/2006/03/20/open-source-bi-trend-will-grow-heres-why/#comments</comments>
		<pubDate>Mon, 20 Mar 2006 11:38:44 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[Open Source Analytics]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/2006/03/20/open-source-bi-trend-will-grow-heres-why/</guid>
		<description><![CDATA[In this recent article called &#8220;The Open Source BI Trend Will Grow &#8211; Here&#8217;s Why&#8221; on the DM Direct Newsletter, Rick Mortensen of MARVELIt explains why open source BI is gaining traction and will continue to grow.

He rightly points out that the commercial BI vendors have added layers and layers of &#8216;advanced features&#8217; that add [...]]]></description>
			<content:encoded><![CDATA[<p>In this recent article called <a href="http://www.dmreview.com/article_sub.cfm?articleId=1050215">&#8220;The Open Source BI Trend Will Grow &#8211; Here&#8217;s Why&#8221;</a> on the DM Direct Newsletter, Rick Mortensen of MARVELIt explains why open source BI is gaining traction and will continue to grow.<br />
<span id="more-36"></span><br />
He rightly points out that the commercial BI vendors have added layers and layers of &#8216;advanced features&#8217; that add to the complexity and cost, making their software unsuitable for small and medium-size businesses (SMBs).  Targeted only at Enterprise users, commercial software do not provide sufficient ROI for SMBs which do not meed most of the features, but would still have to pay for them in the license cost.  </p>
<p>Open source BI helps SMBs by <strong>reducing the total cost of ownership </strong>by eliminating license costs.  It provides <strong>greater flexibility</strong> as you can pick and use only what you want.  This in turn helps keep implementations short and manageable.</p>
<p>Rick correctly points out that <strong>simplicity </strong>and focus on the essential is the key to promote adoption of open source BI.  Adding unnecessary features and functions adds to the complexity and implementation costs.  It is important for open source BI to keep this in mind.</p>
<p>Thanks Shane, for mailing me this link.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2006/03/20/open-source-bi-trend-will-grow-heres-why/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Econometrics on R Environment</title>
		<link>http://opensourceanalytics.com/2006/03/07/econometrics-on-r-environment/</link>
		<comments>http://opensourceanalytics.com/2006/03/07/econometrics-on-r-environment/#comments</comments>
		<pubDate>Mon, 06 Mar 2006 20:56:15 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[DecisionStudio-Professional]]></category>
		<category><![CDATA[Modeling]]></category>
		<category><![CDATA[Open Source Analytics]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/2006/03/07/econometrics-on-r-environment/</guid>
		<description><![CDATA[I would, in due course, spend some time on the R Environment (available as part of DecisionStudio Professional, and separately downloadable from http://r-project.org).  R provides an excellent alternative to commercial products for modeling, statistical analysis, and graphics.  Originally designed by AT&#038;T Bell Labs, the R environment is fast becoming the standard for cutting [...]]]></description>
			<content:encoded><![CDATA[<p>I would, in due course, spend some time on the R Environment (available as part of <a href="http://decisionstudio.com/product">DecisionStudio Professional</a>, and separately downloadable from <a href="http://r-project.org">http://r-project.org</a>).  R provides an excellent alternative to commercial products for modeling, statistical analysis, and graphics.  Originally designed by AT&#038;T Bell Labs, the R environment is fast becoming the standard for cutting edge number crunching.<br />
<span id="more-32"></span><br />
Bio-informatics is one area of dominance, and a lot of academic papers these days come with R based/compatible implementations.  (So you don&#8217;t have to wait for your commercial BI vendor to first discover that new technique, make up its mind about it, and then finally get a bunch of disinterested code-monkeys to write a new library for, luckily just in time for the release after next).</p>
<p>While R is used a lot for statistical research, its adoption for econometrics has been comparatively slower, possibly due to terminology differences.</p>
<p>An excellent text called <a href="http://cran.r-project.org/doc/contrib/Farnsworth-EconometricsInR.pdf">&#8220;Econometrics in R&#8221; by Grant V. Farnsworth (PDF)</a> provides a effective hands-on intro to the most common things you would find yourself doing including Time Series, regressions, Plotting, etc.  Have a look.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2006/03/07/econometrics-on-r-environment/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>DecisionStudio Professional &#8211; Desktop BI Platform</title>
		<link>http://opensourceanalytics.com/2006/02/28/decisionstudio-professional-desktop-bi-platform/</link>
		<comments>http://opensourceanalytics.com/2006/02/28/decisionstudio-professional-desktop-bi-platform/#comments</comments>
		<pubDate>Tue, 28 Feb 2006 17:39:34 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[BI, Data Mining, Analytics]]></category>
		<category><![CDATA[Data Warehousing]]></category>
		<category><![CDATA[DecisionStudio-Professional]]></category>
		<category><![CDATA[Modeling]]></category>
		<category><![CDATA[On Your Own]]></category>
		<category><![CDATA[Open Source Analytics]]></category>
		<category><![CDATA[Reporting]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/2006/02/28/decisionstudio-professional-desktop-bi-platform/</guid>
		<description><![CDATA[On Friday we released DecisionStudio Professional &#8211; a comprehensive and free desktop BI Platform that gives you all the tools needed for analytics under a single package licensed under GNU Public License (GPL).
DecisionStudio Professional (DSP) is an advanced graphical data mining, reporting, modeling, and analysis environment built on top of the best-of-breed open source projects. [...]]]></description>
			<content:encoded><![CDATA[<p>On Friday we released <a href="http://decisionstudio.com/product" target="_blank"><strong>DecisionStudio Professional</strong></a> &#8211; a comprehensive and free <strong>desktop BI Platform </strong>that gives you all the tools needed for analytics under a single package licensed under <strong>GNU Public License (GPL).</strong></p>
<p>DecisionStudio Professional (DSP) is an advanced <strong>graphical data mining, reporting, modeling, and analysis environment </strong>built on top of the best-of-breed open source projects.  Some of these include:<br />
      &#8212;  <strong>Optimized MySQL database </strong>as data warehouse platform<br />
      &#8212;  <strong>SQL Workbench</strong> (MySQL Query Browser and DBDesigner) for Data Analysts<br />
      &#8212;  <strong>R environment </strong>for statistical analysis and modeling<br />
      &#8212;  <strong>iReport </strong>Reporting GUI and <strong>JasperReport </strong>reporting library<br />
      &#8212;  <strong>Python </strong>with <strong>Boa Constructor IDE </strong>for application and GUI development</p>
<p>DecisionStudio Professional is the only <strong>end-to-end open source analytics platform </strong>that provides comprehensive capabilities to each role.  Data Analysts get to store, process, and publish data on a standard MySQL platform; Reporting Analysts would like iReport and the integration with Office tools; and Modelers would love the excellent R Environment.  It also includes Python along with a drag-n-drop GUI building environment for analytics Application Developers.</p>
<p>You can <a href="http://decisionstudio.com/product" target="_blank"><strong>find out more about DecisionStudio Professional at decisionstudio.com</strong></a>, and can <a href="https://sourceforge.net/projects/ds-professional" target="_blank"><strong>download your copy at Sourceforge.net</strong></a>.   <a href="http://decisionstudio.com/site/wp-content/decisionstudio-professional.pdf" target="_blank">Click here to download the product brochure (PDF).</a> </p>
<p>Go ahead, it&#8217;s completely free and will always stay so.  <img src='http://opensourceanalytics.com/wordpress/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2006/02/28/decisionstudio-professional-desktop-bi-platform/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>The Analytics Value Chain</title>
		<link>http://opensourceanalytics.com/2006/02/24/the-analytics-value-chain/</link>
		<comments>http://opensourceanalytics.com/2006/02/24/the-analytics-value-chain/#comments</comments>
		<pubDate>Fri, 24 Feb 2006 12:11:01 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[BI, Data Mining, Analytics]]></category>
		<category><![CDATA[Open Source Analytics]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/2006/02/24/the-analytics-value-chain/</guid>
		<description><![CDATA[Analytics and Business Intelligence is really about the conversion of raw data into optimal and actionable decisions to create tangible business value.  Otherwise, what&#8217;s the point?

Starting from disciplines like machine learning and statistics, Analytics today has taken on a meaning and a life of its own.  While the earlier efforts were largely technique [...]]]></description>
			<content:encoded><![CDATA[<p>Analytics and Business Intelligence is really about the conversion of raw data into optimal and actionable decisions to create tangible business value.  Otherwise, what&#8217;s the point?<br />
<span id="more-29"></span><br />
Starting from disciplines like machine learning and statistics, Analytics today has taken on a meaning and a life of its own.  While the earlier efforts were largely technique oriented, BI today is far more Business oriented for better productivity and to justify for the immense costs involved (if you are using commercial tools).  As a result the processes that make up the analytics effort have undergone dramatic changes.</p>
<p>An Analytics group of yesteryear was usually made up of statisticians (and technologists) that would huddle together in their corners, engaged in a black magic which was ill understood by the business.  Instead of breaking down information silos, Analytics shops were themselves becoming silos that worked on their own agendas and often had ill-concealed disdain for &#8216;business&#8217;.  This has changed over time and an effective analytics group cannot afford to be disconnected from the business.</p>
<p>While the processes have matured in Analytics today, unfortunately the commercial tools have not, and still continue to make some invalid assumptions.  Foremost among these is the assumption that all of Analytics is usually done by a single person (or a bunch of similar know-it-all-magicians).</p>
<p>This is not true.  </p>
<p>If Analytics delivers Business Value (as everyone seems to be loudly proclaiming these days), there must be a value chain!  And there must be some distinct roles and activities that lead to the creation of this value (in the form of better decisions that ultimately make more money for the business)!!  </p>
<p><img src="http://opensourceanalytics.com/wordpress/wp-content/AVC.JPG" alt="The Analytics Value Chain" /></p>
<p>The Analytics Value Chain depicted above essentially shows the distinct roles that would exist in an Analytics group.  There could be more roles, or a single individual might be handling multiple roles, but these are needed to ensure end-to-end delivery of comprehensive analytics.  If you have been doing analytics, the diagram should be self-explanatory, otherwise I&#8217;ll elaborate upon it in the next post.</p>
<p>Did the pre-sales guy from that analytics vendor tell you that?  Or did he push a product that can do it all but no one can fully comprehend and use?  </p>
<p>For sanity&#8217;s sake, always keep the value chain in mind.  If nothing else, it would mean you&#8217;d be clearer about what those twenty analysts are doing staring at those screens all day long.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2006/02/24/the-analytics-value-chain/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>OLAP Reporting on Open Source Software &#8211; II</title>
		<link>http://opensourceanalytics.com/2006/02/10/olap-reporting-on-open-source-software-ii/</link>
		<comments>http://opensourceanalytics.com/2006/02/10/olap-reporting-on-open-source-software-ii/#comments</comments>
		<pubDate>Fri, 10 Feb 2006 11:59:09 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[Open Source Analytics]]></category>
		<category><![CDATA[Reporting]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/?p=27</guid>
		<description><![CDATA[In the last post (OLAP Reporting on Open Source Software &#8211; I) we spoke about Mondrian, an open-source OLAP server.  
In this post we would be setting up OLAP reporting for a hypothetical retailer called FoodMart that sells various grocery products in a chain of stores across US, Canada, and Mexico.

Assuming that you are [...]]]></description>
			<content:encoded><![CDATA[<p>In the last post (<a href="http://opensourceanalytics.com/2006/02/07/olap-reporting-on-open-source-software-i/">OLAP Reporting on Open Source Software &#8211; I</a>) we spoke about Mondrian, an open-source OLAP server.  </p>
<p>In this post we would be setting up OLAP reporting for a hypothetical retailer called FoodMart that sells various grocery products in a chain of stores across US, Canada, and Mexico.<br />
<span id="more-27"></span><br />
Assuming that you are running MS Windows on your machine, we would need:</p>
<ol>JDK 1.4.2 or above downloadable from <a href="http://java.sun.com/j2se/1.5.0/download.jsp">Sun&#8217;s java download page</a> (I used j2sdk1.4.2)</ol>
<ol>Microsoft Access to act as the data store (drop a comment if you do not have MS-Access and I can provide detailed instructions on setting it up on another database such as MySQL)</ol>
<ol>Tomcat 5 from <a href="http://tomcat.apache.org/download-55.cgi">apache.org</a>.  Go to the bottom and download Windows Binary Distribution (I am using jakarta-tomcat-5.0.28.exe).</ol>
<ol>Mondrian along with FoodMart data from <a href="http://prdownloads.sourceforge.net/mondrian/mondrian-2.0.1.zip?download">sourceforge.net</a></ol>
<p>So now that you have downloaded all the software required, we can go about setting things up.</p>
<ol> Install <strong>JDK</strong> (say in c:\j2sdk1.4.2).  This may require you to reboot your machine a couple of times.  Do not install Tomcat yet.</ol>
<ol> Right click on My Computer, select Properties -> Advanced -> Environment Variables.  Create a new environment variable <strong>JAVA_HOME</strong> and point it to your JDK instalation (c:\j2sdk1.4.2)</ol>
<ol> Now install <strong>Tomcat </strong>(say in C:\Program Files\Apache Software Foundation\Tomcat 5.0).  It should get installed as a service.</ol>
<ol> Go to Control Panel -> Administrative Tools -> Services to check if Tomcat5 service is running.  If not, <strong>start the service</strong>.</ol>
<ol> Unzip Mondrian2.0.1.zip somewhere (say C:\Mondrian).  Copy <strong>mondrian.war</strong> file from C:\Mondrian\mondrian-2.0.1\lib folder and place it in the <strong>webapps </strong>folder of your tomcat installation (C:\Program Files\Apache Software Foundation\Tomcat 5.0\webapps).</ol>
<ol> Within the unzipped Mondrian (in demo/access folder), you fill find a MS Access database file called MondrianFoodMart.mdb.  This is where the data is going to get picked up from once we create a ODBC DSN.</ol>
<ol> Go to Control Panel -> Administrative Tools -> Data Sources (ODBC) -> System DSN.  Create a new ODBC DSN called <strong>ModrianFoodMart</strong> pointing to the MondrianFoodMart.mdb above.  Make sure you get this step correct (drop a comment if you need help).  </ol>
<p>Okay, now time to go see the brand new reports by pointing your browser to <a href="http://localhost:8080/mondrian/">http://localhost:8080/mondrian/</a>.  If everything has gone right, you should see a page with a couple of links there.  Click the first link called &#8220;JPivot pivot tables&#8221; under the heading &#8220;Mondrian Examples&#8221;.  You will see a nice little report showing Unit Sales, Store Cost, and Store Sales for our FoodMart.</p>
<p><img src="http://opensourceanalytics.com/wordpress/wp-content/JPivot_01.jpg" alt="JPivot Pivot Table" /></p>
<p>The user-interface is quite intuitive, so play around by clicking on various buttons.  Do not forget to try out the top-left button (OLAP Navigator) that allows you to define your own reports by selecting measures, rows, columns and filters.</p>
<p>Have a go.  And do let me know.  <img src='http://opensourceanalytics.com/wordpress/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2006/02/10/olap-reporting-on-open-source-software-ii/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>OLAP Reporting on Open Source Software &#8211; I</title>
		<link>http://opensourceanalytics.com/2006/02/07/olap-reporting-on-open-source-software-i/</link>
		<comments>http://opensourceanalytics.com/2006/02/07/olap-reporting-on-open-source-software-i/#comments</comments>
		<pubDate>Tue, 07 Feb 2006 08:00:18 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[Open Source Analytics]]></category>
		<category><![CDATA[Reporting]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/?p=26</guid>
		<description><![CDATA[OLAP (On-Line Analytical Processing) reporting systems provide what is commonly known as &#8220;slice-and-dice&#8221; functionality to non-technical end users.  Users are able to see ad-hoc reports and charts to answer ad-hoc questions they may want answered.  Another commonly used name is &#8220;drill-down reporting&#8221; on &#8220;OLAP cubes&#8221;.  In essence this is similar to the [...]]]></description>
			<content:encoded><![CDATA[<p><strong>OLAP </strong>(On-Line Analytical Processing) reporting systems provide what is commonly known as &#8220;<strong>slice-and-dice</strong>&#8221; functionality to non-technical end users.  Users are able to see ad-hoc reports and charts to answer ad-hoc questions they may want answered.  Another commonly used name is &#8220;<strong>drill-down reporting</strong>&#8221; on &#8220;OLAP cubes&#8221;.  In essence this is similar to the Excel based Pivot reports, only that OLAP systems can do the same thing on massive amounts of data.</p>
<p><strong>The OLAP Reporting revolves around two simple concepts:  Dimensions, and Measures.<br />
</strong><span id="more-26"></span><br />
A <strong>Measure </strong>is any <strong>business metric </strong>that you want to calculate and report on.  It can range from simple stuff like Dollar Sales (total sales revenue), Unit Sales (no of units for a particular product sold), Profits, Current Inventory Level, etc. to more complicated computed fields like Average Cost, Average Profit per Unit, % Margin, etc.  </p>
<p>It should be noted that a Measure can have different values depending upon the context &#8211;  whether you are looking at All Locations or a particular City, or whether you are looking at annual figures or weekly figures, or daily figures, whether you are looking at a particular demographic segment, etc.  A measure&#8217;s value depends upon the specific subset of the entire database that you are looking at (hence the term &#8220;slice-and-dice&#8221; commonly used with &#8220;cubes&#8221;).  The specific context for the calculation of Measures is provided by Dimensions.</p>
<p><strong>Dimensions </strong>help subset the database to get to the specific information you are interested in.  You may want to subset by Geography, Time, Product/Service Offering, Customer Demographics, etc.  For example, I may want to analyze Unit Sales (Measure) for a particular product (specified on Product Dimension) in New York (specified on Geography Dimension) for the month of Jan 2006 (Time Dimension).  I could then drill-down on Time Dimension to get the daily sales for the product.</p>
<p>OLAP tools provide this functionality through a user friendly GUI.  The commercial OLAP systems are extremely expensive, but that would change with the advent of open-source OLAP tools.</p>
<p>An excellent open-source OLAP server is <a href="http://mondrian.sourceforge.net">Mondrian</a> that allows you to do OLAP cube reporting through a JSP web-based interface called <a href="http://jpivot.sourceforge.net">JPivot</a>.</p>
<p>The two together provide an excellent reporting system that can be quickly deployed on an existing datawarehouse to deliver impressive reports to end users.</p>
<p>In the <a href="http://opensourceanalytics.com/2006/02/10/olap-reporting-on-open-source-software-ii/">next post </a>we will set up a quick reporting system using Mondrian, JPivot, Tomcat and MySQL/MS-Access.</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2006/02/07/olap-reporting-on-open-source-software-i/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Consumer Finance Analytics</title>
		<link>http://opensourceanalytics.com/2005/11/25/consumer-finance-analytics/</link>
		<comments>http://opensourceanalytics.com/2005/11/25/consumer-finance-analytics/#comments</comments>
		<pubDate>Fri, 25 Nov 2005 05:25:28 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[Open Source Analytics]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/2005/11/25/consumer-finance-analytics/</guid>
		<description><![CDATA[Sorry guys to have been AWOL all this while.  Some hectic travelling followed by some ill health and a lot of overdue work has been keeping me away.  Having said that, I still should have posted at least a line to inform the regular visitors.  By the way, it might be a [...]]]></description>
			<content:encoded><![CDATA[<p>Sorry guys to have been AWOL all this while.  Some hectic travelling followed by some ill health and a lot of overdue work has been keeping me away.  Having said that, I still should have posted at least a line to inform the regular visitors.  By the way, it might be a good idea to grab the <a href="http://opensourceanalytics.com/feed/">RSS feed </a>for this blog, so that you get informed whenever its getting updated.</p>
<p>Now back to work.</p>
<p>A concern some of you have raised about the Consumer Finance data model we discussed (see <a href="http://opensourceanalytics.com/2005/10/31/open-source-analytics-business-intelligence-in-a-month-using-free-software/">Open Source Analytics in a month</a> and subsequent posts) is that it appears far too simple to be able to deliver any analytical value to the business.  Wouldn&#8217;t we be needing the behavioral, payment, response, clickstream, usage data, blah-this, blah-that, and blah-other in order to deliver any value?  Isn&#8217;t a three table (Loan, Customer, and Marketing) demo too simplistic to be of any real use?</p>
<p>This post is really about answering this.  Get out of the hype-victim mode and start thinking.  If you look close enough there is enough you can do with just this much data. And in this post we explore just that.  <span id="more-16"></span></p>
<p>Let us look at various categories of analytics we could do with this data.</p>
<p>1.  Operational Sales Reports and MIS</p>
<li>Daily, Weekly, Monthly Loan Bookings</li>
<li>Disbursal Reports and Portfolio Growth</li>
<li>Analysis of No of Loans and Loan Amount by Loan Product, Interest Rate Slabs, Customer Demographics</li>
<li>Asset Reports</li>
<p>2.  Credit and Collections Reports</p>
<li>Exposure Analysis by Customer Segments and Loan Products</li>
<li>Portfolio Growth Analysis by Segments and Loan Products</li>
<li>Delinquency Analysis and Tracking</li>
<li>Roll-rate analysis (rates of improvement/worsening for delinquent loans)</li>
<p>3.  Predictive Modeling and Customer Segmentation</p>
<li>Credit Scoring</li>
<li>Marketing Promo Analysis</li>
<li>Promo and Capmpaign optimization</li>
<p>I just put this list down in 10 minutes.  There&#8217;s a lot more that can be done, but this is just to give you a sense of what all is pssible even with seemingly limited data.  You just need to look closer and put yourself in the other person&#8217;s shoes.</p>
<p>Okay, so now we have done away with your top two excuses for not doing analytics yet.</p>
<li>Analytics is not accessible/affordable:  That&#8217;s history now.  This blog addresses that problem of yours.</li>
<li>You do not have enough data:  Whatever you have is good enough to begin with.  If you are not convinced, tell me what data you have and I&#8217;ll tell you 10 things you could do with it, for free!!</li>
<p>Any more excuses?  Cool, we can now go ahead and start doing things.</p>
<p>Keep watching&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2005/11/25/consumer-finance-analytics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Consumer Finance Data Model &#8211; II</title>
		<link>http://opensourceanalytics.com/2005/11/10/consumer-finance-data-model-ii/</link>
		<comments>http://opensourceanalytics.com/2005/11/10/consumer-finance-data-model-ii/#comments</comments>
		<pubDate>Thu, 10 Nov 2005 05:46:48 +0000</pubDate>
		<dc:creator>Nishith</dc:creator>
				<category><![CDATA[Open Source Analytics]]></category>

		<guid isPermaLink="false">http://opensourceanalytics.com/2005/11/10/consumer-finance-data-model-ii/</guid>
		<description><![CDATA[For those who joined the party late, this blog is about doing advanced analytics with free open source tools.  And this month we are developing a completely free analytical solution for a hypothtical consumer finance company.  You can get more details in the Open Source Analytics Category.
Today we shall design a Data Warehouse [...]]]></description>
			<content:encoded><![CDATA[<p>For those who joined the party late, this blog is about doing advanced analytics with free open source tools.  And this month we are developing a completely free analytical solution for a hypothtical consumer finance company.  You can get more details in the <a href="http://opensourceanalytics.com/category/open-source-analytics/">Open Source Analytics Category</a>.</p>
<p>Today we shall design a Data Warehouse for the data. </p>
<p>In <a href="http://opensourceanalytics.com/2005/11/06/the-consumer-finance-data-model/">Consumer Finance Data Model &#8211; I</a>, we described what AFS (our fictitious consumer finance organization) does, and the main pieces of information it deals with.  To summarize, here are the pieces of data reproduced below:  <span id="more-14"></span>  </p>
<ul><strong>Loan Level (specified for each loan):</strong></ul>
<blockquote><p>LoanID, Product (auto/Personal/etc.), Amount Finance, Asset Cost, Loan Duration, Interest Rate, EMI Amount, Disbursal Date,  and No of Installments.  Also, POS (Principal Outstanding), DelPOS, Bucket, and Late Fee.</p></blockquote>
<ul><strong>Customer Level (specified for each customer):</strong></ul>
<blockquote><p>CustomerID, Name, Address, email, Age, Sex, Marital Status, Educational Qualification, Income, Years in Current Job, etc.</p></blockquote>
<ul><strong>Marketing Information:</strong></ul>
<blockquote><p>CustomerID, PromoID (uniquely identifies the prmotion or marketing offer), PromoDate, Product, and ConversionFlag (True if customer took up the offer within one month, False otherwise).</p></blockquote>
<p>Listing down the data elements in this manner gives us a direct insight what the basic tables in the DW would be ad also their structure.  We now know that we need at least thee tables, namely:  Loan_Table, Customer_Table, and Promo_Table.  And you guessed it right, we will be doing this on MySQL, the most widely used open source database.</p>
<p>Homework:<br />
1)  Download and install <a href="http://dev.mysql.com/downloads/mysql/4.1.html">MySQL Server</a> and <a href="http://dev.mysql.com/downloads/query-browser/1.1.html">MySQL Query Browser</a> from theMySQL website.<br />
2)  Open MySQL Query Browser (connected to the just installed server) and create the three tables above.  Query Browser is an excellent GUI tool and it shouldn&#8217;t be tough to find your way around.</p>
<p>If you are getting stuck, and/or need more specific help, please drop a comment below and I&#8217;ll help you out.  </p>
<p>Keep walking&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://opensourceanalytics.com/2005/11/10/consumer-finance-data-model-ii/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
