August 31, 2011 11:57 AM
Getting to the real value of data through Agile BI
While not directly related to the notion of Agile Development Methodologies, agility in BI shares many things with that concept:
- A desire to increase the pace of quantitative
analysis, like the increased pace of iterative development in agile development
- A desire to make it easy to introduce new data sources, explore that data, relate that data to existing sources, and elevate useful analyses and models to production use similar to the rapid build-test-deploy cycles in agile methodologies.
Driving this desire is the recognition of the strategic value of data. If we think about a simple predictive modelling problem, the introduction of more training data typically improves our ability to make useful predictions. The rising tide of more data makes all algorithms more effective, while tweaking technique by improving algorithms only results in incremental improvements. Sometimes the incremental improvements from technique can be very valuable, for example those implemented by Wall Street, but in many cases adding additional data to your world view is more beneficial to your strategic objectives than tweaking algorithms ad infinitum.
The new class of emerging data platforms like Hadoop, Cassandra, Riak and others is providing flexibility with respect to data model which many a BI manager dreams about. They allow analysts to introduce new data easily, play with it and adjust the model as their data needs evolve. Traditional BI systems wanted the data model and relationships defined up front, constraining an organisation’s ability to make use of additional data by making it nearly impossible to introduce new data or implement changes to the existing data model. The constraints around BI deliver a single version of the truth, which is why we put up with them, but it is a very limited view of the truth.
New patterns are emerging for how we can position these emerging data platforms next to an organisation’s existing, highly structured Enterprise Data Warehouse. Taking advantage of connectors between existing environments and these new, relatively unstructured and semi-structured stores, methods of rationalising data across, and executing analytics over both traditional and emerging sources are being introduced by both open source projects and the titans of data warehousing alike.
These technologies and design patterns hold the promise of making it easier to introduce more data to an organisation’s world-view. This makes existing data more valuable, driving improved ROI from the whole IT stack and enabling new and better outcomes for the organisations nimble enough to take advantage.
Now that’s agile BI.
By John Akred, Data Management & Emerging Data Platforms R&D Lead