Challenges of the Data Platform
Data platforms of the yester years have faced some traditional challenges, which most of us have experienced at some point in our lives if you interacted with data in any way. They are nothing like the modern data platforms of today.
Excel was the preferred “BI Tool” (and in some organizations it still may be one of them) and the challenges while expressed in various forms really all came down to:
- Reporting on data
- Speed of reports – generation and performance
- Consistency of data
- Combining of data
These are simplified statements but there was a lot more to it as you all know. Reporting on data usually meant querying a warehouse with sql report or in the SAP world writing Bex queries. In the BW world it was further complicated by the fact that there were layers of summarization of physical data so that the queries were able to run and present data. Which in turn meant that the ETL windows were longer and if there was a data set which did not exist in the “Cubes” they had to be configured all the way thru those layers and data loaded. Which all lead to the frequently heard statements of “BW Sucks” and “reporting requirements going into a black hole”.
Real time data was a pipe dream for most analysts and if there was a need to view real time data that meant writing a script or ABAP report on the operational system which was a time consuming process in itself and sometimes numbered in hundreds or even more than a thousand depending upon the side and business complexity of the organization.
These challenges lead to users typically mashing up data in excel, which had its own set of challenges, like ‘version of truth’, etc. but was a better alternative than to wait for a few month’s to get a report back from IT.
These challenges made us very creative for sure. We created table spaces, “Cubes” by customers or sometimes by quarter for large data sets and volume, created a lot of aggregates which were loaded with the ETL process and pre-populated queries so performance would be fast next day, etc. Remember those days?
However, with this creativity the bar seemed to keep getting higher. Meaning as each challenge was taken care of that became the new normal and the data needs (along with the multiple V’s – Volume, Velocity, Variety, etc.) grew.
This resulted in a whole new set of data requirements which now needed to make the data visual, drillable, combinable, analysing unstructured data, etc. All in the hope of finding that nugget which will answer their business question or solve a business challenge. And just like with the traditional data platforms, there were new sets of challenges and complexities. Numerous tools, specialized skills needed, complexity of tools, and data volume growth resulting in further report performance enhancement creativity. Data governance and management took on a whole new meaning and importance. Organizations wanted the right data, at the right time, at the right place and in the right format for the right person. Information instead of Data and Insights versus data reports became the words of the day. Now we could not only look at what happened but there was a potential to look forward to make informed assumptions on what might happen.
Fast forward to today, data related challenges morphed into the same set of challenges, but on steroids:
- Insights on all data assets – Structured or unstructured
- Speed of insights – analyse data as and when needed and performance is sub seconds
- Real time – transactional or strategic data analysis
- Data management – even more critical
There is no denying the value of data assets available to us, and the market and consumer expectations and behaviours clearly are setting the pace and standard for insights. Data lakes concept is synonymous with Hadoop. But if we really think about it, that’s what the data analysts were really looking for and what the various tools and data platforms with our added creativity had been trying to achieve.
The rest of this blog series will explore in detail how the Modern Data Platform has evolved to address the evolving challenges.
We will also explore best practice development and implementation techniques that have evolved to support their successful deployment.
You can also expect to hear real world, cross industry, practical examples of the Modern Data Platform in action. To make sure you don’t miss out on any of the blogs in this series, sign up to receive all of these blogs! You can also register to attend The Modern Data Platform Webinar on Wednesday, March 15th at 2 PM EST, 1 PM CST, 11 AM PST to learn more!