Author – Asad Mahmood, Principal Solution Architect, itelligence UK.
I have been working with SAP Data Warehouse Cloud since its inception and procrastinating over my first post. With the festive season upon us, an imminent Christmas break and DWC going GA last week, there is no better time!
DWC is an exciting product from SAP. In my view, DWC will become a very capable standalone Data Warehousing solution over the coming months. Its current form lends itself to a Hybrid solution, which can interoperate with an existing BW Data Warehousing system to provide users with flexibility in data modelling and deploying new solutions using the rich visualisation capabilities included as part of this release. I also feel that it fills a particular void that SAP Analytics Cloud customers are experiencing when it comes to accessing, transforming and modelling data sources. It’s particularly helpful to have ABAP, HANA and OData supported as part of the remote data sources. This is where data is kept in the remote system through the use of Smart Data Access Technology.
For the purposes of this blog, I will be using authentic government crime statistics as part of the ensuing illustrations. The complete crime statistics are available for individual Police Forces and operations such as Stop and Search. For the purposes of this blog, I will be using a small and random selection of data. This is not intended to create a misleading picture of crime or performance statistics relating to any individual Police Force or passage of time. I figured that this would be a more interesting dataset than the usual cache of bike or jersey sales in the US – as exhilarating as that can be of course.
My first dataset comes from “GOV.UK” and provides Crime Data by month and across individual Police Forces. I have restricted my data to Thames Valley Police and it relates to the period July 2019 to October 2019. Upon downloading this dataset, I have created a new Space in DWC. Spaces are a great feature, which allow IT to centrally create and manage an area for individual projects. This can be allocated appropriate resources and monitored over the life of the project. Examples may include Promotional Planning, Resource Allocation and New Product Development. They tend to be transient analysis. In our situation, we have created a Space for the Crime Statistics analysis:
You can also assign users to this Space and limit it to use specified connections. These connections can refer to remote sources such as your Enterprise Data Warehouse and consequentially, allow users to combine this data with other datasets that they may want to include in this analysis. It is also possible to use an IDE to connect to the underlying HANA instance and create/populate custom tables. This is supported through the “Open SQL Schema” functionality of DWC.
To begin my analysis, I will bring through my Crime data and perform the necessary transformation and modelling. These activities are undertaken in the “Data Builder”. This offers various Graphical and SQL options. We will use the CSV option, which supports the Pipe delimiter amongst many other delimiters. I also had to alter some data types – in preparation for the creation of Measures and Attributes. This data is now loaded into DWC and represented as a Table. This allowed me to provide user-friendly names for columns and measures before I “Deploy” this to DWC. Until this point, the development is localised to the user.
Note that there are options to connect to remote data sources such as an Enterprise Data Warehouse e.g. BW/4HANA, SAP HANA, etc. This uses the Data Provisioning Agent, as is the case with SAP HANA.
I can now embellish this further by using the “Graphical View”. This is where data can be modelled prior to its consumption. More on this later.
Let’s now switch to “Story Builder” and see what we’ve concocted so far…
This is ostensibly SAP Analytics Cloud embedded within DWC and is affected by the same Live Connection limitations:
The above shows us the breakdown of the crime count by the Crime Type and the graphic below shows us the crime breakdown in July 2019. The orange bar represents the crimes that are currently under investigation. For instance, we can see that from the 5,542 Violence and Sexual Offences, 1,515 are currently under investigation. This is achieved through the Restricted Measures functionality within the Story Builder.
This measure was created by using the “Restricted Measure” calculation type in Story Builder. Additional Measures can be created within Stories by following a similar approach.
This is evidently helpful as end users can easily create measures to identify key insights within their datasets. However, these measures are confined to a single Story and hence, have to be repeated across individual Stories and thereby run the risk of inconsistency. For measures, that are likely to be consumed by multiple users, it is possible to create such measures in the DWC Model. Let’s take a closer look at this capability. In this scenario, I will create a measure, which identifies Violence and Sexual Offences. This measure can then be used across any Stories and in conjunction with any other measures/dimensions.
Firstly, if you are conversant with SQL, you may choose to use the SQL View and achieve this through a CASE statement over your Table as shown here:
Alternatively, there is also a Function Builder available in the Graphical View.
I must confess that I was unable to find a CASE or IF function amongst the currently supported 151 functions. However, a CASE Function is supported – as are other ANSI Compliant SQL functions. Once scripted, you are able to proceed with saving and deploying the Model.
Let’s dip back into the “Story Builder” for an analysis of the crime data using this new Measure.
The above reveals the proportion of Violence and Sexual Offences there have been across these months and similarly, the below dissects these crimes by the current outcome.
I am reliably informed by my colleagues at SAP, that DWC now provides a full version of SAC which can connect into DWC and as part of this support features such as geospatial analysis. Here is an example of the type of output that could be produced from a similar dataset:
To finish, it can often be helpful to add additional context into analysis such as this. DWC models can be extended to allow users to supplement this data or other organisational data with contextual datasets to help users understand a given situation and ultimately, drive informed decisions. Keeping with this scenario, we may be interested in understanding the Stop and Search activities and how that has affected the above situation.
Helpfully, GOV.UK also provides data on Stop and Search operations. This is split by month and Police Force. Once downloaded, this can be dropped on to our SQL Graphical View and subsequently combined, transformed and modelled as part of the Data Builder functionality. Here is an example of this:
These additional Measures and Attributes can now be analysed across the existing dataset and shared with other users.
In summary, this scenario demonstrates how data can be sourced, combined and transformed before it is analysed and acted up on.