Written by Asad Mahmood, Head of Database and Technology Consulting, itelligence UK
I stopped over at one of the prominent SAP HANA stands before making my way for what continues to be fine lunch at SAP SAPPHIRE and had a very interesting discussion and demonstration from the equally helpful guys on the stand. I wanted to share with you the extent of support you can expect from SPS5 around unstructured data and how I envisage this helping businesses.
As discussed in my previous blog SPS5 brings with it a spectacular array of capabilities which will help businesses solve their data challenges. Having worked in Data Warehousing and Business Intelligence for many years, the perennial problem continuing to spook businesses is the growing volumes of unstructured data. A simple search on your preferred search engine will provide a measure of this. Here is the first result I fetched during my search:
- 80 percent of business is conducted on unstructured information
- 85 percent of all data stored is held in an unstructured format
- Unstructured data doubles every three months
- 7 million web pages are added every day.
We often talk about unlocking value from data and delivering information in the hands of the business users but the reality is that this has overwhelmingly focussed on the structured data deposits and in the shadows of this success looms an untapped source of value and competitiveness locked away in the unstructured counterparts. Of course we have made strides through the use of API interfaces with social media networks, text processing capabilities during the Extract Transform and Load (ETL) operations and working with partners such as Netbase in the Social Intelligence space but we have been stretched when trying to match the sophisticated analytical capabilities we enjoy over structured data when it comes to addressing unstructured data.
So what’s changed I hear you asking? I have already mentioned the embedment of the Text Analysis processing in the SAP HANA platform, along with the announcement of Extended Services (XS). You may already be aware that we have Binary Large Object (BLOB) support in the platform. When these capabilities are considered collectively, we have everything to solve our perennial challenge except a suitable business interface allowing the most important part of the process to take place, an actionable insight. Well, we have it now!
SPS5 will ship with what has been described as a HTML5 “Framework” based upon the new XS engine in SAP HANA which can quickly be adapted to provide a web based user interface which allows a user to enter a search term and as the search term is being populated, suggestions are continuously provided to the user based upon the data that resides in SAP HANA; much like the bing or Google search functionality. The search terms entered by a user or provided as suggestions whilst the user is populating the search fields emanate from the SAP HANA contents but this is where the sophistication of the BLOB support and Text Processing must be considered.
The collective power of these components allow the SAP HANA engine to present nouns, be that names of places, people, job titles, etc. along with the text contents of a document, be that a Microsoft Word document, PDF, etc. The contents of such documents are included in a Full Text index which resides in memory. Therefore, this very simple but exceptionally powerful search capability allows a business user to search through both structured and unstructured data stored in SAP HANA through an interface which is akin to any familiar web based search engine. As a result of the user defined search, the results are returned along with corresponding analytical facets which are customisable and resemble the type of visualisation one would experience in SAP BusinessObjects Explorer. The user can then select the appropriate search result and drill into the detail and as part of this, fetch out and open any corresponding documents. The associative capabilities built into this framework also allow a user to view similar search results again akin to the type of experience one would expect in a conventional search engine.
Potential use cases may include the need to simply and seamlessly search across both structured and unstructured data to identify information relating to a given term but also extends through to sophisticated brand management which may require detailed analysis of social media expressions, documents and patterns and contexts pertaining to the various pieces of text. I am interested to learn about the evolution and maturity of this application and specifically any potential integration of with the rest of the SAP BusinessObjects suite but a great illustration of how SPS5 will come together to solve real business challenges nonetheless.
For the techies amongst us, this is currently designed to interface with an SAP HANA Attribute View but I expect that this will evolve to include Analytic Views, allowing the search criteria and results to correspond to measures. The measures are currently limited to the instance count of a given attribute based on the search definition.