SAP HANA Real Time Sentiment Analysis and Text Mining app
You want it you got it!
Finally we are publishing all HANA apps that all of us, B1 Solution Architects, developed to demonstrate how far you can go using SAP HANA!
If you are seeing this application for the first time, you didn’t go to one of the Business One Innovation Summits 2015 (Bangkok, Barcelona or Miami). Shame on you and you have to be there next year.
This application implements 2 interesting HANA functionalities:
1 – SAP HANA Text Analysis
This functionality allows us to handle “Unstructured Data” (in our case tweets). This feature classifies terms of a given input in several categories (People, companies, locations etc). Its is also possible to extract the whole meaning of an unstructured text and classify in a Positive or Negative Sentiment.
2 – SAP HANA Text Mining (Available on SPS09 only)
This feature works with a concept called Bag of words not only acting at a document level (tweet) but a whole set of documents (an entire table of tweets for example).
With those features we can classify documents relevancy for a given input. E.g. give me the tweets that are more relevant for the term “Steve Jobs” or suggest me terms that are related to it (“apple”).
Lets see the app running in this conceptual demonstration:
App Installation (SPS08 or higher required)
As Sally’s just explained on the video. This solution in composed by 2 applications:
You can find all installation files and a detailed presentation on my Dropbox.
Or clone the repository on my GitHub
1 – Structurer One (The HANA App)
You just have to import the Delivery Unit (file .tgz) on your HANA System to have all the App structure set up. If you don’t know how to work with DUs, use this simple example of how to import it as a guide.
After the import, on the development perspective of HANA Studio, check the software repositories and execute all the SQL commands that are listed on the file Summit15 > SQL > CreateIndexes.sql
Those commands will Activate Text Analysis features on the app.
The app will be at:
2 – Tweets Retriever
To the app works completely, we need data. Of course you can input it manually, but the whole idea is to perform Real Time Analysis.
You can install this framework in any OS and run the tweet retriever from your laptop for example.
This is a small script is responsible to listen the Twitter API and every time a new tweet comes up, it will be stored on a HANA Server.
To run it:
- Download and Install Node.JS
- From terminal (cmd, shell, etc) download the script dependencies by running these 3 commands:
- npm install util
- npm install hdb
- npm install twitter
- Register a new application on Twitter Developers to have your own twitter keys
- Open the node scripts (brands.js, sapb1.js and twitterSummit.js) and fill your twitter keys and HANA server information.
Run the Tweets Retriever scripts with the commands
- node brands.js
- node twitterSummit.js
- node brands.js
And it should work like this:
This app is, off course, for demonstration purposes. It is 100% open source and can be enhanced or modified accordingly. It was developed under a didactical perspective, in a matter that you could have a comprehensive experience understanding each step.
One point to be emphasized it’s the amount of free third party resources that I used here to show you that, once working on HANA, we are free to work with a infinite of libraries and resources.
- The UI is 100% build with Twitter Bootstrap
- Dashboards are made with Morris.JS
- Maps are on jVectorMap
- Rest calls with jQuery
- And others…
SAP embraces and support (a lot) open source community. You can see several examples on SAP GitHub Repository. Here we are using the HANA Node driver that is a example of it and I bet you will hear more and more about this in a near the future.
Let me know what you think and don’t forget to follow me on twitter! Follow @Ralphive