SAP HANA Self Learning as Never Before(Part 1) – the first lesson for startups to learn HANA

By admin Last updated Feb 15, 2022

SAP HANA Self Learning as Never Before: As a technology advisor to the startups at SAP Startup Focus Program, I have the opportunities to work with the innovative startups from all different kinds of areas. SAP Startup Focus is a 12-month global program for startups with big data, predictive analytics and/or real-time data decision solutions. We make SAP HANA available to the startup community, help eligible startups accelerate the development of their solutions. We also help the startups with validated HANA solution accelerate market traction.

I’m from the second phase of the program that we call Development Accelerator, in which phase we are helping startups to build the Minimum Viable Product(MVP) within 1-year period of free technical supports. As a hands-on person of the team, my job is to help the startups solve all different kinds of technical problems, advise architecture designs. I also own the technical thoughts and creations of technical contents for startup educations, the team I am with have done many prototyping workshops(a 1-day classroom training) to train the engineers from startups. We have trained many startups all over the world.

I had studied some existing educational content, thanks to SAP product and development teams in creating the amazing SAP HANA Interactive Education (SHINE), it will be a solid start for new developers to SAP HANA. Click the link to learn more about SHINE.

The reason we found that we cannot just reuse SHINE is because of the diversity of the startups, many of them aren’t in the area of the enterprise world and it is hard for them to understand the data model of SAP EPM system. Besides, startups want to have something fun so I decided to create something interesting and more close to the mindset of startups. We have used these contents to train many startups, the feedbacks we received that they are very much enjoyed that I decide to share it with you through a series of blogs and this is the first blog that in which I will cover the overview.

Ok. Let’s get started. I do want to tell you that I have evaluated many open datasets that include twitter you may have seen in my another blog, LinkedIn data and some other dataset, eventually CrunchBase data stands itself out because it is so close to what I want. For those who don’t know CrunchBase data yet, the simple description is it is the dataset about Startups, Investors, Competitors, Fundings and Acquisitions that you can imagine it is very close of the startups’ daily life.

Data Model

CrunchBase is a free database of technology companies and start-ups operated by TechCrunch, which comprises around 500,000 data points profiling companies, people, investors, fundings and acquisitions. Below is the number of points for each entity type in CrunchBase:

CrunchBase itself don’t compare the companies and there is no option to aggregate and calculate even discover the relationships between the various datasets, by loading the data into a in-memory database like SAP HANA and utilize the data modeling tool or embedded analysis algorithms, some very interesting questions like below can be answered in real time:

What kind of companies have more opportunities to be invested or acquired?
What are the likable competitors of a company?
What is the location distribution of companies had received investments over 3 rounds?
What are the shortest or average time to IPO?

The diagram below shows the entity relationships. For each company, it can have zero to multiple funding rounds, acquisitions, IPOs, persons work or had worked for the company, competitions as well as offices. The financial organizations are usually the venture capitalists.

You can think there are many ways to use the data to find the insights behind startups and investors community. But don’t forgot our mission here is to use it to demonstrate HANA capabilities, here are some examples:

Modeling: Investment history model to aggregate all the funding records of each financial organization
SQLScript Procedures: Define proprietary algorithms to calculate startup ranks based on the fundings, competition landscape analysis
Text Analysis: Extract sentiment results of company related information
Predictive Analysis: Investor clustering
Geospatial Analysis: Funding and acquisition location distributions
Visualization: Using SAPUI5 for Mobile, CVOM charts to show funding, acquisition records
XS Engine(OData & XSJS): Declare OData services or XSJS services for data exposure to UI layer