Monday, 5 November 2018

Setting up a HANA Express Python Machine Learning API Demo VM

This is a step-by-step guide to setting up a HANA Express(HXE) VM for demonstrating the Python Machine Learning API with functioning demo application.

The goal is to eventually provide an automated script to perform all the required actions. Currently some steps require user authentication and can’t be performed automatically.

As of the date of this post, these instructions are specific to HXE version 2.00.033.00.20180925.2 as reported by the HXE Download Manager for platform = Linux / x86-64, image = Virtual Machine, Package = Server + applications. The host VM was a Macbook Pro with 16GB of RAM running VMWare Fusion 8.5.10 (7527438). Other vm hypervisors and host platforms may work but your mileage may vary. Please adjust the following as needed for your situation.

This post does not describe a similar process for installing HXE from the binary installer or the dockerized version. However, you may find much of the following useful if you choose those methods.


SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Pick the downloader for your platform. Apple isn’t supported directly so we’ll use the Platform-independent DM.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

This will download the file HXEDownloadManager.jar. In order to run the downloader you must have java installed on your local machine.

Open a terminal window and change to your Downloads directory.

cd ~/Downloads

java -jar HXEDownloadManager.jar

This will run the HXE Download Manager.

Double check that the Version is 2.00.033.00.20180925.2 and that you’ve selected the Server + apps and Clients for Linux options.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

This can take some time depending on your internet connection speed.

Once the download is completed, you’ll have a file called hxexsa.ova and one called clients_linux_x86_64.tgz in your Downloads.

Run VMWare Fusion or your preferred hypervisor app and import the ova file.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

On my machine this took about 20 minutes. Don’t be tempted to give your vm less than 12GB.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

When the vm starts up, Confirm the keyboard configuration and change the time zone if desired.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Consult the Getting_Started_HANAexpress_VM.pdf file that was downloaded for details of setting up your machine’s hosts file to override the name resolution for the hostname hxehost.

WAIT! .. resist attempting to login right away. Give the system about 15 minutes to settle before continuing. Set a timer for 15 minutes and go get a cup of coffee.

…you waited, right? OK, let’s continue.

Login using the hxeadm user entering password HXEHana1(the default). You will be prompted to enter the (current) HXEHana1 password again and then your new password twice.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Do not forget this password! It will become the password for the os hxeadm user, the XSA_ADMIN, XSA_DEV users.  See the Getting_Started_HANAexpress_VM.pdf file for details.

You will then be prompted for the password of the db SYSTEM user in both the SYSTEMDB and the tenant HXE db.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

If your system needs a proxy setting to reach the internet, configure it now.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

The system will now do a bunch of installation/configuration/adjustment/tuning. This can take at least ??? minutes and you should wait for this to complete before continuing.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Last chance before continuing.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

zypper ar…

From this point we’ll ssh into our server with the built-in mac ssh client. In windows you’ll want to use Putty and Putty-gen to get things set up. I’m also setting up for passwordless ssh’ing into the server. There are several ways to do this so I won’t go into detail. For a quick setup, use ssh-keygen and ssh-copy-id.

ssh-keygen
ssh-copy-id -i ~/.ssh/id_rsa.pub hxeadm@hxehost

Open a terminal or ssh client window and ssh to the server.

Check that the xs api endpoint is set.

If it isn’t, use the following.

xs api https://hxehost:39030/ --cacert=/hana/shared/HXE/xs/controller_data/controller/ssl-pub/router/default.root.crt.pem

If the XSA_ADMIN user isn’t logged it, Use this to login and the password set above.

xs login -u XSA_ADMIN

By default the XS Advanced application runtime has a LOT of things running. Since we’re very tight on memory we can turn nearly everything off but just the critical apps. Run these commands.

xs target -o HANAExpress -s SAP ; xs a | grep STARTED | grep -v hrtt-service | grep -v di-runner | grep -v di-core | grep -v deploy-service | cut -d ' ' -f 1 | while read -r line ; do echo "Stopping $line"; xs stop $line ; done

Note that once the above commands have finished, you won’t be able to access the xsa-cockpit, hana-cockpit, or webide or any other xsa utility.

Run the following to see what’s still started.

xs a | grep STARTED

You should only see 4 apps.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

The hxeadm user is by default set up to be able to sudo into the root user.

Become the root user by starting a new /bin/bash shell.

sudo /bin/bash

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

HANA Express doesn’t come with any repositories configured by default. We will need to add compilation tools and other programs to accomplish our tasks so we need to be able to install software. Configure the repos with the following commands as the root user.

zypper ar http://download.opensuse.org/distribution/leap/42.2/repo/oss/ oss
zypper ar http://download.opensuse.org/distribution/leap/42.2/repo/non-oss/ non-oss
zypper ar http://download.opensuse.org/update/leap/42.2/oss/ update-oss
zypper ar http://download.opensuse.org/update/leap/42.2/non-oss/ update-non-oss

Refresh the repo catalogs with this command as root.

zypper -n --gpg-auto-import-keys refresh

We will need to build the version of python we will use from source code so we need the standard linux build tools. Get them with the following.

zypper -n --gpg-auto-import-keys install --no-recommends --auto-agree-with-licenses --force-resolution --type pattern devel_basis

We also need a bunch of other tools, libraries and programs. Install them with the following.

zypper -n --gpg-auto-import-keys install --no-recommends --auto-agree-with-licenses --force-resolution tk-devel tcl-devel libffi-devel openssl-devel readline-devel sqlite3-devel ncurses-devel xz-devel zlib-devel wget git-core nodejs npm lynx jq libzip2 libzip inotify-tools

That’s all we need to do as root for now. Drop back to the hxeadm user with exit.

exit

As mentioned at the top of this (now growing long) blog post, I’ve intend to make all of this into an installation script, but it’s also good to get an explanation of the individual step in case you run into issues. The script and accompanying readme is in the following github repo. We will clone it here as some of the following scripts rely on it. If you want to look at my work in progress. Edit the hxe_python_ml.sh file.

git clone https://github.com/alundesap/hxe_python_ml.git

Now change into the new git repo folder hxe_python_ml.

cd hxe_python_ml

The demo application itself is a different github repo. We’ll clone it here.

git clone https://github.com/alundesap/mta_python_ml.git

We will be getting version 3.6.5 of the python source from python.org.

wget https://www.python.org/ftp/python/3.6.5/Python-3.6.5.tgz

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

We need the Python Machine Learning API libraries from the client libraries that you downloaded with the HXE Download Manager. Return to your system and expand the clients_linux_x86_64.tgz file into it’s component files. On my mac I use the tar command. On a windows machine you’ll have ot untar and unzip it.

tar xzvf clients_linux_x86_64.tgz

Transfer the hana_ml-1.0.3.tar.gz file to your vm. On my mac I use scp, but on windows you may have to use WinSCP or similar.

scp hana_ml-1.0.3.tar.gz hxehost:/usr/sap/HXE/HDB90/hxe_python_ml/

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Download the XS_PYTHON00_1-70003433.ZIP file and then copy it into your vm as before.

scp XS_PYTHON00_1-70003433.ZIP hxehost:/usr/sap/HXE/HDB90/hxe_python_ml/

Return to the terminal or putty session where you are logged in as the hxeadm user.

Make sure you are in the hxe_python_ml folder.

Your folder should now have these files in it.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Next we will expand the python source, configure, build and install it into a local folder called python_3_6_5.

tar xzvf Python-3.6.5.tgz
md python_3_6_5
cd Python-3.6.5
./configure --prefix=/usr/sap/HXE/HDB90/hxe_python_ml/python_3_6_5/ --exec-prefix=/usr/sap/HXE/HDB90/hxe_python_ml/python_3_6_5/ ; make -j4 ; make altinstall

This will take some time to build. Plan on about 3-4 minutes.

You should end up with pip-9.0.3 and setuptools-39.0.1.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Move up one directory and into the bin folder of the python we just built.

cd ../python_3_6_5/bin

Since we just built a fully functional version of python, it would be nice to use it for localized testing of our project code. A nice way to do that is to create some symbolic links and set up our environment to use it. Change into the target python directory we installed our custom version of python into and create some links.

ln -s easy_install-3.6 easy_install
ln -s pip3.6 pip
ln -s pydoc3.6 pydoc
ln -s python3.6 python
ln -s pyvenv-3.6 pyvenv

Now we can install the python runtime into xs.

xs create-runtime -p /usr/sap/HXE/HDB90/hxe_python_ml/python_3_6_5/

Check that installed properly.

xs runtimes | grep python

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Let’s go up two levels.

cd ../..

When we cloned the hxe_python_ml repo, we got another handy file for setting up the environment for us to test python locally. Run the set_python_env.sh script like this.

Note. There is a space ” ” between the dot and set_python_env.sh. This ensures that the changes affect the current shell.

. set_python_env.sh

A side-effect of this script is that is places you in the mta_python_ml/python folder.

Test that python and pip are the versions we expect.

python --version

Should return Python 3.6.5

pip --version

Should return..

pip 9.0.3 from /usr/sap/HXE/HDB90/hxe_python_ml/python_3_6_5/lib/python3.6/site-packages (python 3.6)

We now have a functioning python that is the version that we built. It’s often that there are other versions of python available on a given system and it’s easy to get them mixed up, so beware.

Let’s install the libraries that we copied in earlier. Go up two folders.

cd ../..

Uncompress the SAP Python libraries.

unzip XS_PYTHON00_1-70003433.ZIP -d sap_dependencies

We use the sap_dependencies folder name so that it matches the SAP documentation.

Go back into the mta_python_ml/python folder.

cd mta_python_ml/python

In order to facilitate local testing of our demo app, let’s install the python libraries locally.

pip install -r requirements.txt --find-links ../../sap_dependencies --find-links ../../hana_ml-1.0.3.tar.gz

When we go to deploy our python demo application we’ll need the python dependencies in a special folder called vendor that the buildpack will detect and utilize.

mkdir -p vendor
pip download -d vendor -r requirements.txt --find-links ../../sap_dependencies --find-links ../../hana_ml-1.0.3.tar.gz hana_ml

Since the HANA WebIDE doesn’t support python as a development language yet and we’ve also turned it off, it would be nice to have a way to play with the python machine learning api. There is a python project called jupyter that provides this functionality.

pip install jupyter

Here are a few other python ml packages that you might like to explore(optionally).

pip install sklearn
pip install mxnet
pip install tensorflow
pip install python-mnist

Create a default jupyter configuration with this command.

jupyter notebook --generate-config
cp ~/../HDB90/hxe_python_ml/jupyter_config/jupyter_notebook_config.py ~/.jupyter/jupyter_notebook_config.py
cp ~/../HDB90/hxe_python_ml/jupyter_config/jupyter_notebook_config.json ~/.jupyter/jupyter_notebook_config.json

Run the jupyter notebook server.

jupyter notebook

Now you can browse to the Jupyter nodebook at http://hxehost:8080/ password is Plak8484

Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).

There is a README.md file that contains the steps for building and deploying the demo application.

Currently this is a work-in-progress and may change somewhat so I’ll put the important steps here.

Move up one folder level to the top of the demo project.

cd ..

The python machine learning library is a facade for the Predictive Analytics Library(PAL) and the PAL library requires a DB tenant with the scriptserver daemon running. The default development space in HXE is mapped to the SYSTEMDB which doesn’t allow the scriptserver daemon to run in it. The HXE tenant will allow the scriptserver to run in it if so we’ll enable it.

Note, you’ll need to replace Plak8484 with the password you specified above.

hdbsql -i 90 -n localhost:39013 -u SYSTEM -p Plak8484 -d SYSTEMDB "ALTER DATABASE HXE ADD 'scriptserver'"

SELECT * FROM SYS_DATABASES.M_SERVICES

hdbsql -i 90 -n localhost:39015 -u SYSTEM -p Plak8484 -d HXE

We need a space that we can map to the HXE tenant DB so that deploys into that space will be mapped to that tenant. Create the ml space and add users to it.

xs create-space ml -o HANAExpress
xs set-space-role XSA_ADMIN HANAExpress ml SpaceManager
xs set-space-role XSA_ADMIN HANAExpress ml SpaceDeveloper
xs set-space-role XSA_DEV HANAExpress ml SpaceManager
xs set-space-role XSA_DEV HANAExpress ml SpaceDeveloper

Now we need to restart the xsa-cockpit app and enable XSA in the HXE DB tenant.

xs t -s SAP
xs start xsa-cockpit

Once it is running find it’s url.

xs app xsa-cockpit --urls

Browse to it at https://hxehost:51039 (yours my be different) and login with XSA_ADMIN and the password you specified above.

Click Tenant Databases and the little wand icon to the right of the HXE db line.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Set the SYSTEM user password to the same as the password you used above.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Wait for the status to become Enabled for the HXE db, then Click on the little map icon.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Select HANAExpress as the Organization and ml as the Space. Click Save.

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

Now when we push or deploy something into the ml space, it will get created in the HXE database tenant (which has the scriptserver running in it).

We won’t need the xsa-cockpit anymore so we can stop it.

xs stop xsa-cockpit

Let’s go now to our newly minted ml space and build our demo app.

xs t -s ml

Make sure we’re in the project folder.

cd /usr/sap/HXE/HDB90/hxe_python_ml/mta_python_ml

First we create a service instance to our HANA DB (in the ml space).

xs create-service hana hdi-shared python-ml-hdi

And one for the UAA.

xs create-service xsuaa default python-ml-uaa

Now we build our HDI container manually.

xs push python-ml.db -k 1024M -m 256M -p db --no-start --no-route
xs bind-service python-ml.db python-ml-hdi
xs restart python-ml.db --wait-indefinitely ; sleep 15 ; xs stop python-ml.db

Then the python module. Make a note of the port that the python module was assigned.

xs push python-ml.python -k 1024M -m 256M -n python -p python --no-start
xs bind-service python-ml.python python-ml-hdi
xs bind-service python-ml.python python-ml-uaa
xs start python-ml.python
xs app python-ml.python --urls

And finally the web module. Note: Adjust the port of the destination to match that of the python module above.

xs push python-ml.web -k 1024M -m 256M -n web -p web --no-start
xs bind-service python-ml.web python-ml-uaa
xs set-env python-ml.web destinations '[{"forwardAuthToken":true, "name":"python_be", "url":"https://hxehost:51029"}]'
xs start python-ml.web
xs app python-ml.web --urls

One manual steps we have to do is to grant our HDI container user the global role grant AFL__SYS_AFL_AFLPAL_EXECUTE so that they can execute PAL functions. At some point I will figure out a way to grant this in the application but for now we have to do it manually.

Here’s a dirty way to get at the user name. Don’t use the hdi_user.

xs env python-ml.python | grep user

“user” : “391FCB5A79B343949DECFD0F9014B082_F4UOB7FMN51QC8ELIJ95YXJAZ_RT”

Modify the following to fit your environment user name + SYSTEM password.

hdbsql -i 90 -n localhost:39015 -u SYSTEM -p Plak8484 -d HXE "grant AFL__SYS_AFL_AFLPAL_EXECUTE to 391FCB5A79B343949DECFD0F9014B082_F4UOB7FMN51QC8ELIJ95YXJAZ_RT"

Normally we’d bundle our project up into an mtar file and deploy it, but these steps accomplish the same thing for this example.

Now after all of that, we have an xs advanced app that utilized the python machine learning API.

Browse to the url of the web module. https://hxehost:51030 (Note yours will likely be different.)

SAP HANA Tutorial and Material, SAP HANA Certification, SAP HANA Learning, SAP HANA Study Materials

This example uses the Support Vector Machine, Support Vector Classifier to try and train against and predict hand written digits. Start by drawing a “1” in the box at the left and then clicking the “1” button. Do this about 10 times and the prediction function will start. You’ve just taught it to recognize a “1”. Continue adding other digits and see when it starts correctly predicting them.

Each digit is represented by 784 integers, one for each pixel. So it’s easy to create quite a bit of data pretty fast. In theory each pixel could be a different feature for analysis.

Well this has been a long journey and we didn’t even look at the code itself! I’ve found that all the elements to pull this together were scattered in various places so hopefully this helps someone get a functional system working so that they can get to the good machine learning stuff.