/hana-ml-py-codejam

Material (learning content and exercises) for SAP CodeJams on getting started with Machine Learning using SAP HANA Cloud and Python.

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

REUSE status

CodeJam - Getting Started with Machine Learning using SAP HANA and Python

Description

This repository contains the material for the CodeJam on Getting Started with Machine Learning using SAP HANA and Python.

In this CodeJam you will learn how the Machine Learning process develops using a Python Client for SAP HANA.

Overview

Overview sessions

These recorded sessions are optional, but are available should you be interested to get an overview ahead of the event, or after the event to recap.

  1. Build your Machine Learning Scenario for your SAP HANA Cloud application from Python - Devtoberfest'22
  2. Accelerate your Machine Learning efforts - benefit from SAP HANA Cloud AutoML - SAP Community Call

Requirements

The requirements to follow the exercises in this repository, including hardware and software, are detailed in the prerequisites file.

Material organization

The material consists of a series of exercises. Each exercise is a notebook to be executed.

Following the exercises

During the CodeJam you will complete each exercise one at a time. During each exercise there are discussion points to be discussed with the entire CodeJam class, led by the instructor and marker as "🤓 Let's discuss"

If you finish an exercise early, please resist the temptation to continue with the next one. Instead, explore what you've just done and see if you can find out more about the subject that was covered. That way we all stay on track together and can benefit from some reflection via the questions (and answers).

The exercises

Here's an overview of the exercises in this CodeJam.

Make certain that you have successfully completed all the prerequisites

Setup:

  1. Setup SAP Business Application Studio and a dev space, or
  2. Setup GitHub and a codespace

Machine Learning:

  1. Check your setup
  2. Basics of HANA DataFrames
  3. Exploratory Data Analysis, or EDA
  4. Training a ML model using Classification
  5. Training a ML model using Train/Test split
  6. Preprocessing - Exclude High Cardinality
  7. Preprocessing - Missing Values
  8. Preprocessing - Feature Engineering
  9. Model tuning
  10. Auto ML

Known Issues

Not known issues as of now.

Feedback

If you can spare a couple of minutes at the end of the session, please help me improve for next time by giving me some feedback.

Simply use this Give Feedback link to create a special "feedback" issue, and follow the instructions in there.

How to obtain support

Create an issue in this repository if you find a bug or have questions about the content.

For additional support, ask a question in SAP Community using SAP HANA tag.

Further connections and information

Here are a few pointers to resources for further connections and information:

Contributing

If you wish to contribute code, offer fixes or improvements, please send a pull request. Due to legal reasons, contributors will be asked to accept a DCO when they create the first pull request to this project. This happens in an automated fashion during the submission process. SAP uses the standard DCO text of the Linux Foundation.

License

Copyright (c) 2023 SAP SE or an SAP affiliate company. All rights reserved. This project is licensed under the Apache Software License, version 2.0 except as noted otherwise in the LICENSE file.