Welcome to Driverless AI Courses

These are instructor-led, hands-on courses on H2O.ai Driverless AI product. 

  • AUDIENCE : Everyone
  • LEARNING PATH LEVEL: Beginner, Intermediate, Expert
  • 11 Tasks


Learning Path

image

Learning Path Description:

  • The learning path is extremely flat, with all courses having the Course 0: “Introduction to Driverless AI” prerequisite.
  • Course numberings are for grouping and not sequencing. Thus, one could take Course 5 immediately after completing Course 0.
  • A, B, and C offerings within a number are independent and can be taken in any order.
  • The only exception to the rules above: Course 2C is a prerequisite for Course 4C and for the Special Topics Courses 3A and 3B.

Universal Prerequisite

  • Course 0: Introduction to Driverless AI

    Contains 0 Component(s)

    This is a hands-on introduction to the enhanced Automatic Machine Learning (AutoML) functionality provided by Driverless AI. We introduce you to data visualization for predictive models, automated feature engineering and model optimization using an adaptive evolutionary approach, automatic report generation, machine learning interpretability, and one-click model deployment, all using H2O.ai Driverless AI.

Refresher

Data Science Modelers

  • Course 2A: Machine Learning Interpretability in Driverless AI

    Contains 0 Component(s)

    In this hands-on training session, we guide you on a deep dive into Driverless AI’s Machine Learning Interpretability functionality and features. We discuss the use of surrogate models such as LIME (local interpretable model-agnostic explanations) and surrogate Decision Trees, as well as LOCO in conjunction with Random Forests. We investigate the game-theoretic roots of Shapley values and explain how they can be used, on both original features and engineered features, to understand the impact of variables on the final predictions. We demonstrate practical uses of these metrics, such as creating reason codes, understanding impact at an individual row level through ICE, disparate impact analysis, and sensitivity analysis. We conclude by showing how this functionality can be deployed to score in real time or at scale on new data.

  • Course 2B: Exploring Expert Settings in Driverless AI

    Contains 0 Component(s)

    In the “Introduction to Driverless AI” course, you learned how to use Driverless AI to build predictive models, adjusting the modeling recipe using the Accuracy, Time, and Interpretability knobs. In this hands-on training session, we open the hood to Driverless AI and explore in detail various options for controlling the system, software, feature engineering, models, and recipes used to power Driverless AI. We demonstrate the effect of various options live, by looking at the results of experiments built with and without those options. In particular, we concentrate on expert settings that impact feature engineering, model selection and creation, and reproducibility. (Note: Expert Settings for NLP and Time Series are discussed in their respective courses.)

  • Course 2C: Customizing Driverless AI with Recipes

    Contains 0 Component(s)

    In this hands-on training session, we show you how you can extend the functionality of the Driverless AI platform by incorporating additional Python code as recipes. Recipes allow us to take advantage of the Driverless AI platform infrastructure to solve a vast multitude of problems customized to each specific use case. We review resources for finding recipes and show how to import directly from URL, or download, edit, and upload the recipe. We demonstrate practical uses of data recipes, transformer recipes, model recipes, and scorer recipes. We show you how to customize these recipes for your own particular use cases.

Special Topics

  • Course 3A: Time Series in Driverless AI

    Contains 0 Component(s)

    In the “Introduction to Driverless AI” course, you learned how to use Driverless AI to build regression and classification predictive models for IID data. In this hands-on training session, we introduce the time series capabilities of Driverless AI. We demonstrate the default nonparametric (gradient boosting) approach for building time series predictions, along with discussions of gaps in training and testing, transformers used in time series models, and expert settings specific to Time Series. Beyond expert settings, we show you how to implement, via recipes, other time series approaches such as ARIMA or Facebook Prophet models. We finish by discussing diagnostics and MLI for time series models, including Shapley values for individual time series predictions.

  • Course 3B: Natural Language Processing in Driverless AI

    Contains 0 Component(s)

    In the “Introduction to Driverless AI” course, you learned how to use Driverless AI to build regression and classification predictive models for numeric data. In this hands-on training session, we show that Driverless AI works “out of the box” for NLP. However, under the hood there is much more that can be done to improve NLP models. We explore Expert Settings and show you how to enable word- and character-based embeddings using TensorFlow. We discuss the difference between TF-IDF and Word2Vec-type approaches, and compare their respective advantages and disadvantages. We explore MLI for NLP: how to understand what our NLP models are predicting. We finish by surveying additional model and transformer recipes that may be useful in NLP use cases, and demonstrate their use.

Data Science Coders

  • Course 4A: Using the Python Client in Driverless AI

    Contains 0 Component(s)

    This is a hands-on training session where we show you how to programmatically call the Driverless AI API from Python. This course is for data scientists that prefer coding in Python to using menus and options via the user interface (UI). We exclusively employ Python to replicate many of the steps in the “Introduction to Driverless AI” course, while showing how the UI can simultaneously be used for monitoring Driverless AI if preferred. We also demonstrate how calling Driverless AI from Python allows the user to create custom analyses and processes that are otherwise not available in the UI.

  • Course 4B: Using the R Client in Driverless AI

    Contains 0 Component(s)

    This is a hands-on training session where we show you how to programmatically call the Driverless AI API from R. This course is for data scientists that prefer coding in R to using menus and options via the user interface (UI). We exclusively employ R to replicate many of the steps in the “Introduction to Driverless AI” course, while showing how the UI can simultaneously be used for monitoring Driverless AI if preferred. We also demonstrate how calling Driverless AI from R allows the user to create custom analyses and processes that are otherwise not available in the UI.

  • Course 4C: Extending Driverless AI by Writing Recipes

    Contains 0 Component(s)

    In this hands-on training session, we extend what you learned in “Customizing Driverless AI with Recipes” to teach you how to write your own recipes from scratch. This will allow you to incorporate Python code, including proprietary or self-written libraries and packages, into Driverless AI yourself.

Data Engineers

  • Course 5: Deployment with Driverless AI

    Contains 0 Component(s)

    In the “Introduction to Driverless AI” course, we showed you how to deploy a model using a one-click deployment to a REST server. In this hands-on training session, we open up the world of scoring pipeline deployment to you. We compare advantages and disadvantages of Python and MOJO scoring approaches. We download MOJOs and show how they can be scored (1) in Java, (2) from Python, and (3) from R. We also create and download a Python scoring pipeline, and compare its capabilities with the MOJO called from Python. In the end, we discuss additional ways of deploying Driverless AI models, including real-time scoring, as a Spark, Hive, or database UDF (user-defined function), and other options. We explore additional resources for deployment patterns.