Machine Learning Foundations Course

This course is designed to be a hands-on complement to the AI Fundamentals course offered by H2O.ai.  While a data science background is not required, successful learners should have some familiarity with Python and R.

Prerequisites

  1. This course assumes you have some foundational AI knowledge
  2. This course assumes some basic familiarity with statistics.

Series Description

Machine learning (ML) is one of the most active areas of artificial intelligence. Computers can learn new things without being programmed through the use of machine learning algorithms. The large amounts of data available can be understood through statistical methods to create new insights. These ML algorithms can help machines classify things we hear, images we see, videos consumed.  Machine learning algorithms can also help discover new health remedies, can generate art or write songs, and can answer questions we ask.  Over the course of the series, we will review the types of machine learning: supervised, unsupervised, reinforcement as well as how to boost predictions through ensembling or through the use of AutoML tools. We will also be hands-on with popular ML methods for different problem types and data on a massive scale.  

About the Speakers

Chemere Davis

Chemere is a passionate data science leader and educator with strong technical skills. Actively involved in the data science community outreach and volunteer opportunities. Experienced Financial Services data scientist, Chemere also leads many of our Customer Support engagements with these customers. Connect with her on LinkedIn.

image

Parul Pandey

Parul is a Data Science Evangelist at H2O.ai and a Kaggle Kernels Grandmaster. She comes from an Engineering background and combines Data Science, evangelism, and community in her work. Her emphasis is to spread the information about H2O and Driverless AI to as many people as possible through meetups and writeups. Parul was one of Linkedin’s Top Voice in the Software Development category in 2019. She can be reached out at Linkedin: and Twitter.

image


Sanyam Bhutani

Sanyam Bhutani is a Machine Learning Engineer and AI Content Creator at H2O.ai. He is also an active Kaggler, AI blogger on Medium & Hackernoon (Medium Blog link) with over 1 Million+ Views overall. Sanyam is also the host of Chai Time Data Science Podcast where he interviews top practitioners, researchers, and Kagglers. You can follow him on Twitter or subscribe to his podcast

image


Jo-Fai Chow

Jo-fai (or Joe) has multiple roles (data scientist / evangelist / community manager) at H2O.ai. Since joining H2O.ai in 2016, Joe has delivered H2O talks/workshops in 40+ cities around Europe, US, and Asia. He is also the co-organiser of H2O’s EMEA meetup groups including London Artificial Intelligence & Deep Learning - one of the biggest data science communities in the world with more than 11,000 members. After years of non-stop #AroundTheWorldWithH2Oai action, he is now best known as the H2O #360Selfie guy. 

LinkedIn

Kaggle

GitHub

Twitter

image

Elena Boiarskaia

As a Senior Solutions Engineer, Elena is passionate about helping H2O customers solve advanced data science problems while maximizing business value. With a background in Math and Economics, Elena loves to explore diverse applications of machine learning, earning her PhD from the University of Illinois with a dissertation focusing on predicting health outcomes using accelerometers. Previously, Elena worked with a variety of big data use cases on Spark while at Databricks, as well building machine learning models to identify manipulative activity in the US markets as the Lead Data Scientist at the Financial Industry Regulatory Authority (FINRA). LinkedIn

image


Benjamin Cox

Ben Cox is a Director of Product Marketing at H2O.ai and leads Responsible AI research and thought leadership. Ben has held roles in leading teams of data scientists and machine learning engineers at Ernst & Young, Nike, and NTT Data. Ben holds a MBA from University of Chicago Booth School of Business with concentrations in Business Analytics, Economics, and Econometrics & Statistics, and a Bachelor of Science in Economics from the College of Charleston.

image


Dan Darnell

Dan is an experienced product marketer with over twenty years of experience in leading technology companies. For the past nine years, he has been working on AI platforms and applications, including senior marketing roles at DataRobot, ParallelM, Talend, and Baynote. Before that, Dan was focused on analytics and optimization technologies at Adchemy, Interwoven, Oracle, and Siebel Systems. He holds an MBA from Carnegie Mellon University and a Bachelor's in engineering from The University of Colorado at Boulder.

image

Ingrid Burton

Ingrid Burton is CMO at H2O.ai, the open source leader in AI and machine learning. She is helping companies determine their AI transformation, and helps them on their AI journey. Prior to H2O.ai she was CMO at Hortonworks and helped position the company for growth. At SAP she co-created the Cloud strategy, led SAP HANA and Analytics marketing, and drove developer outreach. She also served as CMO at Silver Spring Networks and Plantronics after spending almost 20 years at Sun Microsystems, where she was head of Sun marketing, led Java marketing to build out a thriving Java developer community, championed and led open source initiatives, and drove various product and strategic initiatives. A developer early in her career, Ingrid holds a BA in Math with a concentration in Computer Science from San Jose State University.

image

David Engler 

David Engler is a Senior Data Scientist and the Director of Customer Success at H2O. He has 15 years of experience leading data science teams in healthcare research and analytics and has over 20 publications in medical analytics as a primary author. He most recently built and led the analytics team for healthcare strategy at the University of Utah hospitals and clinics. David obtained his Ph.D. in Biostatistics from Harvard University.

image

Rafael Coss

Rafael Coss is a Community Maker H2O.ai.  At H2O.ai he works on nourishing, supporting and growing the community(online, meetups, tutorials, ...).  He also works closely with prospects and customers helping them on their AI journeys.  And lastly, he also works on technical marketing (messaging, content, partners).

Prior to joining H2O.ai, he was technical marketing and community Director and a developer advocate at Hortonworks. He was also the DataWorks Summit Program Co-Chair for the 3 years. Prior to Hortonworks, he was a Senior Solution Architect and Manager of IBM's WW Big Data Enablement team. At IBM he was responsible for the technical product enablement for BigInsights and Streams. Previously, he held several other positions in IBM, where he worked on tools, XML db, federated db and Object-Relational db.

Twitter Linkedin

image

  • Module 0 - Start With The Business Problem, Again

    Contains 3 Component(s) Includes Multiple Live Events. The next is on 08/27/2020 at 7:00 AM (PDT)

    You know you need data to use a machine learning algorithm but one of the most critical aspects of machine learning success and eventual implementation is having a thorough understanding of the problems you want the data to solve using AI. In this module, we will review frameworks that will take you from data curation, to an AI solution, to value identification and quantification.

    You know you need data to use a machine learning algorithm but one of the most critical aspects of machine learning success and eventual implementation is having a thorough understanding of the problems you want the data to solve using AI.  In this module, we will review frameworks that will take you from data curation, to an AI solution, to value identification and quantification.

    Learning Outcomes
    • Construct a framework canvas for ML problems
    • Use that framework for a specific use case
    • Identify inputs for ML model
    • Determine quantifiable outputs

  • Module 1 - Data Wrangling & Data Curation

    Contains 4 Component(s) Includes Multiple Live Events. The next is on 09/01/2020 at 7:00 AM (PDT)

    At the very foundation of MI is data. Getting the right information for the specific problem you are trying to solve can make or break your machine learning success. Learn what it takes to get an accurate set of information geared toward the business problem for which you are trying to solve. In this session, we will use tools like Python, R, and H2O-3.

    At the very foundation of MI is data.  Getting the right information for the specific problem you are trying to solve can make or break your machine learning success.  Learn what it takes to get an accurate set of information geared toward the business problem for which you are trying to solve. In this session we will use tools like Python, R and H2O-3.

    Learning Outcomes
    • Identify and categorize data by type                                                                                                                                                                                                                                                
    • Perform summary statistics on data/features
    • Create appropriate data visualizations on each feature
    • Construct a model table with inputs and target/label
    • Apply a sampling strategy
    • Split data for modeling

  • Module 2 - Feature Engineering in Machine Learning

    Contains 7 Component(s) Includes Multiple Live Events. The next is on 09/08/2020 at 7:00 AM (PDT)

    Feature engineering will level up your machine learning algorithm. In many cases feature engineering can be as important as, or sometimes more important than the actual machine learning algorithm you use. In this module, you will be introduced to various feature engineering techniques and feature selection strategies.

    Feature engineering will level up your machine learning algorithm.  In many cases feature engineering can be as important as, or sometimes more important than the actual machine learning algorithm you use. In this module, you will be introduced to various feature engineering techniques and feature selection strategies.

    Learning Outcomes
    • Perform normalization on numeric features
    • Bucket numeric features
    • Transform categorical data based on type
    • Impute missing values based on a specific method
    • Identify Outliers
    • Perform the log transformation of specific features
    • Perform grouping operations on both numeric and categorical features
    • Split raw data into new features
    • Extract additional information from time data
    • Turn transactional data into IID data

  • Module 3 - Machine Learning Deep Dive

    Contains 8 Component(s) Includes Multiple Live Events. The next is on 09/22/2020 at 7:00 AM (PDT)

    Machines learn in different ways and there are several different strategies or methods to apply for a use case in the form of supervised, unsupervised, semi-supervised, and reinforcement learning. Training a machine using basic rules or letting the machine discover patterns independently or using a mix of the two will cover many different problems that can be addressed in an organization. This module will be a mixture of deep-dive into machine learning types, limitations, and hands-on development of machine learning solutions.

    Machines learn in different ways and there are several different strategies or methods to apply for a use case in the form of supervised, unsupervised, semi-supervised, and reinforcement learning. Training a machine using basic rules or letting the machine discover patterns independently or using a mix of the two will cover many different problems that can be addressed in an organization.  This module will be a mixture of deep-dive into machine learning types, limitations, and hands-on development of machine learning solutions.

    Learning Outcomes
    • Select the appropriate machine learning task for a real-world application
    • Use a dataset to fit a new model  
    • Build a machine learning model based on the business application
    • Use cross-validation for a supervised learning model
    • Assess the model performance in terms by error metrics for each ML task.

  • Module 4 - Hands-On Deep Learning

    Contains 4 Component(s) Includes Multiple Live Events. The next is on 10/06/2020 at 7:00 AM (PDT)

    Deep learning (DL) is a driving force for many of the artificial intelligence applications changing the world today in areas like image recognition and self-driving cars. In this hands-on module you will apply neural networks to practical examples.

    Deep learning (DL) is a driving force for many of the artificial intelligence applications changing the world today in areas like image recognition and self-driving cars. In this hands-on module you will apply neural networks to practical examples.

    Learning Outcomes
    • Select the appropriate deep learning task for a real-world application
    • Use a dataset to fit a new model  
    • Build a deep learning model based on business application
    • Assess the model performance in terms by error metrics for the DL task.

  • Module 5 - Machine Learning for Big Data

    Contains 4 Component(s) Includes Multiple Live Events. The next is on 10/13/2020 at 7:00 AM (PDT)

    It’s no secret that the power of artificial intelligence starts with data. The amount of information that is created is growing each day and with that brings challenges when applying AI problems at scale. In this module, we will use big data technologies like H2O-3,Spark and Sparkling Water to build simple machine learning models.

    It’s no secret that the power of artificial intelligence starts with data.  The amount of information that is created is growing each day and with that brings challenges when applying AI problems at scale.  In this module, we will use big data technologies like H2O-3,Spark and Sparkling Water to build simple machine learning models. 

    Learning Outcomes
    • Initialize a Sparkling Water cluster
    • Build a classification model using a Spark integration 
    • Evaluate a Sparkling Water model’s metrics

  • Module 6 - Responsible AI with H2O-3 and Driverless AI

    Contains 4 Component(s) Includes Multiple Live Events. The next is on 10/20/2020 at 7:00 AM (PDT)

    In this hands-on session we will lead you in practical applications of the Machine Learning Interpretability methods to explain a models’ predictions. We will discuss methods such as building surrogate models, utilizing interpretability techniques like K-LIME, variable and feature importance for a machine learning model. We will also demonstrate the use of explainable techniques like partial dependence plots and Shapley values to to provide exact contributions of a feature to a prediction. Additionally we will examine fairness in a model through disparate impact analysis and use sensitivity analysis to debug our model and probe it for security and fairness.

    In this hands-on session we will lead you in practical applications of the Machine Learning Interpretability methods to explain a models’ predictions. We will discuss methods such as building surrogate models, utilizing interpretability techniques like K-LIME, variable and feature importance for a machine learning model. We will also demonstrate the use of explainable techniques like partial dependence plots and Shapley values to  to provide exact contributions of a feature to a prediction.  Additionally we will examine fairness in a model through disparate impact analysis and use sensitivity analysis to debug our model and probe it for security and fairness.

    Learning Outcomes
    • Build an explainable surrogate model 
    • Apply & interpret the K-LIME method for a ML model 
    • Apply & interpret the Variable/Feature Importance for a ML model
    • Apply & interpret a Decision Tree Surrogate Model for a ML model
    • Apply & interpret the Partial Dependence & ICE Plots for a ML model
    • Generate Shapley Values for a ML model
    • Examine a model for bias using Disparate Impact Analysis
    • Run Sensitivity/What-if Analysis for a ML model

  • Module 7 - ML Ops with H2O.ai

    Contains 4 Component(s) Includes Multiple Live Events. The next is on 10/27/2020 at 7:00 AM (PDT)

    Model operations is a critical component in creating value from machine learning models. After all, models are just experiments until they are deployed and used in business applications, to generate insights, or to make decisions. In this course, you will see how to use Model Ops to deploy, manage, and govern models in production environments.

    Model operations is a critical component in creating value from machine learning models. After all, models are just experiments until they are deployed and used in business applications, to generate insights, or to make decisions. In this course, you will see how to use Model Ops to deploy, manage, and govern models in production environments.

    Learning Outcomes
    • Describe how operators on the IT team can work together with data scientists to deploy models in production
    • Determine how to monitor models running in production and what to look for
    • Identify when to retrain models and how to put new versions of models into production without interrupting downstream services

  • Module 8 - Machine Learning Application Integration

    Contains 3 Component(s) Includes Multiple Live Events. The next is on 11/03/2020 at 7:00 AM (PST)

    Defining your problem, collecting your data, choosing your ML algorithm, training your model, and finally deploying your model are some of the major steps in your ML solution. However, the AI transformation is not going to occur until your Application and business process are coming consuming the results of the ML model in an application, and in a context a business can take an action.

    Defining your problem, collecting your data, choosing ML algorithm, training your model and finally deploying your model are some of the major steps in your ML solution. However, the AI transformation is not going to occur until your Application and business process are coming consuming the results of the ML model in an application and in a context a business can take an action.  

    Learning Outcomes

    • How can citizen data scientists train ML models quickly with AutoML?
    • What are the various options to consume the ML models in applications?
    • How to quickly build a decision making application that leverages ML models?

  • Module 9 - Machine Learning Foundations Capstone

    Contains 3 Component(s) Includes Multiple Live Events. The next is on 11/10/2020 at 7:00 AM (PST)

    It’s time to put everything you’ve learned together to build an end-to-end machine learning solution.

    It’s time to put everything you’ve learned together to build an end-to-end machine learning solution. 

    Learning Outcomes
    • Successfully build an end to end machine learning solution

Are you interested in the ML Foundations course? If so, we will email you once this course becomes available.


Please login to access this poll.
You must be logged in to post to the discussion