Module 1 - Data Wrangling & Data Curation

Back to Package

At the very foundation of ML is data.  Getting the right information for the specific problem you are trying to solve can make or break your machine learning success.  Learn what it takes to get an accurate set of information geared toward the business problem for which you are trying to solve. In this session we will use tools like Python, R and H2O-3.

Learning Outcomes
  • Identify and categorize data by type                                                                                                                                                                                                                                                
  • Perform summary statistics on data/features
  • Create appropriate data visualizations on each feature
  • Construct a model table with inputs and target/label
  • Apply a sampling strategy
  • Split data for modeling


Session 1 (Slides & Replay): Getting the Right Data Set For Modeling: Data Exploration & Munging
Click on View to access the replay and the slides
Click on View to access the replay and the slides Slides and Replay of our second ML Foundations Course session Module 1.
Module 1 Data Wrangling Session 1 Quiz
10 Questions  |  2 attempts  |  8/10 points to pass
10 Questions  |  2 attempts  |  8/10 points to pass To be successful with this quiz, please review and follow the directions in the "Getting the Right Dataset for Modeling" document on the Handouts tab for this Module.
You must be logged in to post to the discussion
  • CD

    The Quiz for Module 1 has been posted!  Please be sure to review the Getting the Right Dataset For Modeling document on the Handouts tab for exercises specific to this quiz.

  • SC

    The video link is not related to Getting the Right Data Set For Modelling: Data Exploration & Munging . I think the link is wrong.

  • CD

    Thank you Serkan for alerting the replay link is incorrect, we are working to get the correct link posted.

  • RS

    You are right: seems coming from model 3 session 2 of AI Foundation or something similar... Wrong link !

  • Himon Garg

    Where can I find the  Session 1: Getting the Right Data Set For Modelling: Data Exploration & Munging ? Please help!

  • RS

    Slides are here:

    VIdeo with the lesson here:

  • RC

    Post Module 1 lab questions here.

  • RS

    A question already posed but without reply... after the previous course (AI Foundation): how much Python is needed ? From basic to intermediate is sufficient ? Do we need to proved test / dev platforms of any kind ? on our PC ? or will we rely on H2O resources through the browser ?

  • RM

    Only basics needed