Enrolment options
Machine learning (ML) helps us find patterns in data—patterns we then use to make predictions about new data points. To get those predictions right, we must construct the data set and transform the data correctly. This course covers these two key steps. We'll also see how training/serving considerations play into these steps. It is recommended that a basic understanding of machine learning should be in place before embarking on this course.
Who is this course for? This self paced course is aimed at learners who have a basic understanding in ML and ML problem framing and who would like to take this a step further and learn how to construct a data set and correctly transform data
What will you learn? By the end of this course you will be able to recognise the relative impact of data quality and size to algorithms, set informed and realistic expectations for the time to transform the data, explain a typical process for data collection and transformation within the overall ML workflow, collect raw data and construct a data set, sample and split the data set with considerations for imbalanced data and transform numerical and categorical data
Topics overview: Collecting data, sampling and splitting, transforming categorical and numerical data
How much time you need to invest: This course should take approximately three (3) hours to complete
Prerequisites: It's recommended that learners have a basic understanding of ML and that the Introduction to ML Problem Framing course has been completed
Course certificate: A course certificate will be generated on the successful completion of the final quiz
Course developers: This course has been developed by Google Developers. One must be aware that other vendors have similar courses available but for the purposes of this exercise, the course offered by Google Developers has been selected.
Course contact: In case of technical troubleshooting or questions, please reach out to fairforward@giz.de
Course License: CC BY