Watch the intro video


Note: if you can't see the video, you might need to allow cookies or disable the add blocker.


Soledad Galli

Soledad Galli, PhD

Instructor



Sole is a lead data scientist, instructor and developer of open source software. She created and maintains the Python library for feature engineering Feature-engine, which allows us to impute data, encode categorical variables, transform, create and select features. Sole is also the author of the book "Python Feature engineering Cookbook" by Packt editorial.

Course description


Welcome to the most comprehensive course on feature engineering available online. In this course, you will learn about variable imputation, variable encoding, feature transformation, discretization, and how to create new features from your data.


Specifically, you will learn:

  • How to impute missing data
  • How to encode categorical variables
  • How to transform numerical variables and change their distribution
  • How to perform discretization
  • How to remove outliers
  • How to extract features from date and time
  • How to create new features from existing ones


While most online courses will teach you the very basics of feature engineering, like imputing variables with the mean or transforming categorical variables using one hot encoding, this course will teach you that, and much, much more.


In this course, you will first learn the most popular and widely used techniques for variable engineering, like mean and median imputation, one-hot encoding, transformation with logarithm, and discretization. Then, you will discover more advanced methods that capture information while encoding or transforming your variables to improve the performance of machine learning models.


The methods that you will learn were described in scientific articles, are used in data science competitions, and are commonly utilized in organizations. And what’s more, they can be easily implemented by utilizing Python's open-source libraries!


Throughout the lectures, you’ll find detailed explanations of each technique and a discussion about their advantages, limitations, and underlying assumptions, followed by the best programming practices to implement them in Python.


By the end of the course, you will be able to decide which feature engineering technique you need based on the variable characteristics and the models you wish to train. And you will also be well placed to test various transformation methods and let your models decide which ones work best.


This comprehensive feature engineering course contains over 100 lectures spread across approximately 10 hours of video, and ALL topics include hands-on Python code examples that you can use for reference, practice, and reuse in your own projects.

Course Curriculum


  Introduction
Available in days
days after you enroll
  Variable types
Available in days
days after you enroll
  Variable characteristics
Available in days
days after you enroll
  Missing data imputation
Available in days
days after you enroll
  Multivariate imputation
Available in days
days after you enroll
  Categorical variable encoding
Available in days
days after you enroll
  Variable transformation
Available in days
days after you enroll
  Discretisation
Available in days
days after you enroll
  Outliers
Available in days
days after you enroll
  Feature scaling
Available in days
days after you enroll
  Engineering mixed variables
Available in days
days after you enroll
  Datetime variables
Available in days
days after you enroll
  Assembling feature engineering pipelines
Available in days
days after you enroll
  Final section | Next steps
Available in days
days after you enroll