Models in Pipelines

Pipeline are useful for model building

Data transfromation is a key step in any ML problem. It can quickly become cumbersome and error prone if this is done by hand. Scikit-learn gives the pipeline object to solve this problem.

Let’s say in a given problem, we would like to do the following:

  • Impute missing values using the mean
  • Transform features to quadratic
  • Fit a lineear regression


from sklearn.pipeline import make_pipeline

model = make_pipeline(Imputer(strategy='mean'),

The pipeline object behaves like a Sklearn object and can take the fit and predict steps., y)
pred = model.predict(X)

We could apply GridSearch for Hyperparameters selection on this pipeline object do get the best performing model.