Explain how Gradient Boosting works? What are some of the pros and cons?

☞ Gradient Boosting is an ensemble machine learning technique used for both regression and classification problems. It builds the model in a stage-wise fashion, like other boosting methods, but it generalizes them by allowing optimization of an arbitrary differentiable loss function.

  • PREDICT THE RESIDUALS YO
  • learning rate: prevents overfitting (scaling of the new tree, as in the values of the leaves of the new tree)

Ensemble

  • Combines output from multiple models to make a single decision (by weighting or majority decision)

Regression/Classification

  • Regression: predicting a continuous value (mean)
  • Classification: predicting a discrete category (mode)

Boosting

  • A strategy where each new model (in an ensemble) tries to improve on the errors of the previous ones

Arbitrary differentiable loss function

  • The loss function can have a flexible definition as long as it is differentiable
  • Gradient Boosting can be customized to optimize (minimize) different types of errors depending on what’s important for the specific problem you’re trying to solve.

Walk through this.

<br


(QX) How is AdaBoost different to Gradient Boosting?

https://youtu.be/3CC4N4z3GJc?si=5yIur9zmlB74uFPe