In SKLearn PLSRegression, several items can be called after a model is trained:
- All the above are separated by X and Y
I intuitively understand that x_scores and y_scores should have a linear relationship because that’s what the algorithm is trying to maximize.
However, despite reading multiple resources, I find that some articles use the terms loadings and weights interchangeably but I know they are different. I think loadings are the “direction” vector values that describe where each component is “pointing” at. But what about weights?
TL;DR: What’s the difference between weights and loadings in SKLearn PLSRegression?
Read on this a bit more for a project I’m working on, and I have some links to share that may be helpful. The “weights” in a PLS model are used to translate E_a (the deflated X matrices) to a column in the scores matrix t_a. Deflation occurs after each step of the algorithm by subtracting the variance accounted for by the new component. Loadings on the other hand, translate T to X.
This is a fantastic reference and goes into much more detail:
I also read through the
plsr vignette several times. It’s R but the concepts should translate: https://cran.r-project.org/web/packages/pls/vignettes/pls-manual.pdf
According to this resource, the weights are required to “maintain orthogonal scores.” There are some nice visualizations starting on slide 35.