What is the “partial” in partial least squares methods?

In partial least squares regression (PLSR) or partial least squares structural equation modelling (PLS-SEM), what does the term “partial” refer to?

Answer

I would like to answer this question, largely based on the historical perspective, which is quite interesting. Herman Wold, who invented partial least squares (PLS) approach, hasn’t started using term PLS (or even mentioning term partial) right away. During the initial period (1966-1969), he referred to this approach as NILES – abbreviation of the term and title of his initial paper on this topic Nonlinear Estimation by Iterative Least Squares Procedures, published in 1966.

As we can see, procedures that later will be called partial, have been referred to as iterative, focusing on the iterative nature of the procedure of estimating weights and latent variables (LVs). The “least squares” term comes from using ordinary least squares (OLS) regression to estimate other unknown parameters of a model (Wold, 1980). It seems that the term “partial” has its roots in the NILES procedures, which implemented “the idea of split the parameters of a model into subsets so they can be estimated in parts” (Sanchez, 2013, p. 216; emphasis mine).

The first use of the term PLS has occurred in the paper Nonlinear iterative partial least squares (NIPALS) estimation procedures, which publication marks next period of PLS history – the NIPALS modeling period. 1970s and 1980s become the soft modeling period, when, influenced by Karl Joreskog’s LISREL approach to SEM, Wold transforms NIPALS approach into soft modeling, which essentially has formed the core of the modern PLS approach (the term PLS becomes mainstream in the end of 1970s). 1990s, the next period in PLS history, which Sanchez (2013) calls “gap” period, is marked largely by decreasing of its use. Fortunately, starting from 2000s (consolidation period), PLS enjoyed its return as a very popular approach to SEM analysis, especially in social sciences.

UPDATE (in response to amoeba’s comment):

  • Perhaps, Sanchez’s wording is not ideal in the phrase that I’ve cited. I think that “estimated in parts” applies to latent blocks of variables. Wold (1980) describes the concept in detail.
  • You’re right that NIPALS was originally developed for PCA. The confusion stems from the fact that there exist both linear PLS and nonlinear PLS approaches. I think that Rosipal (2011) explains the differences very well (at least, this is the best explanation that I’ve seen so far).

UPDATE 2 (further clarification):

In response to concerns, expressed in amoeba’s answer, I’d like to clarify some things. It seems to me that we need to distinguish the use of the word “partial” between NIPALS and PLS. That creates two separate questions about 1) the meaning of “partial” in NIPALS and 2) the meaning of “partial” in PLS (that’s the original question by Phil2014). While I’m not sure about the former, I can offer further clarification about the latter.

According to Wold, Sjöström and Eriksson (2001),

The “partial” in PLS indicates that this is a partial regression, since …

In other words, “partial” stems from the fact that data decomposition by NIPALS algorithm for PLS may not include all components, hence “partial”. I suspect that the same reason applies to NIPALS in general, if it’s possible to use the algorithm on “partial” data. That would explain “P” in NIPALS.

In terms of using the word “nonlinear” in NIPALS definition (do not confuse with nonlinear PLS, which represents nonlinear variant of the PLS approach!), I think that it refers not to the algorithm itself, but to nonlinear models, which can be analyzed, using linear regression-based NIPALS.

UPDATE 3 (Herman Wold’s explanation):

While Herman Wold’s 1969 paper seems to be the earliest paper on NIPALS, I have managed to find another one of the earliest papers on this topic. That is a paper by Wold (1974), where the “father” of PLS presents his rationale for using the word “partial” in NIPALS definition (p. 71):

3.1.4. NIPALS estimation: Iterative OLS. If one or more variables of the model are latent, the predictor relations involve not only unknown
parameters, but also unknown variables, with the result that the
estimation problem becomes nonlinear. As indicated in 3.1 (iii),
NIPALS solves this problem by an iterative procedure, say with steps s
= 1, 2, … Each step s involves a finite number of OLS regressions, one for each predictor relation of the model. Each such regression
gives proxy estimates for a sub-set of the unknown parameters and
latent variables (hence the name partial least squares), and these
proxy estimates are used in the next step of the procedure to
calculate new proxy estimates.

References

Rosipal, R. (2011). Nonlinear partial least squares: An overview. In Lodhi H. and Yamanishi Y. (Eds.), Chemoinformatics and Advanced Machine Learning Perspectives: Complex Computational Methods and Collaborative Techniques, pp. 169-189. ACCM, IGI Global. Retrieved from http://aiolos.um.savba.sk/~roman/Papers/npls_book11.pdf

Sanchez, G. (2013). PLS path modeling with R. Berkeley, CA: Trowchez Editions. Retrieved from http://gastonsanchez.com/PLS_Path_Modeling_with_R.pdf

Wold, H. (1974). Causal flows with latent variables: Partings of the ways in the light of NIPALS modelling. European Economic Review, 5, 67-86. North Holland Publishing.

Wold, H. (1980). Model construction and evaluation when theoretical knowledge is scarce: Theory and applications of partial least squares. In J. Kmenta and J. B. Ramsey (Eds.), Evaluation of econometric models, pp. 47-74. New York: Academic Press. Retrieved from http://www.nber.org/chapters/c11693

Wold, S., Sjöström, M., & Eriksson, L. (2001). PLS-regression: A basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58, 109-130. doi:10.1016/S0169-7439(01)00155-1 Retrieved from http://www.libpls.net/publication/PLS_basic_2001.pdf

Attribution
Source : Link , Question Author : Alph , Answer Author : Aleksandr Blekh

Leave a Comment