Comparison of locally weighted PLS strategies for regression and discrimination on agronomic NIR data
Corresponding Author
Matthieu Lesnoff
CIRAD, UMR SELMET, Montpellier, France
SELMET, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
ChemHouse Research Group, Montpellier, France
Correspondence
Matthieu Lesnoff, Selmet Joint Research Unit (Tropical and Mediterranean Animal Production Systems), CIRAD, TA C-112/A—Campus international de Baillarguet—34398 Montpellier Cedex 5, France.
Email: [email protected]
Search for more papers by this authorMaxime Metz
ITAP, Montpellier SupAgro, Irstea, Univ Montpellier, Montpellier, France
ChemHouse Research Group, Montpellier, France
Search for more papers by this authorJean-Michel Roger
ITAP, Montpellier SupAgro, Irstea, Univ Montpellier, Montpellier, France
ChemHouse Research Group, Montpellier, France
Search for more papers by this authorCorresponding Author
Matthieu Lesnoff
CIRAD, UMR SELMET, Montpellier, France
SELMET, Univ Montpellier, CIRAD, INRA, Montpellier SupAgro, Montpellier, France
ChemHouse Research Group, Montpellier, France
Correspondence
Matthieu Lesnoff, Selmet Joint Research Unit (Tropical and Mediterranean Animal Production Systems), CIRAD, TA C-112/A—Campus international de Baillarguet—34398 Montpellier Cedex 5, France.
Email: [email protected]
Search for more papers by this authorMaxime Metz
ITAP, Montpellier SupAgro, Irstea, Univ Montpellier, Montpellier, France
ChemHouse Research Group, Montpellier, France
Search for more papers by this authorJean-Michel Roger
ITAP, Montpellier SupAgro, Irstea, Univ Montpellier, Montpellier, France
ChemHouse Research Group, Montpellier, France
Search for more papers by this authorAbstract
In multivariate calibrations, locally weighted partial least squared regression (LWPLSR) is an efficient prediction method when heterogeneity of data generates nonlinear relations (curvatures and clustering) between the response and the explicative variables. This is frequent in agronomic data sets that gather materials of different natures or origins. LWPLSR is a particular case of weighted PLSR (WPLSR; ie, a statistical weight different from the standard 1/n is given to each of the n calibration observations for calculating the PLS scores/loadings and the predictions). In LWPLSR, the weights depend from the dissimilarity (which has to be defined and calculated) to the new observation to predict. This article compares two strategies of LWPLSR: (a) “LW”: the usual strategy where, for each new observation to predict, a WPLSR is applied to the n calibration observations (ie, entire calibration set) vs (b) “KNN-LW”: a number of k nearest neighbors to the observation to predict are preliminary selected in the training set and WPLSR is applied only to this selected KNN set. On three illustrating agronomic data sets (quantitative and discrimination predictions), both strategies overpassed the standard PLSR. LW and KNN-LW had close prediction performances, but KNN-LW was much faster in computation time. KNN-LW strategy is therefore recommended for large data sets. The article also presents a new algorithm for WPLSR, on the basis of the “improved kernel #1” algorithm, which is competitor and in general faster to the already published weighted PLS nonlinear iterative partial least squares (NIPALS).
REFERENCES
- 1Shen G, Lesnoff M, Baeten V, et al. Local partial least squares based on global PLS scores. J Chemometr. 2019; 33(5):e3117. https://doi.org/10.1002/cem.3117
- 2Wold H. Nonlinear iterative partial least squares (NIPALS) modeling: some current developments. In: PR Krishnaiah, ed. Multivariate Analysis II. Wright State University, Dayton, Ohio, USA. June 19–24, 1972. New York: Academic Press; 1973: 383-407.
- 3Wold S, Sjöström M, Eriksson L. PLS-regression: a basic tool of chemometrics. Chemom Intel Lab Syst. 2001; 58(2): 109-130. https://doi.org/10.1016/S0169-7439(01)00155-1
- 4de Jong S. SIMPLS: an alternative approach to partial least squares regression. Chemom Intel Lab Syst. 1993; 18(3): 251-263. https://doi.org/10.1016/0169-7439(93)85002-X
- 5Tenenhaus M. La Régression PLS: Théorie et Pratique. Paris: Editions Technip; 1998.
- 6Dardenne P, Sinnaeve G, Baeten V. Multivariate calibration and chemometrics for near infrared spectroscopy: which method? J Infrared Spectrosc. 2000; 8(4): 229-237.
- 7Clairotte M, Grinand C, Kouakoua E, et al. National calibration of soil organic carbon concentration using diffuse infrared reflectance spectroscopy. Geoderma. 2016; 276: 41-52. https://doi.org/10.1016/j.geoderma.2016.04.021
- 8Davrieux F, Dufour D, Dardenne P, et al. LOCAL regression algorithm improves near infrared spectroscopy predictions when the target constituent evolves in breeding populations. J Infrared Spectrosc. 2016; 24(2): 109-117. https://doi.org/10.1255/jnirs.1213
- 9Tran H, Salgado P, Tillard E, Dardenne P, Nguyen XT, Lecomte P. “Global” and “local” predictions of dairy diet nutritional quality using near infrared reflectance spectroscopy. J Dairy Sci. 2010; 93(10): 4961-4975. https://doi.org/10.3168/jds.2008-1893
- 10Schaal S, Atkeson CG, Vijayakumar S. Scalable techniques from nonparametric statistics for real time robot learning. Appl Intell. 2002; 17(1): 49-60. https://doi.org/10.1023/A:1015727715131
- 11Sicard E, Sabatier R. Theoretical framework for local PLS1 regression, and application to a rainfall data set. Comput Stat Data Anal. 2006; 51(2): 1393-1410. https://doi.org/10.1016/j.csda.2006.05.002
- 12Kim S, Kano M, Nakagawa H, Hasebe S. Estimation of active pharmaceutical ingredients content using locally weighted partial least squares and statistical wavelength selection. Int J Pharm. 2011; 421(2): 269-274. https://doi.org/10.1016/j.ijpharm.2011.10.007
- 13Hubert M, Vanden Branden K. Robust methods for partial least squares regression. J Chemometr. 2003; 17: 537-549. https://doi.org/10.1002/cem.822
- 14Cleveland WS. Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc. 1979; 74(368): 829. https://doi.org/10.2307/2286407
- 15Cleveland WS, Devlin SJ. Locally weighted regression: an approach to regression analysis by local fitting. J Am Stat Assoc. 1988; 83(403): 596-610. https://doi.org/10.1080/01621459.1988.10478639
- 16Sicard E. Choix de composantes optimales pour l'analyse spatiale et la modélisation: application aux pluies mensuelles du Nordeste brésilien. 2004.
- 17Yoshizaki R, Kano M, Tanabe S, Miyano T. Process parameter optimization based on LW-PLS in pharmaceutical granulation process**This work was partially supported by Japan Society for the Promotion of Science (JSPS), Grant-in-Aid for Scientific Research (C) 24560940. IFAC-Pap. 2015; 48(8): 303-308. https://doi.org/10.1016/j.ifacol.2015.08.198
10.1016/j.ifacol.2015.08.198 Google Scholar
- 18Zhang X, Kano M, Li Y. Locally weighted kernel partial least squares regression based on sparse nonlinear features for virtual sensing of nonlinear time-varying processes. Comput Chem Eng. 2017; 104: 164-171. https://doi.org/10.1016/j.compchemeng.2017.04.014
- 19Dayal BS, MacGregor JF. Improved PLS algorithms. J Chemometr. 1997; 11(1): 73-85. https://doi.org/10.1002/(SICI)1099-128X(199701)11:1<73::AID-CEM435>3.0.CO;2-#
- 20Pérez-Marín D, Fearn T, Guerrero JE, Garrido-Varo A. Improving NIRS predictions of ingredient composition in compound feedingstuffs using Bayesian non-parametric calibrations. Chemom Intel Lab Syst. 2012; 110(1): 108-112. https://doi.org/10.1016/j.chemolab.2011.10.007
- 21Manne R. Analysis of two partial-least-squares algorithms for multivariate calibration. Chemom Intel Lab Syst. 1987; 2(1-3): 187-197. https://doi.org/10.1016/0169-7439(87)80096-5
- 22Höskuldsson A. PLS regression methods. J Chemometr. 1988; 2(3): 211-228. https://doi.org/10.1002/cem.1180020306
10.1002/cem.1180020306 Google Scholar
- 23Bastien P. Régression PLS et données censurées. 2008.
- 24Hazama K, Kano M. Covariance-based locally weighted partial least squares for high-performance adaptive modeling. Chemom Intel Lab Syst. 2015; 146: 55-62. https://doi.org/10.1016/j.chemolab.2015.05.007
- 25Andersson M. A comparison of nine PLS1 algorithms. J Chemometr. 2009; 23(10): 518-529. https://doi.org/10.1002/cem.1248
- 26 R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2018. http://www.R-project.org.
- 27Naes T, Isaksson T, Kowalski B. Locally weighted regression and scatter correction for near-infrared reflectance data. Anal Chem. 1990; 62(7): 664-673.
- 28Ståhle L, Wold S. Partial least squares analysis with cross-validation for the two-class problem: a Monte Carlo study. J Chemometr. 1987; 1(3): 185-196. https://doi.org/10.1002/cem.1180010306
10.1002/cem.1180010306 Google Scholar
- 29Vong R, Geladi P, Wold S, Esbensen K. Source contributions to ambient aerosol calculated by discriminant partial least squares regression (PLS). J Chemometr. 1988; 2(4): 281-296. https://doi.org/10.1002/cem.1180020406
- 30Kemsley EK. Discriminant analysis of high-dimensional data: a comparison of principal components analysis and partial least squares data reduction methods. Chemom Intel Lab Syst. 1996; 33(1): 47-61. https://doi.org/10.1016/0169-7439(95)00090-9
- 31Barker M, Rayens W. Partial least squares for discrimination. J Chemometr. 2003; 17(3): 166-173. https://doi.org/10.1002/cem.785
- 32Shenk J, Westerhaus M, Berzaghi P. Investigation of a LOCAL calibration procedure for near infrared instruments. J Infrared Spectrosc. 1997; 5(1): 223. https://doi.org/10.1255/jnirs.115
- 33Centner V, Massart DL. Optimization in locally weighted regression. Anal Chem. 1998; 70(19): 4206-4211. https://doi.org/10.1021/ac980208r
- 34Filzmoser P, Liebmann B, Varmuza K. Repeated double cross validation. J Chemometr. 2009; 23(4): 160-171. https://doi.org/10.1002/cem.1225
- 35Wold S. Cross-validatory estimation of the number of components in factor and principal components models. Technometrics. 1978; 20(4): 397-405. https://doi.org/10.2307/1267639
- 36Andries JPM, Vander Heyden Y, Buydens LMC. Improved variable reduction in partial least squares modelling based on predictive-property-ranked variables and adaptation of partial least squares complexity. Anal Chim Acta. 2011; 705(1-2): 292-305. https://doi.org/10.1016/j.aca.2011.06.037
- 37Aastveit AH, Marum P. Near-infrared reflectance spectroscopy: different strategies for local calibrations in analyses of forage quality. Appl Spectrosc. 1993; 47(4): 463-469. https://doi.org/10.1366/0003702934334912
- 38Bevilacqua M, Marini F. Local classification: locally weighted–partial least squares-discriminant analysis (LW–PLS-DA). Anal Chim Acta. 2014; 838: 20-30. https://doi.org/10.1016/j.aca.2014.05.057
- 39Atkeson CG, Moore AW, Schaal S. Locally weighted learning for control. Artif Intell Rev. 1997; 11(1): 75-113. https://doi.org/10.1023/A:1006511328852
- 40Song W, Wang H, Maguire P, Nibouche O. Local partial least square classifier in high dimensionality classification. Neurocomputing. 2017; 234: 126-136. https://doi.org/10.1016/j.neucom.2016.12.053




