首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Common methods of building linear calibration models are principal component regression (PCR), partial least squares (PLS), and least squares (LS). Recently, the method of cyclic subspace regression (CSR) has been presented and shown to provide PCR, PLS, LS and other related intermediate regressions with one algorithm. When forming a linear model with spectral data for quantitative analysis, prediction results can be adversely affected by responses that do not conform well to the linear model proposed. Wavelength selection can be used to eliminate wavelengths where such problem responses occur. It has recently been reported that CSR regression vectors can be formed by summing weighted eigenvectors where weights are determined from the hat matrix, singular values, and eigenvectors characterizing the sample space. Investigation of these weights shows that wavelength selection based on loading vectors can be misleading. Specifically, by using CSR it is shown that a small weight for an eigenvector can annihilate a large peak in a loading vector. In this study, correlograms are used with CSR regression vectors and eigenvector weights as wavelength-selection criteria. It is demonstrated that even though a model generated by LS for a wavelength subset produces substantially reduced prediction errors relative to PCR and PLS, CSR weight plots show that the LS model overfits and should not be used. Simulated situations containing spectral regions with excess noise or nonlinear responses are examined to study the effectiveness of wavelength selection based on the previously listed criteria. Near infrared spectra of gasoline samples with several known properties are also studied.  相似文献   

2.
In multivariate calibration methods like partial least squares (PLS), especially when the spectra data consists of measurements at hundreds and even thousands of analytical channels, it is widely accepted that before a multivariate regression model is built, a well-performed variable selection can be helpful to improve the predictive ability of the model. In the present paper, the idea of variable selection is extended. Unlike in traditional variable selection methods, where the deleted variables and the variables included in the regression model are essentially weighted with discrete values 0 and 1, respectively, the strategy adopted in this paper is to weight the variables with continuous non-negative values. A recently proposed global optimization method, particle swarm optimization (PSO) algorithm is used to search for the weights of variables optimizing the training of a calibration set and the prediction of an independent validation set. Since variable selection is just a special case of variable weighting, the latter is expected to be more rational and flexible. Variable weighting would reduce the negative influence of wavelengths with undesirable qualities while retaining the useful information carried by them. Variable weighting would also prevent the possible spoiling of the multi-channel advantage of the model by variable selection, which would happen when the number of selected wavelengths is small. Two real data sets are investigated and the results of variable-weighted PLS and those of PLS are compared to demonstrate the advantages of the proposed method.  相似文献   

3.
Six popular approaches of «NIR spectrum–property» calibration model building are compared in this work on the basis of a gasoline spectral data. These approaches are: multiple linear regression (MLR), principal component regression (PCR), linear partial least squares regression (PLS), polynomial partial least squares regression (Poly-PLS), spline partial least squares regression (Spline-PLS) and artificial neural networks (ANN). The best preprocessing technique is found for each method. Optimal calibration parameters (number of principal components, ANN structure, etc.) are also found. Accuracy, computational complexity and application simplicity of different methods are compared on an example of prediction of six important gasoline properties (density and fractional composition). Errors of calibration using different approaches are found. An advantage of neural network approach to solution of «NIR spectrum–gasoline property» problem is illustrated. An effective model for gasoline properties prediction based on NIR data is built.  相似文献   

4.
Prediction of sample properties using spectroscopic data with multivariate calibration is often enhanced by wavelength selection. This paper reports on a built-in wavelength selection method in which the estimated regression vector contains zero to near-zero coefficients for undesirable wavelengths. The method is based on Tikhonov regularization with the model 1-norm (TR1) and is applied to simulated and near-infrared (NIR) spectral data. Models are also formed from wavelength subsets determined by the standard method of stepwise regression (SWR). Harmonious (bias/variance tradeoff) and parsimonious considerations are compared with and without wavelength selection for principal component regression (PCR), ridge regression (RR), partial least squares (PLS), and multiple linear regression (MLR). Results show that TR1 models generally contain large baseline regions of near-zero coefficients, thereby essentially achieving built-in wavelength selection. For example, wavelengths with spectral interferences and/or poor signal-to-noise ratios obtain near zero regression coefficients. Results often improve with TR1 models, compared to full wavelength PCR, RR, and PLS models. The SWR subset results are similar to those for the TR1 models using the NIR data and worse with the simulated spectral situations. In general, wavelength selection improves prediction accuracy at a sacrifice to a potential increase in variance and the parsimony remains nearly equivalent compared to full wavelength models. New insights gained from the reported studies provide useful guidelines on when to use full wavelengths or use wavelength selection methods. Specifically, when a small number of large wavelength effects (good sensitivity and selectivity) exist, subset selection by SWR (with caution) and TR1 do well. With a small to moderate number of large to moderate sized wavelength effects, TR1 is better. Lastly, when a large number of small effects are present, full wavelengths with the methods of PCR, RR, or PLS are best.  相似文献   

5.
A class of multivariate calibration methods called augmented classical least squares (ACLS) has been proposed which combines an explicit linear additive model with the predictive power of inverse models, such as principal component regression (PCR) and partial least squares (PLS). Because of its use of the explicit linear additive model, ACLS provides an interesting framework to incorporate different sources of prior information, such as measured pure component spectra, in the model. In this study, the predictive power of ACLS models incorporating different amounts of prior information has been compared to that of PCR and PLS using two examples, a designed experiment and one with biological samples. In both cases, the ACLS models showed predictive power comparable to PLS under idealized validation conditions. When a different interferent structure was present in the validation samples, the predictive power of the inverse models (PCR and PLS) dramatically decreased, with an increase in root-mean-squared error of prediction by a factor of 3.5 for the first example and a factor of 2 in the second example. The incorporation of prior information in the ACLS framework was found to considerably reduce or even completely remove these dramatic effects, especially when the pure component contributions for the interferents were taken into account.  相似文献   

6.
There are many chemometric applications, such as spectroscopy, where the objective is to explain a scalar response from a functional variable (the spectrum) whose observations are functions of wavelengths rather than vectors. In this paper, PLS regression is considered for estimating the linear model when the predictor is a functional random variable. Due to the infinite dimension of the space to which the predictor observations belong, they are usually approximated by curves/functions within a finite dimensional space spanned by a basis of functions. We show that PLS regression with a functional predictor is equivalent to finite multivariate PLS regression using expansion basis coefficients as the predictor, in the sense that, at each step of the PLS iteration, the same prediction is obtained. In addition, from the linear model estimated using the basis coefficients, we derive the expression of the PLS estimate of the regression coefficient function from the model with a functional predictor. The results provided by this functional PLS approach are compared with those given by functional PCR and discrete PLS and PCR using different sets of simulated and spectrometric data.  相似文献   

7.
Simultaneous determination of Mn, Zn, Co and Cd was studied by two methods, classical partial least squares (PLS) and kernel partial least squares (KPLS), with 2-(5-bromo-2-pyridylazo)-5-diethylaminephenol (5-Br-PADAP) and cetyl pyridinium bromide (CPB). Two programs, SPGRPLS and SPGRKPLS, were designed to perform the calculations. Eight error functions were calculated for deducing the number of factors. Data reductions were performed using principal component analysis. The KPLS method was applied for the rapid determination from a data matrix with many wavelengths and fewer samples. Experimental results showed both methods to be successful even where there was severe overlap of spectra.  相似文献   

8.
The potential of the cross-section (CS) approach in combination with the partial least squares (PLS) and principal component regression (PCR) was assessed in the resolution of a complex pesticide mixture showing twelve overlapped components in High Performance Liquid Chromatography with Diode Array Detection (HPLC-DAD). Careful selection of the CS through the three-dimensional (3D) (A, lambda, t) data matrix gave two-dimensional (2D) signals with the best sensitivity for the determination of each pesticide. In all cases, the application of the PLS method demonstrated a better quantitative prediction ability than that of the PCR method. The CS-PLS approach is a powerful analytical tool. Ten pesticides were well-resolved, while for the other two pesticides of the mixture prediction ability was poor, and they could not be determined, probably due to their low net analytical signal. The CS-PLS model was evaluated by predicting the concentrations of independent test set samples. Finally, the proposed model was successfully applied for the determination of these pesticides in groundwater.  相似文献   

9.
Chemometric approaches, such as classical least squares (CLS), principal component regression (PCR), partial least squares (PLS) and iterative target transformation factor analysis (ITTFA), were applied to the simultaneous determination of mixtures of lead, copper, vanadium, cadmium and nickel by differential pulse polarography (DPP). The conventional and first-derivative polarograms of the mixtures were used to perform the optimization of the calibration procedure by chemometric models. The proposed method was applied satisfactorily to the determination of a set of synthetic mixtures of metal in Britton–Robinson buffer (pH 2.87) and potassium thiocyanate and acceptable results were obtained. The results obtained by the application of the different chemometric approaches are discussed and compared. It was found that factor analysis methods generally give better results than CLS and no significant advantages were found with the application of derivative technique, except for ITTFA in this polarographic work.  相似文献   

10.
An analytical technique based on kernel matrix representation is demonstrated to provide further chemically meaningful insight into partial least squares (PLS) regression models. The kernel matrix condenses essential information about scores derived from PLS or principal component analysis (PCA). Thus, it becomes possible to establish the proper interpretation of the scores. A PLS model for the total nitrogen (TN) content in multiple Thai fish sauces is built with a set of near-infrared (NIR) transmittance spectra of the fish sauce samples. The kernel analysis of the scores effectively reveals that the variation of the spectral feature induced by the change in protein content is substantially associated with the total water content and the protein hydration. Kernel analysis is also carried out on a set of time-dependent infrared (IR) spectra representing transient evaporation of ethanol from a binary mixture solution of ethanol and oleic acid. A PLS model to predict the elapsed time is built with the IR spectra and the kernel matrix is derived from the scores. The detailed analysis of the kernel matrix provides penetrating insight into the interaction between the ethanol and the oleic acid.  相似文献   

11.
Yao S  Lu J  Dong M  Chen K  Li J  Li J 《Applied spectroscopy》2011,65(10):1197-1201
Laser-induced breakdown spectroscopy (LIBS) combined with partial least squares (PLS) analysis has been applied for the quantitative analysis of the ash content of coal in this paper. The multivariate analysis method was employed to extract coal ash content information from LIBS spectra rather than from the concentrations of the main ash-forming elements. In order to construct a rigorous partial least squares regression model and reduce the calculation time, different spectral range data were used to construct partial least squares regression models, and then the performances of these models were compared in terms of the correlation coefficients of calibration and validation and the root mean square errors of calibration and cross-validation. Afterwards, the prediction accuracy, reproducibility, and the limit of detection of the partial least squares regression model were validated with independent laser-induced breakdown spectroscopy measurements of four unknown samples. The results show that a good agreement is observed between the ash content provided by thermo-gravimetric analyzer and the LIBS measurements coupled to the PLS regression model for the unknown samples. The feasibility of extracting coal ash content from LIBS spectra is approved. It is also confirmed that this technique has good potential for quantitative analysis of the ash content of coal.  相似文献   

12.
The strongly overlapping infrared absorption features of atherosclerotic and normal rabbit aorta samples as governed by their water, lipid, and protein content render the direct evaluation of molecular characteristics obtained from infrared (IR) spectroscopic measurements challenging for classification. We have successfully applied multivariate data analysis and classification techniques based on partial least squares regression (PLS), linear discriminant analysis (LDA), and principal component regression (PCR) to IR spectroscopic data obtained by using a recently developed infrared attenuated total reflectance (IR-ATR) catheter prototype for future in vivo diagnostic applications. Training data were collected ex vivo from atherosclerotic and normal rabbit aorta samples. The successful classification results on atherosclerotic and normal aorta samples utilizing the developed data evaluation routines reveals the potential of spectroscopy combined with multivariate classification strategies for the identification of normal and atherosclerotic aorta tissue for in vitro and, in the future, in vivo applications.  相似文献   

13.
An updating procedure is described for improving the robustness of multivariate calibration models based on near-infrared spectroscopy. Employing a single blank sample containing no analyte, repeated spectra are acquired during the instrumental warm-up period. These spectra are used to capture the instrumental profile on the analysis day in a way that can be used to update a previously computed calibration model. By augmenting the original spectra of the calibration samples with a group of spectra collected from the blank sample, an updated model can be computed that incorporates any instrumental drift that has occurred. This protocol is evaluated in the context of an analysis of physiological levels of glucose in a simulated biological matrix designed to mimic blood plasma. Employing data of calibration and prediction samples acquired over approximately six months, procedures are studied for implementing the algorithm in conjunction with calibration models based on partial least squares (PLS) regression. Over the range of 1-20 mM glucose, the final algorithm achieves a standard error of prediction (SEP) of 0.79 mM when the augmented PLS model is applied to data collected 176 days after the collection of the calibration spectra. Without updating, the original PLS model produces a seriously degraded SEP of 13.4 mM.  相似文献   

14.
Two novel methods are described for direct quantitative analysis of NMR free induction decay (FID) signals. The methods use adaptations of the generalized rank annihilation method (GRAM) and the direct exponential curve resolution algorithm (DECRA). With FID-GRAM, the Hankel matrix of the sample signal is compared with that of a reference mixture to obtain quantitative data about the components. With FID-DECRA, a single-sample FID matrix is split into two matrices, allowing quantitative recovery of decay constants and the individual signals in the FID. Inaccurate results were obtained with FID-GRAM when there were differences between the frequency or transverse relaxation time of signals for the reference and test samples. This problem does not arise with FID-DECRA, because comparison with a reference signal is unnecessary. Application of FID-DECRA to 19F NMR data, which contained overlapping signals from three components, gave concentrations comparable to those derived from partial least squares (PLS) analysis of the Fourier transformed spectra. However, the main advantage of FID-DECRA was that accurate (<5% error) and precise (2.3% RSD) results were obtained using only one calibration sample, whereas with PLS, a training set of 10 standard mixtures was used to give comparable accuracy and precision.  相似文献   

15.
This paper investigates the use of Fourier transform infrared (FTIR) attenuated total reflectance (ATR) spectroscopy as a fast and simple way for direct determination of nitrate concentration in soil pastes, which would assist precision fertilizer placement and reduce nitrate pollution. Eight types of soils are investigated, with nitrate concentrations ranging from 0 to 1000 ppm-N. The spectral region around the nitrate band (1300-1550 cm(-1)) is analyzed by (1) principal component regression (PCR), (2) partial least squares (PLS), and (3) cross-correlation with reference libraries that include spectra of pure ions and/or soils. The main obstacle to accurate nitrate measurement appears to be an interfering band present in calcareous soils. This band, which may be due to carbonate, is located around 1450 cm(-1) and overlaps with the nitrate band centered around 1370 cm(-1). For non-calcareous soils, and in particular for light sandy agricultural soils, PLS and cross-correlation with a reference library containing only spectra of ions in water give similar results (about 8 ppm-N on dry soil basis), while PCR leads to slightly poorer results. When calcareous soils are included in the analysis, the prediction errors are about twice as large. In this case, the best results are obtained using PLS, followed by PCR, while cross-correlation with reference libraries leads to poorer results.  相似文献   

16.
The need for automated quality surveillance of liquid hydrocarbon fuels has driven the development of rapid fuel property modeling from spectroscopic sensor data. The correlation of near-infrared (NIR) and Raman spectroscopic data with jet and diesel fuel properties can be improved by the deliberate selection of continuous wavelength sub-ranges. An automatic wavelength selection strategy would allow for the unsupervised construction of partial least squares (PLS) regression models of increased predictive utility when supervised model construction and maintenance is not feasible. Changeable size moving window partial least squares (CSMWPLS) is one of the most thorough operations suited for this task. Unfortunately, the necessarily large number of PLS model constructions required by an automated version of this procedure limits the evaluation of the predictive ability of the resulting models through full cross-validation results. Presented here is a novel restricted version of the CSMWPLS algorithm in which the initial spectral range selection is accomplished through multiple interval PLS (iPLS) analyses, where analysis windows for the refinement step no longer move, and size changes are limited to a series of symmetric attenuations. It is shown that the proposed algorithm can provide significant PLS model improvements during the course of a fully automated analysis of jet and diesel fuel spectra in less time than an automated CSMWPLS algorithm.  相似文献   

17.
Partial least squares (PLS) regression is commonly used for multivariate calibration of instruments. Because of the need to know the quality of the prediction in a specific unknown sample and the lack of theory, an ‘empirically found formula’ to express the uncertainty is utilized in The Unscrambler II software, the de-facto standard in computer software for PLS. In this critique the formula is examined theoretically and by simulation. It is concluded that this formula underestimates the root mean squared error of prediction in most practical applications of PLS. A change of the formula is planned in the next version of The Unscrambler. In the mean time users of The Unscrambler ver 5.5 or lower should multiply the reported deviation by a factor of at least , to get a reasonable estimate of the prediction error.  相似文献   

18.
The use of multiple calibration sets in partial least squares (PLS) regression was proposed to improve the quantitative determination of NH(3) over wide concentration ranges from open-path Fourier transform infrared (OP/FT-IR) spectra. The spectra were measured near animal farms, where the path-integrated concentration of NH(3) can fluctuate from nearly zero to as high as approximately 1000 ppm-m. PLS regression with a single calibration set did not cover such a large concentration range effectively, and the quantitative accuracy was degraded due to the nonlinear relationship between concentration and absorbance for spectra measured at low resolution (1 cm(-1) and poorer.) In PLS regression with multiple calibration sets, each calibration set covers a part of the entire concentration range, which significantly decreases the serious nonlinearity problem in PLS regression occurring when only a single calibration set is used. The relative error was reduced from approximately 6% to below 2%, and the best results were obtained with four calibration sets, each covering one quarter of the entire concentration range. It was also found that it was possible to build the multiple calibration sets easily and efficiently without extra measurements.  相似文献   

19.
Two new approaches to multivariate calibration are described that, for the first time, allow information on measurement uncertainties to be included in the calibration process in a statistically meaningful way. The new methods, referred to as maximum likelihood principal components regression (MLPCR) and maximum likelihood latent root regression (MLLRR), are based on principles of maximum likelihood parameter estimation. MLPCR and MLLRR are generalizations of principal components regression (PCR), which has been widely used in chemistry, and latent root regression (LRR), which has been virtually ignored in this field. Both of the new methods are based on decomposition of the calibration data matrix by maximum likelihood principal component analysis (MLPCA), which has been recently described (Wentzell, P. D.; et al. J. Chemom., in press). By using estimates of the measurement error variance, MLPCR and MLLRR are able to extract the optimum amount of information from each measurement and, thereby, exhibit superior performance over conventional multivariate calibration methods such as PCR and partial least-squares regression (PLS) when there is a nonuniform error structure. The new techniques reduce to PCR and LRR when assumptions of uniform noise are valid. Comparisons of MLPCR, MLLRR, PCR, and PLS are carried out using simulated and experimental data sets consisting of three-component mixtures. In all cases of nonuniform errors examined, the predictive ability of the maximum likelihood methods is superior to that of PCR and PLS, with PLS performing somewhat better than PCR. MLLRR generally performed better than MLPCR, but in most cases the improvement was marginal. The differences between PCR and MLPCR are elucidated by examining the multivariate sensitivity of the two methods.  相似文献   

20.
为了有效利用振动信号进行故障诊断,提出了一种基于邻域自适应局部保持投影的轴承故障诊断模型。首先,利用EMD将轴承振动信号分解为若干个平稳的固有模态函数(IMF),对IMF分量建立自回归(AR)模型,构建原始特征子集。然后,利用邻域自适应局部保持投影算法对原始特征子集进行降维处理,获得原始特征子集的低维特征向量和投影矩阵。以低维特征向量为输入,以最小二乘支持向量机(LS-SVM)为分类器,通过研究故障识别率和低维特征空间维数的关系确定最优降维维数和对应的最优投影矩阵。最后,根据最优降维维数完成降维处理过程,得到低维特征向量,输入LS-SVM分类器,识别轴承的工作状态和故障类型。实验结果表明,该模型提高了轴承故障诊断的精度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号