首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Unambiguous recovery of profiles is a distinguishable advantage of Parallel Factor Analysis (PARAFAC) as a trilinear model and has made it a promising exploratory tool for data analysis. Linear dependency in profiles destroys trilinearity and will increase ambiguity in the curve resolution of three-way data sets. PARAFAC uniqueness deteriorates totally or partially in data sets with linearly dependent loadings. Exploiting a reliable method for determination and direct visualization of feasible bands in the PARAFAC model can be helpful not only in full characterization of uniqueness conditions but also in the investigation of the effects of constraints on the PARAFAC feasible solutions. The purpose of this paper is twofold. First, the calculation of rotational ambiguity in the PARAFAC model extends to three components system. The principle behind the algorithm is described in detail and tested for simulated and real data sets. Completely general and thoroughly investigated results are presented for the three component cases. Secondly, the effects of selective regions in the profiles on the resolution of systems that suffered from the rank deficiency problem, due to rank overlap, are emphasized. In the case of two-way data sets the effect of selectivity constraint on the unique recovery of profiles was investigated and applied. However, to our knowledge, in this report, for the first time, the effect of the presence of selective windows in the profiles, on the unique resolution of three-way data sets has been systematically investigated.  相似文献   

2.
The implementation of maximum likelihood parallel factor analysis (MLPARAFAC) in conjunction with the direct exponential curve resolution algorithm (DECRA) is described. DECRA takes advantage of the intrinsic exponential structure of some bilinear data sets to produce trilinear data by a simple shifting scheme, but this manipulation generates an error structure that is not optimally handled by traditional three-way chemometrics methods such as TLD and PARAFAC. In this work, the effects of these violations are studied using simulated and experimental data used in conjunction with the well-established TLD and PARAFAC. The results obtained by both methods are compared with the results obtained by MLPARAFAC, which is a method designed to optimally accomodate a variety of measurement error structures. The impact on the estimates of different parameters linked to the data sets and the DECRA method is investigated using simulated data. The results indicate that PARAFAC produces estimates of much poorer quality than TLD and MLPARAFAC. Also, it was found that the quality TLD estimates was comparable or only marginally poorer than the MLPARAFAC estimates. A number of commonly used algorithms were also compared to MLPARAFAC using two sets of published experimental data from kinetic studies. The MLPARAFAC estimates of rate constants were more precise than the other methods examined.  相似文献   

3.
PARAFAC is one of the most widely used algorithms for trilinear decomposition. The uniqueness properties of the PARAFAC model are very attractive regardless of whether one is interested curve resolution or not. The fact that PARAFAC provides one unique solution simplifies interpretation of the model. But in three‐way data arrays the uniqueness condition can only be expected when kA + kB + kC ≥ 2F + 2, where F is the number of components and k's are the Kruskal ranks of loadings A to C. As much as second order instruments produce data of varying complexity depending upon the nature of the analytical techniques being combined, with some three‐way data it is possible for patterns generated by the underlying sources of variation to have sufficient independent effects in two modes, yet nonetheless be proportional in a third mode. For example, in three‐way data for spectrophotometric titrations of weak acids or bases (pH‐wavelength‐sample), a rank deficiency may occur in two modes, that is closure rank deficiency in the pH mode and proportionality rank deficiency in the sample direction because each analyte will have acidic and basic forms that are linear combinations in the sample mode. The goal of the present paper is to overcome the non‐uniqueness problem in the second order calibration of monoprotic acids mixtures. The solution contains two steps: first each pH‐absorbance matrix is pretreated by subtraction of the first spectrum from each spectrum in the data matrix. This pretreated data matrix is called the variation matrix. Second, by stacking the variation matrices, a three‐way trilinear variation data array will be obtained without the proportional linear dependency problem that can be resolved uniquely by PARAFAC. It is shown, although unique results are not guaranteed by the Kruscal's condition for the original three‐way data, this condition is fulfilled for pretreated three‐way data. Hence, the variation array may be uniquely decomposed by the PARAFAC algorithm. Studies on simulated as well as real data array reveal the applicability of the proposed method to this kind of problem in the second order calibration of monoprotic acids. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

4.
One of the main problems that limit the use of model-free analysis methods for the resolution of multivariate data is that usually there is rotational ambiguity in the result. While methods for the complete definition of rotational ambiguity for two- and three-component systems have been published recently, the comprehensive and general resolution of rotational ambiguity for four-component systems has eluded chemists for several decades. We have developed an extension of self-modelling curve resolution for a mixture of four-components. The performance of the method was verified by applying it to resolve simulated and real data sets.  相似文献   

5.
Diffusion-ordered spectroscopy (DOSY) NMR is based on a pulse-field gradient spin-echo NMR experiment, in which components experience diffusion. Consequently, the signal of each component decays with different diffusion rates as the gradient strength increases, constructing a bilinear NMR data set of a mixture. By calculating the diffusion coefficient for each component, it is possible to obtain a two-dimensional NMR spectrum: one dimension is for the conventional chemical shift and the other for the diffusion coefficient. The most interesting point is that this two-dimensional NMR allows non-invasive “chromatography” to obtain the pure spectrum for each component, providing a possible alternative for LC-NMR that is more expensive and time-consuming. Potential applications of DOSY NMR include identification of the components and impurities in complex mixtures, such as body fluids, or reaction mixtures, and technical or commercial products, e.g. comprising polymers or surfactants.

Data processing is the most important step to interpret DOSY NMR. Single channel methods and multivariate methods have been proposed for the data processing but all of them have difficulties when applied to real-world cases. The big challenge appears when dealing with more complex samples, e.g. components with small differences in diffusion coefficients, or severely overlapping in the chemical shift dimension. Two single channel methods, including SPLMOD and continuous diffusion coefficient (CONTIN), and two multivariate methods, called direct exponential curve resolution algorithm (DECRA) and multivariate curve resolution (MCR), are critically evaluated by simulated and real DOSY data sets. The assessments in this paper indicate the possible improvement of the DOSY data processing by applying iterative principal component analysis (IPCA) followed by MCR-alternating least square (MCR-ALS).  相似文献   


6.
In the present contribution, a new combination of multivariate curve resolution-correlation optimized warping (MCR-COW) with trilinear parallel factor analysis (PARAFAC) is developed to exploit second-order advantage in complex chromatographic measurements. In MCR-COW, the complexity of the chromatographic data is reduced by arranging the data in a column-wise augmented matrix, analyzing using MCR bilinear model and aligning the resolved elution profiles using COW in a component-wise manner. The aligned chromatographic data is then decomposed using trilinear model of PARAFAC in order to exploit pure chromatographic and spectroscopic information. The performance of this strategy is evaluated using simulated and real high-performance liquid chromatography-diode array detection (HPLC-DAD) datasets. The obtained results showed that the MCR-COW can efficiently correct elution time shifts of target compounds that are completely overlapped by coeluted interferences in complex chromatographic data. In addition, the PARAFAC analysis of aligned chromatographic data has the advantage of unique decomposition of overlapped chromatographic peaks to identify and quantify the target compounds in the presence of interferences. Finally, to confirm the reliability of the proposed strategy, the performance of the MCR-COW-PARAFAC is compared with the frequently used methods of PARAFAC, COW-PARAFAC, multivariate curve resolution-alternating least squares (MCR-ALS), and MCR-COW-MCR. In general, in most of the cases the MCR-COW-PARAFAC showed an improvement in terms of lack of fit (LOF), relative error (RE) and spectral correlation coefficients in comparison to the PARAFAC, COW-PARAFAC, MCR-ALS and MCR-COW-MCR results.  相似文献   

7.
The obtained results by soft modeling multivariate curve resolution methods often are not unique and are questionable because of rotational ambiguity. It means a range of feasible solutions equally fit experimental data and fulfill the constraints. Regarding to chemometric literature, a survey of useful constraints for the reduction of the rotational ambiguity is a big challenge for chemometrician. It is worth to study the effects of applying constraints on the reduction of rotational ambiguity, since it can help us to choose the useful constraints in order to impose in multivariate curve resolution methods for analyzing data sets. In this work, we have investigated the effect of equality constraint on decreasing of the rotational ambiguity. For calculation of all feasible solutions corresponding with known spectrum, a novel systematic grid search method based on Species-based Particle Swarm Optimization is proposed in a three-component system.  相似文献   

8.
This paper introduces some chemometric methods, i.e., self-modeling curve resolution (SMCR), multivariate curve resolution-alternating least squares (MCR-ALS) and parallel factor analysis (PARAFAC and PARAFAC2), which are used to evaluate in vitro dissolution testing data detected by a UV-vis spectrophotometer on meloxicam-mannitol binary systems. These systems were chosen because of their relative simplicity to apply as part of the validation process illustrating the effectiveness of the developed and applied chemometric method. The paper illustrates the failure of PARAFAC methods used before for pharmaceutical data evaluations as well, and we suggest application of the feasible band form given by SMCR as a more general procedure.Steps to improve the dissolution behavior of drugs have become among the most interesting aspects of pharmaceutical technology, and our results show that a larger particle size of meloxicam is advantageous for dissolution. Instead of the use of only one characteristic wavelength, appropriate chemometric methods can furnish more information from dissolution testing data, i.e., the individual dissolution rate profiles and the individual spectra for all the components can be obtained without resorting to any separation techniques such as HPLC.  相似文献   

9.
Multivariate curve resolution-particle swarm optimization (MCR-PSO) algorithm is proposed to exploit pure chromatographic and spectroscopic information from multi-component hyphenated chromatographic signals. This new MCR method is based on rotation of mathematically unique PCA solutions into the chemically meaningful MCR solutions. To obtain a proper rotation matrix, an objective function based on non-fulfillment of constraints is defined and is optimized using particle swarm optimization (PSO) algorithm. Initial values of rotation matrix are calculated using local rank analysis and heuristic evolving latent projection (HELP) method. The ability of MCR-PSO in resolving the chromatographic data is evaluated using simulated gas chromatography–mass spectrometry (GC–MS) and high-performance liquid chromatography–diode array detection (HPLC–DAD) data. To present a comprehensive study, different number of components and various levels of noise under proper constraints of non-negativity, unimodality and spectral normalization are considered. Calculation of the extent of rotational ambiguity in MCR solutions for different chromatographic systems using MCR-BANDS method showed that MCR-PSO solutions are always in the range of feasible solutions like true solutions. In addition, the performance of MCR-PSO is compared with other popular MCR methods of multivariate curve resolution-objective function minimization (MCR-FMIN) and multivariate curve resolution-alternating least squares (MCR-ALS). The results showed that MCR-PSO solutions are rather similar or better (in some cases) than other MCR methods in terms of statistical parameters. Finally MCR-PSO is successfully applied in the resolution of real GC–MS data. It should be pointed out that in addition to multivariate resolution of hyphenated chromatographic signals, MCR-PSO algorithm can be straightforwardly applied to other types of separation, spectroscopic and electrochemical data.  相似文献   

10.
11.
One of the difficulties frequently encountered when studying acid–base equilibria with NMR spectroscopy is the labile behaviour of the measured signal, which hinders the application of bilinear multivariate data analysis methods. In this work, a mathematical transformation is proposed for the conversion of NMR labile signals to inert signals, which make possible the application of multivariate data analysis methods, based on bilinear data models. The procedure has been applied to the analysis of NMR data corresponding to the acid–base equilibria of nucleotides dCMP and dGMP. Both hard-modelling (EQUISPEC) and soft-modelling (MCR-ALS) approaches have been applied for the analysis and resolution of transformed bilinear NMR data matrices.  相似文献   

12.
In this paper, augmentation has been applied to data matrices, which originate from hyphenated methods that share the same mode of detection, but use different separation methods, HPLC-DAD and MEKC-DAD. A novel method, wavelength shift eigenstructure tracking (WET), has been proposed for the alignment between the wavelength scale of both detectors. WET proves to be suitable for the detection as well as correction of wavelength shift between both detectors. After correction of the wavelength scale, data obtained on both systems have been augmented and submitted to iterative target transformation factor analysis. Augmented curve resolution provides significantly better estimates of the chromatographic and electrophoretic profiles and spectra than the use of non-augmented curve resolution on HPLC and MEKC data separately. It is particularly useful when the pure fraction of a chromatographic peak is less than 0.10. Finally, the relative weight of MEKC versus HPLC in augmentation may be increased using intensity and noise normalisation. However, since noise normalisation and its accompanying decrease in signal-to-noise ratio leads to a loss of information, and, since intensity normalisation may cause a failure of the augmented curve resolution algorithm, benefits and drawbacks of normalisation should be weighed on a case-by-case basis.  相似文献   

13.
《Analytical letters》2012,45(8):933-948
This overview summarizes the application and impact of chemometrics on the extraction and interpretation of analytical data with the use of curve resolution methods from about 2005 onward. The development and usage of well-known and novel chemometric methods have been described and approximately 85 papers have been referenced. Many suggested improvements to some well-known methods, for example, multivariate curve resolution, have been noted as well as the growing software for such methods. Also, these high dimensional resolution methods have found significant application and, arguably, have opened up a new perspective in calibration, that is, extraction of otherwise unobtainable analytical information from strongly overlapping profiles in the presence of interferences. Recent literature suggests that the use of chemometric methods in analytical chemistry for data extraction and interpretation provides indispensable tools for multivariate data processing and extraction of hidden information, which otherwise would be difficult to obtain.  相似文献   

14.
As genome-sequencing projects rapidly increase the database of protein sequences, the gap between known sequences and known structures continues to grow exponentially, increasing the demand to accelerate structure determination methods. Residual dipolar couplings (RDCs) are an attractive source of experimental restraints for NMR structure determination, particularly rapid, high-throughput methods, because they yield both local and long-range orientational information and can be easily measured and assigned once the backbone resonances of a protein have been assigned. While very extensive RDC data sets have been used to determine the structure of ubiquitin, it is unclear to what extent such methods will generalize to larger proteins with less complete data sets. Here we incorporate experimental RDC restraints into Rosetta, an ab initio structure prediction method, and demonstrate that the combined algorithm provides a general method for de novo determination of a variety of protein folds from RDC data. Backbone structures for multiple proteins up to approximately 125 residues in length and spanning a range of topological complexities are rapidly and reproducibly generated using data sets that are insufficient in isolation to uniquely determine the protein fold de novo, although ambiguities and errors are observed for proteins with symmetry about an axis of the alignment tensor. The models generated are not high-resolution structures completely defined by experimental data but are sufficiently accurate to accelerate traditional high-resolution NMR structure determination and provide structure-based functional insights.  相似文献   

15.
A multiobjective evolutionary algorithm (MOEA) is described for evolving multiple structure-activity relationships (SARs). The SARs are encoded in easy-to-interpret reduced graph queries which describe features that are preferentially present in active compounds compared to inactives. The MOEA addresses a limitation associated with many machine learning methods; that is, the inherent tradeoff that exists in recall and precision which is usually handled by combining the two objectives into a single measure with a consequent loss of control. By simultaneously optimizing recall and precision, the MOEA generates a family of SARs that lie on the precision-recall (PR) curve. The user is then able to select a query with an appropriate balance in the two objectives: for example, a low recall-high precision query may be preferred when establishing the SAR, whereas a high recall-low precision query may be more appropriate in a virtual screening context. Each query on the PR curve aims at capturing the structure-activity information into a single representation, and each can be considered as an alternative (equally valid) solution. We then investigate combining individual queries into teams with the aim of capturing multiple SARs that may exist in a data set, for example, as is commonly seen in high-throughput screening data sets. Team formation is carried out iteratively as a postprocessing step following the evolution of the individual queries. The inclusion of uniqueness as a third objective within the MOEA provides an effective way of ensuring the queries are complementary in the active compounds they describe. Substantial improvements in both recall and precision are seen for some data sets. Furthermore, the resulting queries provide more detailed structure-activity information than is present in a single query.  相似文献   

16.
Two basic reasons are proposed for the tremendous success and future promise of mass spectrometry: (1) the unusually high volume of data obtainable from unusually small samples and (2) the success in converting these data into structural and quantitative information. The ion abundance dimension of mass spectrometric data is remarkable in its pico-to-ttogram sensitivity and >106 dynamic range, and the mass scale dimension is uniquely high in the number of resolution increments for larger molecule ionization and high resolution. Additional dimensions of data arise from chromatographic coupling to mass spectrometry and tandem mass spectrometry, as well as from alternative ionization and ion reaction methods. Converting these data into chemical information is equally important. Past progress in these areas has been cyclical; for the immediate future a greater research emphasis is urged to convert data to information through better understanding of the relevant chemistry and better utilization of modern computer methods.  相似文献   

17.
A general mean field theory is presented for the construction of equilibrium coarse-grained models. Inverse methods that reconstruct microscopic models from low resolution experimental data can be derived as particular implementations of this theory. The theory also applies to the opposite problem of reduction, where relevant information is extracted from available equilibrium ensemble data. Additionally, a complementary approach is presented and problems of representability in coarse-grained modeling analyzed using information theoretic arguments. These problems are central to the construction of coarse-grained representations of complex systems, and commonly used coarse-graining methods and variational principles for coarse-graining are derived as particular cases of the general theory.  相似文献   

18.
Multivariate curve resolution techniques are powerful tools to extract from sequences of spectra of a chemical reaction system the number of independent chemical components, their associated spectra, and the concentration profiles in time. Usually, these solutions are not unique because of the so‐called rotational ambiguity. In the present work, we reduce the non‐uniqueness by enforcing the consistency of the computed concentration profiles with a given kinetic model. Traditionally, the kinetic modeling is realized in a separate step, which follows the multivariate curve resolution procedure. In contrast to this, we consider a hybrid approach that combines the model‐free curve resolution technique with the model‐based kinetic modeling in an overall optimization. For a two‐component model problem, the range of possible solutions is analyzed, and its reduction to a single, unique solution by means of the hybrid kinetic modeling is shown. The algorithm reduces the rotational ambiguity and improves the quality of the kinetic fitting. Numerical results are also presented for a multi‐component catalytic reaction system that obeys the Michaelis–Menten kinetics. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

19.
Preprocessing of raw near-infrared (NIR) spectral data is indispensable in multivariate calibration when the measured spectra are subject to significant noises, baselines and other undesirable factors. However, due to the lack of sufficient prior information and an incomplete knowledge of the raw data, NIR spectra preprocessing in multivariate calibration is still trial and error. How to select a proper method depends largely on both the nature of the data and the expertise and experience of the practitioners. This might limit the applications of multivariate calibration in many fields, where researchers are not very familiar with the characteristics of many preprocessing methods unique in chemometrics and have difficulties to select the most suitable methods. Another problem is many preprocessing methods, when used alone, might degrade the data in certain aspects or lose some useful information while improving certain qualities of the data. In order to tackle these problems, this paper proposes a new concept of data preprocessing, ensemble preprocessing method, where partial least squares (PLSs) models built on differently preprocessed data are combined by Monte Carlo cross validation (MCCV) stacked regression. Little or no prior information of the data and expertise are required. Moreover, fusion of complementary information obtained by different preprocessing methods often leads to a more stable and accurate calibration model. The investigation of two real data sets has demonstrated the advantages of the proposed method.  相似文献   

20.
《Analytical letters》2012,45(7):1089-1106
This review is focused on the impact of chemometrics for resolving data sets collected from investigations of the interactions of small molecules with biopolymers. These samples have been analyzed with various instrumental techniques, such as fluorescence, ultraviolet–visible spectroscopy, and voltammetry. The impact of two powerful and demonstrably useful multivariate methods for resolution of complex data—multivariate curve resolution–alternating least squares (MCR–ALS) and parallel factor analysis (PARAFAC)—is highlighted through analysis of applications involving the interactions of small molecules with the biopolymers, serum albumin, and deoxyribonucleic acid. The outcomes illustrated that significant information extracted by the chemometric methods was unattainable by simple, univariate data analysis. In addition, although the techniques used to collect data were confined to ultraviolet–visible spectroscopy, fluorescence spectroscopy, circular dichroism, and voltammetry, data profiles produced by other techniques may also be processed. Topics considered including binding sites and modes, cooperative and competitive small molecule binding, kinetics, and thermodynamics of ligand binding, and the folding and unfolding of biopolymers. Applications of the MCR–ALS and PARAFAC methods reviewed were primarily published between 2008 and 2013.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号