首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 592 毫秒
1.
A new user friendly graphical interface and a command line MATLAB computer program for the evaluation of the extent of rotation ambiguities associated to Multivariate Curve Resolution solutions are presented. Different examples of application are shown including the simultaneous analysis of multiple data sets and the implementation of local rank and trilinearity constraints, basic tools to reduce and eliminate rotation ambiguities. The program allows for an easy check of the extent of rotation ambiguity remaining in Multivariate Curve Resolution solutions in the investigation of a particular system and it also allows for the checking of the effect of applied constraints. In this way, conditions and limitations to achieve optimal solutions in Multivariate Curve Resolution are easily assessed.  相似文献   

2.
Kim BC  Youn CH  Ahn JM  Gu MB 《Analytical chemistry》2005,77(24):8020-8026
In this study, we describe a straightforward strategy to develop whole cell-based biosensors using fusions of the bacterial bioluminescence genes and the promoters from chemically responsive genes within Escherichia coli, in which chemical target-responsive genes were screened by using the information of gene expression data obtained from DNA microarray analysis. Paraquat was used as a model chemical to trigger gene expression changes of E. coli and to show the DNA microarray-assisted development of whole cell-based biosensors. Gene expression data from the DNA microarray were obtained by time course analysis (10, 30, and 60 min) after exposure to paraquat. After clustering gene expression data obtained by time course analysis, a group of highly expressed genes over the all time courses could be classified. Within this group, three genes expressed highly for overall time points were selected and promoters of these genes were used as fusion partners with reporter genes, lux CDABE, to construct whole cell-based biosensors. The constructed biosensors recognized the presence of model inducer, paraquat, and structural analogue chemicals of paraquat with a high specificity, and the results were reconfirmed by using DNA microarray experiments for those structural analogues. This strategy to develop whole cell-based biosensors assisted by DNA microarray information should be useful in general for constructing chemical-specific or stress-specific biosensors with a high-throughput manner.  相似文献   

3.
4.
Software-based feature extraction from DNA microarray images still requires human intervention on various levels. Manual adjustment of grid and metagrid parameters, precise alignment of superimposed grid templates and gene spots, or simply identification of large-scale artifacts have to be performed beforehand to reliably analyze DNA signals and correctly quantify their expression values. Ideally, a Web-based system with input solely confined to a single microarray image and a data table as output containing measurements for all gene spots would directly transform raw image data into abstracted gene expression tables. Sophisticated algorithms with advanced procedures for iterative correction function can overcome imminent challenges in image processing. Herein is introduced an integrated software system with a Java-based interface on the client side that allows for decentralized access and furthermore enables the scientist to instantly employ the most updated software version at any given time. This software tool is extended from PixClust as used in Extractiff incorporated with Java Web Start deployment technology. Ultimately, this setup is destined for high-throughput pipelines in genome-wide medical diagnostics labs or microarray core facilities aimed at providing fully automated service to its users.  相似文献   

5.
Analysis of Variance (ANOVA) separates the effects of different factors in a dataset. Typical examples for gene microarray data are the factors time and treatment. This separation can improve the interpretability of the results. However, the main effects and interactions, calculated in ANOVA, can be heavily influenced by outliers, large numbers of non-expressed genes with noise, and the heavy-tailedness of the distribution of expression values. Robust methods are less affected by these and will improve the analysis.In this paper, several methods to perform robust nonparametric ANOVA are applied to a large multi-treatment time series dataset. The results are compared with the results obtained with parametric ANOVA using Procrustes analysis. A further comparison is made by Gene Ontology (GO) enrichment analysis of groups of genes identified as significant by inspection of the interaction terms in ANOVA. It is shown that there are significant differences in the estimates of main effects and gene–treatment interactions. ROC curves show an improved representation of current biological knowledge for one particular robust form of ANOVA, using a combination of rank transformed data, with the median as location parameter.  相似文献   

6.
Time-course microarray experiments harvested samples at several time points. To reveal the dynamic gene expression changes over time, we need to identify the significant genes and detect the patterns of gene expressions, which may bring directional errors. Guo et al. (Biometrics 66(2):485–492, 2010) introduced a mixed directional false discovery rate (mdFDR) controlled procedure, which controls the sum of expected proportions of Type I and Type III errors among all rejections. In this paper, we develop weighted p value procedures for mdFDR control and give out some sufficient conditions to assure the (asymptotic) mdFDR control. Some weights and their estimators are illustrated to satisfy the sufficient conditions. The proposed weighted p value procedures are compared with the existing method by extensive simulations. Based on the proposed weighted p values procedure, we provide multiple CIs which control the false coverage-statement rate (FCR). We use the proposed methods to analyze the time-course microarray data studied in Lobenhofer et al. (Mol Endocrinol 16:1215–1229, 2002). Most of our findings are the same as those obtained by the existing method. In addition, we identify some other important genes, such as CDKN3 and NQO1.  相似文献   

7.
An important application of microarray data in functional genomics is to classify samples according to their gene expression profiles such as to classify cancer versus normal samples or to classify different types or subtypes of cancer. One of the major tasks with gene expression data is to find co-regulated gene groups whose collective expression is strongly associated with sample categories. In this regard, a gene clustering algorithm is proposed to group genes from microarray data. It directly incorporates the information of sample categories in the grouping process for finding groups of co-regulated genes with strong association to the sample categories, yielding a supervised gene clustering algorithm. The average expression of the genes from each cluster acts as its representative. Some significant representatives are taken to form the reduced feature set to build the classifiers for cancer classification. The mutual information is used to compute both gene-gene redundancy and gene-class relevance. The performance of the proposed method, along with a comparison with existing methods, is studied on six cancer microarray data sets using the predictive accuracy of naive Bayes classifier, K-nearest neighbor rule, and support vector machine. An important finding is that the proposed algorithm is shown to be effective for identifying biologically significant gene clusters with excellent predictive capability.  相似文献   

8.
Reverse engineering problems concerning the reconstruction and identification of gene regulatory networks through gene expression data are central issues in computational molecular biology and have become the focus of much research in the last few years. An approach has been proposed for inferring the complex causal relationships among genes from microarray experimental data, which is based on a novel neural fuzzy recurrent network. The method derives information on the gene interactions in a highly interpretable form (fuzzy rules) and takes into account the dynamical aspects of gene regulation through its recurrent structure. To determine the efficiency of the proposed approach, microarray data from two experiments relating to Saccharomyces cerevisiae and Escherichia coli have been used and experiments concerning gene expression time course prediction have been conducted. The interactions that have been retrieved among a set of genes known to be highly regulated during the yeast cell-cycle are validated by previous biological studies. The method surpasses other computational techniques, which have attempted genetic network reconstruction, by being able to recover significantly more biologically valid relationships among genes  相似文献   

9.
The identification of genes and pathways involved in biological processes is a central problem in systems biology. Recent microarray technologies and other high-throughput experiments provide information which sheds light on this problem. In this article, the authors propose a new computational method to detect active pathways, or identify differentially expressed pathways via integration of gene expression and interactomic data in a sophisticated and efficient manner. Specifically, by using signal-to-noise ratio to measure the differentially expressed level of networks, this problem is formulated as a mixed integer linear programming problem (MILP). The results on yeast and human data demonstrate that the proposed method is more accurate and robust than existing approaches.  相似文献   

10.
The presence of rotation ambiguities and unique solutions in Multivariate Curve Resolution (MCR) chemometric methods is discussed in detail. Using recently proposed graphical approaches to display the bands and areas of feasible solutions in a subspace of reduced dimensions, the results obtained by different MCR methods are compared. These results show that in the presence of rotation ambiguities and under a particular set of constraints, the solutions obtained by the different MCR methods can differ among them and also from the true solution depending on initial estimates and on the applied algorithm. In absence of rotational ambiguities, all MCR methods should give the same unique solution which should be equal to the true one. Many of the MCR methods proposed in the literature like MCR-ALS, RFA, MCR-FMIN, or MCR-BANDS are confirmed to give a valid solution within the band or area of feasible solutions. On the contrary, and according to the results of this study, in its present implementation, the minimum volume simplex analysis, MVSA method can give unfeasible solutions when resolving bilinear data systems with more than two components, because it only applies non-negativity constraints to concentration profiles and not to spectral profiles.  相似文献   

11.
A data anomaly was observed that affected the uniformity and reproducibility of fluorescent signal across DNA microarrays. Results from experimental sets designed to identify potential causes (from microarray production to array scanning) indicated that the anomaly was linked to a batch process; further work allowed us to localize the effect to the posthybridization array stringency washes. Ozone levels were monitored and highly correlated with the batch effect. Controlled exposures of microarrays to ozone confirmed this factor as the root cause, and we present data that show susceptibility of a class of cyanine dyes (e.g., Cy5, Alexa 647) to ozone levels as low as 5-10 ppb for periods as short as 10-30 s. Other cyanine dyes (e.g., Cy3, Alexa 555) were not significantly affected until higher ozone levels (> 100 ppb). To address this environmental effect, laboratory ozone levels should be kept below 2 ppb (e.g., with filters in HVAC) to achieve high quality microarray data.  相似文献   

12.
Stochastic dynamic modeling of short gene expression time-series data   总被引:1,自引:0,他引:1  
In this paper, the expectation maximization (EM) algorithm is applied for modeling the gene regulatory network from gene time-series data. The gene regulatory network is viewed as a stochastic dynamic model, which consists of the noisy gene measurement from microarray and the gene regulation first-order autoregressive (AR) stochastic dynamic process. By using the EM algorithm, both the model parameters and the actual values of the gene expression levels can be identified simultaneously. Moreover, the algorithm can deal with the sparse parameter identification and the noisy data in an efficient way. It is also shown that the EM algorithm can handle the microarrary gene expression data with large number of variables but a small number of observations. The gene expression stochastic dynamic models for four real-world gene expression data sets are constructed to demonstrate the advantages of the introduced algorithm. Several indices are proposed to evaluate the models of inferred gene regulatory networks, and the relevant biological properties are discussed.  相似文献   

13.
Many genes related to the circadian rhythm, especially those involved in phase shifts induced by different environmental stimuli, still remain enigmatic. In this study, the authors monitored the expression of rat genes measured with multiple phase-resetting stimuli, and developed a technique to extract the candidate genes for the changes in circadian rhythm by the stimuli, from microarray data. First, the spectra for the time series of gene expression were estimated by fast Fourier transform, and then two fitting methods, the random period fitting method and the conditional curve fitting method, using the estimated periods as the initial values, were applied to the control and the stimulated expression data to estimate the periods and the phases. Finally, by comparing the two sets of periods and phases, the period change and the phase shift by stimuli were estimated to extract the candidate genes related to the master clock, by mapping the period change and the phase shift on a two-dimensional space, a period?phase map (PPM). As an indirect validation of the genes selected by our method, the significant enrichment of extracted gene clusters on the PPM was further evaluated, in terms of biological function. As a result, the gene clusters related to photoreceptors and neural regulation emerged on the PPM, thus implying the relationships in the stimulus response of the master clock that resides in the brain at the intersection of the optic nerves. Thus, the present approach is a feasible means to explore the oscillatory genes related to stimulus responses.  相似文献   

14.
15.
Multivariate Curve Resolution (MCR) aims to blindly recover the concentration profile and the source spectra without any prior supervised calibration step. It is well known that imposing additional constraints like positiveness, closure and others may improve the quality of the solution. When a physico-chemical model of the process is known, this can be also introduced constraining even more the solution. In this paper, we apply MCR to Ion Mobility Spectra. Since instrumental models suggest that peaks are of Gaussian shape with a width depending on the instrument resolution, we introduce that each source is characterized by a linear superposition of Gaussian peaks of fixed spread. We also prove that this model is able to fit wider peaks departing from pure Gaussian shape. Instead of introducing a non-linear Gaussian peak fitting, we use a very dense model and rely on a least square solver with L1-norm regularization to obtain a sparse solution. This is accomplished via Least Absolute Shrinkage and Selection Operator (LASSO). Results provide nicely resolved concentration profiles and spectra improving the results of the basic MCR solution.  相似文献   

16.
Stroke and cerebral haemorrhage are the second leading causes of death in the world after ischaemic heart disease. In this work, a dataset containing medical, physiological and environmental tests for stroke was used to evaluate the efficacy of machine learning, deep learning and a hybrid technique between deep learning and machine learning on the Magnetic Resonance Imaging (MRI) dataset for cerebral haemorrhage. In the first dataset (medical records), two features, namely, diabetes and obesity, were created on the basis of the values of the corresponding features. The t-Distributed Stochastic Neighbour Embedding algorithm was applied to represent the high-dimensional dataset in a low-dimensional data space. Meanwhile,the Recursive Feature Elimination algorithm (RFE) was applied to rank the features according to priority and their correlation to the target feature and to remove the unimportant features. The features are fed into the various classification algorithms, namely, Support Vector Machine (SVM), K Nearest Neighbours (KNN), Decision Tree, Random Forest, and Multilayer Perceptron. All algorithms achieved superior results. The Random Forest algorithm achieved the best performance amongst the algorithms; it reached an overall accuracy of 99%. This algorithm classified stroke cases with Precision, Recall and F1 score of 98%, 100% and 99%, respectively. In the second dataset, the MRI image dataset was evaluated by using the AlexNet model and AlexNet + SVM hybrid technique. The hybrid model AlexNet + SVM performed is better than the AlexNet model; it reached accuracy, sensitivity, specificity and Area Under the Curve (AUC) of 99.9%, 100%, 99.80% and 99.86%, respectively.  相似文献   

17.
With rapid accumulation of functional relationships between biological molecules, knowledge‐based networks have been constructed and stocked in many databases. These networks provide curated and comprehensive information for functional linkages among genes and proteins, whereas their activities are highly related with specific phenotypes and conditions. To evaluate a knowledge‐based network in a specific condition, the consistency between its structure and conditionally specific gene expression profiling data are an important criterion. In this study, the authors propose a Gaussian graphical model to evaluate the documented regulatory networks by the consistency between network architectures and time course gene expression profiles. They derive a dynamic Bayesian network model to evaluate gene regulatory networks in both simulated and true time course microarray data. The regulatory networks are evaluated by matching network structure with gene expression to achieve consistency measurement. To demonstrate the effectiveness of the authors method, they identify significant regulatory networks in response to the time course of circadian rhythm. The knowledge‐based networks are screened and ranked by their structural consistencies with dynamic gene expression profiling.Inspec keywords: Bayes methods, biology computing, circadian rhythms, Gaussian processes, genetics, genomics, graphs, molecular biophysics, proteinsOther keywords: Gaussian graphical model, responsive regulatory networks, time course high‐throughput data, biological molecules, dynamic gene expression proflling, circadian rhythm, consistency measurement, matching network structure, simulated time course microarray data, true time course microarray data, dynamic Bayesian network model, time course gene expression proflles, network architectures, documented regulatory networks, speciflc gene expression proflling data, phenotypes, proteins, functional linkages, databases, knowledge‐based networks  相似文献   

18.
Reverse engineering of gene regulatory networks   总被引:1,自引:0,他引:1  
Systems biology is a multi-disciplinary approach to the study of the interactions of various cellular mechanisms and cellular components. Owing to the development of new technologies that simultaneously measure the expression of genetic information, systems biological studies involving gene interactions are increasingly prominent. In this regard, reconstructing gene regulatory networks (GRNs) forms the basis for the dynamical analysis of gene interactions and related effects on cellular control pathways. Various approaches of inferring GRNs from gene expression profiles and biological information, including machine learning approaches, have been reviewed, with a brief introduction of DNA microarray experiments as typical tools for measuring levels of messenger ribonucleic acid (mRNA) expression. In particular, the inference methods are classified according to the required input information, and the main idea of each method is elucidated by comparing its advantages and disadvantages with respect to the other methods. In addition, recent developments in this field are introduced and discussions on the challenges and opportunities for future research are provided  相似文献   

19.
Inferring gene regulatory networks (GRNs) from microarray expression data are an important but challenging issue in systems biology. In this study, the authors propose a Bayesian information criterion (BIC)‐guided sparse regression approach for GRN reconstruction. This approach can adaptively model GRNs by optimising the l 1 ‐norm regularisation of sparse regression based on a modified version of BIC. The use of the regularisation strategy ensures the inferred GRNs to be as sparse as natural, while the modified BIC allows incorporating prior knowledge on expression regulation and thus avoids the overestimation of expression regulators as usual. Especially, the proposed method provides a clear interpretation of combinatorial regulations of gene expression by optimally extracting regulation coordination for a given target gene. Experimental results on both simulation data and real‐world microarray data demonstrate the competent performance of discovering regulatory relationships in GRN reconstruction.Inspec keywords: genetics, Bayes methods, genomics, regression analysis, inference mechanisms, bioinformaticsOther keywords: adaptive modelling, gene regulatory network, Bayesian information criterion‐guided sparse regression approach, GRN, microarray expression data, systems biology, GRN reconstruction, optimisation, l1 ‐norm regularisation  相似文献   

20.
The self-organizing oscillator network (SOON) is a comparatively new clustering algorithm that does not require the knowledge of the number of clusters. The SOON is distance based, and its clustering behavior is different to density-based algorithms in a number of ways. This paper examines the effect of adjusting the control parameters of the SOON with four different datasets; the first is a (communications) modulation dataset representing one modulation scheme under a variety of noise conditions. This allows the assessment of the behavior of the algorithm with data varying between highly separable and nonseparable cases. The main thrust of this paper is to evaluate its efficacy in biological datasets. The second is taken from microarray experiments on the cell cycle of yeast, while the third and the fourth represent two microarray cancer datasets, i.e., the lymphoma and the liver cancer datasets. The paper demonstrates that the SOON is a viable tool to analyze these problems, and can add many useful insights to the biological data that may not always be available using other clustering methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号