首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
The usefulness of representing an ensemble of NMR-derived protein structures by a single structure has been investigated. Two stereochemical properties have been used to assess how a single structure relates to the ensemble from which it was derived, namely the distribution of phi psi torsion angles and the distribution of chi 1 torsion angles. The results show that the minimized average structure derived from the ensemble (a total of 11 ensembles from the Brookhaven Protein Data Bank were analyzed) does not always correspond well with this ensemble, particularly for those ensembles generated with a smaller number of experimentally derived restraints per residue. An alternative method that selects the member of the ensemble which is closest to the "average" of the ensemble has been investigated (a total of 23 ensembles from the Brookhaven Protein Data Bank were analyzed). Although this method selected a structure that on the whole corresponded more closely to the ensemble than did the minimized average structure, this is still not a totally reliable means of selecting a single structure to represent the ensemble. This suggests that it is advisable to study the ensemble as a whole. A study has also been made of the practice of selecting the "best" rather than the most representative member of the ensemble. This too suggests that the ensemble should be studied as a whole. A study of the conformational space occupied by the ensemble also suggests the need to consider the ensemble as a whole, particularly for those ensembles generated with a smaller number of experimentally derived restraints per residue.  相似文献   

2.
Conformational ensembles are increasingly recognized as a useful representation to describe fundamental relationships between protein structure, dynamics and function. Here we present an ensemble of ubiquitin in solution that is created by sampling conformational space without experimental information using “Backrub” motions inspired by alternative conformations observed in sub-Angstrom resolution crystal structures. Backrub-generated structures are then selected to produce an ensemble that optimizes agreement with nuclear magnetic resonance (NMR) Residual Dipolar Couplings (RDCs). Using this ensemble, we probe two proposed relationships between properties of protein ensembles: (i) a link between native-state dynamics and the conformational heterogeneity observed in crystal structures, and (ii) a relation between dynamics of an individual protein and the conformational variability explored by its natural family. We show that the Backrub motional mechanism can simultaneously explore protein native-state dynamics measured by RDCs, encompass the conformational variability present in ubiquitin complex structures and facilitate sampling of conformational and sequence variability matching those occurring in the ubiquitin protein family. Our results thus support an overall relation between protein dynamics and conformational changes enabling sequence changes in evolution. More practically, the presented method can be applied to improve protein design predictions by accounting for intrinsic native-state dynamics.  相似文献   

3.
Considerable debate has focused on whether sampling of molecular dynamics trajectories restrained by crystallographic data can be used to develop realistic ensemble models for proteins in their natural, solution state. For the SARS-CoV-2 main protease, Mpro, we evaluated agreement between solution residual dipolar couplings (RDCs) and various recently reported multi-conformer and dynamic-ensemble crystallographic models. Although Phenix-derived ensemble models showed only small improvements in crystallographic Rfree, substantially improved RDC agreement over fits to a conventionally refined 1.2-Å X-ray structure was observed, in particular for residues with above average disorder in the ensemble. For a set of six lower resolution (1.55–2.19 Å) Mpro X-ray ensembles, obtained at temperatures ranging from 100 to 310 K, no significant improvement over conventional two-conformer representations was found. At the residue level, large differences in motions were observed among these ensembles, suggesting high uncertainties in the X-ray derived dynamics. Indeed, combining the six ensembles from the temperature series with the two 1.2-Å X-ray ensembles into a single 381-member “super ensemble” averaged these uncertainties and substantially improved agreement with RDCs. However, all ensembles showed excursions that were too large for the most dynamic fraction of residues. Our results suggest that further improvements to X-ray ensemble refinement are feasible, and that RDCs provide a sensitive benchmark in such endeavors. Remarkably, a weighted ensemble of 350 PDB Mpro X-ray structures provided slightly better cross-validated agreement with RDCs than any individual ensemble refinement, implying that differences in lattice confinement also limit the fit of RDCs to X-ray coordinates.  相似文献   

4.
The brain can represent behaviorally relevant information through the firing of individual neurons as well as the coordinated firing of ensembles of neurons. Neurons in the hippocampus and associated cortical regions participate in a variety of types of ensembles to support navigation. These ensemble types include single cell codes, population codes, time-compressed sequences, behavioral sequences, and engrams. We present the physiological basis and behavioral relevance of ensemble firing. We discuss how these traditional definitions of ensembles can constrain or expand potential analyses due to the underlying assumptions and abstractions made. We highlight how coding can change at the ensemble level while underlying single cell codes remain intact. Finally, we present how ensemble definitions could be broadened to better understand the full complexity of the brain.  相似文献   

5.
Laughton CA  Orozco M  Vranken W 《Proteins》2009,75(1):206-216
NMR structures are typically deposited in databases such as the PDB in the form of an ensemble of structures. Generally, each of the models in such an ensemble satisfies the experimental data and is equally valid. No unique solution can be calculated because the experimental NMR data is insufficient, in part because it reflects the conformational variability and dynamical behavior of the molecule in solution. Even for relatively rigid molecules, the limited number of structures that are typically deposited cannot completely encompass the structural diversity allowed by the observed NMR data, but they can be chosen to try and maximize its representation. We describe here the adaptation and application of techniques more commonly used to examine large ensembles from molecular dynamics simulations, to the analysis of NMR ensembles. The approach, which is based on principal component analysis, we call COCO ("Complementary Coordinates"). The COCO approach analyses the distribution of an NMR ensemble in conformational space, and generates a new ensemble that fills "gaps" in the distribution. The method is very rapid, and analysis of a 25-member ensemble and generation of a new 25 member ensemble typically takes 1-2 min on a conventional workstation. Applied to the 545 structures in the RECOORD database, we find that COCO generates new ensembles that are as structurally diverse-both from each other and from the original ensemble-as are the structures within the original ensemble. The COCO approach does not explicitly take into account the NMR restraint data, yet in tests on selected structures from the RECOORD database, the COCO ensembles are frequently good matches to this data, and certainly are structures that can be rapidly refined against the restraints to yield high-quality, novel solutions. COCO should therefore be a useful aid in NMR structure refinement and in other situations where a richer representation of conformational variability is desired-for example in docking studies. COCO is freely accessible via the website www.ccpb.ac.uk/COCO.  相似文献   

6.
Characterizing ensembles of intrinsically disordered proteins is experimentally challenging because of the ill-conditioned nature of ensemble determination with limited data and the intrinsic fast dynamics of the conformational ensemble. Amide I two-dimensional infrared (2D IR) spectroscopy has picosecond time resolution to freeze structural ensembles as needed for probing disordered-protein ensembles and conformational dynamics. Also, developments in amide I computational spectroscopy now allow a quantitative and direct prediction of amide I spectra based on conformational distributions drawn from molecular dynamics simulations, providing a route to ensemble refinement against experimental spectra. We performed a Bayesian ensemble refinement method on Ala–Ala–Ala against isotope-edited Fourier-transform infrared spectroscopy and 2D IR spectroscopy and tested potential factors affecting the quality of ensemble refinements. We found that isotope-edited 2D IR spectroscopy provides a stringent constraint on Ala–Ala–Ala conformations and returns consistent conformational ensembles with the dominant ppII conformer across varying prior distributions from many molecular dynamics force fields and water models. The dominant factor influencing ensemble refinements is the systematic frequency uncertainty from spectroscopic maps. However, the uncertainty of conformer populations can be significantly reduced by incorporating 2D IR spectra in addition to traditional Fourier-transform infrared spectra. Bayesian ensemble refinement against isotope-edited 2D IR spectroscopy thus provides a route to probe equilibrium-complex protein ensembles and potentially nonequilibrium conformational dynamics.  相似文献   

7.
Predictive performance is important to many applications of species distribution models (SDMs). The SDM ‘ensemble’ approach, which combines predictions across different modelling methods, is believed to improve predictive performance, and is used in many recent SDM studies. Here, we aim to compare the predictive performance of ensemble species distribution models to that of individual models, using a large presence–absence dataset of eucalypt tree species. To test model performance, we divided our dataset into calibration and evaluation folds using two spatial blocking strategies (checkerboard-pattern and latitudinal slicing). We calibrated and cross-validated all models within the calibration folds, using both repeated random division of data (a common approach) and spatial blocking. Ensembles were built using the software package ‘biomod2’, with standard (‘untuned’) settings. Boosted regression tree (BRT) models were also fitted to the same data, tuned according to published procedures. We then used evaluation folds to compare ensembles against both their component untuned individual models, and against the BRTs. We used area under the receiver-operating characteristic curve (AUC) and log-likelihood for assessing model performance. In all our tests, ensemble models performed well, but not consistently better than their component untuned individual models or tuned BRTs across all tests. Moreover, choosing untuned individual models with best cross-validation performance also yielded good external performance, with blocked cross-validation proving better suited for this choice, in this study, than repeated random cross-validation. The latitudinal slice test was only possible for four species; this showed some individual models, and particularly the tuned one, performing better than ensembles. This study shows no particular benefit to using ensembles over individual tuned models. It also suggests that further robust testing of performance is required for situations where models are used to predict to distant places or environments.  相似文献   

8.
A method is introduced to represent an ensemble of conformers of a protein by a single structure in torsion angle space that lies closest to the averaged Cartesian coordinates while maintaining perfect covalent geometry and on average equal steric quality and an equally good fit to the experimental (e.g. NMR) data as the individual conformers of the ensemble. The single representative ‘regmean structure’ is obtained by simulated annealing in torsion angle space with the program CYANA using as input data the experimental restraints, restraints for the atom positions relative to the average Cartesian coordinates, and restraints for the torsion angles relative to the corresponding principal cluster average values of the ensemble. The method was applied to 11 proteins for which NMR structure ensembles are available, and compared to alternative, commonly used simple approaches for selecting a single representative structure, e.g. the structure from the ensemble that best fulfills the experimental and steric restraints, or the structure from the ensemble that has the lowest RMSD value to the average Cartesian coordinates. In all cases our method found a structure in torsion angle space that is significantly closer to the mean coordinates than the alternatives while maintaining the same quality as individual conformers. The method is thus suitable to generate representative single structure representations of protein structure ensembles in torsion angle space. Since in the case of NMR structure calculations with CYANA the single structure is calculated in the same way as the individual conformers except that weak positional and torsion angle restraints are added, we propose to represent new NMR structures by a ‘regmean bundle’ consisting of the single representative structure as the first conformer and all but one original individual conformers (the original conformer with the highest target function value is discarded in order to keep the number of conformers in the bundle constant). In this way, analyses that require a single structure can be carried out in the most meaningful way using the first model, while at the same time the additional information contained in the ensemble remains available.  相似文献   

9.
10.

Background

Molecular dynamics (MD) simulations are powerful tools to investigate the conformational dynamics of proteins that is often a critical element of their function. Identification of functionally relevant conformations is generally done clustering the large ensemble of structures that are generated. Recently, Self-Organising Maps (SOMs) were reported performing more accurately and providing more consistent results than traditional clustering algorithms in various data mining problems. We present a novel strategy to analyse and compare conformational ensembles of protein domains using a two-level approach that combines SOMs and hierarchical clustering.

Results

The conformational dynamics of the α-spectrin SH3 protein domain and six single mutants were analysed by MD simulations. The Cα's Cartesian coordinates of conformations sampled in the essential space were used as input data vectors for SOM training, then complete linkage clustering was performed on the SOM prototype vectors. A specific protocol to optimize a SOM for structural ensembles was proposed: the optimal SOM was selected by means of a Taguchi experimental design plan applied to different data sets, and the optimal sampling rate of the MD trajectory was selected. The proposed two-level approach was applied to single trajectories of the SH3 domain independently as well as to groups of them at the same time. The results demonstrated the potential of this approach in the analysis of large ensembles of molecular structures: the possibility of producing a topological mapping of the conformational space in a simple 2D visualisation, as well as of effectively highlighting differences in the conformational dynamics directly related to biological functions.

Conclusions

The use of a two-level approach combining SOMs and hierarchical clustering for conformational analysis of structural ensembles of proteins was proposed. It can easily be extended to other study cases and to conformational ensembles from other sources.  相似文献   

11.
Mathematical modeling of complex gene expression programs is an emerging tool for understanding disease mechanisms. However, identification of large models sometimes requires training using qualitative, conflicting or even contradictory data sets. One strategy to address this challenge is to estimate experimentally constrained model ensembles using multiobjective optimization. In this study, we used Pareto Optimal Ensemble Techniques (POETs) to identify a family of proof-of-concept signal transduction models. POETs integrate Simulated Annealing (SA) with Pareto optimality to identify models near the optimal tradeoff surface between competing training objectives. We modeled a prototypical-signaling network using mass-action kinetics within an ordinary differential equation (ODE) framework (64 ODEs in total). The true model was used to generate synthetic immunoblots from which the POET algorithm identified the 117 unknown model parameters. POET generated an ensemble of signaling models, which collectively exhibited population-like behavior. For example, scaled gene expression levels were approximately normally distributed over the ensemble following the addition of extracellular ligand. Also, the ensemble recovered robust and fragile features of the true model, despite significant parameter uncertainty. Taken together, these results suggest that experimentally constrained model ensembles could capture qualitatively important network features without exact parameter information.  相似文献   

12.
Most active biopolymers are dynamic structures; thus, ensembles of such molecules should be characterized by distributions of intra- or intermolecular distances and their fast fluctuations. A method of choice to determine intramolecular distances is based on Förster resonance energy transfer (FRET) measurements. Major advances in such measurements were achieved by single molecule FRET measurements. Here, we show that by global analysis of the decay of the emission of both the donor and the acceptor it is also possible to resolve two sub-populations in a mixture of two ensembles of biopolymers by time resolved FRET (trFRET) measurements at the ensemble level. We show that two individual intramolecular distance distributions can be determined and characterized in terms of their individual means, full width at half maximum (FWHM), and two corresponding diffusion coefficients which reflect the rates of fast ns fluctuations within each sub-population. An important advantage of the ensemble level trFRET measurements is the ability to use low molecular weight small-sized probes and to determine nanosecond fluctuations of the distance between the probes. The limits of the possible resolution were first tested by simulation and then by preparation of mixtures of two model peptides. The first labeled polypeptide was a relatively rigid Pro7 and the second polypeptide was a flexible molecule consisting of (Gly-Ser)7 repeats. The end to end distance distributions and the diffusion coefficients of each peptide were determined. Global analysis of trFRET measurements of a series of mixtures of polypeptides recovered two end-to-end distance distributions and associated intramolecular diffusion coefficients, which were very close to those determined from each of the pure samples. This study is a proof of concept study demonstrating the power of ensemble level trFRET based methods in resolution of subpopulations in ensembles of flexible macromolecules.  相似文献   

13.
14.
Noy E  Tabakman T  Goldblum A 《Proteins》2007,68(3):702-711
We investigate the extent to which ensembles of flexible fragments (FF), generated by our loop conformational search method, include conformations that are near experimental and reflect conformational changes that these FFs undergo when binary protein-protein complexes are formed. Twenty-eight FFs, which are located in protein-protein interfaces and have different conformations in the bound structure (BS) and unbound structure (UbS) were extracted. The conformational space of these fragments in the BS and UbS was explored with our method which is based on the iterative stochastic elimination (ISE) algorithm. Conformational search of BSs generated bound ensembles and conformational search of UbSs produced unbound ensembles. ISE samples conformations near experimental (less than 1.05 A root mean square deviation, RMSD) for 51 out of the 56 examined fragments in the bound and unbound ensembles. In 14 out of the 28 unbound fragments, it also samples conformations within 1.05 A from the BS in the unbound ensemble. Sampling the bound conformation in the unbound ensemble demonstrates the potential biological relevance of the predicted ensemble. The 10 lowest energy conformations are the best choice for docking experiments, compared with any other 10 conformations of the ensembles. We conclude that generating conformational ensembles for FFs with ISE is relevant to FF conformations in the UbS and BS. Forming ensembles of the isolated proteins with our method prior to docking represents more comprehensively their inherent flexibility and is expected to improve docking experiments compared with results obtained by docking only UbSs.  相似文献   

15.
Recent algorithmic advances and continual increase in computational power have made it possible to simulate protein folding and dynamics on the level of ensembles. Furthermore, analyzing protein structure by using ensemble representation is intrinsic to certain experimental techniques, such as nuclear magnetic resonance. This creates a problem of how to compare an ensemble of molecules with a given reference structure. Recently, we used distance-based root-mean-square deviation (dRMS) to compare the native structure of a protein with its unfolded-state ensemble. We showed that for small, mostly alpha-helical proteins, the mean unfolded-state Calpha-Calpha distance matrix is significantly more nativelike than the Calpha-Calpha matrices corresponding to the individual members of the unfolded ensemble. Here, we give a mathematical derivation that shows that, for any ensemble of structures, the dRMS deviation between the ensemble-averaged distance matrix and any given reference distance matrix is always less than or equal to the average dRMS deviation of the individual members of the ensemble from the same reference matrix. This holds regardless of the nature of the reference structure or the structural ensemble in question. In other words, averaging of distance matrices can only increase their level of similarity to a given reference matrix, relative to the individual matrices comprising the ensemble. Furthermore, we show that the above inequality holds in the case of Cartesian coordinate-based root-mean-square deviation as well. We discuss this in the context of our proposal that the average structure of the unfolded ensemble of small helical proteins is close to the native structure, and demonstrate that this finding goes beyond the above mathematical fact.  相似文献   

16.
《Biophysical journal》2020,118(11):2703-2717
Molecular motors drive cytoskeletal rearrangements to change cell shape. Myosins are the motors that move, cross-link, and modify the actin cytoskeleton. The primary force generator in contractile actomyosin networks is nonmuscle myosin II (NMMII), a molecular motor that assembles into ensembles that bind, slide, and cross-link actin filaments (F-actin). The multivalence of NMMII ensembles and their multiple roles have confounded the resolution of crucial questions, including how the number of NMMII subunits affects dynamics and what affects the relative contribution of ensembles’ cross-linking versus motoring activities. Because biophysical measurements of ensembles are sparse, modeling of actomyosin networks has aided in discovering the complex behaviors of NMMII ensembles. Myosin ensembles have been modeled via several strategies with variable discretization or coarse graining and unbinding dynamics, and although general assumptions that simplify motor ensembles result in global contractile behaviors, it remains unclear which strategies most accurately depict cellular activity. Here, we used an agent-based platform, Cytosim, to implement several models of NMMII ensembles. Comparing the effects of bond type, we found that ensembles of catch-slip and catch motors were the best force generators and binders of filaments. Slip motor ensembles were capable of generating force but unbound frequently, resulting in slower contractile rates of contractile networks. Coarse graining of these ensemble types from two sets of 16 motors on opposite ends of a stiff rod to two binders, each representing 16 motors, reduced force generation, contractility, and the total connectivity of filament networks for all ensemble types. A parallel cluster model, previously used to describe ensemble dynamics via statistical mechanics, allowed better contractility with coarse graining, though connectivity was still markedly reduced for this ensemble type with coarse graining. Together, our results reveal substantial tradeoffs associated with the process of coarse graining NMMII ensembles and highlight the robustness of discretized catch-slip ensembles in modeling actomyosin networks.  相似文献   

17.
18.
A replica‐exchange Monte Carlo (REMC) ensemble docking approach has been developed that allows efficient exploration of protein–protein docking geometries. In addition to Monte Carlo steps in translation and orientation of binding partners, possible conformational changes upon binding are included based on Monte Carlo selection of protein conformations stored as ordered pregenerated conformational ensembles. The conformational ensembles of each binding partner protein were generated by three different approaches starting from the unbound partner protein structure with a range spanning a root mean square deviation of 1–2.5 Å with respect to the unbound structure. Because MC sampling is performed to select appropriate partner conformations on the fly the approach is not limited by the number of conformations in the ensemble compared to ensemble docking of each conformer pair in ensemble cross docking. Although only a fraction of generated conformers was in closer agreement with the bound structure the REMC ensemble docking approach achieved improved docking results compared to REMC docking with only the unbound partner structures or using docking energy minimization methods. The approach has significant potential for further improvement in combination with more realistic structural ensembles and better docking scoring functions. Proteins 2017; 85:924–937. © 2016 Wiley Periodicals, Inc.  相似文献   

19.
The effect of cut-off distance used in molecular dynamics (MD) simulations on fluid properties was studied systematically in both canonical (NVT) and isothermal–isobaric (NPT) ensembles. Results show that the cut-off distance in the NVT ensemble plays little role in determining the equilibrium structure of fluid if the ensemble has a high density. However, pressures calculated in the same NVT ensembles strongly depend on the cut-off distance used. In the NPT ensemble, cut-off distance plays a key role in determining fluid equilibrium structure, density and self-diffusion coefficient. The characteristic of the radial distribution function of fluid in NPT ensembles depending on the cut-off distance used in MD simulations means that the WCA theory (a perturbation theory developed by Weeks, Chandler and Andersen) is not suitable for NPT ensembles because the assumption (the effect of the attractive force in determining the liquid structure is negligible) used in the WCA theory is not valid. The dependence of fluid properties on the cut-off distance also indicates that using the WCA potential (the repulsive part of the intermolecular potential proposed in the WCA theory) to calculate fluid transport in heterogeneous systems could lead to significant errors or incorrect results.  相似文献   

20.
Understanding the genetic regulatory network comprising genes, RNA, proteins and the network connections and dynamical control rules among them, is a major task of contemporary systems biology. I focus here on the use of the ensemble approach to find one or more well-defined ensembles of model networks whose statistical features match those of real cells and organisms. Such ensembles should help explain and predict features of real cells and organisms. More precisely, an ensemble of model networks is defined by constraints on the "wiring diagram" of regulatory interactions, and the "rules" governing the dynamical behavior of regulated components of the network. The ensemble consists of all networks consistent with those constraints. Here I discuss ensembles of random Boolean networks, scale free Boolean networks, "medusa" Boolean networks, continuous variable networks, and others. For each ensemble, M statistical features, such as the size distribution of avalanches in gene activity changes unleashed by transiently altering the activity of a single gene, the distribution in distances between gene activities on different cell types, and others, are measured. This creates an M-dimensional space, where each ensemble corresponds to a cluster of points or distributions. Using current and future experimental techniques, such as gene arrays, these M properties are to be measured for real cells and organisms, again yielding a cluster of points or distributions in the M-dimensional space. The procedure then finds ensembles close to those of real cells and organisms, and hill climbs to attempt to match the observed M features. Thus obtains one or more ensembles that should predict and explain many features of the regulatory networks in cells and organisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号