首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Having multiple domains in proteins can lead to partial folding and increased aggregation. Folding cooperativity, the all or nothing folding of a protein, can reduce this aggregation propensity. In agreement with bulk experiments, a coarse-grained structure-based model of the three-domain protein, E. coli Adenylate kinase (AKE), folds cooperatively. Domain interfaces have previously been implicated in the cooperative folding of multi-domain proteins. To understand their role in AKE folding, we computationally create mutants with deleted inter-domain interfaces and simulate their folding. We find that inter-domain interfaces play a minor role in the folding cooperativity of AKE. On further analysis, we find that unlike other multi-domain proteins whose folding has been studied, the domains of AKE are not singly-linked. Two of its domains have two linkers to the third one, i.e., they are inserted into the third one. We use circular permutation to modify AKE chain-connectivity and convert inserted-domains into singly-linked domains. We find that domain insertion in AKE achieves the following: (1) It facilitates folding cooperativity even when domains have different stabilities. Insertion constrains the N- and C-termini of inserted domains and stabilizes their folded states. Therefore, domains that perform conformational transitions can be smaller with fewer stabilizing interactions. (2) Inter-domain interactions are not needed to promote folding cooperativity and can be tuned for function. In AKE, these interactions help promote conformational dynamics limited catalysis. Finally, using structural bioinformatics, we suggest that domain insertion may also facilitate the cooperative folding of other multi-domain proteins.  相似文献   

2.
Structural genomics projects require strategies for rapidly recognizing protein sequences appropriate for routine structure determination. For large proteins, this strategy includes the dissection of proteins into structural domains that form stable native structures. However, protein dissection essentially remains an empirical and often a tedious process. Here, we describe a simple strategy for rapidly identifying structural domains and assessing their structures. This approach combines the computational prediction of sequence regions corresponding to putative domains with an experimental assessment of their structures and stabilities by NMR and biochemical methods. We tested this approach with nine putative domains predicted from a set of 108 Thermus thermophilus HB8 sequences using PASS, a domain prediction program we previously reported. To facilitate the experimental assessment of the domain structures, we developed a generic 6-hour His-tag-based purification protocol, which enables the sample quality evaluation of a putative structural domain in a single day. As a result, we observed that half of the predicted structural domains were indeed natively folded, as judged by their HSQC spectra. Furthermore, two of the natively folded domains were novel, without related sequences classified in the Pfam and SMART databases, which is a significant result with regard to the ability of structural genomics projects to uniformly cover the protein fold space.  相似文献   

3.
An algorithm is presented for the fast and accurate definition of protein structural domains from coordinate data without prior knowledge of the number or type of domains. The algorithm explicitly locates domains that comprise one or two continuous segments of protein chain. Domains that include more than two segments are also located. The algorithm was applied to a nonredundant database of 230 protein structures and the results compared to domain definitions obtained from the literature, or by inspection of the coordinates on molecular graphics. For 70% of the proteins, the derived domains agree with the reference definitions, 18% show minor differences and only 12% (28 proteins) show very different definitions. Three screens were applied to identify the derived domains least likely to agree with the subjective definition set. These screens revealed a set of 173 proteins, 97% of which agree well with the subjective definitions. The algorithm represents a practical domain identification tool that can be run routinely on the entire structural database. Adjustment of parameters also allows smaller compact units to be identified in proteins.  相似文献   

4.
5.
MOTIVATION: Although many methods are available for the identification of structural domains from protein three-dimensional structures, accurate definition of protein domains and the curation of such data for a large number of proteins are often possible only after manual intervention. The availability of domain definitions for protein structural entries is useful for the sequence analysis of aligned domains, structure comparison, fold recognition procedures and understanding protein folding, domain stability and flexibility. RESULTS: We have improved our method of domain identification starting from the concept of clustering secondary structural elements, but with an intention of reducing the number of discontinuous segments in identified domains. The results of our modified and automatic approach have been compared with the domain definitions from other databases. On a test data set of 55 proteins, this method acquires high agreement (88%) in the number of domains with the crystallographers' definition and resources such as SCOP, CATH, DALI, 3Dee and PDP databases. This method also obtains 98% overlap score with the other resources in the definition of domain boundaries of the 55 proteins. We have examined the domain arrangements of 4592 non-redundant protein chains using the improved method to include 5409 domains leading to an update of the structural domain database. AVAILABILITY: The latest version of the domain database and online domain identification methods are available from http://www.ncbs.res.in/~faculty/mini/ddbase/ddbase.html Supplementary information: http://www.ncbs.res.in/~faculty/mini/ddbase/supplementary/supplementary.html  相似文献   

6.
We are interested in determining which amino acid pairs can be substituted for the disulfide (S-S) bonds in proteins without disrupting their native structures under physiological conditions. In this study, we focused on the intradomain S-S bonds in Ig fold domains and aimed to determine a simple rule for replacement of their S-S bonds. The cysteines of four different Ig fold domains were mutated randomly, and the amino acid pairs substituted for the S-S bonds were screened by the method utilizing a cellular quality control system. Among the 36 selected mutants, 31 were natively folded without S-S bonds, as judged from the cooperativity of thermal unfolding. In addition, the selected mutant llama heavy chain antibodies retained antigen-binding affinity. At least two of the pairs Ala:Ala, Ala:Val, Val: Ala, and Val:Val were found in the selected mutants for all four different Ig fold domains, and they were stably folded at 30 degrees C. This suggests that examination of these four pairs could be enough to obtain natively folded Ig fold domains without S-S bonds.  相似文献   

7.
L Wernisch  M Hunting  S J Wodak 《Proteins》1999,35(3):338-352
A novel automatic procedure for identifying domains from protein atomic coordinates is presented. The procedure, termed STRUDL (STRUctural Domain Limits), does not take into account information on secondary structures and handles any number of domains made up of contiguous or non-contiguous chain segments. The core algorithm uses the Kernighan-Lin graph heuristic to partition the protein into residue sets which display minimum interactions between them. These interactions are deduced from the weighted Voronoi diagram. The generated partitions are accepted or rejected on the basis of optimized criteria, representing basic expected physical properties of structural domains. The graph heuristic approach is shown to be very effective, it approximates closely the exact solution provided by a branch and bound algorithm for a number of test proteins. In addition, the overall performance of STRUDL is assessed on a set of 787 representative proteins from the Protein Data Bank by comparison to domain definitions in the CATH protein classification. The domains assigned by STRUDL agree with the CATH assignments in at least 81% of the tested proteins. This result is comparable to that obtained previously using PUU (Holm and Sander, Proteins 1994;9:256-268), the only other available algorithm designed to identify domains with any number of non-contiguous chain segments. A detailed discussion of the structures for which our assignments differ from those in CATH brings to light some clear inconsistencies between the concept of structural domains based on minimizing inter-domain interactions and that of delimiting structural motifs that represent acceptable folding topologies or architectures. Considering both concepts as complementary and combining them in a layered approach might be the way forward.  相似文献   

8.
Abstract: Proteins are often classified in a binary fashion as either structured or disordered. However this approach has several deficits. Firstly, protein folding is always conditional on the physiochemical environment. A protein which is structured in some circumstances will be disordered in others. Secondly, it hides a fundamental asymmetry in behavior. While all structured proteins can be unfolded through a change in environment, not all disordered proteins have the capacity for folding. Failure to accommodate these complexities confuses the definition of both protein structural domains and intrinsically disordered regions. We illustrate these points with an experimental study of a family of small binding domains, drawn from the RNA polymerase of mumps virus and its closest relatives. Assessed at face value the domains fall on a structural continuum, with folded, partially folded, and near unstructured members. Yet the disorder present in the family is conditional, and these closely related polypeptides can access the same folded state under appropriate conditions. Any heuristic definition of the protein domain emphasizing conformational stability divides this domain family in two, in a way that makes no biological sense. Structural domains would be better defined by their ability to adopt a specific tertiary structure: a structure that may or may not be realized, dependent on the circumstances. This explicitly allows for the conditional nature of protein folding, and more clearly demarcates structural domains from intrinsically disordered regions that may function without folding.  相似文献   

9.
Domains are the main structural and functional units of larger proteins. They tend to be contiguous in primary structure and can fold and function independently. It has been observed that 10–20% of all encoded proteins contain duplicated domains and the average pairwise sequence identity between them is usually low. In the present study, we have analyzed the structural similarity between domain repeats of proteins with known structures available in the Protein Data Bank using structure-based inter-residue interaction measures such as the number of long-range contacts, surrounding hydrophobicity, and pairwise interaction energy. We used RADAR program for detecting the repeats in a protein sequence which were further validated using Pfam domain assignments. The sequence identity between the repeats in domains ranges from 20 to 40% and their secondary structural elements are well conserved. The number of long-range contacts, surrounding hydrophobicity calculations and pairwise interaction energy of the domain repeats clearly reveal the conservation of 3-D structure environment in the repeats of domains. The proportions of mainchain–mainchain hydrogen bonds and hydrophobic interactions are also highly conserved between the repeats. The present study has suggested that the computation of these structure-based parameters will give better clues about the tertiary environment of the repeats in domains. The folding rates of individual domains in the repeats predicted using the long-range order parameter indicate that the predicted folding rates correlate well with most of the experimentally observed folding rates for the analyzed independently folded domains.  相似文献   

10.
An algorithm for determining of protein domain structure is proposed. Domain structures resulted from the algorithm application have been obtained and compared with available data. The method is based on entirely physical model of van der Waals interactions that reflects as illustrated in this work the distribution of electron density. Various levels of hierarchy in the protein spatial structure are discerned by analysis of the energy interaction between structural units of different scales. Thus the level of energy hierarchy plays role of sole parameter, and the method obviates the use of complicated geometrical criteria with numerous fitting parameters. The algorithm readily and accurately locates domains formed by continuous segments of the protein chain as well as those comprising non-sequential segments, sets no limit to the number of segments in a domain. We have analyzed 309 protein structures. Among 277 structures for which our results could be compared with the domain definitions made in other works, 243 showed complete or partial coincidence, and only in 34 cases the domain structures proved substantially different. The domains delineated with our approach may coincide with reference definition at different levels of the globule hierarchy. Along with defining the domain structure, our approach allows one to consider the protein spatial structure in terms of the spatial distribution of the interaction energy in order to establish the correspondence between the hierarchy of energy distribution and the hierarchy of structural elements.  相似文献   

11.
Structural genomic projects envision almost routine protein structure determinations, which are currently imaginable only for small proteins with molecular weights below 25,000 Da. For larger proteins, structural insight can be obtained by breaking them into small segments of amino acid sequences that can fold into native structures, even when isolated from the rest of the protein. Such segments are autonomously folding units (AFU) and have sizes suitable for fast structural analyses. Here, we propose to expand an intuitive procedure often employed for identifying biologically important domains to an automatic method for detecting putative folded protein fragments. The procedure is based on the recognition that large proteins can be regarded as a combination of independent domains conserved among diverse organisms. We thus have developed a program that reorganizes the output of BLAST searches and detects regions with a large number of similar sequences. To automate the detection process, it is reduced to a simple geometrical problem of recognizing rectangular shaped elevations in a graph that plots the number of similar sequences at each residue of a query sequence. We used our program to quantitatively corroborate the premise that segments with conserved sequences correspond to domains that fold into native structures. We applied our program to a test data set composed of 99 amino acid sequences containing 150 segments with structures listed in the Protein Data Bank, and thus known to fold into native structures. Overall, the fragments identified by our program have an almost 50% probability of forming a native structure, and comparable results are observed with sequences containing domain linkers classified in SCOP. Furthermore, we verified that our program identifies AFU in libraries from various organisms, and we found a significant number of AFU candidates for structural analysis, covering an estimated 5 to 20% of the genomic databases. Altogether, these results argue that methods based on sequence similarity can be useful for dissecting large proteins into small autonomously folding domains, and such methods may provide an efficient support to structural genomics projects.  相似文献   

12.
With a growing number of structures available in the Brookhaven Protein Data Bank, automatic methods for domain identification are required for the construction of databases. Domains are considered to be clusters of secondary structure elements. Thus, helices and strands are first clustered using intersecondary structural distances between C alpha positions, and dendrograms based on this distance measure are used to identify domains. Individual domains are recognized by a disjoint factor, which enables the automatic identification and classification into disjoint, interacting, and conjoint domains. Application to a database of 83 protein families and 18 unique structures shows that the approach provides an effective delineation of boundaries and identifies those proteins that can be considered as a single domain. A quantitative estimate of the interaction between domains has been proposed. The database of protein domains is a useful tool for understanding protein folding, for recognizing protein folds, and for understanding structure-activity relationships.  相似文献   

13.
Most protein domains are found in multi-domain proteins, yet most studies of protein folding have concentrated on small, single-domain proteins or on isolated domains from larger proteins. Spectrin domains are small (106 amino acid residues), independently folding domains consisting of three long alpha-helices. They are found in multi-domain proteins with a number of spectrin domains in tandem array. Structural studies have shown that in these arrays the last helix of one domain forms a continuous helix with the first helix of the following domain. It has been demonstrated that a number of spectrin domains are stabilised by their neighbours. Here we investigate the molecular basis for cooperativity between adjacent spectrin domains 16 and 17 from chicken brain alpha-spectrin (R16 and R17). We show that whereas the proteins unfold as a single cooperative unit at 25 degrees C, cooperativity is lost at higher temperatures and in the presence of stabilising salts. Mutations in the linker region also cause the cooperativity to be lost. However, the cooperativity does not rely on specific interactions in the linker region alone. Most mutations in the R17 domain cause a decrease in cooperativity, whereas proteins with mutations in the R16 domain still fold cooperatively. We propose a mechanism for this behaviour.  相似文献   

14.
The Trk receptors and their neurotrophin ligands control development and maintenance of the nervous system. The crystal structures of the ligand binding domain of TrkA, TrkB, and TrkC were solved and refined to high resolution. The domains adopt an immunoglobulin-like fold, but crystallized in all three instances as dimers with the N-terminal strand of each molecule replaced by the same strand of a symmetry-related mate. Models of the correctly folded domains could be constructed by changing the position of a single residue, and the resulting model of the binding domain of TrkA is essentially identical with the bound structure as observed in a complex with nerve growth factor. An analysis of the existing mutagenesis data for TrkA and TrkC in light of these structures reveals the structural reasons for the specificity among the Trk receptors, and explains the underpinnings of the multi-functional ligands that have been reported. The overall structure of all three domains belongs to the I-set of immunoglobulin-like domains, but shows several unusual features, such as an exposed disulfide bridge linking two neighboring strands in the same beta-sheet. For all three domains, the residues that deviate from the standard fingerprint pattern common to the I-set family fall in the region of the ligand binding site observed in the complex. Therefore, identification of these deviations in the sequences of other immunoglobulin-like domain-containing receptors may help to identify their ligand binding site even in the absence of structural or mutagenesis data.  相似文献   

15.
Proteins with homologous amino acid sequences have similar folds and it has been assumed that an unknown three-dimensional structure can be obtained from a known homologous structure by substituting new side-chains into the polypeptide chain backbone, followed by relatively small adjustment of the model. To examine this approach of structure prediction and, more generally, to isolate the characteristics of native proteins, we constructed two incorrectly folded protein models. Sea-worm hemerythrin and the variable domain of mouse immunoglobulin K-chain, two proteins with no sequence homology, were chosen for study; the former is composed of a bundle of four alpha-helices and the latter consists of two 4-stranded beta-sheets. Using an automatic computer procedure, hemerythrin side-chains were substituted into the immunoglobulin domain and vice versa. The structures were energy-minimized with the program CHARMM and the resulting structures compared with the correctly folded forms. It was found that the incorrect side-chains can be incorporated readily into both types of structures (alpha-helices, beta-sheets) with only small structural adjustments. After constrained energy-minimization, which led to an average atomic co-ordinate shift of no more than 0.7 to 0.9 A, the incorrectly folded models arrived at potential energy values comparable to those of the correct structures. Detailed analysis of the energy results shows that the incorrect structures have less stabilizing electrostatic, van der Waals' and hydrogen-bonding interactions. The difference is particularly pronounced when the electrostatic and van der Waals' energy terms are calculated by modified equations that include an approximate representation of solvent effects. The incorrectly folded structures also have a significantly larger solvent-accessible surface and a greater fraction of non-polar side-chain atoms exposed to solvent. Examination of their interior shows that the packing of side-chains at the secondary structure interfaces, although corresponding to sterically allowed conformations, deviates from the characteristics found in normal proteins. The analysis of incorrectly folded structures has made it clear that the absence of bad non-bonded contacts, though necessary, is not sufficient to demonstrate the validity of model-built structures and that modeling of homologous structures has to be accompanied by a thorough quantitative evaluation of the results. Further, certain features that characterize native proteins are made evident by their absence in misfolded models.  相似文献   

16.
Identifying glycoconjugate-binding domains. Building on the past.   总被引:1,自引:0,他引:1  
G D Holt 《Glycobiology》1991,1(4):329-336
The molecular details of how glycoconjugate-binding proteins interact with their ligands have been revealed by a variety of techniques. For example, proteases, chemical-modifying reagents and antibodies have served as effective probes of lectin functional domains. Protein crystallography has providing insight into how lectins are structured, and aided in determining which amino acids in these proteins are positioned appropriately for bond formation with glycoconjugates. In addition, the characterization and sequencing of naturally occurring, non-functional lectin variants have led to the identification of amino acids which play critical roles in a lectin's glycoconjugate-binding domain. Similarly, studies of lectin mutants produced by site-directed mutagenesis, and of synthetic peptides that mimic lectin binding properties, have demonstrated the importance of particular amino acids for glycoconjugate binding. An alternate approach to understanding lectin functional domains has been to compare the primary sequences of these proteins to reveal common sequence elements which allow them to be organized into families. For example, the discovery of amino acid homologies dispersed over long segments of the primary sequences of several lectins has suggested that many of these proteins have a related three-dimensional organization. In addition, the identification of more highly focused regions of sequence homology has indicated that many structures within the lectin glycoconjugate-binding domains themselves may be conserved. Scanning protein data banks for sequences homologous to known lectins has led to the identification of several previously unrecognized lectins, and aided in determining what portions of these proteins function in their glycoconjugate-binding domains.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

17.
Domains are the building blocks of proteins and play a crucial role in protein–protein interactions. Here, we propose a new approach for the analysis and prediction of domain–domain interfaces. Our method, which relies on the representation of domains as residue-interacting networks, finds an optimal decomposition of domain structures into modules. The resulting modules comprise highly cooperative residues, which exhibit few connections with other modules. We found that non-overlapping binding sites in a domain, involved in different domain–domain interactions, are generally contained in different modules. This observation indicates that our modular decomposition is able to separate protein domains into regions with specialized functions. Our results show that modules with high modularity values identify binding site regions, demonstrating the predictive character of modularity. Furthermore, the combination of modularity with other characteristics, such as sequence conservation or surface patches, was found to improve our predictions. In an attempt to give a physical interpretation to the modular architecture of domains, we analyzed in detail six examples of protein domains with available experimental binding data. The modular configuration of the TEM1-β-lactamase binding site illustrates the energetic independence of hotspots located in different modules and the cooperativity of those sited within the same modules. The energetic and structural cooperativity between intramodular residues is also clearly shown in the example of the chymotrypsin inhibitor, where non–binding site residues have a synergistic effect on binding. Interestingly, the binding site of the T cell receptor β chain variable domain 2.1 is contained in one module, which includes structurally distant hot regions displaying positive cooperativity. These findings support the idea that modules possess certain functional and energetic independence. A modular organization of binding sites confers robustness and flexibility to the performance of the functional activity, and facilitates the evolution of protein interactions.  相似文献   

18.
Domains are the building blocks of proteins and play a crucial role in protein-protein interactions. Here, we propose a new approach for the analysis and prediction of domain-domain interfaces. Our method, which relies on the representation of domains as residue-interacting networks, finds an optimal decomposition of domain structures into modules. The resulting modules comprise highly cooperative residues, which exhibit few connections with other modules. We found that non-overlapping binding sites in a domain, involved in different domain-domain interactions, are generally contained in different modules. This observation indicates that our modular decomposition is able to separate protein domains into regions with specialized functions. Our results show that modules with high modularity values identify binding site regions, demonstrating the predictive character of modularity. Furthermore, the combination of modularity with other characteristics, such as sequence conservation or surface patches, was found to improve our predictions. In an attempt to give a physical interpretation to the modular architecture of domains, we analyzed in detail six examples of protein domains with available experimental binding data. The modular configuration of the TEM1-beta-lactamase binding site illustrates the energetic independence of hotspots located in different modules and the cooperativity of those sited within the same modules. The energetic and structural cooperativity between intramodular residues is also clearly shown in the example of the chymotrypsin inhibitor, where non-binding site residues have a synergistic effect on binding. Interestingly, the binding site of the T cell receptor beta chain variable domain 2.1 is contained in one module, which includes structurally distant hot regions displaying positive cooperativity. These findings support the idea that modules possess certain functional and energetic independence. A modular organization of binding sites confers robustness and flexibility to the performance of the functional activity, and facilitates the evolution of protein interactions.  相似文献   

19.
Experiments were designed to explore the tolerance of protein structure and folding to very large insertions of folded protein within a structural domain. Dihydrofolate reductase and beta-lactamase have been inserted in four different positions of phosphoglycerate kinase. The resultant chimeric proteins are all overexpressed, and the host as well as the inserted partners are functional. Although not explicitly designed, functional coupling between the two fused partners was observed in some of the chimeras. These results show that the tolerance of protein structures to very large structured insertions is more general than previously expected and supports the idea that the natural sequence continuity of a structural domain is not required for the folding process. These results directly suggest a new experimental approach to screen, for example, for folded protein in randomized polypeptide sequences.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号