首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Despite the decades-long use of Bacillus atrophaeus var. globigii (BG) as a simulant for biological warfare (BW) agents, knowledge of its genome composition is limited. Furthermore, the ability to differentiate signatures of deliberate adaptation and selection from natural variation is lacking for most bacterial agents. We characterized a lineage of BGwith a long history of use as a simulant for BW operations, focusing on classical bacteriological markers, metabolic profiling and whole-genome shotgun sequencing (WGS).

Results

Archival strains and two “present day” type strains were compared to simulant strains on different laboratory media. Several of the samples produced multiple colony morphotypes that differed from that of an archival isolate. To trace the microevolutionary history of these isolates, we obtained WGS data for several archival and present-day strains and morphotypes. Bacillus-wide phylogenetic analysis identified B. subtilis as the nearest neighbor to B. atrophaeus. The genome of B. atrophaeus is, on average, 86% identical to B. subtilis on the nucleotide level. WGS of variants revealed that several strains were mixed but highly related populations and uncovered a progressive accumulation of mutations among the “military” isolates. Metabolic profiling and microscopic examination of bacterial cultures revealed enhanced growth of “military” isolates on lactate-containing media, and showed that the “military” strains exhibited a hypersporulating phenotype.

Conclusions

Our analysis revealed the genomic and phenotypic signatures of strain adaptation and deliberate selection for traits that were desirable in a simulant organism. Together, these results demonstrate the power of whole-genome and modern systems-level approaches to characterize microbial lineages to develop and validate forensic markers for strain discrimination and reveal signatures of deliberate adaptation.  相似文献   

2.

Background

Next-generation sequencing techniques, such as genotyping-by-sequencing (GBS), provide alternatives to single nucleotide polymorphism (SNP) arrays. The aim of this work was to evaluate the potential of GBS compared to SNP array genotyping for genomic selection in livestock populations.

Methods

The value of GBS was quantified by simulation analyses in which three parameters were varied: (i) genome-wide sequence read depth (x) per individual from 0.01x to 20x or using SNP array genotyping; (ii) number of genotyped markers from 3000 to 300 000; and (iii) size of training and prediction sets from 500 to 50 000 individuals. The latter was achieved by distributing the total available x of 1000x, 5000x, or 10 000x per genotyped locus among the varying number of individuals. With SNP arrays, genotypes were called from sequence data directly. With GBS, genotypes were called from sequence reads that varied between loci and individuals according to a Poisson distribution with mean equal to x. Simulated data were analyzed with ridge regression and the accuracy and bias of genomic predictions and response to selection were quantified under the different scenarios.

Results

Accuracies of genomic predictions using GBS data or SNP array data were comparable when large numbers of markers were used and x per individual was ~1x or higher. The bias of genomic predictions was very high at a very low x. When the total available x was distributed among the training individuals, the accuracy of prediction was maximized when a large number of individuals was used that had GBS data with low x for a large number of markers. Similarly, response to selection was maximized under the same conditions due to increasing both accuracy and selection intensity.

Conclusions

GBS offers great potential for developing genomic selection in livestock populations because it makes it possible to cover large fractions of the genome and to vary the sequence read depth per individual. Thus, the accuracy of predictions is improved by increasing the size of training populations and the intensity of selection is increased by genotyping a larger number of selection candidates.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0102-z) contains supplementary material, which is available to authorized users.  相似文献   

3.
4.

Background

The main goal of our study was to investigate the implementation, prospects, and limits of marker imputation for quantitative genetic studies contrasting map-independent and map-dependent algorithms. We used a diversity panel consisting of 372 European elite wheat (Triticum aestivum L.) varieties, which had been genotyped with SNP arrays, and performed intensive simulation studies.

Results

Our results clearly showed that imputation accuracy was substantially higher for map-dependent compared to map-independent methods. The accuracy of marker imputation depended strongly on the linkage disequilibrium between the markers in the reference panel and the markers to be imputed. For the decay of linkage disequilibrium present in European wheat, we concluded that around 45,000 markers are needed for low cost, low-density marker profiling. This will facilitate high imputation accuracy, also for rare alleles. Genomic selection and diversity studies profited only marginally from imputing missing values. In contrast, the power of association mapping increased substantially when missing values were imputed.

Conclusions

Imputing missing values is especially of interest for an economic implementation of association mapping in breeding populations.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1366-y) contains supplementary material, which is available to authorized users.  相似文献   

5.
6.

Background

Molecular marker-assisted breeding provides an efficient tool to develop improved crop varieties. A major challenge for the broad application of markers in marker-assisted selection is that the marker phenotypes must match plant phenotypes in a wide range of breeding germplasm. In this study, we used the legume crop species Lupinus angustifolius (lupin) to demonstrate the utility of whole genome sequencing and re-sequencing on the development of diagnostic markers for molecular plant breeding.

Results

Nine lupin cultivars released in Australia from 1973 to 2007 were subjected to whole genome re-sequencing. The re-sequencing data together with the reference genome sequence data were used in marker development, which revealed 180,596 to 795,735 SNP markers from pairwise comparisons among the cultivars. A total of 207,887 markers were anchored on the lupin genetic linkage map. Marker mining obtained an average of 387 SNP markers and 87 InDel markers for each of the 24 genome sequence assembly scaffolds bearing markers linked to 11 genes of agronomic interest. Using the R gene PhtjR conferring resistance to phomopsis stem blight disease as a test case, we discovered 17 candidate diagnostic markers by genotyping and selecting markers on a genetic linkage map. A further 243 candidate diagnostic markers were discovered by marker mining on a scaffold bearing non-diagnostic markers linked to the PhtjR gene. Nine out from the ten tested candidate diagnostic markers were confirmed as truly diagnostic on a broad range of commercial cultivars. Markers developed using these strategies meet the requirements for broad application in molecular plant breeding.

Conclusions

We demonstrated that low-cost genome sequencing and re-sequencing data were sufficient and very effective in the development of diagnostic markers for marker-assisted selection. The strategies used in this study may be applied to any trait or plant species. Whole genome sequencing and re-sequencing provides a powerful tool to overcome current limitations in molecular plant breeding, which will enable plant breeders to precisely pyramid favourable genes to develop super crop varieties to meet future food demands.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1878-5) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Canine hip dysplasia (CHD) is characterised by a malformation of the hip joint, leading to osteoarthritis and lameness. Current breeding schemes against CHD have resulted in measurable but moderate responses. The application of marker-assisted selection, incorporating specific markers associated with the disease, or genomic selection, incorporating genome-wide markers, has the potential to dramatically improve results of breeding schemes. Our aims were to identify regions associated with hip dysplasia or its related traits using genome and chromosome-wide analysis, study the linkage disequilibrium (LD) in these regions and provide plausible gene candidates. This study is focused on the UK Labrador Retriever population, which has a high prevalence of the disease and participates in a recording program led by the British Veterinary Association (BVA) and The Kennel Club (KC).

Results

Two genome-wide and several chromosome-wide QTLs affecting CHD and its related traits were identified, indicating regions related to hip dysplasia.

Conclusion

Consistent with previous studies, the genetic architecture of CHD appears to be based on many genes with small or moderate effect, suggesting that genomic selection rather than marker-assisted selection may be an appropriate strategy for reducing this disease.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-833) contains supplementary material, which is available to authorized users.  相似文献   

8.

Background

In contrast to currently used single nucleotide polymorphism (SNP) panels, the use of whole-genome sequence data is expected to enable the direct estimation of the effects of causal mutations on a given trait. This could lead to higher reliabilities of genomic predictions compared to those based on SNP genotypes. Also, at each generation of selection, recombination events between a SNP and a mutation can cause decay in reliability of genomic predictions based on markers rather than on the causal variants. Our objective was to investigate the use of imputed whole-genome sequence genotypes versus high-density SNP genotypes on (the persistency of) the reliability of genomic predictions using real cattle data.

Methods

Highly accurate phenotypes based on daughter performance and Illumina BovineHD Beadchip genotypes were available for 5503 Holstein Friesian bulls. The BovineHD genotypes (631,428 SNPs) of each bull were used to impute whole-genome sequence genotypes (12,590,056 SNPs) using the Beagle software. Imputation was done using a multi-breed reference panel of 429 sequenced individuals. Genomic estimated breeding values for three traits were predicted using a Bayesian stochastic search variable selection (BSSVS) model and a genome-enabled best linear unbiased prediction model (GBLUP). Reliabilities of predictions were based on 2087 validation bulls, while the other 3416 bulls were used for training.

Results

Prediction reliabilities ranged from 0.37 to 0.52. BSSVS performed better than GBLUP in all cases. Reliabilities of genomic predictions were slightly lower with imputed sequence data than with BovineHD chip data. Also, the reliabilities tended to be lower for both sequence data and BovineHD chip data when relationships between training animals were low. No increase in persistency of prediction reliability using imputed sequence data was observed.

Conclusions

Compared to BovineHD genotype data, using imputed sequence data for genomic prediction produced no advantage. To investigate the putative advantage of genomic prediction using (imputed) sequence data, a training set with a larger number of individuals that are distantly related to each other and genomic prediction models that incorporate biological information on the SNPs or that apply stricter SNP pre-selection should be considered.

Electronic supplementary material

The online version of this article (doi:10.1186/s12711-015-0149-x) contains supplementary material, which is available to authorized users.  相似文献   

9.
10.
11.

Background

Genomic selection (GS) promises to improve accuracy in estimating breeding values and genetic gain for quantitative traits compared to traditional breeding methods. Its reliance on high-throughput genome-wide markers and statistical complexity, however, is a serious challenge in data management, analysis, and sharing. A bioinformatics infrastructure for data storage and access, and user-friendly web-based tool for analysis and sharing output is needed to make GS more practical for breeders.

Results

We have developed a web-based tool, called solGS, for predicting genomic estimated breeding values (GEBVs) of individuals, using a Ridge-Regression Best Linear Unbiased Predictor (RR-BLUP) model. It has an intuitive web-interface for selecting a training population for modeling and estimating genomic estimated breeding values of selection candidates. It estimates phenotypic correlation and heritability of traits and selection indices of individuals. Raw data is stored in a generic database schema, Chado Natural Diversity, co-developed by multiple database groups. Analysis output is graphically visualized and can be interactively explored online or downloaded in text format. An instance of its implementation can be accessed at the NEXTGEN Cassava breeding database, http://cassavabase.org/solgs.

Conclusions

solGS enables breeders to store raw data and estimate GEBVs of individuals online, in an intuitive and interactive workflow. It can be adapted to any breeding program.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0398-7) contains supplementary material, which is available to authorized users.  相似文献   

12.
13.
14.

Background

Spounavirinae viruses have received an increasing interest as tools for the control of harmful bacteria due to their relatively broad host range and strictly virulent phenotype.

Results

In this study, we collected and analyzed the complete genome sequences of 61 published phages, either ICTV-classified or candidate members of the Spounavirinae subfamily of the Myoviridae. A set of comparative analyses identified a distinct, recently proposed Bastille-like phage group within the Spounavirinae. More importantly, type 1 thymidylate synthase (TS1) and dihydrofolate reductase (DHFR) genes were shown to be unique for the members of the proposed Bastille-like phage group, and are suitable as molecular markers. We also show that the members of this group encode beta-lactamase and/or sporulation-related SpoIIIE homologs, possibly questioning their suitability as biocontrol agents.

Conclusions

We confirm the creation of a new genus—the “Bastille-like group”—in Spounavirinae, and propose that the presence of TS1- and DHFR-encoding genes could serve as signatures for the new Bastille-like group. In addition, the presence of metallo-beta-lactamase and/or SpoIIIE homologs in all members of Bastille-like group phages makes questionable their suitability for use in biocontrol.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1757-0) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background

Fuelled by the advent and subsequent development of next generation sequencing technologies, metagenomics became a powerful tool for the analysis of microbial communities both scientifically and diagnostically. The biggest challenge is the extraction of relevant information from the huge sequence datasets generated for metagenomics studies. Although a plethora of tools are available, data analysis is still a bottleneck.

Results

To overcome the bottleneck of data analysis, we developed an automated computational workflow called RIEMS – Reliable Information Extraction from Metagenomic Sequence datasets. RIEMS assigns every individual read sequence within a dataset taxonomically by cascading different sequence analyses with decreasing stringency of the assignments using various software applications. After completion of the analyses, the results are summarised in a clearly structured result protocol organised taxonomically. The high accuracy and performance of RIEMS analyses were proven in comparison with other tools for metagenomics data analysis using simulated sequencing read datasets.

Conclusions

RIEMS has the potential to fill the gap that still exists with regard to data analysis for metagenomics studies. The usefulness and power of RIEMS for the analysis of genuine sequencing datasets was demonstrated with an early version of RIEMS in 2011 when it was used to detect the orthobunyavirus sequences leading to the discovery of Schmallenberg virus.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0503-6) contains supplementary material, which is available to authorized users.  相似文献   

16.

Background

Mate selection can be used as a framework to balance key technical, cost and logistical issues while implementing a breeding program at a tactical level. The resulting mating lists accommodate optimal contributions of parents to future generations, in conjunction with other factors such as progeny inbreeding, connection between herds, use of reproductive technologies, management of the genetic distribution of nominated traits, and management of allele/genotype frequencies for nominated QTL/markers.

Methods

This paper describes a mate selection algorithm that is widely used and presents an extension that makes it possible to apply constraints on certain matings, as dictated through a group mating permission matrix.

Results

This full algorithm leads to simpler applications, and to computing speed for the scenario tested, which is several hundred times faster than the previous strategy of penalising solutions that break constraints.

Conclusions

The much higher speed of the method presented here extends the use of mate selection and enables implementation in relatively large programs across breeding units.  相似文献   

17.
18.

Background

Pea (Pisum sativum L.), a major pulse crop grown for its protein-rich seeds, is an important component of agroecological cropping systems in diverse regions of the world. New breeding challenges imposed by global climate change and new regulations urge pea breeders to undertake more efficient methods of selection and better take advantage of the large genetic diversity present in the Pisum sativum genepool. Diversity studies conducted so far in pea used Simple Sequence Repeat (SSR) and Retrotransposon Based Insertion Polymorphism (RBIP) markers. Recently, SNP marker panels have been developed that will be useful for genetic diversity assessment and marker-assisted selection.

Results

A collection of diverse pea accessions, including landraces and cultivars of garden, field or fodder peas as well as wild peas was characterised at the molecular level using newly developed SNP markers, as well as SSR markers and RBIP markers. The three types of markers were used to describe the structure of the collection and revealed different pictures of the genetic diversity among the collection. SSR showed the fastest rate of evolution and RBIP the slowest rate of evolution, pointing to their contrasted mode of evolution. SNP markers were then used to predict phenotypes -the date of flowering (BegFlo), the number of seeds per plant (Nseed) and thousand seed weight (TSW)- that were recorded for the collection. Different statistical methods were tested including the LASSO (Least Absolute Shrinkage ans Selection Operator), PLS (Partial Least Squares), SPLS (Sparse Partial Least Squares), Bayes A, Bayes B and GBLUP (Genomic Best Linear Unbiased Prediction) methods and the structure of the collection was taken into account in the prediction. Despite a limited number of 331 markers used for prediction, TSW was reliably predicted.

Conclusion

The development of marker assisted selection has not reached its full potential in pea until now. This paper shows that the high-throughput SNP arrays that are being developed will most probably allow for a more efficient selection in this species.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1266-1) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

Human leukocyte antigen (HLA) is a group of genes that are extremely polymorphic among individuals and populations and have been associated with more than 100 different diseases and adverse drug effects. HLA typing is accordingly an important tool in clinical application, medical research, and population genetics. We have previously developed a phase-defined HLA gene sequencing method using MiSeq sequencing.

Results

Here we report a simple, high-throughput, and cost-effective sequencing method that includes normalized library preparation and adjustment of DNA molar concentration. We applied long-range PCR to amplify HLA-B for 96 samples followed by transposase-based library construction and multiplex sequencing with the MiSeq sequencer. After sequencing, we observed low variation in read percentages (0.2% to 1.55%) among the 96 demultiplexed samples. On this basis, all the samples were amenable to haplotype phasing using our phase-defined sequencing method. In our study, a sequencing depth of 800x was necessary and sufficient to achieve full phasing of HLA-B alleles with reliable assignment of the allelic sequence to the 8 digit level.

Conclusions

Our HLA sequencing method optimized for 96 multiplexing samples is highly time effective and cost effective and is especially suitable for automated multi-sample library preparation and sequencing.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-645) contains supplementary material, which is available to authorized users.  相似文献   

20.

Background

Copy number variations (CNVs) are a main source of genomic structural variations underlying animal evolution and production traits. Here, with one pure-blooded Angus bull as reference, we describe a genome-wide analysis of CNVs based on comparative genomic hybridization arrays in 29 Chinese domesticated bulls and examined their effects on gene expression and cattle growth traits.

Results

We identified 486 copy number variable regions (CNVRs), covering 2.45% of the bovine genome, in 24 taurine (Bos taurus), together with 161 ones in 2 yaks (Bos grunniens) and 163 ones in 3 buffaloes (Bubalus bubalis). Totally, we discovered 605 integrated CNVRs, with more “loss” events than both “gain” and “both” ones, and clearly clustered them into three cattle groups. Interestingly, we confirmed their uneven distributions across chromosomes, and the differences of mitochondrion DNA copy number (gain: taurine, loss: yak & buffalo). Furthermore, we confirmed approximately 41.8% (253/605) and 70.6% (427/605) CNVRs span cattle genes and quantitative trait loci (QTLs), respectively. Finally, we confirmed 6 CNVRs in 9 chosen ones by using quantitative PCR, and further demonstrated that CNVR22 had significantly negative effects on expression of PLA2G2D gene, and both CNVR22 and CNVR310 were associated with body measurements in Chinese cattle, suggesting their key effects on gene expression and cattle traits.

Conclusions

The results advanced our understanding of CNV as an important genomic structural variation in taurine, yak and buffalo. This study provides a highly valuable resource for Chinese cattle’s evolution and breeding researches.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-480) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号