- 1Laboratory of Microbiology, Institute of Biology, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
- 2Radboud Institute for Molecular Life
Sciences, Centre for Molecular and Biomolecular Informatics, Radboud
University Medical Centre, Nijmegen, Netherlands
- 3Theoretical Biology and Bioinformatics, Utrecht University, Utrecht, Netherlands
- 4Laboratory of Microbiology, Ghent University, Ghent, Belgium
- 5Center of Technology - CT2, SAGE-COPPE, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
Cyanobacteria are major contributors to global biogeochemical cycles.
The genetic diversity among Cyanobacteria enables them to thrive across
many habitats, although only a few studies have analyzed the
association of phylogenomic clades to specific environmental niches. In
this study, we adopted an ecogenomics strategy with the aim to delineate
ecological niche preferences of Cyanobacteria and integrate them to the
genomic taxonomy of these bacteria. First, an appropriate phylogenomic
framework was established using a set of genomic taxonomy signatures
(including a tree based on conserved gene sequences, genome-to-genome
distance, and average amino acid identity) to analyse ninety-nine
publicly available cyanobacterial genomes. Next, the relative abundances
of these genomes were determined throughout diverse global marine and
freshwater ecosystems, using metagenomic data sets. The
whole-genome-based taxonomy of the ninety-nine genomes allowed us to
identify 57 (of which 28 are new genera) and 87 (of which 32 are new
species) different cyanobacterial genera and species, respectively. The
ecogenomic analysis allowed the distinction of three major ecological
groups of Cyanobacteria (named as i. Low Temperature; ii. Low
Temperature Copiotroph; and iii. High Temperature Oligotroph) that were
coherently linked to the genomic taxonomy. This work establishes a new
taxonomic framework for Cyanobacteria in the light of genomic taxonomy
and ecogenomic approaches.
Introduction
Earth is home to nearly one trillion (1012) microbial species that have evolved over ~4 billion years (Locey and Lennon, 2016).
Cyanobacteria emerged ~3 billion years ago, ushering Earth's transition
from anoxygenic to oxygenic conditions through photosynthesis (Schirrmeister et al., 2011a).
Throughout their evolution, Cyanobacteria became one of the most
diverse and widely distributed Prokaryotes, occupying many niches within
terrestrial, planktonic, and benthic habitats. Their long history
evolved in a broad heterogeneity comprising unicellular and
multicellular, photosynthetic and non-photosynthetic (i.e.,
Melainabacteria) (Schirrmeister et al., 2011a; Di Rienzi et al., 2013; Soo et al., 2014), free-living, symbiotic, toxic and predatory organisms (Soo et al., 2015), with genomes sizes ranging from 1 to 10 Mb (Shih et al., 2013). Here we consider Cyanobacteria phylum as consisting only of oxygenic phototrophs.
Cyanobacteria (also known as the Cyanophyceae,
Cyanophyta, cyanoprokaryota, blue-green algae or blue-green bacteria)
share similar metabolic features with eukaryotic algae and have been
named according to the Botanical Code (Kauff and Büdel, 2010). The inclusion of Cyanobacteria in taxonomic schemes of Bacteria was only proposed in 1978 by Stanier et al. (1978), and through time the bacterial taxonomic names have come into conflict with the botanical nomenclature (Oren, 2004; Oren and Garrity, 2014).
More than two decades passed before a Note to General Consideration 5
(1999) was published for Cyanobacteria to be included under the rules of
the International Committee on Systematic Bacteriology
(ICSB)/International Committee on Systematic of Prokaryotes (ICSP) (Tindall, 1999; De Vos and Trüper, 2000; Labeda, 2000). Taxa nomenclature within this group has long been a topic of discussion, but currently there is no consensus (Hoffmann et al., 2005; Oren and Tindall, 2005; Oren et al., 2009; Oren and Ventura, 2017).
As a result, more than 50 genera of Cyanobacteria have been described
since 2000, and many of them remain unrecognized in the List of
Prokaryotic Names with Standing in Nomenclature, LPSN, http://www.bacterio.net (Parte, 2014) or in databases (e.g., NCBI).
The Cyanobacteria form a challenging group for the
microbiologists. Their traditional taxonomy based on morphologic traits
does not reflect the results of phylogenetic analyses (Rippka et al., 1979; Boone and Castenholz, 2001; Gugger and Hoffmann, 2004; Schirrmeister et al., 2011b; Hugenholtz et al., 2016).
The predominance of morphology assembled unrelated Cyanobacteria into
polyphyletic species and genera and higher taxonomic categories which
require revisions in the future (Komárek et al., 2014).
The polyphyly is an indicative of the taxonomic mislabeling of many
taxa. The 16S rRNA gene sequences were useful in charting and
characterizing microbial communities (Kozlov et al., 2016)
but this molecule lack sensitivity for evolutionary changes that occur
in ecological dynamics, where microbial diversity is organized by
physicochemical parameters (Choudoir et al., 2012; Becraft et al., 2015).
Hence, the processes that shape cyanobacterial communities over space
and time are less known. A recent study proposed that there should be
170 genera of Cyanobacteria based on 16S rRNA sequences only (Kozlov et al., 2016). Farrant et al. (2016) delineated 121 Prochlorococcus and 15 Synechococcus ecologically significant taxonomic units (ESTUs) in the global ocean using single-copy petB sequences (encoding cytochrome b6) and environmental cues.
High Throughput Sequencing (HTS) have revolutionized the
practice of microbial systematics, providing an informative,
reproducible, and portable tool to delineate species, reconstruct their
evolutionary history, and infer ecogenomic features (Gevers et al., 2005; Konstantinidis and Tiedje, 2005a,b; Garrity and Oren, 2012; Gribaldo and Brochier-Armanet, 2012; Shih et al., 2013; Sutcliffe et al., 2013; Hugenholtz et al., 2016). This approach allows both cultured (Al-saari et al., 2015; Appolinario et al., 2016) and uncultured microorganisms (Iverson et al., 2012; Brown et al., 2015; Hugerth et al., 2015)
to be studied. The latter is especially important because the
cyanobacterial cultivation in laboratory is another hurdle in the study
of this group of bacteria.
Recommendations that nomenclature should agree with and reflect genomic information were stated during the pre-genomic era (Wayne et al., 1987),
due nothing describes an organism better than its genome.
Sequence-based methods to delimit prokaryotic species have emerged to
define and to improve cut-offs criteria during the genomic era (Gevers et al., 2005; Konstantinidis and Tiedje, 2005a,b; Konstantinidis et al., 2006; Goris et al., 2007; Richter and Rossello-Mora, 2009; Auch et al., 2010a; Thompson et al., 2013a,b; Varghese et al., 2015),
demonstrating a greater discriminatory power. Inexorable advances in
methodologies will incorporate genomics into the taxonomy and
systematics of the prokaryotes, boosting the credibility of taxonomy in
the current post-genomic era (Coenye et al., 2005; Chun and Rainey, 2014). Up-to-date, while several groups have been analyzed through a genomic-wide view (Gupta et al., 2015; Adeolu et al., 2016; Hahnke et al., 2016; Ahn et al., 2017; Amin et al., 2017; Waite et al., 2017),
many others have faced hurdles, such as Cyanobacteria. However, a
genomic taxonomy approach has successfully been applied to elucidate the
taxonomic structure of the two cyanobacterial genera, Prochlorococcus and Synechococcus (Thompson et al., 2013a; Coutinho et al., 2016a,b).
As genomic taxonomy postulates numeric, non-subjective, cut-offs for
taxa delimitation, strains were considered to belong to the same species
when share at least 98.8% 16S rRNA gene sequence similarity, 95% of
AAI, and 70% GGD (Konstantinidis and Tiedje, 2005a; Thompson et al., 2013a,b), while species from the same genus form monophyletic branches (Yarza et al., 2008; Qin et al., 2014).
It is in agreement with the concept of species as a discrete,
monophyletic and genomically homogeneous population of organisms that
can be discriminated from other related populations by means of
diagnostic properties (Rossello-Mora and Amann, 2001; Stackenbrandt et al., 2002).
The availability of whole-genomes opened the doors for an in-depth
knowledge in microbial diversity and ecology, where the entire genomic
pool may be applied to understanding the forces that govern community
structure. The use of ecogenomic analysis postulates a reliable and
scalable approach to delineate species and genera in order to
reconstruct their evolution and to draw a global picture of possible
ecological determinants (Di Rienzi et al., 2013; Soo et al., 2014; Spang et al., 2015; Thompson et al., 2015; Anantharaman et al., 2016; Garrity, 2016; Hug et al., 2016; Hugenholtz et al., 2016). Our hypothesis is that a phylogenomic framework will reflect ecologic groups found in nature.
To test this hypothesis, we first established a
phylogenomic framework, using genomic signatures (i.e., a tree based on
conserved gene sequences, average amino acid identity, and
genome-to-genome distance), with the circumscription of species and
genera. We then classified the genomes in three major groups according
to their ecological traits as inferred through metagenomics and
environmental metadata. Finally, we correlated the three disclosed
ecogenomic groups (i. Low Temperature; ii. Low Temperature Copiotroph;
and iii. High Temperature Oligotroph) with the circumscribed species and
genera. We observed that the taxonomic delineation of species and
genera is coherent with the ecogenomic groups.
Materials and Methods
Genome Election
Cyanobacterial genomes publicly available in January 2016
were retrieved from RefSeq (NCBI Reference Sequence Database), GenBank
and GEBA (Genomic Encyclopedia of Bacteria and Archaea) databases.
Genome completeness was assessed with CheckM (Parks et al., 2015),
and the genomes that were at least 90% complete and assembled in <
500 contigs were used for further analyses. Ninety-nine genomes were
selected based on that criterion, and they are listed in Table 1 (additional information on Table S2).
s://www.frontiersin.org/articles/10.3389/fmicb.2017.02132/full
Annotation and Genomic Taxonomy
All genomes were annotated using Prokka version 1.11 (Seemann, 2014),
with default settings, in order to avoid any possible bias. Genomic
taxonomy of the ninety-nine cyanobacterial genomes was performed
according to Thompson et al. (2013a) and Coutinho et al. (2016a)
and are briefly described here. Average Amino acid Identity (AAI) and
Genome-to-Genome Distance (GGD) were calculated as described previously (Konstantinidis and Tiedje, 2005a; Auch et al., 2010a,b; Meier-Kolthoff et al., 2013). GGD were calculated using the Genome-to-Genome Distance Calculator tool, version 2.1 under recommended settings (Meier-Kolthoff et al., 2013; http://ggdc.dsmz.de/), whereas AAI values were carried out through GenTaxo as previously described (Coutinho et al., 2016a). The species cut-offs delimitation were ≥95% AAI and ≥70% GGD, and ≥70% AAI for genus delimitation.
The Manhattan distances were calculated based on the
percentage AAI values of every genome (genome-genome matrix) and was
used as the input for making the hierarchical clustering using the
hclust() function in R (R Development Core Team, 2011).
This distance is able to indicate how far/close the genomes are located
from each other. The heatmap was produced by heatmap.2 {gplots} package
in R, with background color of each panel mapping to percentage AAI
values.
Phylogenetic Analysis
To establish the phylogenetic structure of the phylum
Cyanobacteria, phylogenetic trees were constructed using the 16S rRNA
gene sequences and the concatenated alignments of a set of conserved
genes, most of which encode ribosomal proteins.
Ribosomal RNA Sequences
The small subunit ribosomal RNA (16S rRNA) sequences
from all cyanobacterial strains for which whole genome sequence data are
publicly available (exception see below, thus N = 97), as well as 16S rRNA gene sequences from additional type-strains available (N = 14) were all analyzed. The sequences were retrieved from the ARB SILVA database (Pruesse et al., 2007; Quast et al., 2013). Whenever sequences were not available, they were retrieved directly from the genomes using RNammer 1.2 Server (Lagesen et al., 2007). Sequences were aligned through MUSCLE v. 3.8 (Edgar, 2004), with default settings, and Gblocks 0.91b (Castresana, 2000; Talavera and Castresana, 2007) was used for alignment curation. Using MEGA 6 (Tamura et al., 2013),
best-fitting nucleic acid substitution models were calculated through
the MLModelTest feature. Models were ranked based on their Bayesian
Information Criterion (BIC) scores as described by Tamura et al. (2013).
The model with the lowest BIC score was selected and used for further
phylogenetic analysis. The phylogenetic inference was obtained using the
Maximum Likelihood method based on the Kimura 2 parameter method with
the Gamma distributed rate variation (K2+G) as the nucleotide
substitution model, which was estimated from the data. The support
branches of tree topology were checked by 1,000 bootstrap replicates.
The 16S rRNA gene alignments were used to estimate the degree of genetic
distance between strains through the Tajima-Nei method (Tajima and Nei, 1984).
Gloeobacter violaceus PCC 7421 was set as the outgroup in both trees. Trees were visualized with FigTree, version 1.4.2 (Rambaut, 2015). Due to incomplete or partial sequences, Synechococcus sp. CB0101 was omitted from these analyses. Planktothrix mougeotii NIVA-CYA 405 as well as Planktothrix prolifica
NIVA-CYA 540 were not included in the phylogenetic analyses because 16S
rRNA sequences are not currently available for these strains (and not
retrievable from their genomes).
The type-strains or the type-species of each taxa were
included in the 16S phylogenetic tree to confirm the phylogenetic
relatedness of the cyanobacterial genomes. Designations of type strain
or type species were not available for Chaemaesiphon minutus PCC6605, Pleurocapsa sp. PCC7319, Rivularia sp. PCC7116, Synechocystis sp. PCC7509, Trichodesmium erythraeum IMS01, Xenococcus sp. PCC7305, cyanobacterium ESFC-1, and cyanobacterium JSC-12. Geitlerinema sp. PCC7105 is the reference strain for marine species of Geitlerinema, and PCC73106 is the reference strain for Gloeocapsa (Sarma, 2012).
Conserved Marker Genes
A tree was generated using 31 conserved gene sequences previously validated as phylogenetic markers for (cyano) bacteria (Wu and Eisen, 2008, and recently used by Shih et al., 2013 and Komárek et al., 2014). The sequences of these proteins were mined using the AutoMated Phylogenomic infeRence Application—AMPHORA2 tool (Wu and Scott, 2012),
through default settings for the Bacteria option, and with a cut-off
value of 1.e−10. Individual alignments were performed for each of the 31
gene sets through MUSCLE v. 3.8 with default settings (Edgar, 2004).
All alignments were then concatenated.
A
Maximum Likelihood tree was constructed using RaxML v. 7 (Stamatakis, 2006)
and the Dayhoff+G likelihood model. One thousand bootstrap replications
were calculated to evaluate the relative support of the branches. Trees
were visualized with FigTree, version 1.4.2 (Rambaut, 2015).
Abundance of Cyanobacterial Genomes Across Aquatic Environments and Ecological Correlations
Marine and freshwater metagenomes were retrieved to
determine the abundance of ninety-nine cyanobacterial genomes across the
Earth. A set of 191 marine metagenomes from the Tara Ocean project were
retrieved for analysis along with their associated metadata (Sunagawa et al., 2015).
Sample-associated environmental data were inferred across multiple
depths at global scale of Tara's metagenomics sampling: (i) surface
water layer (5 m, s.d. = 0); and (ii) subsurface layer, including deep
chlorophyll maximum zone (71 m, s.d. = 41 m) and mesopelagic zone (600
m, s.d. = 220 m) (Sunagawa et al., 2015).
Eight freshwater metagenomes were retrieved for analysis from the
Caatinga biome microbial community project along with their associated
metadata (Lopes et al., 2016).
Metagenome reads were mapped to a database containing the ninety-nine analyzed cyanobacterial genomes through Bowtie2 (Langmead and Salzberg, 2012) using -very-sensitive-local and -a options. Abundance of genomes across samples was calculated based on the number of mapped reads as described by Iverson et al. (2012).
Metagenomes were compared based on the relative abundances of the
ninety-nine analyzed genomes within them using non-metric
multidimensional scaling (NMDS).
Spearman correlation coefficients (R, or Spearman's rho)
were calculated for the abundance of each genome and the levels of
measured environmental parameters across samples. Next, a dissimilarity
matrix of Manhattan distances was calculated based on the Spearman
correlation values of every genome. All correlations were used by this
analysis regardless of the corrected p-value, as non-significant
correlations are still ecologically informative as they indicate weak
associations between microorganisms and environmental parameters.
Finally, this dissimilarity matrix was used as input for hierarchical
clustering using the complete linkage method within the hclust()
function in R. The resulting dendrogram was visually inspected to define
groups (i.e., ecogenomic groups) of organisms with similar correlation
patterns which were named based on the main correlated feature.
The classification reassessment was made integrating the
results of genomic taxonomy, phylogenomic analysis and ecogenomic
signals through an accurately comparison.
Results
Phylogenomic Framework Reconstruction
The tree based on conserved marker genes (Figure 1)
revealed the topology with the presence of well-defined nodes in
general with bootstrap support values greater than 50% over 1,000
replicates. The phylogenomic tree (Figure 1) gave a higher resolution than the 16S rRNA phylogenetic analysis (Figure S1 and Table S1), in the means that strains were better discriminated in the conserved marker genes tree (e.g., Parasynechococcus group, Figure 1 and Figure S1).
The species assignations were considered correct when organisms located
on the same phylogenetic branch as the corresponding type strains or
type species presented the 16S rRNA sequence similarity higher than
98.8%, such as Crinalium epipsammum SAG22.89T (Figure S1) and Crinalium epipsammum PCC9333 (Figure S1 and Figure 1).
Figure
Figure 1. Phylogenomic tree of the Cyanobacteria phylum
with the proposed new names. Tree construction was performed using 100
genomes (ninety-nine used in this study plus the outgroup), based on a
set of conserved marker genes. The numbers at the nodes indicate
bootstrap values as percentages greater than 50%. Bootstrap tests were
conducted with 1,000 replicates. The unit of measure for the scale bars
is the number of nucleotide substitutions per site. The Gloeobacter violaceus
PCC 7421 sequence was designated as outgroup. Capital letters indicate
environmental source: F, freshwater; M, marine; P, peat bog (sphagnum);
S, soil; T, thermal; and §, other habitat. New names are highlighted in
red. Overwritten T indicates type strain or type species. Ecogenomic
groups are depicted in different colors as indicated in the legend: Low
Temperature group; Low Temperature Copiotroph group; and High
Temperature Oligotroph group. Cases depicted in the Results section are
in bold.
Genomic Diversity of Cyanobacteria
In total, we found 57 branches corresponding to genera based on the AAI and GGD analyses (Figure 2).
The genus and species cut-off delimitation were ≥70% and ≥95% AAI
similarity respectively. Thirty-three new genera and 87 species (of
which 28 are new species) were circumscribed. From a total of
ninety-nine genomes used in this study, 69 were previously classified to
the species level, whereas the remaining 30 had incomplete taxonomic
classification (i.e., only sp. or unclassified). In total, 13 genera
(from a total of 33) and 38 species (from a total of 69) were
taxonomically reclassified and/or re-named. Thus, we found that 71 of
all analyzed genomes required reassignment at one or more ranks to
reconcile existing taxonomic classifications with our new genomic
taxonomy (Figure 2 and Figure S1).
Figure 2. Heatmap displaying the AAI levels between
cyanobacterial genomes. The intraspecies limit is assumed as ≥95%,
whereas genera delimitation is assumed as ≥70% (dashed lines) AAI.
Clustering the genomes by AAI similarity was done using a hierarchical
clustering method in R (hclust), based on Manhattan distances. The AAI
values are associated with the respective thermal color scale located at
the bottom left corner of the figure. The proposed new genera and
species names were adopted in this figure.
Over the next section, we
highlight four specific cases to exemplify cyanobacterial taxonomic
issues that were resolved through our genome-driven approach (see Figure
S2).
These cases illustrate how the use of genomic taxonomy in Cyanobacteria
provides relevant information (Data Sheet 1, Formal description of new
genera and species).
Case I. Oscillatoria group. Analysis of the five genomes of Oscillatoria distinguished four genera, based on the genomic signatures (i.e., GGD, AAI, 16S, and conserved marker genes tree):
(i) Oscillatoria acuminata PCC 6304 type strain formed a separate group;
(ii) Oscillatoria sp. PCC 10802 formed a separate divergent group, corresponding to a new genus named Somacatellium (S. hydroxylic PCC 10802T);
(iii) Oscillatoria nigroviridis strain PCC 7112 (closest related with Microcoleus vaginatus FGP-2 type strain) belongs to the genus Microcoleus (M. nigroviridis PCC 7112T); and
(iv) Oscillatoria strains PCC 6407 and PCC 6506 formed a new genus named Toxinema (T. oscillati PCC 6407T and T. oscillati PCC 6506).
Case II. Leptolyngbya group. The five Leptolyngbya strains were polyphyletic, forming different phylogenetic branches.
Thus, (i) Leptolyngbya boryana PCC 6306T type strain forms a separate group with cyanobacterium JSC-12, while the rest of the Leptolyngbya strains cluster apart;
(ii) strain PCC 7376 forms a new genus named Enugrolinea (E. bermudensis PCC 7376T);
(iii) strain PCC 7375 forms a new genus named Adonisia (A. splendidus PCC 7375T);
(iv) strain PCC 7104 forms a new genus named Allonema (A. longislandicus PCC 7104T); and
(v) strain PCC 6406 forms a new genus named Euryforis (E. eilemai PCC 6406T).
Case III. Arthrospira group. Examination of the four Arthrospira strains indicated that
i) A. platensis C1 should be considered a new species, named A. sesilensis (A. sesilensis C1T);
(ii) strain PCC 8005 belongs to a new species, named A. nitrilium (A. nitrilium PCC 8005T); and
(iii) the type strain of Arthrospira platensis (PCC 7345) formed a tight cluster along with NIES-39 and Paraca.
Case IV. Synechococcus group. The nine Synechococcus strains split in
(i) S. elongatus PCC 6301T type strain forms a separate group with S. elongatus PCC 7942;
(ii) strain PCC 6312 forms a new genus named Stenotopis (S. californii PCC 6312T);
(iii) strain PCC 7335 belongs to a new genus named Coccusdissimilis (C. mexicanus PCC 7335T);
(iv) strains JA23Ba213 and JA33Ab formed a new genus named Leptococcus (L. springii JA23Ba213T and L. yellowstonii JA33AbT);
(v) strain PCC 7336 formed a new genus named Eurycoccus (E. berkleyi PCC 7336T);
(vi) strain PCC 7502 belonged to a new genus named Leptovivax (L. bogii PCC 7502T); and
(vii) Synechococcus euryhalinus PCC 7002 represents a new genus named Enugrolinea (E. euryhalinus PCC 7002T).
-----
Members of the Low Temperature
group were characterized by positive correlations with the concentration
of nitrogen (N) and phosphorus (P) sources; weak positive correlations with
minimum generation time, silicate and depth; and by negative
correlations with temperature, microbial cell abundance, oxygen
availability, and salinity (Figures 3, 4).
Meanwhile, members of the Low Temperature Copiotroph group were
characterized by strong positive correlations with the concentration of
nitrogen (N) and phosphorus(P); positive correlations (stronger than those
presented by Low Temperature group) with minimum generation time,
silicate and depth; and by negative correlations (also stronger than
those presented by Low Temperature group) with temperature, microbial
cell abundance (in particular with autotroph cell density), oxygen
availability, and salinity (Figures 3, 4).
Finally, members of High Temperature Oligotroph group were
characterized by negative correlations with the concentration of
nitrogen and phosphorus and positive correlations with temperature and
autotroph cell abundance (Figures 3, 4).
As suggested by correlation analyses (Figures 4C,D),
NMDS revealed the Low Temperature Copiotroph group to be more abundant
in cold and eutrophic waters, while the High Temperature Oligotroph
group exhibited the opposite pattern and was more abundant in warm and
oligotrophic environments (Figures 4A,B).
In turn, Low Temperature was more abundant at intermediate conditions
between these polar opposites and was shown to be more abundant in
samples with higher cell densities and NO2 concentrations.
We also investigated the abundance of the ecogenomic
groups in freshwater environments. Unfortunately, there is no currently
available large-scale dataset of freshwater metagenomes with associated
metadata comparable to the Tara Oceans dataset. To define freshwater
ecogenomic groups we chose to extrapolate the classification obtained
from the analyses of the marine dataset. In freshwater metagenomes, the
Low Temperature Copiotroph was the dominant group in all the analyzed
samples (Figure S4A).
NMDS of freshwater samples suggested that Low Temperature group
displayed a preference for higher pH and DOC, nitrite and total nitrogen
concentrations whereas the High Temperature Oligotroph group has a
preference for habitats with higher concentrations of POC, phosphorus,
ammonia and nitrate (Figures S4B,C).
Discussion
The use of HTS technologies and environmental surveys
have allowed studies that link phylogenomics and ecogenomics of
Cyanobacteria. High-throughput genome sequence technologies are causing a
revolution in microbial diversity studies. Recent studies have obtained
dozens of new metagenome-assembled genomes from complex environmental
samples (Brown et al., 2015; Hugerth et al., 2015; Almstrand et al., 2016; Haroon et al., 2016; Pinto et al., 2016).
The abundance of these genomes across different environments can now be
inferred from metagenomics, including their metabolic and ecological
potential. It is clear that a new system is required to allow for
precise taxonomic identification of these new genomes.
WGS as the Basic Unit for Cyanobacteria Genomic Taxonomy (CGT)
Comparative genomic studies allow for identification of
sequence groups with high genotypic similarity based on variation in
protein coding genes distributed across the genomes. Analyses of
environmental metagenomes and microbiomes have shown that microbial
communities consist of genotypic clusters of closely related organisms (Farrant et al., 2016).
These groups display cohesive environmental associations and dynamics
that differentiate them from other groups co-existing in the same
environment. In light of new concepts, restlessness is mounting with the
inability to define the microbial species itself. Evolution studies on
closely related bacteria show rapid and highly variable gene fluxes in
evolving microbial genomes, suggesting that extensive gene loss and
horizontal gene transfer leading to innovation are the dominant
evolutionary processes (Batut et al., 2014; Puigbò et al., 2014).
CGT will solve the often-observed issue that even closely related
genomes contain high gene content variation, that gives phenotypic
variation. CGT is completely adjusting to the genomics era, addressing
the needs of its users in microbial ecology and clinical microbiology,
in a new paradigm of open access (Beiko, 2015).
CGT will provide a predictive operational framework for reliable
automated and openly available identification and classification (Thompson et al., 2015).
Proposals for Cyanobacterial Taxonomy
A main gap exists and is growing each day between the
formal taxonomy of Cyanobacteria and the forest of acronyms and numbers
in the different databases. Indeed, the nameless operational taxonomic
units (OTUs), strains, isolates and WGS sequences (Beiko, 2015; Kozlov et al., 2016)
form the great majority of data in private and public databases. There
is a need to re-examine the Cyanobacteria prokaryote species, taking
into account all recently developed concepts, e.g., the gene flow unit,
OTU, ESTU and Candidate taxonomic unit (CTU) in the context of a
pragmatic genome-based taxonomic scheme. The type species or strain can
be a culture, DNA or a WGS. The CGT system should maintain all of the
existing information, integrating it with new data on DNA, genomes,
isolates/strains, cultured and uncultured, “Candidatus” cases and
reconstructed genomes from metagenomes (Brown et al., 2015; Hugerth et al., 2015).
The international initiatives of GEBA are currently working on
determining the WGS of all type strains of known microbial species to
shorten this gap (more than eleven thousand genomes).
We strongly recommended that the modern taxonomy should
be based on WGS. The enormous amount of unique gene sequences (e.g., 16S
rRNA gene) databases should be always compared to the available
genome-based phylogeny. Studies focusing on one specific taxa/group
cannot be disregarded the phylogenetic analysis for the whole major
taxa. It will avoid the inclusion of the previously erroneous taxa on
the analysis. Furthermore, the anxiety to give a new name should be
reconsidered. Proposes of new taxa where the phylogenetic relationship
was not firmly established are frequently found (e.g., Rajaniemi et al., 2005).
Ecogenomics and the Delineation of the Ecological Niches of Cyanobacteria
Correlation analysis allowed us to characterize how the
abundance of the analyzed genomes is associated with environmental
parameters at both marine and freshwater habitats. These associations
shed light on ecological interactions taking place within aquatic
habitats that are responsible for delineating the ecological niches of
Cyanobacteria. Our results showed that taxonomic affiliation and niche
occupancy are coherently linked, i.e., closely related species of the
same genus often shared correlation patterns, and consequently were
assigned to the same ecogenomic group.
The identification of specific features responsible for
defining niche occupancy among these organisms depends on extensive
experimental data focusing on both physiological and morphological
features, which is outside of our scope. Nevertheless, we speculate that
some features are likely playing a role in this process:
(1)
Transcriptional patterns: The way in which Cyanobacteria regulate gene
expression in response to changing environmental conditions is likely to
play a role in defining which habitats are better suitable for growth
of different species.
(2) Nutrient uptake and utilization: Throughout the aquatic environment a myriad of gradients of nutrient abundance are formed (Stocker and Seymour, 2012).
The cyanobacterial capacity for uptake and utilization of limiting
nutrients (e.g., P, N and Fe) is associated with their ecological niches
occupancy (Thompson et al., 2013a; Coutinho et al., 2016b; Farrant et al., 2016).
Considering that significant associations were detected between the
abundance of the analyzed genomes and the nutrients sources (phosphorus
and nitrogen), we assume that the diversity and efficiency of their
nutrient transporters plays a major role in defining the cyanobacterial
affiliation to the proposed ecogenomic groups.
(3)
Photosynthetic machinery and efficiency: Cyanobacteria are remarkably
diverse when considering their photosynthetic physiology. Species differ
with regard their preferred light intensities and wavelengths which
affects their photosynthetic efficiency (Moore et al., 1998; Ting et al., 2002). They also can be differentiated regarding their carboxysomes, sub-cellular structures where carbon fixation takes place (Yeates et al., 2008).
To our knowledge, no study has consistently compared the photosynthetic
yields of all the strains analyzed here, therefore we cannot determine
if the proposed ecogenomic groups differ regarding this parameter.
Nevertheless, distinctions regarding their requirements for efficient
photosynthesis are likely linked to their patterns of niche occupancy.
Ecogenomics, Global Changes, and Cyanobacterial Communities
Over the past two centuries, human development has
affected aquatic ecosystems due to nutrient over-enrichment
(eutrophication), hydrologic alterations, global warming and ocean
acidification. Temperature is one of the most important factors
determining the taxonomic composition of marine microbial communities (Sunagawa et al., 2015).
Our data shows that temperature is central for regulating the
composition and functioning of cyanobacterial communities. Global
warming can affect growth rates and bloom potentials of many taxa within
this phylum (Fu et al., 2007; Paerl and Huisman, 2008; Flombaum et al., 2013; Pittera et al., 2014). Niche based models predict an increase in the absolute levels of organisms formerly classified as Prochlorococcus and Synechococcus due to global warming (Flombaum et al., 2013). Consequently, the functioning of the biogeochemical cycles in which these organisms are involved will also be affected (Fu et al., 2007).
Nevertheless, much less is known regarding how global warming could
affect communities of Cyanobacteria aside from these two groups of
organisms.
The ecogenomic groups identified and their associations
with environmental parameters shed light into the potential changes that
communities of Cyanobacteria will undergo following global climate
changes. Our results indicate that an increase in temperature will lead
to decreases in the relative abundances of Low Temperature and Low
Temperature Copiotroph groups, while that of High Temperature Oligotroph
group increases, especially those of species Eurycolium neptunis, E. ponticus, E. chisholmi, and E. nereus.
One major impact of this alteration is a possible effect on the degree
of nitrogen fixation mediated by Cyanobacteria, as none of the species
assigned to the High Temperature Oligotroph group are known to fix
nitrogen (Latysheva et al., 2012).
In fact, our data shows that higher temperatures are associated with
lower relative abundances of nitrogen fixating Cyanobacteria of the
genera Trichodesmium and Anabaena (Zehr, 2011).
Both beneficial and deleterious effects of the ocean warming and
associated phenomena (e.g., acidification) on the rates of growth and N2 fixation have been reported (Hutchins et al., 2007; Shi et al., 2012; Fu et al., 2014), and recent laboratory and field experiments (Hong et al., 2017) showed that the acidification inhibit growth and N2 fixation in T. erythraeum IMS101T
due a decrease in cytosolic pH resulting biochemical cost of proton
pumping across membranes. Rising temperatures might shift cyanobacterial
community composition toward a state were diazotrophs are relatively
less abundant. Because nitrogen is often a limiting nutrient to marine
primary productivity (Tyrrell, 1999; Moore et al., 2013),
alterations in the oceanic levels of nitrogen fixation could affect not
only non-diazotrophic Cyanobacteria but also heterotrophic microbes as
well as the higher tropic levels that are sustained by microorganisms.
Furthermore, our findings suggest that changes in
temperature can affect the contributions of Cyanobacteria to the global
carbon pump (Flombaum et al., 2013; Biller et al., 2015).
For example, the five strongest positive correlations with temperature
between the High Temperature Oligotroph group involve the high-light
adapted members of the Eurycolium genus (i.e., strains MIT9312T, MIT9301T, MIT9215, MIT9202T, and AS9601T). These are high-light adapted strains that display lower photosynthetic efficiency than their low-light adapted counterparts (Moore et al., 1998; Moore and Chisholm, 1999).
Our results suggest that the relative abundance of high-light adapted
strains would increase induced by the rising temperatures. In turn,
these changes could affect the efficiency of carbon fixation in the
ocean, a change that could also be influenced by the alterations in
nitrogen fixation mentioned above.
Conclusions
The present study proposes a first attempt toward
integrating taxonomy and ecogenomics, offering a compelling new
perspective for the development of Cyanobacteria studies. Our results
show that closely related genomes often share a niche and can be
assigned to the same ecogenomic group. End-users of Cyanobacteria
taxonomy may benefit from a more reproducible and portable taxonomic
scheme. Future studies are needed to expand the evolutionary and
physiological basis for the cyanobacterial niche occupancy, integrating
other important ecological variables such as phage susceptibility, light
utilization strategies, horizontal gene transfer, and inter-species
interactions.
Author Contributions
All authors contributed to the writing of the manuscript.
JW, FC, BD, JS, FT, and CT designed and planned the study. JW and FC
performed the bioinformatics analyses, analyzed the results, and
compiled the data. All authors approved the final version of the
manuscript.
Funding
This work was supported by the National Counsel of
Technological and Scientific Development (CNPq), Coordination for the
Improvement of Higher Education Personnel (CAPES), and Rio de Janeiro
Research Foundation (FAPERJ).
Sitaatti blogiini 30.6. 2023