-
The V4 bacterial cells were Gram-negative, with the red phenotype under standard gram-staining method (Supplemental Fig. S1a). The results of biochemical characteristics showed that the following substrates were utilized: mannitol, inositol, melibiose and raffinose. Other substrates unavailable: sorbitol, ribol, phenylalanine, ornithine and lysine. Tests were negative for methyl red, indole, urease, H2S and citrate utilization, except for V-P test. Also, the test was positive for motility test medium (semisolid agar) (Supplemental Fig. S1b). The 100 μg/mL IAA standard showed negative for indole tests using this biochemical assay (Supplemental Fig. S1c). The SEM image showed that the appearance of V4 was rod shaped, with average dimensions of 1.34−1.5 μm long and 0.32−0.39 μm wide. The TEM observation showed pure cultures of strain V4 in LB medium revealed flagellum at both ends, showing an average up to 5 μm long (Fig. 1).
Figure 1.
Scanning electron micrograph (SEM) of strain V4 grown for 10 h in LB solid medium, acceleration voltage of 3.0 kv, (a) 8.4 mm × 5.00 k, scale bar 10 μm, (b) 8.4 mm × 30.0 k, scale bar 1 μm. Transmission electron micrograph (TEM) showing cells of strain V4 grown for 10 h in LB solid medium, with flagella at both ends, acceleration voltage of (c) 80 kv, × 1.2 k, scale bar 5 μm, (d) × 3.0 k, scale bar 2 μm.
Assembly and annotation of the V4 genome sequence
-
The assembled genome sequence of V4 was composed of one circular chromosome of 4,697,109 base pairs (bp) and two plasmids of 160,141 and 71,044 bp, respectively. The GC content of the complete genome was 56.88%. The chromosome genome sequence was predicted to contain 4189 protein-coding genes, 22 tRNA, 84 rRNA and 31 sRNA. It also harbored four gene islands and two prophages. The larger plasmid 1 had 131 protein-coding genes without noncoding RNA genes. Also, one putative coding sequence encoded putative components of the Type IV secretion system (T4SS). The small plasmid 2 had only 64 protein-coding genes (Table 1). The GC depth map and reads comparison map showed the quality of the assembly. The reads obtained from Illumina among the original sequencing were used to compare to our assembly results to obtain GC depth map. The GC content showed a concentrated distribution indicating the absence of species contamination (Supplemental Fig. S2a). Then, the first and last 800 bp of the assembly results were joined together, and then the reads obtained from Illumina sequencing were compared to the joined sequences to assess whether they were looped or not. The sign of loop formation was that complete reads could cross the joining point, which meant that assembly results could form a loop with the first and the last reads connected. Most of the reads in the reads comparison map were well connected to the end and the first end, indicating that the assembly result was not missing at the end and had formed a loop (Supplemental Fig. S2b). In addition, the number of CDS, GC content, and total length of our assembly were comparable to that of the E. rhapontici BY21311 complete genome (PRJNA773578) published in 2022[21]. In the V4 and E. rhapontici BY21311 genomes, the number of CDS were 4,189 and 4,612, GC content were 56.68% and 54.12%, and total length were 4,697,109 bp and 5.16 Mb respectively, which demonstrated that the level of completeness of V4 genome was similar to that of same genus Erwinia.
Table 1. General characteristics of the strain V4 genome.
Chromosome Plasmid 1 Plasmid 2 Size (bp) 4,697,109 160,141 71,044 G+C content (%) 56.68% 53.46% 54.25% Total protein-coding genes 4189 131 64 Total length (bp) 4,080,951 119,871 55,104 G + C content (%) 57.94% 54.65% 56.67% No. of rRNA operons 22 0 0 No. of tRNAs 84 0 0 No. of sRNAs 31 0 0 Gene islands 4 0 0 Prophage 2 0 0 TNSS II, III, IV IV 0 A total of 4,189 putative coding genome sequences were annotated though diverse protein databases. The 4,162, 3,372 and 2,865 coding genome sequences were annotated though the NR, Swissport and KEGG databases, respectively. Therefore, three type secretion systems II, III, IV were acquired based on the annotated result of KEGG and NR databases (Fig. 2a, b). Go enrichment analysis showed characteristics of gene function distribution in molecular function, biological process, cellular component (Fig. 2c). In addition, the function of 3,357 coding genome sequences representing 80.14% of all the sequences were categorized by comparison with the COGs. These functional sequences located in 21 functional categories as showing in Fig. 2d. The function category R (representing general functions) was the largest category, followed by E (amino acid transport and metabolism), G (carbohydrate transport and metabolism), S (function unknown), K (transcription), and P (inorganic ion transport and metabolism). AntiSMASH predicted that strain V4 contained six secondary metabolic biosynthetic gene clusters, including a cluster of siderophore, hserlactone, thiopeptide and three clusters of non-ribosomal peptide synthetases (Supplemental Fig. S3). The circular chromosome and plasmids of the V4 are shown in (Fig. 3). Also, GC skew was used to measure the relative amounts of G and C and mark start and end points in ring chromosomes, GC skew = (G − C)/(G + C), the window size was 10 kb.
Figure 2.
The V4 chromosome genomic annotation information of gene function based on (a) KEGG, (b) NR, (c) GO and (d) COG databases.
Figure 3.
Circular representation of the (a) chromosome and (b) plasmids of V4 strain. Circles (from outside to inside): forward strand genes and reverse strand genes (annotation information of gene function based on COG databases), ncRNA (black, tRNA, red, rRNA), GC content (red, > mean value of GC content, blue, < mean value of GC content), GC skew (positive and negative values being indicated with purple and orange colors respectively) (purple, > 0, orange, < 0).
Taxonomic classification of V4
-
The Blast result of 16S rRNA gene revealed that V4 strain had high sequence identity with the E. rhapontici strain DQ-03 (99.58%), Erwinia sp. Strain 20TX0058 (99.58%), Erwinia sp. CSQXZR5.2.3. (99.51%), Erwinia sp. IMCC25602 (99.58%), and E. aphidicola strain X 001 (99.58%). Next, a preliminary phylogenetic study was undertaken to clarify the taxonomic classification of V4 using cloned 16S rRNA gene sequence. The 34 genus Erwinia and one Herbaspirillum strains 16S rRNA gene sequences with length > 1,200 bp were obtained from NCBI. In total, there were 34 strains of the genus Erwinia belonging to 19 species, including psidii, papaya, mallotivora, gerundensis, oleae, tracheiphila, typograhi, toletana, iniecta, carotovora, billingiae, rhapontici, aphidicola, persicina, tasmaniensis, piriflorinigrans, amylovora, uzenensis, pyrifoliae. The maximum likelihood phylogenetic tree showed that the V4 strain formed a monophyletic clade with members of the species E. aphidicola with 78.8% 16S rRNA gene sequence similarity (Fig. 4a). This clade was clustered with members of the species E. persicina with 23.4% 16S rRNA gene sequence similarity becoming a branch, and this branch was clustered with members of the species E. rhapontici with 98.9% 16S rRNA gene sequence similarity.
Figure 4.
(a) Maximum-likelihood phylogenomic tree based on 16S rRNA gene sequences among V4 strain and Erwinia genus strains. Herbaspirillum seropedicae Z67T was used as outgroup. Bootstrap values are shown in tree branches calculating for 1000 subsets. (b) Whole-genome-based phylogenomic tree of V4 strain, available Erwinia genus strains and Herbaspirillum seropedicae Z67T. Support values are shown in tree branches.
Based on the completion of the V4 strain genome, phylogenetic analysis of the genome was carried out. The peptide sequences of 28 representative strains belonging to 14 Erwinia species including type strains and one H. seropedicae Z67T as outgroup were obtained from NCBI (Supplemental Table S1). The phylogenetic analysis was carried out using homologous gene sequences though OrthoFinder. Phylogenetic analysis of V4 strain and members of the species E. aphidicola (E. aphidicola JCM 21238T and E. aphidicola X001T) also corroborated a close relationship within a single clade with 91.3% similarity (Fig. 4b). This clade was clustered with members of the species E. rhapontici and E. persicina with 70.2% similarity. These analyses allowed us to conclude that V4 was a strain of E. aphidicola.
Genes involved in plant growth promotion traits present in the genome of V4
-
The V4 strain had shown plant growth-promoting ability through plate identification and pot inoculation experiments in a previous study[9]. Plate identification experiments demonstrated that the V4 strain has the ability to produce IAA, ACC deaminase, nitrogen fixation, phosphorus solubilization and siderophores production. Assembly of the genome sequence of V4 therefore provide us with the opportunity to identify key genes and compare their copy number variations associated with plant growth promotion traits (Supplemental Tables S2 & S3).
IAA biosynthesis and ACC deaminase
-
The indole-3-acetic acid (IAA) as a plant hormone was involved in the regulation of plant growth and development. The indole pyruvate decarboxylase, a key rate-limiting enzyme encoded by one copy of ipdC gene and aldehyde dehydrogenase encoded by two copies of dhaS gene, which catalyzed the conversion of indole-3-pyruvic acid to indole-3-acetaldehyde and the dehydrogenation of indole-3-acetaldehyde to indole-3-acetic acid in indole-3-pyruvic acid (IPA) pathway were found in V4 genome. ACC deaminase regulated ethylene production by utilizing the exuded ACC, the immediate precursor of ethylene in higher plants. ACC deaminase was a member of the tryptophan synthase β subunit family of PLP-dependent enzymes, the 1-aminocyclopropane-1-carboxylate (ACC) deaminase and cysteine desulfhydrase were all belonging to PLP-dependent enzymes family with high degree of homology. The ACC deaminase structural gene (acdS) was not found in V4 genome, however, one copy of cysteine desulfhydrase gene (dcyD) which was annotated as 1-aminocyclopropane-1-carboxylate deaminase in COG database present in the genome.
Nitrogen and phosphorus acquisition
-
Nitrogenase was a complex metalloenzyme with conserved structure and biological characteristics, which had the ability to convert nitrogen from air into nitrogenous compounds. It was found that two copies of nitrogen regulation system related genes ntrB (nitrogen regulation protein NR(II)) and ntrC (nitrogen regulation protein NR(I)) were present in the V4 genome. However, the nif family nitrogen fixing genes encoding nitrogenase were not found. The main mechanism of solubilization of insoluble mineral phosphate complexes by gram-negative bacteria was that the direct oxidation of glucose to produce gluconic acid, which was synthesized by glucose dehydrogenase (GDH) and the co-factor pyrroloquinoline quinine (PQQ). The results also showed that one copy of gcd and pqqE gene encoding GDH and PQQ respectively were present in V4 genome. Also, one copy of pstA and two copies of pitA genes related to the high-affinity phosphate transport (Pst) system and low-affinity phosphate transport (Pit) system to obtain effective phosphorus were also present. Phosphorus transport system-related binding protein genes were also identified, including one copy of phnD2, phnC, phnL and phnK genes, respectively. In addition, the one copy of appA and agp genes associated with phytases synthesis, and a phoR-phoP phosphate regulation system regulated phytases to initiate the release of phosphate from phytate were found in V4 genome.
Siderophores production
-
Siderophore biosynthesis occured via two pathways: the non-ribosomal peptide synthetase (NRPS) pathway and the NRPS-independent siderophore synthetase (NIS) pathway[27]. In V4 genome, the siderophore biosynthesis gene cluster belonging to NIS synthetase pathway was found through antiMAST analysis. Core biosynthesis gene iucC and additional biosynthesis genes ddc and alcA were responsible for the production of siderophore, and three transport-related genes mdfA, zunA, zunC were involved in the transport of siderophore, others genes were hexR, pykA, lpxM, mepM in this siderophore biosynthesis gene cluster. The siderophore outer membrane receptor proteins (fhuA, fhuE, fepA and tonB) and ABC-type Fe2+/Fe3+-hydroxamate transport protein (fepB, fhuB, fhuC, fhuD, fepC and fepD) were found in NRPS pathway.
Others
-
Annotation of the V4 genome also identified candidate genes related to plant growth regulator, plant resistance, extracellular polysaccharide production and heavy metal resistance. One copy of hemA gene encoding glutamyl-tRNA reductase involved in 5-Aminolevulinic acid (5-ALA) biosynthesis were found in the V4 genome. One copy of SpeE gene encoded spermidine synthase that associated with plant resistance. Two copies of GalE genes were related with extracellular polysaccharide biosynthesis. The V4 genome carried one copy of genes including copper-transporting ATPase copA, zinc/cadmium/mercury/lead-transporting ATPase zntA, mtnABCDKN encoding metallothionein, which was able to bind metals. One copy of gstB and gst3 genes, two copies of gstA genes encoded glutathione S-transferase that catalyzed the binding of the sulfur group of glutathione. CysC, cysD, cysH, cysK genes possessing one copy gene number showed ability in sulfate assimilation pathway.
In total, the schematic overview of main plant growth-promoting traits in V4 is shown in Fig. 5. These included IAA biosynthesis, phosphorus acquisition, siderophores production and others plant growth-promoting traits, which indicated V4 could promote plant growth though producing soluble phosphate, plant growth regulator, promoting the uptake of iron ions, improving plant resistance and heavy metal resistance.
Figure 5.
Schematic overview of main plant growth-promoting traits in V4. The depicted pathways were predicted based on the genomic data of V4. Details are available in Supplemental Table S2. These include IAA biosynthesis, phosphorus acquisition, siderophores production and others. Individual pathways were denoted by single-headed arrows, genes were shown in blue italics. Abbreviations: GDH (glucose dehydrogenase); PQQ (pyrroloquinoline quinine); 5-ALA (5-Aminolevulinic acid); ① low-affinity phosphate transport system, Pit; ② high-affinity phosphate transport system, Pst; ③ siderophore transporter; ④ siderophore outer membrane receptor proteins; ⑤ ABC-type Fe2+/Fe3+-hydroxamate transport protein.
Genome mining for V4 endophytic colonization
-
V4 whole genome sequence analysis revealed functional genes and their copy numbers variations potentially associated with colonization according to KEGG database and COG function analysis (Supplemental Tables S4 & S5). In detail, genes for flagellar assembly, chemotaxis were connected with motility, and pilus assembly were important for attachment to plant surfaces for host plant colonization.
Motility was an important characteristic for bacteria. V4 was well equipped with flagellar to move towards plants actively. Its genome contained three region flagellar biosynthesis genes. Flagellar assembly genes contained flgABCDEFGH1IKLMN, flhABCD, motAB and fliACDEFGHIJKMNOPQRST in region-I, II, and III respectively. All of the genes possessed two copies, except for one copy of flgM, fliK, fliT genes and three copies of fliC genes. The flgABCDEFGH1IKLMN genes were involved in complex basal body component. The elongation of hook was controlled by flgE and fliK genes. The fliC gene involved in the assembly of filament as the last step. The products of motAB and fliGMN genes were responsible for energizing the flagellar motors. Each flagellum was driven by a flagellar motor located at the base, which rotates and drives the cell movement.
Chemotaxis enabled microorganisms to move towards beneficial or away from harmful substances in their environments through flagellar motility. V4 had multiple clusters of chemotaxis genes including cheA, cheB, cheY, cheW, cheV2, cheR, cheZ, and mcpA. The copy numbers of cheABYWRZ and mcpA genes had two copies, in addition to cheV2 gene had one copy. The methyl-accepting chemotaxis protein encoding by mcpA coupled the sensor histidine kinase cheA via cheW protein were conserved as chemotaxis signal transduction system. Then cheA phosphorylated the response regulators cheB and cheY. CheB balanced the activity of the methltransferase, and cheY controled the flagellar motor, cheZ promoted cheY-P dephosphorylation and recovered the bacteria ability to respond to external signals. CheR protein added methyl groups to methyl-accepting chemotaxis protein. Additional chemotaxis genes tar, tsr, tas, tap, tcp, trg, ctpL, dppA, mglB, mocB, and rbsB encoding methyl-accepting chemotaxis proteins were involved in chemotaxis signal transduction system to control flagellar motility directly through the motAB and fliGMN genes controlling flagellar motors. There was one copy of tap, tcp, trg, aer, ctpL, mglB, mocB, rbsB genes, two copies of tsr, dppA genes, four copies of tar genes and six copies of tas genes.
Pilus were involved in adhesion to plant surfaces. V4 also had a large number of pilus biosynthesis genes and possessed multiple copies of genes, including one copy of afaC, cpxP, lpfB, pmfD, pmfC, spy, ppdD, hofB, yggR genes, two copies of yhcA, smf-1 genes, three copies of yadV, yhcD, yfcS, vfcU, smfA genes, and four copies of htrE. MrkC, fimD, and htrE were homologous genes. All of them were located in the genomic chromosome. The plasmid 1 was known as a F plasmid which contained tra gene to produce sex pilus. Therefore, V4 could realize conjugative DNA transfer and plant-bacteria interactions by the F-plasmid though a type IV secretion system (T4SS).
Genome mining for signal transduction mechanisms
-
In general, the intracellular signal transduction mechanisms are used for regulating biological process in microorganism including two-component regulatory systems, quorum sensing and so on. Bacterial signal transduction mechanisms are mainly referred to as 'two-component regulatory systems'. There were many two-component signal-transduction systems (TCSs) in bacteria and the structures contain a histidine protein kinase (HK) as a sensor receptor, a reaction regulatory protein as a response regulator, which contains one or more DNA-binding effector domains that participate in transcriptional regulation to generate various responses to environmental alteration. In the V4 genome, 142 and 74 genes were annotated as functional genes involved in two-component systems and quorum sensing respectively basing on the KEGG database. Also, COG database annotation showed 186 genes belonging to T (signal transduction mechanisms) function classification. Combining the annotation results of the above two databases, the numbers of two-component system and quorum sensing genes belonging to T classification were 76 and 8 respectively (Supplemental Table S6). The copy numbers of these genes are shown in Supplemental Table S7. There were 15 TCSs in V4 genomic sequences, including cheA-cheB/cheY phosphotransfer signaling for flagellar chemotaxis, phoR-phoB, pmrB-pmrA, phoQ-phoP and kdpD-kdpE phosphotransfer signaling for phosphate regulation, iron (Fe3+) regulation, magnesium (Mg2+) regulation and potassium (K+) transport individually, arcB-arcA phosphotransfer signaling for anaerobic metabolism and biofilm formation regulation, envZ-ompR and cpxA-cpxR phosphotransfer signaling for osmotic regulation, rcsC-rcsB and ntrB-ntrC phosphotransfer signaling for capsular polysaccharide synthesis and nitrogen regulation, bygS-bvgA phosphotransfer signaling for the production of virulence factors, qseC-qseB phosphotransfer signaling for flagellar and virulence factors genes expressions, rstB-rstA phosphotransfer signaling for multi-drug resistance, dcuS-dcuR phosphotransfer signaling for controlling genes expression in response to C4-dicarboxylates, baeS-baeR phosphotransfer signaling for regulating genes expression. These two-component systems played a major role in regulating cell activities in V4.
Quorum sensing had been shown to be important in traits such as virulence, biofilm formation and swarming motility in bacteria and involved in communication with host plants. In the V4 genome, the eight quorum sensing genes were one copy of qseC, qseE, qseB, luxS, crp, glrR, kdpE genes and three copies of pdeR genes. The qseC-qseB was a two-component regulatory system involved in the regulation of flagella and motility. LuxS was the gene for synthesizing autoinducer 2 (AI-2), which could mediate expression of virulence genes in response to the bacterial cell density as bioactive small diffusible molecules. The cyclic adenosine monophosphate receptor protein was encoded by crp gene, proquorum sensing CRP agonists could inhibit bacteria virulence. KdpE, a KDP operon transcriptional regulatory protein, it regulated potassium (K+) transport in the stressful conditions and contributed to bacterial survival in the host.
Results of comparative genomic analysis
-
Based on the results of gene family analysis, we further compared the copy number variations of genes associated with plant growth promoting traits, colonization, and signal transduction mechanisms. It was shown that V4 and E. tasmaniensis ET1/99 endophytic bacteria both had genes associated with IAA synthesis, and P-solubilization, also had dcyD, ntrB, and ntrC genes associated with ACC deaminase and Nitrogen-fixation abilities. The V4, E. tasmaniensis ET1/99, H. seropedicae Z67 endophytic bacteria and E. aphidicola 18B1 all had genes for production of siderophores. Moreover, V4 has two copies of dhaS, pitA, fhuA and fhuD genes. In others promotion mechanism, V4 had more genes copies than E. tasmaniensis ET1/99 endophytic bacteria, such as, one copy of gstB, gst3 genes and two copies of gstA genes associated with heavy metal resistance, two copies of gale genes associated with extracellular polysaccharide. V4 had core siderophores biosynthesis gene iucC, additional biosynthesis genes ddc, alcA and outer membrane receptor proteins fepA comparing with E. rhapontici BY21311 and E. persicina B64 plant pathogenic bacteria. V4 possessed higher copy numbers of genes associated with flagellar assembly, bacterial chemotaxis, P pilus assembly and two-component system comparing with E. tasmaniensis ET1/99, H. seropedicae Z67, E. rhapontici BY21311 and E. persicina B64 bacteria. Also, both V4 and E. aphidicola 18B1 all had higher gene copy numbers in flagellar assembly.
-
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. The V4 genomic sequence reported has been deposited in NCBI database (accession number, PRJNA855316).
-
About this article
Cite this article
Jia H, Yan Y, Ma J, Xia E, Ma R, et al. 2023. Complete genome sequence of a plant growth-promoting endophytic bacterium V4 isolated from tea (Camellia sinensis) leaf. Beverage Plant Research 3:24 doi: 10.48130/BPR-2023-0024
Complete genome sequence of a plant growth-promoting endophytic bacterium V4 isolated from tea (Camellia sinensis) leaf
- Received: 13 July 2023
- Revised: 06 September 2023
- Accepted: 07 September 2023
- Published online: 08 October 2023
Abstract: V4 is a Gram-negative, plant growth promoting endophytic bacterium that promotes the growth of tea plants. The appearance of V4 is rod shaped, with average dimensions of 1.34−1.5 × 0.32−0.39 μm and flagellum at both ends. The complete genome contains one circular chromosome and two plasmids. It is 4,697,109 bp in size, and contains 4,189 protein-coding genes, four gene islands and two prophages. Taxonomic classification suggested that V4 was a strain of Erwinia aphidicola. It was possible to find genes involved in plant growth promotion traits present in the genome of V4. Meanwhile, V4 was consistent with plant growth-promoting endophytic bacteria containing key synthetic genes associated with IAA synthesis, and P-solubilization, siderophores. V4 has siderophore biosynthesis genes compared with plant pathogenic bacteria showing stronger survival ability and the ability to interaction with the host plant. In addition, V4 endophytic bacteria possess a higher copy number of genes for flagellar assembly, bacterial chemotaxis and P-pilus assembly indicating stronger colonization and communication ability with host plants compared with five other bacteria in comparative genomic analysis. Analysis of the V4 endophytic bacterium complete genome sequence provides novel insights into the endophytic bacteria-host plant relationship, and suggests many candidate genes for post-genomic experiments.
-
Key words:
- Endophytic bacteria /
- Genome /
- Plant growth-promoting /
- Colonization