ARTICLE   Open Access    

Characterization and comparative analysis of the first mitochondrial genome of Michelia (Magnoliaceae)

  • # Authors contributed equally: Suyan Wang, Jing Qiu

More Information
  • Genus Michelia has various functions and is valuable in medicine, food, and agriculture. Many plastid genomes (plastomes) of Michelia have been released, but no mitochondrial genomes (mitogenomes) have been reported. In this study, using third-generation HIFI sequencing techniques, Michelia figo (M. figo) mitogenome was de novo assembled into a circular chromosome spanning 773,377 bp with a total GC content of 46.83%. Sixty six genes in total were annotated, including 41 protein-coding genes, 21 tRNA genes, and three rRNA genes. The mitogenome contains 1,514 dispersed repeats (> 30 bp), 39 tandem repeats, and 262 simple sequence repeats. Eighty one fragments originating from the M. figo plastome were detected in its mitogenome and three tRNA genes (trnD-GUC, trnW-CCA, and trnV-GAC) completely transferred from the plastome to the mitogenome. Repeats and collinearity analyses of four Magnoliaceae mitogenomes reveal substantial structural variations, a relatively low degree of collinearity, and significant genetic diversity of this genus. Phylogenetic analysis showed that two phylogenetic trees constructed separately based on mitogenomes and plastomes accurately depict the phylogenetic relationship of M. figo. This study offers the first comprehensive comparative genomic and phylogenetic analysis of the M. figo mitogenome, facilitating the development of genetic markers, taxonomic classification, and resource exploration within the Michelia genus.
  • The Fabaceae family, the third largest family in angiosperms, contains about 24,480 species (WFO, https://wfoplantlist.org), and has been a historically important source of food crops[14]. Peas (formerly Pisum sativa L. renamed to Lathyrus oleraceusPisum spp. will be used hereafter due to historical references to varietal names and subspecies that may not have been fully synonymized), are a member of the Fabaceae family and is among one of the oldest domesticated food crops with ongoing importance in feeding humans and stock. Peas originated in Western Asia and the Mediterranean basin where early finds from Egypt have been dated to ~4500 BCE and further east in Afghanistan from ~2000 BCE[5], and have since been extensively cultivated worldwide[6,7]. Given that peas are rich in protein, dietary fiber, vitamins, and minerals, have become an important part of people's diets globally[810].

    Domesticated peas are the result of long-term human selection and cultivation, and in comparison to wild peas, domesticated peas have undergone significant changes in morphology, growth habits, and yield[1113]. From the long period of domestication starting in and around Mesopotamia many diverse lineages of peas have been cultivated and translocated to other parts of the world[14,15]. The subspecies, Pisum sativum subsp. sativum is the lineage from which most cultivars have been selected and is known for possessing large, round, or oval-shaped seeds[16,17]. In contrast, the subspecies, P. sativum subsp. elatius, is a cultivated pea which more closely resembles wild peas and is mainly found in grasslands and desert areas in Europe, Western Asia, and North Africa. Pisum fulvum is native to the Mediterranean basin and the Balkan Peninsula[15,18], and is resistant to pea rust caused by the fungal pathogen Uromyces pisi. Due to its resistance to pea rust, P. fulvum has been cross-bred with cultivated peas in the development of disease-resistant strains[19]. These examples demonstrate the diverse history of the domesticated pea and why further study of the pea pan-plastome could be employed for crop improvement. While studies based on the nuclear genome have been used to explore the domestication history of pea, these approaches do not account for certain factors, such as maternal inheritance. Maternal lineages, which are inherited through plastomes, play a critical role in understanding the full domestication process. A pan-plastome-based approach will no doubt allow us to investigate the maternal genetic contributions and explore evolutionary patterns that nuclear genome studies may overlook. Besides, pan-plastome analysis enables researchers to systematically compare plastome diversity across wild and cultivated species, identifying specific regions of the plastome that contribute to desirable traits. These plastid traits can then be transferred to cultivated crops through introgression breeding or genetic engineering, leading to varieties with improved resistance to disease, environmental stress, and enhanced agricultural performance.

    Plastids are organelles present in plant cells and are the sites in which several vital biological processes take place, such as photosynthesis in chloroplasts[2024]. Because the origin of plastids is the result of an ancient endosymbiotic event, extant plastids retain a genome (albeit much reduced) from the free-living ancestor[25]. With the advancement of high-throughput DNA sequencing technology, over 13,000 plastid genomes or plastomes have been published in public databases by the autumn of 2023[24]. Large-scale comparison of plastomic data at multiple taxonomic levels has shown that plastomic data can provide valuable insights into evolution, interspecies relationships, and population genetic structure. The plastome, in most cases, displays a conserved quadripartite circular genomic architecture with two inverted repeat (IR) regions and two single copy (SC) regions, referred to as the large single-copy (LSC) and small single-copy (SSC) regions. However, some species have lost one copy of the inverted repeated regions, such as those in Erodium (Geraniaceae family)[26,27] and Medicago (Fabaceae)[28,29]. Compared to previous plastomic studies based on a limited number of plastomes, the construction of pan-plastomes attempts to describe all nucleotide variants present in a lineage through intensive sampling and comparisons. Such datasets can provide detailed insights into the maternal history of a species and help to better understand applied aspects such as domestication history or asymmetries in maternal inheritance, which can help guide future breeding programs. Such pan-plastomes have recently been constructed for several agriculturally important species. A recent study focuses on the genus Gossypium[20], using plastome data at the population level to construct a robust map of plastome variation. It explored plastome diversity and population structure relationships within the genus while uncovering genetic variations and potential molecular marker loci in the plastome. Besides, 65 samples were combined to build the pan-plastome of Hemerocallis citrina[30] , and 322 samples for the Prunus mume pan-plastome[31]. Before these recent efforts, similar pan-plastomes were also completed for Beta vulgaris[32], and Nelumbo nucifera[33]. However, despite the agricultural importance of peas, no such pan-plastome has been completed.

    In this study, 103 complete pea plastomes were assembled and combined another 42 published plastomes to construct the pan-plastome. Using these data, the following analyses were conducted to better understand the evolution and domestication history of pea: (1) genome structural comparisons, (2) codon usage bias, (3) simple sequence repeat patterns, (4) phylogenetic analysis, and (5) nucleotide variation of plastomes in peas.

    One hundred and three complete pea plastomes were de novo assembled from public whole-genome sequencing data[34]. For data quality control, FastQC v0.11.5 (www.bioinformatics.babraham.ac.uk/projects/fastqc/) was utilized to assess the quality of the reads and ensure that the data was suitable for assembly. The clean reads were then mapped to a published pea plastome (MW308610) plastome from the GenBank database (www.ncbi.nlm.nih.gov/genbank) as the reference using BWA v0.7.17[35] and SAMtools v1.9[36] to isolate plastome-specific reads from the resequencing data. Finally, these plastome-specific reads were assembled de novo using SPAdes v3.15.2[37]. The genome annotation was conducted by Geseq online program (https://chlorobox.mpimp-golm.mpg.de/geseq.html). Finally, the OGDRAW v1.3.1[38] program was utilized to visualize the circular plastome maps with default settings. To better resolve the pan-plastome for peas, 42 complete published pea plastomes were also downloaded from NCBI and combined them with the de novo data (Supplementary Table S1).

    To investigate the codon usage in the pan-plastome of pea, we utilized CodonW v.1.4.2 (http://codonw.sourceforge.net) to calculate the Relative Synonymous Codon Usage (RSCU) value of the protein-coding genes (PCGs) longer than 300 bp, excluding stop codons. The RSCU is a calculated metric used to evaluate the relative frequency of usage among synonymous codons encoding the same amino acid. An RSCU value above 1 suggests that the codon is utilized more frequently than the average for a synonymous codon. Conversely, a value below 1 indicates a lower-than-average usage frequency. Besides, the Effective Number of Codons (ENC) and the G + C content at the third position of synonymous codons (GC3s) were also calculated in CodonW v.1.4.2. The ENC value and GC3s value were utilized for generating the ENC-GC3s plot, with the expected ENC values (standard curve), are calculated according to formula: ENC = 2 + GC3s + 29 / [GC3s2 + (1 – GC3s)2][39].

    The MISA program[40] was utilized to detect simple sequence repeats (SSRs), setting the minimum threshold for repeat units at 10 for mono-motifs, 6 for di-motifs, and 5 for tri-, tetra-, penta-, and hexa-motif microsatellites, respectively.

    The 145 complete pea plastomes were aligned using MAFFT v 7.487[41]. Single nucleotide variants (SNVs)-sites were used to derive an SNV only dataset from the entire-plastome alignment[42]. A total of 959 SNVs were analyzed using IQ-TREE v2.1[43] with a TVMe + ASC + R2 substitution model, determined by ModelTest-NG[44] based on BIC, and clade support was assessed with 1,000 bootstrap replicates. Vavilovia formosa (MK604478) was chosen as an outgroup. The principal coordinates analysis (PCA) was conducted in TASSEL 5.0[45].

    DnaSP v6[46] was utilized to identify different haplotypes among the plastomes, with gaps and missing data excluded. Haplotype networks were constructed in Popart v1.7[47] using the median-joining algorithm. Haplotype diversity (Hd) for each group was calculated by DnaSP v6[46], and the evolutionary distances based on the Tamura-Nei distance model were computed based on the population differentiation index (FST) between different groups with the plastomic SNVs.

    In this study, the pan-plastome structure of peas was elucidated (Fig. 1). The length of these plastomes ranged from 120,826 to 122,547 bp. And the overall GC content varied from 34.74% to 34.87%. In contrast to typical plastomes characterized by a tetrad structure, the plastomes of peas contained a single IR copy. The average GC content among all pea plastomes was 34.8%, with the highest amount being 34.84% and the lowest 34.74%, with minimal variation among the pea plastomes.

    Figure 1.  Pea pan-plastome annotation map. Indicated by arrows, genes listed inside and outside the circle are transcribed clockwise and counterclockwise, respectively. Genes are color-coded by their functional classification. The GC content is displayed as black bars in the second inner circle. SNVs, InDels, block substitutions and mixed variants are represented with purple, green, red, and black lines, respectively. Single nucleotide variants (SNVs), block substitutions (BS, two or more consecutive nucleotide variants), nucleotide insertions or deletions (InDels), and mixed sites (which comprise two or more of the preceding three variants at a particular site) are the four categories into which variants are divided.

    A total of 110 unique genes were annotated (Supplementary Table S2), of which 76 genes were PCGs, 30 were transfer RNA (tRNA) genes and four were ribosomal RNA (rRNA) genes. Genes containing a single intron, include nine protein-coding genes (rpl16, rpl2, ndhB, ndhA, petB, petD, rpoC1, clpP, atpF) and six tRNA genes (trnK-UUU, trnV-UAC, trnL-UAA, trnA-UGC, trnI-GAU, trnG-UCC). Additionally, two protein-coding genes ycf3 and rps12 were found to contain two introns.

    The codon usage frequency in pea plastome genes is shown in Fig. 2a. The analysis of codon usage in the pea plastome indicated significant biases for specific codons across various amino acids. Here a nearly average usage in some amino acids was observed, such as Alanine (Ala) and Valine (Val). For most amino acids, the usage of different synonymous codons was not evenly distributed. Regarding stop codons, a nearly even usage was found, with 37.0% for TAA, 32.2% for TAG and 30.8% for TGA.

    Figure 2.  (a) The overall codon usage frequency in 51 CDSs (length > 300 bp) from the pea pan-plastome. (b) The heatmap of RSCU values in 51 CDSs (length > 300 bp) from the pea pan-plastome. The x-axis represents different codons and the y-axis represents different CDSs. The tree at the top was constructed based on a Neighbor-Joining algorithm.

    The RSCU heatmap (Fig. 2b) showed different RSCU values for all codons in plastomic CDSs. In general, a usage bias for A/T in the third position of codons was found among CDSs in the pea pan-plastome. The RSCU values among these CDSs ranged from 0 to 4.8. The highest RSCU value (4.8) was found with the CGT codon in the cemA gene, where six synonymous codons exist for Arg but only CGT (4.8) and AGG (1.2) were used in this gene. This explained in large part the extreme RSCU value for CGT, resulting in an extreme codon usage bias in this amino acid.

    In the ENC-GC3s plot (Fig. 3), 31 PCGs were shown below the standard curve, while 20 PCGs were above. Besides, around 12 PCGs were near the curve, which meant these PCGs were under the average natural selection and mutation pressure. This plot displayed that the codon usage preferences in pea pan-plastomes were mostly influenced by natural selection. Five genes were shown an extreme influence with natural selection for its extreme ΔENC (ENCexpected – ENC) higher than 5, regarding as petB (ΔENC = 5.18), psbA (ΔENC = 8.96), rpl16 (ΔENC = 5.62), rps14 (ΔENC = 14.29), rps18 (ΔENC = 6.46) (Supplementary Table S3).

    Figure 3.  The ENC-GC3s plot for pea pan-plastome, with GC3s as the x-axis and ENC as the y-axis. The expected ENC values (standard curve) are calculated according to formula: ENC = 2 + GC3s + 29 / [GC3s2 + (1 − GC3s)2].

    For SSR detection (Fig. 4), mononucleotide, dinucleotide, and trinucleotide repeats were identified in the pea pan-plastome including A/T, AT/TA, and AAT/ATT. The majority of these SSRs were mononucleotides (A/T), accounting for over 90% of all identified repeats. Additionally, we observed that A/T and AT/TA repeats were present in all pea accessions, whereas only about half of the plastomes contained AAT/ATT repeats. It was also found that the number of A/T repeats exhibited the greatest diversity, while the number of AAT/ATT repeats showed convergence in all plastomes that possessed this repeat.

    Figure 4.  Simple sequence repeats (SSRs) in the pea pan-plastome. The x-axis represents different samples of pea and the y-axis represents the number of repeats in this sample. (a) The number of A/T repeats in the peapan-plastome. (b) The number of AT/TA and AAT/ATT repeats of pea pan-plastomes.

    To better understand the phylogenetic relationships and evolutionary history of peas, a phylogenetic tree was reconstructed using maximum likelihood for 145 pea accessions utilizing the whole plastome sequences (Fig. 5a). The 145 pea accessions were grouped into seven clades with high confidence. These groups were named the 'PF group', 'PSeI-a group', 'PSeI-b group', 'PA group', 'PSeII group', 'PSeIII group', and the 'PS group'. The naming convention for these groups relates to the majority species names for accessions in each group, where P. fulvum makes up the 'PF group', P. sativum subsp. elatius in the 'PSeI-a group', 'PSeI-b group', 'PSeII group', and 'PSeIII group', P. abyssinicum in the 'PA group', and P. sativum in the 'PS group'. From this phylogenetic tree, it was observed that the 'PSeI-a group' and the 'PSeI-b group' had a close phylogenetic relationship and nearly all accessions in these two groups (except DCG0709 accession was P. sativum) were identified as P. sativum subsp. elatius. In addition to the P. sativum subsp. elatius found in PSeI, seven accessions from the PS group were identified as P. sativum subsp. elatius.

    The PCA results (Fig. 5b) also confirmed that domesticated varieties P. abyssinicum were closer to cultivated varieties PSeI and PSeII, while PSeIII was more closely clustered with cultivated varieties of P. sativum. A previous study has indicated that P. sativum subsp. sativum and P. abyssinicum were independently domesticated from different P. sativum subsp. elatius populations[34].

    The complete plastome sequences were utilized for haplotype analysis using TCS and median-joining network methods (Fig. 5c). A total of 76 haplotypes were identified in the analysis. The TCS network resolved a similar pattern as the other analyses in that six genetic clusters were resolved with genetic clusters PS and PSeIII being very closely related. The genetic cluster containing P. fulvum exhibits greater genetic distance from other genetic clusters. The genetic clusters containing P. abyssinicum (PA) and P. sativum (PS) had lower levels of intracluster differentiation. In the TCS network, Hap30 and Hap31 formed distinct clusters from other haplotypes, such as Hap27, which may account for the genetic difference between the 'PSeI-a group' and 'PSeI-b group'. The network analysis results were consistent with the findings of the phylogenetic tree and principal component analyses results in this study.

    Figure 5.  (a) An ML tree resolved from 145 pea plastomes. (b) PCA analysis showing the first two components. (c) Haplotype network of pea plastomes. The size of each circle is proportional to the number of accessions with the same haplotype. (d) Genetic diversity and differentiation of six clades of peas. Pairwise FST between the corresponding genetic clusters is represented by the numbers above the lines joining two bubbles.

    Among the six genetic clusters, the highest haplotype diversity (Hd) was observed in PSeIII (Hd = 0.99, π = 0.22 × 10−3), followed by PSeII (Hd = 0.96, π = 0.43 × 10−3), PSeI (Hd = 0.96, π = 0.94 × 10−3), PF (Hd = 0.94, π = 0.6 × 10−4), PS (Hd = 0.88, π = 0.3 × 10−4), and PA (Hd = 0.70, π = 0.2 × 10−4). Genetic differentiation was evaluated between each genetic cluster by calculating FST values. As shown in Fig. 5d, except for the relatively lower population differentiation between PS and PSeIII (FST = 0.54), and between PSeI and PSeII (FST = 0.59), the FST values between the remaining clades ranged from 0.7 to 0.9. The highest population differentiation was observed between PF and PA (FST = 0.98). The FST values between PSeI and different genetic clusters were relatively low, including PSeI and PF (FST = 0.80), PSeI and PS (FST = 0.77), PSeI and PSeIII (FST = 0.72), PSeI and PSeII (FST = 0.59), and PSeI and PA (FST = 0.72).

    To further determine the nucleotide variations in the pea pan-plastome, 145 plastomes were aligned and nucleotide differences analyzed across the dataset. A total of 1,579 variations were identified from the dataset (Table 1), including 965 SNVs, 24 Block Substitutions, 426 InDels, and 160 mixed variations of these three types. Among the SNVs, transitions were more frequent than transversions, with 710 transitions and 247 transversions. In transitions, T to G and A to C had 148 and 139 occurrences, respectively, while in transversions, G to A and C to T had 91 and 77 occurrences, respectively.

    Table 1.  Nucleotide variation in the pan-plastome of peas.
    Variant Total SNV Substitution InDel Mix
    (InDel, SNV)
    Mix
    (InDel, SUB)
    Total 1,576 965 24 426 156 4
    CDS 734 445 6 176 103 4
    Intron 147 110 8 29 0 0
    tRNA 20 15 1 4 0 0
    rRNA 11 3 0 6 2 0
    IGS 663 392 9 211 51 0
     | Show Table
    DownLoad: CSV

    When analyzing variants by their position to a gene (Fig. 6), there were 731 variations in CDSs, accounting for 46.3% of the total variations, including 443 SNVs (60.6%), six block substitutions (0.83%), 175 InDels (23.94%), and four mixed variations (14.64%). There were 104 variants in introns, accounting for 6.59% of the total variations, including 78 SNVs (75%), seven block substitutions (6.73%), and 19 InDels (18.27%). IGS (Intergenic spacers) contained 660 variations, accounting for 41.8% of the total variations, including 394 SNVs (59.7%), nine block substitutions (1.36%), 207 InDels (31.36%), and 50 mixed variations (7.58%). The tRNA regions contained 63 variants, accounting for 3.99% of the total variations, including 47 SNVs (74.6%) and 14 InDels (22.2%). The highest number of variants were detected in the IGS regions, while the lowest were found in introns. Among CDSs, accD (183) had the highest number of variations. In introns, rpL16 (18) and ndhA (16) had the most variants. In the IGS regions, ndhD-trnI-CAU (73), and trnL-UAA-trnT-UGU (44) possessed the greatest number of variants.

    Figure 6.  Variant locations within the pea pan-plastome categorized by genic position (Introns, CDS, and IGS).

    Finally, examples of some genes with typical variants were provided to better illustrate the sequence differences between clades (Fig. 7). For example, the present analysis revealed that the ycf1 gene exhibited a high number of variant loci, which included unique single nucleotide variants (SNVs) specific to the P. abyssinicum clade. Additionally, a unique InDel variant belonging to P. abyssinicum was identified. Similar unique SNVs and InDels were also found in other genes, such as matK and rpoC2, distinguishing the P. fulvum clade from others. These unique SNVs and InDels could serve as DNA barcodes to distinguish different maternal lineages of peas.

    Figure 7.  Examples of variant sites.

    The present research combined 145 pea plastomes to construct a pan-plastome of peas. Compared to single plastomic studies, pan-plastome analyses across a species or genus provide a higher-resolution understanding of phylogenetic relationships and domestication history. Most plastomes in plants possess a quadripartite circular structure with two inverted repeat (IR) regions and two single copy regions (LSC and SSC)[2024]. However, the complete loss of one of the IR regions in the pea plastome was observed which is well-known among the inverted repeat-lacking clade (IRLC) species in Fabaceae. The loss of IRs has been documented in detail from other genera such as Erodium (Geraniaceae family)[26,27] and Medicago (Fabaceae family)[28,29]. This phenomenon although not commonly observed, constitutes a significant event in the evolutionary trajectories of certain plant lineages[26]. Such large-scale changes in plastome architecture are likely driven in part by a combination of selective pressures and genetic drift[48]. In the pea pan-plastome, it was also found that, compared to some plants with IR regions, the length of the plastomes was much shorter, and the overall GC content was lower. This phenomenon was due to the loss of one IR with high GC content.

    Repetitive sequences are an important part of the evolution of plastomes and can be used to reconstruct genealogical relationships. Mononucleotide SSRs are consistently abundant in plastomes, with many studies identifying them as the most common type of SSR[4952]. Among these, while C/G-type SSRs may dominate in certain species[53,54], A/T types are more frequently observed in land plants. The present research was consistent with these previous conclusions, showing an A/T proportion exceeding 90% (Fig. 4). Due to their high rates of mutation, SSRs are widely used to study phylogenetic relationships and genetic variation[55,56]. Additionally, like other plants, pea plastome genes have a high frequency of A/Ts in the third codon position. This preference is related to the higher AT content common among most plant plastomes and Fabaceae plastomes in particular with their single IRs[57,58]. The AT-rich regions are often associated with easier unwinding of DNA during transcription and potentially more efficient and accurate translation processes[59]. The preference for A/T in third codon positions may also be influenced by tRNA availability, as the abundance of specific tRNAs that recognize these codons can enhance the efficiency of protein synthesis[60,61]. However, not all organisms exhibit this preference for A/T-ending codons. For instance, many bacteria have GC-rich genomes and thus show a preference for G/C-ending codons[6264]. This variation in codon usage bias reflects the differences in genomic composition and the evolutionary pressures unique to different lineages.

    This study also comprehensively examined the variant loci of the pea pan-plastome. Among these variant sites, some could potentially serve as DNA barcode sites for specific lineages of peas, such as ycf1, rpoC2, and matK. Both ycf1 and matK have been widely used as DNA barcodes in many species[6568], as they are hypervariable. Researchers now have a much deeper understanding of the crucial role plastomes have played in plant evolution[6971]. By generating a comprehensive map of variant sites, future researchers can now more effectively trace differences in plastotypes to physiological and metabolic traits for use in breeding elite cultivars.

    The development of a pan-plastome for peas provides new insights into the maternal domestication history of this important food crop. Based on the phylogenetic analysis in this study, we observed a clear differentiation between wild and cultivated peas, with P. fulvum being the earliest diverging lineage, and was consistent with former research[34]. The ML tree (Fig. 5a) indicated that cultivated peas had undergone at least two independent domestications, namely from the PA and PS groups, which is consistent with former research[34]. However, as the present study added several accessions over the previous study and plastomic data was utilized, several differences were also found[34], such as the resolution of the two groups, referred to as PSeI-a group and PSeI-b group which branched between the PA group and PF group. Previous research based on nuclear data[34] only and with fewer accessions showed that the PA group and PF group were closely related in phylogeny, with no PSeI group appearing between them. One possible explanation is that the PSeI-a and PSeI-b lineages represents the capture and retention of a plastome from a now-extinct lineage while backcrossing to modern cultivars has obscured this signal in the nuclear genomic datasets. However, procedural explanations such as incorrectly identified accessions might have also resulted in such patterns. In either case, the presence of these plastomes in the cultivated pea gene pool should be explored for possible associations with traits such as disease resistance and hybrid incompatibility. This finding underscores the complexity of the domestication process and highlights the role of hybridization and selection in shaping the genetic landscape of cultivated peas. As such, future studies integrating data from the nuclear genome, mitogenome, and plastome will undoubtedly provide deeper insights into the phylogeny and domestication of peas. This pan-plastome research, encompassing a variety of cultivated taxa, will also support the development of elite varieties in the future.

    This study newly assembled 103 complete pea plastomes. These plastomes were combined with 42 published pea plastomes to construct the first pan-plastome of peas. The length of pea plastomes ranged from 120,826 to 122,547 bp, with the GC content varying from 34.74% to 34.87%. The codon usage pattern in the pea pan-plastome displayed a strong bias for A/T in the third codon position. Besides, the codon usage of petB, psbA, rpl16, rps14, and rps18 were shown extremely influenced by natural selection. Three types of SSRs were detected in the pea pan-plastome, including A/T, AT/TA, and AAT/ATT. From phylogenetic analysis, seven well-supported clades were resolved from the pea pan-plastome. The genes ycf1, rpoC2, and matK were found to be suitable for DNA barcoding due to their hypervariability. The pea pan-plastome provides a valuable supportive resource in future breeding and selection research considering the central role chloroplasts play in plant metabolism as well as the association of plastotype to important agronomic traits such as disease resistance and interspecific compatibility.

  • The authors confirm contribution to the paper as follows: study conception and design: Wang J; data collection: Kan J; analysis and interpretation of results: Kan J, Wang J; draft manuscript preparation: Kan J, Wang J, Nie L; project organization and supervision: Tiwari R, Wang M, Tembrock L. All authors reviewed the results and approved the final version of the manuscript.

  • The annotation files of newly assembled pea plastomes were uploaded to the Figshare website (https://figshare.com/, doi: 10.6084/m9.figshare.26390824).

  • This study was funded by the Guangdong Pearl River Talent Program (Grant No. 2021QN02N792) and the Shenzhen Fundamental Research Program (Grant No. JCYJ20220818103212025). This work was also funded by the Science Technology and Innovation Commission of Shenzhen Municipality (Grant No. RCYX20200714114538196) and the Innovation Program of Chinese Academy of Agricultural Sciences. We are also particularly grateful for the services of the High-Performance Computing Cluster in the Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences.

  • The authors declare that they have no conflict of interest.

  • Supplementary Table S1 The relative synonymous codon usage of amino acids in the mitogenome of Michelia figo, Magnolia biondii, Magnolia officinalis, and Liriodendron tulipifera.
    Supplementary Table S2 The frequency of codon usage in the mitogenome of Michelia figo, Magnolia biondii, Magnolia officinalis, and Liriodendron tulipifera.
    Supplementary Table S3 Dispersed repeat sequences identified in the Michelia figo mitogenome.
    Supplementary Table S4 Tandem repeat sequences identified in the Michelia figo mitogenome.
    Supplementary Table S5 The simple sequence repeats identified in the Michelia figo mitogenome.
    Supplementary Table S6 Dispersed repeat sequences identified in the Magnolia biondii mitogenome.
    Supplementary Table S7 Dispersed repeat sequences identified in the Magnolia officinalis mitogenome.
    Supplementary Table S8 Tandem repeat sequences identified in the Magnolia biondii mitogenome.
    Supplementary Table S9 Tandem repeat sequences identified in the Magnolia officinalis mitogenome.
    Supplementary Table S10 The simple sequence repeats identified in the Magnolia biondii mitogenome.
    Supplementary Table S11 The simple sequence repeats identified in the Liriodendron tulipifera mitogenome.
    Supplementary Table S12 The homologous DNA fragment between mitogenome and cpgenome of Michelia figo.
    Supplementary Table S13 The collinear blocks between mitogenomes of Michelia figo and Magnolia officinalis.
    Supplementary Table S14 The collinear blocks between mitogenomes of Michelia figo and Magnolia biondii.
    Supplementary Table S15 The collinear blocks between mitogenomes of Magnolia biondii and Liriodendron tulipifera.
    Supplementary Table S16 Genes used for phylogenetic analysis.
    Supplementary Table S17 The genomic information of the species used in this study.
    Supplementary Fig. S1 The map of genes containing introns. This diagram illustrates the distribution of cis- and trans-introns.
    Supplementary Fig. S2 Heat maps of PCG and intron contents among 15 mitogenomes (a) Comparison of PCG contents among 15 mitogenomes. The gene numbers are shown on the top right. (b) Comparison of intron contents among 15 mitogenomes. The intron numbers are shown on the top right.
  • [1]

    Roger AJ, Muñoz-Gómez SA, Kamikawa R. 2017. The origin and diversification of mitochondria. Current Biology 27(21):R1177−R1192

    doi: 10.1016/j.cub.2017.09.015

    CrossRef   Google Scholar

    [2]

    Dyall SD, Brown MT, Johnson PJ. 2004. Ancient invasions: from endosymbionts to organelles. Science 304:253−57

    doi: 10.1126/science.1094884

    CrossRef   Google Scholar

    [3]

    Yu SB, Pekkurnaz G. 2018. Mechanisms orchestrating mitochondrial dynamics for energy homeostasis. Journal of Molecular Biology 430:3922−41

    doi: 10.1016/j.jmb.2018.07.027

    CrossRef   Google Scholar

    [4]

    van Loo G, Saelens X, van Gurp M, MacFarlane M, Martin SJ, et al. 2002. The role of mitochondrial factors in apoptosis: a Russian roulette with more than one bullet. Cell Death & Differentiation 9:1031−42

    doi: 10.1038/sj.cdd.4401088

    CrossRef   Google Scholar

    [5]

    Bi C, Sun N, Han F, Xu K, Yang Y, et al. 2024. The first mitogenome of Lauraceae (Cinnamomum chekiangense). Plant Diversity 46:144−48

    doi: 10.1016/j.pld.2023.11.001

    CrossRef   Google Scholar

    [6]

    Ma Q, Wang Y, Li S, Wen J, Zhu L, et al. 2022. Assembly and comparative analysis of the first complete mitochondrial genome of Acer truncatum Bunge: a woody oil-tree species producing nervonic acid. BMC Plant Biology 22:29

    doi: 10.1186/s12870-021-03416-5

    CrossRef   Google Scholar

    [7]

    Han F, Qu Y, Chen Y, Xu LA, Bi C. 2022. Assembly and comparative analysis of the complete mitochondrial genome of Salix wilsonii using PacBio HiFi sequencing. Frontiers in Plant Science 13:1031769

    doi: 10.3389/fpls.2022.1031769

    CrossRef   Google Scholar

    [8]

    Wang XD, Xu CY, Zheng YJ, Wu YF, Zhang YT, et al. 2022. Chromosome-level genome assembly and resequencing of camphor tree (Cinnamomum camphora) provides insight into phylogeny and diversification of terpenoid and triglyceride biosynthesis of Cinnamomum. Horticulture Research 9:uhac216

    doi: 10.1093/hr/uhac216

    CrossRef   Google Scholar

    [9]

    Morley SA, Nielsen BL. 2017. Plant mitochondrial DNA. Frontiers in Bioscience-Landmark (FBL) 22:1023−32

    doi: 10.2741/4531

    CrossRef   Google Scholar

    [10]

    Wynn EL, Christensen AC. 2019. Repeats of unusual size in plant mitochondrial genomes: identification, incidence and evolution. G3 Genes| Genomes| Genetics 9:549−59

    doi: 10.1534/g3.118.200948

    CrossRef   Google Scholar

    [11]

    Møller IM, Rasmusson AG, Van Aken O. 2021. Plant mitochondria – past, present and future. The Plant Journal 108:912−59

    doi: 10.1111/tpj.15495

    CrossRef   Google Scholar

    [12]

    Skippington E, Barkman TJ, Rice DW, Palmer JD. 2015. Miniaturized mitogenome of the parasitic plant Viscum scurruloideum is extremely divergent and dynamic and has lost all nad genes. Proceedings of the National Academy of Sciences of the United States of America 112:E3515−E24

    doi: 10.1073/pnas.1504491112

    CrossRef   Google Scholar

    [13]

    Putintseva YA, Bondar EI, Simonov EP, Sharov VV, Oreshkova NV, et al. 2020. Siberian larch (Larix sibirica Ledeb.) mitochondrial genome assembled using both short and long nucleotide sequence reads is currently the largest known mitogenome. BMC Genomics 21:654

    doi: 10.1186/s12864-020-07061-4

    CrossRef   Google Scholar

    [14]

    Bi C, Qu Y, Hou J, Wu K, Ye N, et al. 2022. Deciphering the Multi-Chromosomal Mitochondrial Genome of Populus simonii. Frontiers in Plant Science 13:914635

    doi: 10.3389/fpls.2022.914635

    CrossRef   Google Scholar

    [15]

    Logacheva MD, Schelkunov MI, Fesenko AN, Kasianov AS, Penin AA. 2020. Mitochondrial genome of Fagopyrum esculentum and the genetic diversity of extranuclear genomes in buckwheat. Plants 9:618

    doi: 10.3390/plants9050618

    CrossRef   Google Scholar

    [16]

    Adams KL, Qiu YL, Stoutemyer M, Palmer JD. 2002. Punctuated evolution of mitochondrial gene content: High and variable rates of mitochondrial gene loss and transfer to the nucleus during angiosperm evolution. Proceedings of the National Academy of Sciences of the United States of America 99:9905−12

    doi: 10.1073/pnas.042694899

    CrossRef   Google Scholar

    [17]

    Filip E, Skuza L. 2021. Horizontal gene transfer involving chloroplasts. International Journal of Molecular Sciences 22:4484

    doi: 10.3390/ijms22094484

    CrossRef   Google Scholar

    [18]

    Rodríguez-Moreno L, González VM, Benjak A, Carmen Martí M, Puigdomènech P, et al. 2011. Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin. BMC Genomics 12:424

    doi: 10.1186/1471-2164-12-424

    CrossRef   Google Scholar

    [19]

    Veltjen E, Testé E, Palmarola Bejerano A, et al. 2022. The evolutionary history of the Caribbean magnolias (Magnoliaceae): Testing species delimitations and biogeographical hypotheses using molecular data. Molecular Phylogenetics and Evolution 167:107359

    doi: 10.1016/j.ympev.2021.107359

    CrossRef   Google Scholar

    [20]

    Law YW. 1984. A preliminary study on the taxonomy of the family magnoliaceae. Journal of Systematics and Evolution 22(2):89−109

    Google Scholar

    [21]

    Taprial S. 2015. A review on phytochemical and pharmacological properties of Michelia champaca Linn. family: Magnoliaceae. International Journal Of Pharmaceutical Sciences And Research 2:430−36

    Google Scholar

    [22]

    Shang C, Hu Y, Deng C, Hu K. 2002. Rapid determination of volatile constituents of Michelia alba flowers by gas chromatography-mass spectrometry with solid-phase microextraction. Journal of chromatography. A 942:283−8

    doi: 10.1016/S0021-9673(01)01382-6

    CrossRef   Google Scholar

    [23]

    Khan MR, Kihara M, Omoloso AD. 2002. Antimicrobial activity of Michelia champaca. Fitoterapia 73:744−48

    doi: 10.1016/S0367-326X(02)00248-4

    CrossRef   Google Scholar

    [24]

    Cheng KK, Nadri MH, Othman NZ, Rashid SNAA, Lim YC, et al. 2022. Phytochemistry, bioactivities and traditional uses of Michelia × alba. Molecules 27:3450

    doi: 10.3390/molecules27113450

    CrossRef   Google Scholar

    [25]

    Chericoni S, Testai L, Campeol E, Calderone V, Morelli I, et al. 2004. Vasodilator activity of Michelia figo Spreng (Magnoliaceae) by in vitro functional study. Journal of Ethnopharmacology 91:263−66

    doi: 10.1016/j.jep.2003.12.021

    CrossRef   Google Scholar

    [26]

    Zhai M. 2020. The complete chloroplast genome sequence of Michelia figo based on landscape design, and a comparative analysis with other Michelia species. Mitochondrial DNA Part B 5:2723−24

    doi: 10.1080/23802359.2020.1788446

    CrossRef   Google Scholar

    [27]

    Hinsinger DD, Strijk JS. 2017. The chloroplast genome sequence of Michelia alba (Magnoliaceae), an ornamental tree species. Mitochondrial DNA Part B 2:9−10

    doi: 10.1080/23802359.2016.1275850

    CrossRef   Google Scholar

    [28]

    Li Y, Zhou M, Wang L, Wang J. 2021. The characteristics of the chloroplast genome of the Michelia chartacea (Magnoliaceae). Mitochondrial DNA Part B 6:493−95

    doi: 10.1080/23802359.2020.1871432

    CrossRef   Google Scholar

    [29]

    Sima Y, Li Y, Yuan X, Wang Y. 2020. The complete chloroplast genome sequence of Michelia chapensis Dandy: an endangered species in China. Mitochondrial DNA Part B 5:1594−95

    doi: 10.1080/23802359.2020.1742619

    CrossRef   Google Scholar

    [30]

    Arseneau JR, Steeves R, Laflamme M. 2017. Modified low-salt CTAB extraction of high-quality DNA from contaminant-rich tissues. Molecular Ecology Resources 17:686−93

    doi: 10.1111/1755-0998.12616

    CrossRef   Google Scholar

    [31]

    Shi Q, Tian D, Wang J, Chen A, Miao Y, et al. 2023. Overexpression of miR390b promotes stem elongation and height growth in Populus. Horticulture Research 10:uhac258

    doi: 10.1093/hr/uhac258

    CrossRef   Google Scholar

    [32]

    Bi C, Shen F, Han F, Qu Y, Hou J, et al. 2024. PMAT: an efficient plant mitogenome assembly toolkit using low coverage HiFi sequencing data. Horticulture Research 11:uhae023

    doi: 10.1093/hr/uhae023

    CrossRef   Google Scholar

    [33]

    Dong S, Chen L, Liu Y, Wang Y, Zhang S, et al. 2020. The draft mitochondrial genome of Magnolia biondii and mitochondrial phylogenomics of angiosperms. Plos One 15:e0231020

    doi: 10.1371/journal.pone.0231020

    CrossRef   Google Scholar

    [34]

    Wick RR, Schultz MB, Zobel J, Holt KE. 2015. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics 31:3350−52

    doi: 10.1093/bioinformatics/btv383

    CrossRef   Google Scholar

    [35]

    Li J, Ni Y, Lu Q, Chen H, Liu C. 2024. PMGA: A plant mitochondrial genome annotator. Plant Communications 00:101191

    doi: 10.1016/j.xplc.2024.101191

    CrossRef   Google Scholar

    [36]

    Chen Y, Ye W, Zhang Y, Xu Y. 2015. High speed BLASTN: an accelerated MegaBLAST search tool. Nucleic Acids Research 43:7762−68

    doi: 10.1093/nar/gkv784

    CrossRef   Google Scholar

    [37]

    Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Research 25:955−64

    doi: 10.1093/nar/25.5.955

    CrossRef   Google Scholar

    [38]

    Zhang X, Chen H, Ni Y, Wu B, Li J, et al. 2024. Plant mitochondrial genome map (PMGmap): a software tool for the comprehensive visualization of coding, noncoding and genome features of plant mitochondrial genomes. Molecular Ecology Resources 24:e13952

    doi: 10.1111/1755-0998.13952

    CrossRef   Google Scholar

    [39]

    Beier S, Thiel T, Münch T, Scholz U, Mascher M. 2017. MISA-web: a web server for microsatellite prediction. Bioinformatics 33:2583−85

    doi: 10.1093/bioinformatics/btx198

    CrossRef   Google Scholar

    [40]

    Benson G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 27:573−80

    doi: 10.1093/nar/27.2.573

    CrossRef   Google Scholar

    [41]

    Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, et a l. 2001. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Research 29:4633−42

    doi: 10.1093/nar/29.22.4633

    CrossRef   Google Scholar

    [42]

    Mao J, Wei S, Chen Y, Yang Y, Yin T. 2023. The proposed role of MSL-lncRNAs in causing sex lability of female poplars. Horticulture Research 10:uhad042

    doi: 10.1093/hr/uhad042

    CrossRef   Google Scholar

    [43]

    Chen C, Wu Y, Li J, Wang X, Zeng Z, et al. 2023. TBtools-II: a “one for all, all for one” bioinformatics platform for biological big-data mining. Molecular Plant 16:1733−42

    doi: 10.1016/j.molp.2023.09.010

    CrossRef   Google Scholar

    [44]

    Marçais G, Delcher AL, Phillippy AM, Coston R, Salzberg SL, et al. 2018. MUMmer4: A fast and versatile genome alignment system. PLOS Computational Biology 14:e1005944

    doi: 10.1371/journal.pcbi.1005944

    CrossRef   Google Scholar

    [45]

    He W, Yang J, Jing Y, Xu L, Yu K, et al. 2023. NGenomeSyn: an easy-to-use and flexible tool for publication-ready visualization of syntenic relationships across multiple genomes. Bioinformatics 39:btad121

    doi: 10.1093/bioinformatics/btad121

    CrossRef   Google Scholar

    [46]

    Katoh K, Rozewicki J, Yamada KD. 2019. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Briefings in Bioinformatics 20:1160−66

    doi: 10.1093/bib/bbx108

    CrossRef   Google Scholar

    [47]

    Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. 2009. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25:1972−73

    doi: 10.1093/bioinformatics/btp348

    CrossRef   Google Scholar

    [48]

    Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution 32:268−74

    doi: 10.1093/molbev/msu300

    CrossRef   Google Scholar

    [49]

    Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688−90

    doi: 10.1093/bioinformatics/btl446

    CrossRef   Google Scholar

    [50]

    Letunic I, Bork P. 2021. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Research 49:W293−W296

    doi: 10.1093/nar/gkab301

    CrossRef   Google Scholar

    [51]

    Wu ZQ, Liao XZ, Zhang XN, Tembrock LR, Broz A. 2022. Genomic architectural variation of plant mitochondria—A review of multichromosomal structuring. Journal of Systematics and Evolution 60:160−68

    doi: 10.1111/jse.12655

    CrossRef   Google Scholar

    [52]

    Sloan DB, Alverson AJ, Chuckalovcak JP, Wu M, McCauley DE, et al. 2012. Rapid evolution of enormous, multichromosomal genomes in flowering plant mitochondria with exceptionally high mutation rates. PLoS Biology 10:e1001241

    doi: 10.1371/journal.pbio.1001241

    CrossRef   Google Scholar

    [53]

    Alverson AJ, Wei X, Rice DW, Stern DB, Barry K, et al. 2010. Insights into the evolution of mitochondrial genome size from complete sequences of Citrullus lanatus and Cucurbita pepo (Cucurbitaceae). Molecular Biology and Evolution 27:1436−48

    doi: 10.1093/molbev/msq029

    CrossRef   Google Scholar

    [54]

    Gualberto JM, Newton KJ. 2017. Plant mitochondrial genomes: dynamics and mechanisms of mutation. Annual Review of Plant Biology 68:225−52

    doi: 10.1146/annurev-arplant-043015-112232

    CrossRef   Google Scholar

    [55]

    Gualberto JM, Mileshina D, Wallet C, Niazi AK, Weber-Lotfi F, et al. 2014. The plant mitochondrial genome: dynamics and maintenance. Biochimie 100:107−20

    doi: 10.1016/j.biochi.2013.09.016

    CrossRef   Google Scholar

    [56]

    Rice DW, Alverson AJ, Richardson AO, Young GJ, Sanchez-Puerta MV, et al. 2013. Horizontal transfer of entire genomes via mitochondrial fusion in the angiosperm Amborella. Science 342:1468−73

    doi: 10.1126/science.1246275

    CrossRef   Google Scholar

    [57]

    Yu R, Sun C, Zhong Y, Liu Y, Sanchez-Puerta MV, et al. 2022. The minicircular and extremely heteroplasmic mitogenome of the holoparasitic plant Rhopalocnemis phalloides. Current Biology 32:470−479.e5

    doi: 10.1016/j.cub.2021.11.053

    CrossRef   Google Scholar

    [58]

    Yang H, Ni Y, Zhang X, Li J, Chen H, et al. 2023. The mitochondrial genomes of Panax notoginseng reveal recombination mediated by repeats associated with DNA replication. International Journal of Biological Macromolecules 252:126359

    doi: 10.1016/j.ijbiomac.2023.126359

    CrossRef   Google Scholar

    [59]

    Cole LW, Guo W, Mower JP, Palmer JD. 2018. High and variable rates of repeat-mediated mitochondrial genome rearrangement in a genus of plants. Molecular Biology and Evolution 35:2773−85

    Google Scholar

    [60]

    Sun M, Zhang M, Chen X, Liu Y, Liu B, et al. 2022. Rearrangement and domestication as drivers of Rosaceae mitogenome plasticity. BMC Biology 20:181

    doi: 10.1186/s12915-022-01383-3

    CrossRef   Google Scholar

    [61]

    Zardoya R. 2020. Recent advances in understanding mitochondrial genome diversity. F1000Research 9:270

    doi: 10.12688/f1000research.21490.1

    CrossRef   Google Scholar

    [62]

    Richardson AO, Rice DW, Young GJ, Alverson AJ, Palmer JD. 2013. The “fossilized” mitochondrial genome of Liriodendron tulipifera: ancestral gene content and order, ancestral editing sites, and extraordinarily low mutation rate. BMC Biology 11:29

    doi: 10.1186/1741-7007-11-29

    CrossRef   Google Scholar

    [63]

    Mower JP, Sloan DB, Alverson AJ. 2012. Plant mitochondrial genome diversity: The genomics revolution. In Plant Genome Diversity Volume 1: Plant Genomes, their Residents, and their Evolutionary Dynamics, ed. JF Wendel, J Greilhuber, J Dolezel, IJ Leitch: 123-44. Vienna: Springer Vienna. Number of 123-44 pp

    [64]

    Sloan DB, Alverson AJ, Štorchová H, Palmer JD, Taylor DR. 2010. Extensive loss of translational genes in the structurally dynamic mitochondrial genome of the angiosperm Silene latifolia. BMC Evolutionary Biology 10:274

    doi: 10.1186/1471-2148-10-274

    CrossRef   Google Scholar

    [65]

    Hecht J, Grewe F, Knoop V. 2011. Extreme RNA editing in coding islands and abundant microsatellites in repeat sequences of Selaginella moellendorffii mitochondria: the root of frequent plant mtDNA recombination in early tracheophytes. Genome Biology and Evolution 3:344−58

    doi: 10.1093/gbe/evr027

    CrossRef   Google Scholar

    [66]

    Regina TMR, Quagliariello C. 2010. Lineage-specific group II intron gains and losses of the mitochondrial rps3 gene in gymnosperms. Plant Physiology and Biochemistry 48:646−54

    doi: 10.1016/j.plaphy.2010.05.003

    CrossRef   Google Scholar

    [67]

    Timmis JN, Ayliffe MA, Huang CY, Martin W. 2004. Endosymbiotic gene transfer: organelle genomes forge eukaryotic chromosomes. Nature Reviews Genetics 5:123−35

    doi: 10.1038/nrg1271

    CrossRef   Google Scholar

    [68]

    Cheng Y, He X, Priyadarshani SVGN, Wang Y, Ye L, et al. 2021. Assembly and comparative analysis of the complete mitochondrial genome of Suaeda glauca. BMC Genomics 22:167

    doi: 10.1186/s12864-021-07490-9

    CrossRef   Google Scholar

    [69]

    Sloan DB, Wu Z. 2014. History of plastid DNA insertions reveals weak deletion and AT mutation biases in angiosperm mitochondrial genomes. Genome Biology and Evolution 6:3210−21

    doi: 10.1093/gbe/evu253

    CrossRef   Google Scholar

    [70]

    Zhang T, Fang Y, Wang X, Deng X, Zhang X, et al. 2012. The complete chloroplast and mitochondrial genome sequences of Boea hygrometrica: insights into the evolution of plant organellar genomes. PLOS ONE 7:e30531

    doi: 10.1371/journal.pone.0030531

    CrossRef   Google Scholar

    [71]

    Wang D, Rousseau-Gueutin M, Timmis JN. 2012. Plastid sequences contribute to some plant mitochondrial genes. Molecular Biology and Evolution 29:1707−11

    doi: 10.1093/molbev/mss016

    CrossRef   Google Scholar

    [72]

    Wang D, Wu YW, Shih AC, Wu CS, Wang YN, et al. 2007. Transfer of Chloroplast Genomic DNA to Mitochondrial Genome Occurred At Least 300 MYA. Molecular Biology and Evolution 24:2040−48

    doi: 10.1093/molbev/msm133

    CrossRef   Google Scholar

    [73]

    Notsu Y, Masood S, Nishikawa T, Kubo N, Akiduki G, et al. 2002. The complete sequence of the rice (Oryza sativa L.) mitochondrial genome: frequent DNA sequence acquisition and loss during the evolution of flowering plants. Molecular Genetics and Genomics 268:434−45

    doi: 10.1007/s00438-002-0767-1

    CrossRef   Google Scholar

    [74]

    Clifton SW, Minx P, Fauron CMR, Gibson M, Allen JO, et al. 2004. Sequence and Comparative Analysis of the Maize NB Mitochondrial Genome. Plant Physiology 136:3486−503

    doi: 10.1104/pp.104.044602

    CrossRef   Google Scholar

    [75]

    Nie ZL, Wen J, Azuma H, Qiu YL, Sun H, et al. 2008. Phylogenetic and biogeographic complexity of Magnoliaceae in the Northern Hemisphere inferred from three nuclear data sets. Molecular Phylogenetics and Evolution 48:1027−40

    doi: 10.1016/j.ympev.2008.06.004

    CrossRef   Google Scholar

    [76]

    Group TAP, Chase MW, Christenhusz MJM, Fay MF, Byng JW, et al. 2016. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Botanical Journal of the Linnean Society 181:1−20

    doi: 10.1111/boj.12385

    CrossRef   Google Scholar

  • Cite this article

    Wang S, Qiu J, Sun N, Han F, Wang Z, et al. 2025. Characterization and comparative analysis of the first mitochondrial genome of Michelia (Magnoliaceae). Genomics Communications 2: e001 doi: 10.48130/gcomm-0025-0001
    Wang S, Qiu J, Sun N, Han F, Wang Z, et al. 2025. Characterization and comparative analysis of the first mitochondrial genome of Michelia (Magnoliaceae). Genomics Communications 2: e001 doi: 10.48130/gcomm-0025-0001

Figures(7)  /  Tables(1)

Article Metrics

Article views(762) PDF downloads(157)

ARTICLE   Open Access    

Characterization and comparative analysis of the first mitochondrial genome of Michelia (Magnoliaceae)

Genomics Communications  2 Article number: e001  (2025)  |  Cite this article

Abstract: Genus Michelia has various functions and is valuable in medicine, food, and agriculture. Many plastid genomes (plastomes) of Michelia have been released, but no mitochondrial genomes (mitogenomes) have been reported. In this study, using third-generation HIFI sequencing techniques, Michelia figo (M. figo) mitogenome was de novo assembled into a circular chromosome spanning 773,377 bp with a total GC content of 46.83%. Sixty six genes in total were annotated, including 41 protein-coding genes, 21 tRNA genes, and three rRNA genes. The mitogenome contains 1,514 dispersed repeats (> 30 bp), 39 tandem repeats, and 262 simple sequence repeats. Eighty one fragments originating from the M. figo plastome were detected in its mitogenome and three tRNA genes (trnD-GUC, trnW-CCA, and trnV-GAC) completely transferred from the plastome to the mitogenome. Repeats and collinearity analyses of four Magnoliaceae mitogenomes reveal substantial structural variations, a relatively low degree of collinearity, and significant genetic diversity of this genus. Phylogenetic analysis showed that two phylogenetic trees constructed separately based on mitogenomes and plastomes accurately depict the phylogenetic relationship of M. figo. This study offers the first comprehensive comparative genomic and phylogenetic analysis of the M. figo mitogenome, facilitating the development of genetic markers, taxonomic classification, and resource exploration within the Michelia genus.

    • Mitochondria, as semi-autonomous organelles with their unique genetic material and systems, play crucial roles in plant energy metabolism by generating ATP through oxidative phosphorylation[1,2]. In plants, mitochondria not only participate in energy production, but also collaborate with other organelles to maintain cellular homeostasis[3]. In addition, mitochondria also participate in regulating processes such as apoptosis, playing an important regulatory role in plant life activities[4]. However, despite the undisputed significance of mitochondria, previous research on plant mitochondrial genomes (mitogenomes) has been scant. Initially, sequencing and assembly techniques which were primarily developed for nuclear genomes, encountered significant challenges in handling mitogenomes due to their complicated structures, leading to fragmented or incomplete mitogenome assemblies. Moreover, early bioinformatic tools were not optimized for the unique characteristics of mitogenomes. This limitation further hindered the accurate assembly and analysis of plant mitogenomes. Nevertheless, with the continuous advancement of sequencing technology, especially the widespread application of long-read sequencing, research on plant mitogenomes has gradually increased in recent years[58].

      The mitogenomes of most higher plants exhibit substantial variability in structure and size[9]. Plant mitogenomes contain numerous repetitive sequences (repeats), leading to significant variations in size and structure through frequent recombination events[10]. The size of plant mitogenomes is 100–1,000 times larger than that of animals (15-18 kb)[11]. Plant mitogenomes stand out not just for their exceptional size, but also for the notable variation in size they display across diverse species. For example, Viscum scurruloideum[12] has a mitogenome of only 66 kb, while the mitogenome of Larix sibirica is 11.7 Mb[13]. Additionally, the intricate structure of plant mitogenomes further adds to their complexity, with most of them existing as a single circular molecule, while a minority exist as linear or branched molecules. It has been reported that the mitogenomes of Populus simonii[14] and Fagopyrum esculentum[15] are composed of three and ten circular molecules, respectively. Although plant mitogenomes differ significantly in terms of size and structure, the number of genes remains comparably stable and conserved, with a similar core set of PCGs, rRNAs, and tRNAs which are essential for respiratory function, and translation processes[16]. Intracellular gene transfer (IGT) can further complicate the mitogenome, as sequences from both the plastid and nuclear genomes coexist in plant mitogenomes[17]. For instance, the sequences of the nuclear and plastid genomes account for 46.5% and 1.4% of the Cucumis melo mitogenome, respectively[18]. Therefore, the plant mitogenome, due to its complex characteristics, is an ideal system for exploring genome complexity.

      Michelia figo (M. figo) belongs to the genus Michelia of the Magnoliaceae family, which is the second-largest genus and a relatively evolved group in the Magnoliaceae family. There are about 80 Michelia species in the world, predominantly distributed in tropical, subtropical, and temperate regions of Asia, of which approximately 70 species are distributed in China[19,20]. The broad spectrum of physiological activities exhibited by the genus Michelia underscores its potential applications in medicine, food, agriculture, and other domains[21]. The flowers, leaves, branches, and other parts of these species contain abundant aromatic oil that has been traditionally used in China, India, and other regions for treating fever, leprosy, inflammation, and other ailments[22]. Michelia species usually serve as valuable sources of bioactive compounds, exhibiting antibacterial[23], and antioxidant properties[24]. Furthermore, the methanolic extract from the leaves of M. figo has a concentration-dependent vasodilatory effect, having widespread applications in medicine[25].

      A previous study has reported the complete plastome of M. figo and analyzed its phylogenetic relationship with other Michelia species based on (plastid genomes) plastomes[26]. However, the mitogenome of M. figo and its phylogenetic status based on mitogenomes remain unexplored. Additionally, many plastomes of Michelia have been released[2729], but no mitogenomes have been reported for this genus. Consequently, to further explore the evolution and genetics of M. figo, this study has successfully assembled the complete mitogenome of M. figo. Comparative genomic and phylogenetic analyses were undertaken to elucidate the characteristics of the mitogenomes of M. figo and other Magnoliaceae species. These analyses will offer crucial theoretical and data-driven supports for genomic research, biological functions, and mitogenome evolution in M. figo and other Michelia species.

    • In this study, we collected fresh leaves of M. figo at Nanjing Forestry University, Nanjing, Jiangsu Province, China (118.81° E, 32.07° S). Before DNA extraction, fresh leaves were immediately frozen in liquid nitrogen to preserve their integrity and subsequently stored in a laboratory freezer maintained at −80 °C. The total genomic DNA was extracted using the CTAB method[30]. The quality of the DNA sample was evaluated using 1% agarose gel electrophoresis, while its concentration was accurately determined using a NanoDrop ND 2000 (ThermoFisher Scientific, Waltham, MA, USA)[31]. The size of the genomic insert fragments is 15−18 kb. Then the sequencing libraries were constructed using the high-integrity genomic DNA through SMRTbell Express Template Prep Kit 2.0 (PacBio Biosciences, Menlo Park, CA, USA). We ultimately obtained the HiFi sequencing data from the PacBio Revio platform.

    • The HiFi sequencing data was fed into PMAT v1.31[32] to assemble the mitogenome of M. figo. The parameters were 'autoMito -st hifi -g 2.2G -CPU 50'. The nuclear genome size of M. figo was estimated using the genome of Magnolia biondii as a reference[33]. After using PMAT, the raw assembly graph of M. figo mitogenome was composed of 12 contigs, containing four pairs of repeats. Using Bandage[34], we obtained the circular mitogenome of M. figo by decoding the raw assembly graph, taking into account the copy number of each contig. The mitogenome of M. figo was annotated using the online program PMGA[35]. The rRNA and tRNA genes were then verified by BLASTN[36] and tRNAscan-SE v2.0[37], respectively. Finally, an online tool PMGmap[38] was used to draw the mitogenome map.

    • The online tool MISA[39] was used to detect simple sequence repeats (SSRs) of the M. figo, M. biondii, and M. officinalis mitogenomes. We set the repetition thresholds at 10, 5, 4, 3, 3, and 3 for mononucleotides, dinucleotides, trinucleotides, tetranucleotides, pentanucleotides, and hexanucleotides, respectively. The minimal distance between two SSRs was established as 100 bp. Meanwhile, the online tool TRF[40] was utilized to detect tandem repeats with default parameters. REPuter[41] was used to detect dispersed repeats and the parameters were set as follows: hamming distance of three, maximum computed repeats of 5000, and a minimal repeat size of 30 bp[42]. Codon composition and usage of the M. figo mitogenome were analyzed using CondonW v1.4.4 (https://codonw.sourceforge.net/) with default parameters.

    • We obtained the plastid genome (plastome) of M. figo from the NCBI with the accession number of NC_053861.1. Then, we used BLASTN[36] to identify the homologous fragments between the mitogenome and plastome, and utilized TBtools to visualize the results[43]. We selected three mitogenomes of Magnoliaceae (L. tulipifera, M. biondii, and M. officinalis) for the collinearity analysis with M. figo. The collinear blocks were identified using MUMmer v4.0[44] with default parameters. We chose collinear blocks that exceeded 5,000 bp for subsequent analysis. NGenomeSyn v1.0[45] was finally used to visualize the results.

    • To further clarify the phylogenetic location of M. figo, two phylogenetic trees were constructed using 15 plant mitogenomes and plastomes respectively, including two species of Gymnosperm (Cycas taitungensis, and Ginkgo biloba), three species of ANA clade (Amborella trichopoda, Nymphaea colorata, and Schisandra sphenanthera), four species of Magnoliidae (L. tulipifera, M. figo, M. officinalis, and M. biondii), three species of monocots (Apostasia shenzhenica, Cocos nucifera, and Sorghum bicolor), and three species of core eudicots (Ilex pubescens, Sapindis mukorossi, and Ficus carica). Among these plant species, C. taitungensis and G. biloba were selected as outgroups. We used in-house Python scripts to select shared genes and used MAFFT v7.407[46] to compare the shared genes. After trimming the results using trimAl v1.4[47], IQ-TREE v2.0.3[48] was utilized to construct the phylogenetic trees based on the maximum likelihood (ML) method with 1,000 bootstraps.[49]. Both plastid and mitochondrial trees were found to be best fit by the GTR + F + I + G4 model. Finally, the online tool iTOL[50] was used to visualize and optimize the results.

    • Using the Revio sequencing platform, we obtained a total of 410,107 HiFi sequencing reads with 5.83 Gb in length and the N50 value of 14,355 bp. After using PMAT v1.31 to generate the raw assembly graph of the M. figo mitogenome (Fig. 1a), we utilized Bandage to disentangle the mitogenome graph resulting in a circular molecule with 773,377 bp in length (Fig. 1b). The total GC content is 46.83%, with 26.56%, 26.21%, 23.37%, and 23.46% for bases A, T, G, and C, respectively. The M. figo mitogenome was annotated with 66 genes, comprising 41 protein-coding genes (PCGs), 21tRNA, and three rRNA, as detailed in Table 1. Figure 1c provides a visual representation of the functional classification and specific positions of the annotated genes. The majority of genes are present in a single-copy format, with the exception of three genes (rps7, trnM-CAU, and trnP-UGG), possessing multiple copies. Moreover, we found that a total of 10 genes harbor introns (ccmFc, rpl2, rps3, rps10, cox2, nad1, nad2, nad4, nad5, and nad7) (Supplementary Fig. S1). Most of these introns are cis-spliced, with nad1, nad2, and nad5 containing a few trans-spliced introns.

      Figure 1. 

      Structural and functional features of the M. figo mitogenome. (a) The raw assembly graph of the M. figo mitogenome. (b) The disentangled graph of the M. figo mitogenome. (c) Circular mitogenome map of M. figo. Genes depicted outside the outer circle undergo clockwise transcription, while those positioned within the inner circle undergo counter-clockwise transcription. The legends of different colors positioned in the bottom left corner serve to distinguish genes based on their specific functionalities.

      Table 1.  Gene compositions of the Michelia figo mitogenome.

      Group of genes Name Start codon Stop codon Length Amino acids
      ATP synthase atp1 ATG TGA 1,530 509
      atp4 ATG TAA 582 193
      atp6 ATG TAG 891 296
      atp8 ATG TAA 480 159
      atp9 ATG TAA 261 86
      Cytochrome c biogenesis ccmB ATG TGA 621 206
      ccmC ATG TAA 960 319
      ccmFc* ATG TAA 1,359 452
      ccmFn ATG TAG 1,806 602
      Ubichinol cytochrome c reductase cob ATG TGA 1,182 393
      Cytochrome c oxidase cox1 ACG TAA 1,584 527
      cox2** ATG TAA 759 252
      cox3 ATG TGA 798 265
      Maturases matR ATG TAG 1,959 652
      Transport membrane protein mttB ACG TGA 768 255
      NADH dehydrogenase nad1***+ ACG TAA 978 325
      nad2**** ATG TAA 1,467 488
      nad3 ATG TAA 357 118
      nad4*** ATG TGA 1,488 495
      nad4L ACG TAA 303 100
      nad5**++ ATG TAA 2,013 670
      nad6 ATG TGA 735 244
      nad7**** ATG TAG 1,185 394
      nad9 ATG TAA 573 190
      Large subunit of ribosome (LSU) rpl10 ATG TAA 471 156
      rpl16 GTG TAA 435 144
      rpl2* ATG TAG 1,665 697
      rpl5 ATG TAA 561 186
      Small subunit of ribosome (SSU) rps1 ATG TAA 606 201
      rps10* ATG TGA 420 139
      rps11 ATG TGA 552 183
      rps12 ATG TGA 378 125
      rps13 ATG TGA 351 116
      rps14 ATG TAG 303 100
      rps19 ATG TAA 282 93
      rps2 ATG TAA 657 218
      rps3* ATG TAA 1,572 523
      rps4 ACG TAA 1,071 356
      rps7 (2) ATG/ATG TAA/TAA 450 149
      Succinate dehydrogenase sdh3 ATG TAA 330 109
      sdh4 ATG TGA 447 148
      Ribosomal RNAs rrn5 117
      rrnL 3,560
      rrnS 2,087
      Transfer RNAs trnC-GCA 71
      trnD-GUC 74
      trnE-UUC 72
      trnF-GAA 74
      trnfM-CAU 74
      trnG-GCC 73
      trnH-GUG 74
      trnI-CAU 81
      trnK-UUU 73
      trnM-CAU (2) 73/73
      trnN-GUU 72
      trnP-UGG (3) 75/74/75
      trnQ-UUG 72
      trnS-GCU 88
      trnS-UGA 87
      trnV-UAC 73
      trnW-CCA 74
      trnY-GUA 83
      * Indicates the cis-spliced introns, and + indicates the trans-spliced introns. The number of * and + represents the number of introns. The number in parentheses represents the number of genes.
    • The relative synonymous codon usage (RSCU) value is equal to 1 when there is no synonymous codon usage preference. In the M. figo mitogenome, the RSCU values of AUG (Met), UGG (Trp), and AGC (Ser) are 1 (Supplementary Table S1). Twenty nine codons exhibit RSCU values above 1, among which the codon AGA (Arg) possesses the highest RSCU value, especially 1.57. Additionally, the RSCU values of 32 codons are lower than 1, with CGU(Arg) exhibiting the lowest RSCU value of 0.64. We also compared the RSCU of M. figo mitogenome with the other three Magnoliaceae mitogenomes. The result shows that the relative synonymous codon usage is highly consistent (Fig. 2a), with the codon of AGA (Arg) exhibiting the highest RSCU value in these mitogenomes (Supplementary Table S1). We further calculated the frequency of codon usage, revealing a remarkable similarity across the mitogenomes of Magnoliaceae (Fig. 2b & Supplementary Table S2).

      Figure 2. 

      Codon usage of four Magnoliaceae mitogenomes. (a) Stacked column plots of the relative synonymous codon usage. (b) Heatmap of the codon usage frequencies.

    • The M. figo mitogenome contains abundant repeats (Fig. 3). Using the online tool REPuter, we detected 1,514 pairs of dispersed repeats (≥ 30 bp), including 758 pairs of forward repeats and 756 pairs of palindromic repeats (Supplementary Table S3). However, there are no complementary and reverse repeats. Additionally, the M. figo mitogenome was found to harbor 39 tandem repeats, with lengths varying from 14 to 52 bp (Supplementary Table S4), with matching identity greater than 64%. Altogether, 262 SSRs were identified in the M. figo mitogenome (Supplementary Table S5), most of which are tetranucleotides (96), followed by mononucleotides (55), and dinucleotides (55).

      Figure 3. 

      The distribution of repeats in the M. figo mitogenome. From the center outward, the first circle shows the mitogenome of M. figo, the second and third circle shows tandem repeats and simple sequence repeats, respectively. The inner lines represent the dispersed repeats. The legends of different colors positioned in the bottom left corner represent the dispersed repeats of different lengths.

      To further investigate the repeats in the mitogenomes of Magnoliaceae, we detected and compared the tandem, SSRs and dispersed repeats in the mitogenomes of three Magnoliaceae species (M. biondii, M. officinalis, and M. figo). The results show that only M. officinalis contains two pairs of reverse repeats and one pair of complementary repeats (Fig. 4a), while the number of tandem repeats does not exhibit significant variation (Supplementary Tables S6 & S7). The M. officinalis mitogenome exhibits the highest number of dispersed repeats (3,609), followed by M. biondii (2,800) and M. figo (1,514) (Fig. 4a, Supplementary Tables S8 & S9). The distribution of dispersed repeat lengths across the three mitogenomes is also similar (Fig. 4b), with most repeats ranging from 30 to 49 bp, and only a few exceeding 500 bp. Additionally, comparative results of SSRs reveal that all three mitogenomes of Magnoliaceae contain six SSR types (Fig. 4c, Supplementary Tables S10 & S11), with M. officinalis exhibiting the highest number of SSRs (327). The diversity of SSR types in Magnoliaceae mitogenomes does not vary significantly, with the exception of a notable difference in the case of mononucleotides.

      Figure 4. 

      (a) Type and number of simple sequence repeats in the mitogenomes of two Magnolia species and M. figo. (b) Length and number of dispersed repeats in the mitogenomes of two Magnolia species and M. figo. (c) The different colored legends indicate different species.

    • We identified 81 fragments transferred from the plastome to the mitogenome of M. figo (Fig. 5 & Supplementary Table S12), ranging from 50 to 4,665 bp. The entire length of MTPTs measures 42,791 bp, constituting 5.53% of the whole mitogenome. Most of these MTPTs range from 50 to 500 bp in length, and only 11 fragments exceeding 1 kb, with the longest fragment reaching 4,666 bp. A total of 15 plastid genes were found to be located on MTPTs, including nine PCGs (psbL, psbF, psbE, petL, petG, rps8, rpl14, rps7, and ndhB) and six tRNA genes (trnD-GUC, trnY-GUA, trnE-UUC, trnW-CCA, trnP-UGG, and trnV-GAC). Notably, trnD-GUC, trnW-CCA, and trnV-GAC are completely transferred from the plastome to the mitogenome. Additionally, we found that MTPT22, MTPT65, MTPT68, and MTPT69 are located in repeat regions.

      Figure 5. 

      (a) Homologous sequences between mitogenome and plastome. The plastome is represented by the green circular segment and the mitogenome by the gray circular segment, and two different kinds of yellow lines represent the homologous fragments. The legends of different colors positioned in the bottom right corner represent fragments of different lengths. (b) Lengths and numbers of these homologous fragments in the M. figo mitogenome.

    • We conducted a collinearity analysis by comparing the mitogenome of M. figo with three other Magnoliaceae mitogenomes (L. tulipifera, M. officinalis, and M. biondii). As illustrated in Fig. 6a, a total of 40 locally collinear blocks (LCBs) were identified between the mitogenomes of M. figo and M. officinalis (Supplementary Table S13). The cumulative length of these colinear blocks amounts to 416,577 bp, comprising approximately 53.86% of the M. figo mitogenome. Among these colinear blocks, the longest is 34,737 bp, and the average length is 10,413 bp. Between the mitogenomes of M. biondii and M. figo, we detected 42 LCBs, accounting for 59.04% (456,622 bp) of the M. figo mitogenome (Supplementary Table S14). The longest colinear block is 44,187 bp, and the average length is 10,871 bp. Between the mitogenomes of M. biondii and L. tulipifera, a total of 27 LCBs were identified (Supplementary Table S15), accounting for 52.14% (287,725 bp) of the L. tulipifera mitogenome. The longest colinear block is 30,420 bp, and the average length is 10,656 bp. The average colinear lengths of the four mitogenomes are highly consistent (Fig. 6b).

      Figure 6. 

      Schematic representation of the collinearity among four Magnoliaceae mitogenomes. (a) Collinearity plots of the four Magnoliaceae mitogenomes. The mitogenomes are shown by the bars in each row, and collinear regions are indicated by the connecting lines in the center. (b) Lengths and numbers of collinear blocks. The different colored legends indicate homologous fragments between different species.

    • To further elucidate the phylogenetic position of M. figo, we constructed two phylogenetic trees based on 18 mitochondrial and 61 plastid PCGs from 15 species, respectively (Supplementary Tables S16 & S17). As illustrated in Fig. 7, 91.67% of the total nodes possess bootstrap support values exceeding 80%, including 20 nodes that achieve the maximum support of 100%. From the basal group downward, the bootstrap value for the separation of Magnoliidae from the clade consisting of monocots and core eudicots is 100%. In the Magnoliidae, the bootstrap value for the separation of Magnoliales and Laurales is 100%. Furthermore, we found that M. biondii and M. officinalis firstly grouped, this clade subsequently grouped with M. figo with a 100% bootstrap value, indicating that Michelia is closely related to Magnolia. The phylogenetic trees constructed based on mitogenomes and plastomes exhibit remarkable consistency, supporting that Michelia is closely related to Magnolia.

      Figure 7. 

      The phylogenetic trees constructed based on M. figo and other 14 plant mitogenomes and plastomes. The bootstrap values are clearly displayed within each node. The utilization of distinct colors serves to show the various groups to which the specific species belong. (a) The tree was constructed based on 18 shared mitochondrial genes. (b) The tree was constructed based on 61 shared plastid genes.

    • Plant mitogenomes frequently undergo recombination events mediated by repeats, resulting in great differences in their size[51]. Despite the closely related species, notable variations in mitogenome size can still be observed. For example, the size of the Silene latifolia mitogenome (253 kb) differs by 45 times compared to that of S. conica (11.3 Mb)[52]. The mitogenomes of Cucumis melo (2.9 Mb) and Citrullus lanatus (379 kb) differ by more than seven times[53]. The frequent recombination events of plant mitogenomes may integrate a large amount of foreign DNA during evolution, potentially contributing to the great differences in plant mitogenomes size. In this study, the mitogenome length of M. figo (773,377 bp) is relatively short in Magnoliaceae, with the longest in M. biondii (967,100 bp), followed by M. officinalis (930,306 bp) and M. liliiflora (865,191 bp). The shortest mitogenome is L. tulipifera (551,806 bp), accounting for only 60% of the mitogenomes of M. biondii and M. officinalis.

      Frequent recombination events not only lead to great differences in mitogenome size, but also contribute to complex and diverse structures of plant mitogenomes[54], ranging from single circular and linear structures to more complex branched linear, branched circular, and other complex structures[55]. It has been reported that the mitogenomes of Amborella trichopoda[56], Rhopalocnemis phalloi[57], and Panax notoginseng[58] are complex dynamic structures resulting from recombination. The mitogenome structures of Magnoliaceae are relatively conserved, with the majority being assembled into a single circular chromosome (M. biondii, M. officinalis, M. figo, and L. tulipifera). However, the mitogenome of M. liliiflora exhibits a linear chromosome. Additionally, the mitogenomes of angiosperms exhibit rapid structural differentiation and loss of collinearity, even those of closely related species[59,60]. In this study, using the nucmer program of MUMmer, numerous colinear regions and genomic rearrangements were identified among four Magnoliaceae mitogenomes. The lengths of these colinear blocks account for more than half of each mitogenome. The results of collinearity analysis reveal there may have been significant genomic rearrangements in the mitogenomes of Magnoliaceae species during their evolutionary history.

      The mitogenomes of angiosperms generally encode a core set of 24 PCGs: nad1-7, 9, and 4L; cob; cox1-3; ccmB, C, Fc, and Fn; atp1, 4, 6, 8, and 9; mttB/tatC; and matR. Although these core genes are present in most mitogenomes, there are significant variations in their quantity, position, and arrangement, even within mutants of the same species. In addition to the 24 conserved PCGs, plant mitogenomes also possess 19 standard variable genes, consisting of five large subunits of ribosome proteins (rpl2, 5, 6, 10, and 16), 12 small subunits of ribosome proteins (rps1-4, 7, 8, 10-14, and 19), and two respiratory genes (sdh3-4). Among these variable genes, the large and small subunits of ribosome genes are missing relatively frequently[61]. The mitogenome of M. figo harbors all 24 core PCGs, with only two variable PCGs (rpl6 and rps18) being lost. Similarly, the mitogenomes of M. biondii[33] and L. tulipifera[62] have retained nearly all ancestral PCGs. However, the Silene vulgaris mitogenome has nearly lost all variable PCGs with the exception of rps13. Moreover, the Viscum scurruloideum mitogenome has lost the entirety of 11 of the 24 core PCGs, including ccmB, matR, and all NADH dehydrogenase genes[12]. The gene content in the mitogenomes of Magnoliaceae is relatively abundant[63], suggesting that they may have undergone less gene loss during the mitogenome evolution.

      Plant mitogenomes vary significantly in the number of introns. The Silene latifolia mitogenome has only 19 introns[64], while the Selaginella moellendorffii mitogenome contains the largest number of 37 introns[65]. The M. figo mitogenome contains 25 introns in 10 PCGs (ccmFc, rpl2, rps3, rps10, cox2, nad1, nad2, nad4, nad5, and nad7), consisting of 22 cis-splicing and three trans-splicing introns. Cis-splicing is prevalent in most introns of angiosperm mitogenomes, whereas nad1, nad2, and nad5 evolved a split structure that requires trans-splicing[63]. Similar to the majority of angiosperm mitogenomes, the intron rps3i257 in the M. figo mitogenome is completely lost during differentiation[66]. These results indicate that introns are frequently gained or lost during the evolution of plant mitogenomes (Supplementary Fig. S2)[63].

      Plant mitogenomes are characterized by the abundance of repeats, contributing to the complexity and diversity of mitogenome sizes and structures through frequent recombination events[10]. The intense recombination events mediated by long repeats (> 500 bp) facilitate reversible recombination, regulate the molecular conformation of the mitogenome, and ultimately contribute to the expansion and complexity of plant mitogenomes[54]. In this study, the M. figo mitogenome exhibits the lowest abundance of SSRs and long repeats (> 500 bp), whereas M. biondii mitogenome displays the highest abundance in Magnoliaceae. It can be inferred that it is likely to undergo less recombination events during the evolution of M. figo mitogenome, while the M. biondii mitogenome may experience more recombination events. Simultaneously, variations in the quantity of repeats may result in significant differences in the size of mitogenomes. For example, the mitogenome sizes of bryophytes remain relatively stable at approximately 110 kb, probably due to the scarcity of repeats within their mitogenomes. This scarcity contributes to the conserved and stable structure of the bryophyte mitogenomes. By contrast, the mitogenomes of ferns exhibit a significant number of repeats, accounting for their relatively large sizes[10]. In this study, the mitogenomes of M. biondii and M. officinalis exhibit a significantly higher numbers of repeats compared to M. figo, potentially explaining the differences in their mitogenome sizes.

      DNA fragment transfer events between the plastomes and mitogenome, as well as among different species, are recurrent phenomena that occur during the evolution of the plant mitogenome[67]. The lengths and similarities of these cp-derived fragments vary among different species[68]. The total length of MTPTs in the M. figo mitogenome is 42,791 bp, constituting 5.53% of the whole mitogenome. This proportion is significantly higher than that observed in numerous other mitogenomes, such as Arabidopsis thaliana (0.8%) , Glycine max (0.6%), Silene conica (0.2%), and Vigna angularis (0.1%)[69]. At the other extreme, the length of MTPTs accounts for 10.5% of the Boea hygrometrica mitogenome[70]. The MTPTs in the M. figo mitogenome are notably abundant, with the longest fragment spanning 4,666 bp, and the majority of MTPTs ranging from 50 to 500 bp. These sizable MTPTs are presumed to have significant impacts on plant mitogenome evolution, thereby contributing to genetic diversity[17,71]. Additionally, it is frequently observed that these transferred fragments contain PCGs. The number of PCGs in MTPTs exhibits significant variation in plant mitogenomes, ranging from seven in Brassica to 22 in Nicotiana[72]. In the mitogenome of M. figo, nine PCGs (psbL, psbF, psbE, petL, petG, rps8, rpl14, rps7, and ndhB) are located in MTPTs. However, PCGs in MTPTs turned out to degenerate as a result of sequence alterations and the absence of RNA editing[73,74]. Consequently, PCGs in MTPTs may have limited functional significance in mitogenomes, potentially acting as non-essential sequences[72].

      In this study, we reconstructed two phylogenetic trees based on 18 mitochondrial and 61 plastid PCGs from 15 species, respectively. Both trees exhibit remarkable consistency, supporting that Michelia is closely related to Magnolia, which is consistent with previous studies[26,75]. Additionally, the topological structure of the two phylogenetic trees is also highly consistent with the Angiosperm Phylogeny Group IV (APG IV) system[76]. However, due to the scarcity of mitogenomes in Michelia, we are unable to expand our discussion on the phylogenetic relationships in this genus.

    • In this study, we have successfully sequenced and assembled the mitogenome of M. figo for the first time. The circular mitogenome of M. figo is 773,377 bp in length, encoding 41 PCGs, 21 tRNA genes and three rRNA genes. A total of 22 cis- and three trans-splicing introns were identified in the M. figo mitogenome. The M. figo mitogenome contains abundant repeats, with 1,514 pairs of dispersed repeats, 39 tandem repeats, and 262 SSRs. Additionally, we identified 81 fragments (42,791 bp) that were transferred from the plastome to the mitogenome of M. figo, constituting 5.53% of the whole mitogenome. The M. figo mitogenome is characterized by the abundance of repeats and MTPTs, contributing to the complexity and diversity of mitogenome size and structure. Furthermore, comparative analyses of four Magnoliaceae mitogenomes reveal significant genetic diversity of this genus. Two phylogenetic trees, constructed independently based on the mitogenomes and plastomes of 15 species, depicted the phylogenetic relationship of M. figo. This study presents the first comprehensive genomic and phylogenetic analyses of the M. figo mitogenome, providing crucial theoretical insights and data support for the development of genetic markers, classification, and resource utilization within the Michelia genus.

      • The work is supported by the Natural Science Foundation of Jiangsu Province (BK20220414) and Jiangsu Students' Innovation and Entrepreneurship Training Program (202210298119Y). We thank Assoc. Prof. Kewang Xu from Nanjing Forestry University for collecting the sample of M. figo.

      • This study has rigorously adhered to relevant institutional, national, and international guidelines and regulations. Moreover, the study did not involve the use of any endangered or protected species. The M. figo plant leaves utilized in this experiment were collected at Nanjing Forestry University.

      • The authors confirm contribution to the paper as follows: study conception and design: Bi C, Yang Y; analysis and interpretation of results: Wang S, Sun N, Qiu J, Han F; materials collection and experiments conduct: Bi C, Qiu J; draft manuscript preparation:Wang S; manuscript revision presentation of comments: Bi C, Wang Z, Yang Y. All authors reviewed the results and approved the final version of the manuscript.

      • The mitochondrial genome supporting this study is available at GenBank with accession number: NC_082234.1. The HiFi sequencing data of M. figo is deposited in the SRA repository under SRR28267342.

      • The authors declare that they have no conflict of interest.

      • # Authors contributed equally: Suyan Wang, Jing Qiu

      • Supplementary Table S1 The relative synonymous codon usage of amino acids in the mitogenome of Michelia figo, Magnolia biondii, Magnolia officinalis, and Liriodendron tulipifera.
      • Supplementary Table S2 The frequency of codon usage in the mitogenome of Michelia figo, Magnolia biondii, Magnolia officinalis, and Liriodendron tulipifera.
      • Supplementary Table S3 Dispersed repeat sequences identified in the Michelia figo mitogenome.
      • Supplementary Table S4 Tandem repeat sequences identified in the Michelia figo mitogenome.
      • Supplementary Table S5 The simple sequence repeats identified in the Michelia figo mitogenome.
      • Supplementary Table S6 Dispersed repeat sequences identified in the Magnolia biondii mitogenome.
      • Supplementary Table S7 Dispersed repeat sequences identified in the Magnolia officinalis mitogenome.
      • Supplementary Table S8 Tandem repeat sequences identified in the Magnolia biondii mitogenome.
      • Supplementary Table S9 Tandem repeat sequences identified in the Magnolia officinalis mitogenome.
      • Supplementary Table S10 The simple sequence repeats identified in the Magnolia biondii mitogenome.
      • Supplementary Table S11 The simple sequence repeats identified in the Liriodendron tulipifera mitogenome.
      • Supplementary Table S12 The homologous DNA fragment between mitogenome and cpgenome of Michelia figo.
      • Supplementary Table S13 The collinear blocks between mitogenomes of Michelia figo and Magnolia officinalis.
      • Supplementary Table S14 The collinear blocks between mitogenomes of Michelia figo and Magnolia biondii.
      • Supplementary Table S15 The collinear blocks between mitogenomes of Magnolia biondii and Liriodendron tulipifera.
      • Supplementary Table S16 Genes used for phylogenetic analysis.
      • Supplementary Table S17 The genomic information of the species used in this study.
      • Supplementary Fig. S1 The map of genes containing introns. This diagram illustrates the distribution of cis- and trans-introns.
      • Supplementary Fig. S2 Heat maps of PCG and intron contents among 15 mitogenomes (a) Comparison of PCG contents among 15 mitogenomes. The gene numbers are shown on the top right. (b) Comparison of intron contents among 15 mitogenomes. The intron numbers are shown on the top right.
      • Copyright: © 2025 by the author(s). Published by Maximum Academic Press, Fayetteville, GA. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.
    Figure (7)  Table (1) References (76)
  • About this article
    Cite this article
    Wang S, Qiu J, Sun N, Han F, Wang Z, et al. 2025. Characterization and comparative analysis of the first mitochondrial genome of Michelia (Magnoliaceae). Genomics Communications 2: e001 doi: 10.48130/gcomm-0025-0001
    Wang S, Qiu J, Sun N, Han F, Wang Z, et al. 2025. Characterization and comparative analysis of the first mitochondrial genome of Michelia (Magnoliaceae). Genomics Communications 2: e001 doi: 10.48130/gcomm-0025-0001

Catalog

  • About this article

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return