Evolution of key enzymes in the alkaloid biosynthetic pathway of <i>Ranunculales</i>

Zijian Dai; Feng Xu; Xiaomei Wei; Zhu Qiao; Weijia Wang; Zeyu Zhou; Guangning Zhang; Yang Dong; Xuzhen Li; Ying Hu; Dazhong Guo; Zijian Dai; Feng Xu; Xiaomei Wei; Zhu Qiao; Weijia Wang; Zeyu Zhou; Guangning Zhang; Yang Dong; Xuzhen Li; Ying Hu; Dazhong Guo

doi:10.48130/abd-0026-0003

2026 Volume 3

Article Contents

Next Previous

ARTICLE Open Access

Evolution of key enzymes in the alkaloid biosynthetic pathway of Ranunculales

1.
State Key Laboratory of Biological Big Data in Yunnan Province, Yunnan Agricultural University, Kunming 650201, China
2.
National Center for Traditional Chinese Medicine (TCM) Inheritance and Innovation, Guangxi Botanical Garden of Medicinal Plants, Nanning 530023, China
3.
Guangxi Key Laboratory of Medicinal Resources Protection and Genetic Improvement, Guangxi Botanical Garden of Medicinal Plants, Nanning 530023, China
4.
College of Science, Yunnan Agricultural University, Kunming 650201, China
5.
College of food Science and Technology, Yunnan Agricultural University, Kunming 650201, China
6.
State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming 650201, China
7.
College of Plant Protection, Yunnan Agricultural University, Kunming 650201, China
^#Authors contributed equally: Zijian Dai, Feng Xu

More Information

Corresponding authors: loyalyang@163.com (Dong Y); manlixuzhen@163.com (Li X); hying5340@163.com (Hu Y); gdz@dongyang-lab.org (Guo D)

Received: 31 December 2025
Revised: 26 February 2026
Accepted: 05 March 2026
Published online: 31 March 2026
Agrobiodiversity 2026, 3(1): 31−40 | Cite this article

Abstract

The order Ranunculales is an early-diverging group of angiosperms with abundant medicinal plant resources, attracting considerable attention due to its diverse capabilities of alkaloid biosynthesis. In this study, we integrated transcriptomic data from 319 Ranunculales species to construct a well-supported phylogenetic framework. In addition to previously reported whole-genome duplication events, we identified a lineage-specific duplication event in Berberidaceae. These events significantly drove the expansion of key enzyme gene families involved in the benzylisoquinoline alkaloid (BIA) pathway, particularly prominent in Berberidaceae and Ranunculaceae. Furthermore, we found that NCS genes in Ranunculaceae have undergone strong positive selection, suggesting potential functional adaptive innovations during evolution. These results elucidate the evolutionary mechanisms underlying alkaloid diversity in Ranunculales from both phylogenetic and genomic perspectives, providing a theoretical foundation for the development of relevant medicinal components and synthetic biology.
- Ranunculales,
- Transcriptomics,
- Gene family,
- Alkaloid biosynthesis

Supplementary information

Supplementary Table S1 Summary of the RNA sequencing data of Ranunculales plants.
Supplementary Table S2 Orthogroup assignment statistics per species.
Supplementary Table S3 Transcriptome assembly of Ranunculales plants and its close species.
Supplementary Table S4 Completeness evaluation of transcriptome assembly of Ranunculales using BUSCO.
Supplementary Table S5 Transcriptome annotation of Ranunculales plants using six public protein databases.
Supplementary Table S6 Transcription factors of Ranunculales plants identified using iTAK.
Supplementary Table S7 Simple sequence repeats of Ranunculales plants identified using MISA.
Supplementary Table S8 Estimated Ks distribution for Ranunculales species.
Supplementary Table S9 List of functionally validated genes from previous studies.
Supplementary Table S10 Names of sequences from the four identified gene families in Ranunculales.
Supplementary Table S11 Statistics of conserved amino acid sites in the four screened gene families.
Supplementary Table S12 Critical amino acid sites in positively selected sequences.
Supplementary Table S13 Expression values (FPKM) of sequences from the four gene families in Ranunculales.
Supplementary Fig. S1 Phylogenetic reconstruction of 319 Ranunculales species using a concatenation approach.
Supplementary Fig. S2 Phylogenetic analysis of the Bet v1 gene family showing its division into two subfamilies: NCS-I and NCS-II.
Supplementary Fig. S3 Phylogenetic Analysis of CYP80B/NMCH.
Supplementary Fig. S4 Alignment of key amino acid sites in major CYP80B/NMCH genes sequences.
Supplementary Fig. S5 Phylogenetic analysis of the CNMT gene family showing its division into two subfamilies: CNMT-I and CNMT-II.
Supplementary Fig. S6 Alignment of key amino acid sites in major CNMT gene family sequences.
Supplementary Fig. S7 Phylogenetic analysis of the OMT gene family showing its division into three subfamilies: 6OMT, 7OMT and 4'OMT.
Supplementary Fig. S8 Alignment of key amino acid sites in major OMT gene family sequences.
Supplementary Fig. S9 Prediction of 3D structural models for positively selected sequences in key enzymes.

Rights and permissions
Copyright: © 2026 by the author(s). Published by Maximum Academic Press on behalf of Yunnan Agricultural University. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.

References

[1]	The Angiosperm Phylogeny Group. 2016. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Botanical Journal of the Linnean Society 181:1−20 doi: 10.1111/boj.12385 CrossRef Google Scholar
[2]	Hao DC, Xu LJ, Zheng YW, Lyu HY, Xiao PG. 2022. Mining therapeutic efficacy from treasure chest of biodiversity and chemodiversity: pharmacophylogeny of ranunculales medicinal plants. Chinese Journal of Integrative Medicine 28:1111−1126 doi: 10.1007/s11655-022-3576-x CrossRef Google Scholar
[3]	Bhambhani S, Kondhare KR, Giri AP. 2021. Diversity in chemical structures and biological properties of plant alkaloids. Molecules 26(11):3374 doi: 10.3390/molecules26113374 CrossRef Google Scholar
[4]	Tian Y, Kong L, Li Q, Wang Y, Wang Y, et al. 2024. Structural diversity, evolutionary origin, and metabolic engineering of plant specialized benzylisoquinoline alkaloids. Natural Product Reports 41:1787−1810 doi: 10.1039/D4NP00029C CrossRef Google Scholar
[5]	Minami H, Dubouzet E, Iwasa K, Sato F. 2007. Functional analysis of norcoclaurine synthase in Coptis japonica. The Journal of Biological Chemistry 282:6274−6282 doi: 10.1074/jbc.M608933200 CrossRef Google Scholar
[6]	Morishige T, Tamakoshi M, Takemura T, Sato F. 2010. Molecular characterization of O-methyltransferases involved in isoquinoline alkaloid biosynthesis in Coptis japonica. Proceedings of the Japan Academy, Series B, Physical and Biological Sciences 86:757−768 doi: 10.2183/pjab.86.757 CrossRef Google Scholar
[7]	Guo L, Winzer T, Yang X, Li Y, Ning Z, et al. 2018. The opium poppy genome and morphinan production. Science 362:343−347 doi: 10.1126/science.aat4096 CrossRef Google Scholar
[8]	Hong UVT, Tamiru-Oli M, Hurgobin B, Lewsey MG. 2025. Genomic and cell-specific regulation of benzylisoquinoline alkaloid biosynthesis in opium poppy. Journal of Experimental Botany 76:35−51 doi: 10.1093/jxb/erae317 CrossRef Google Scholar
[9]	Menéndez-Perdomo IM, Facchini PJ. 2023. Elucidation of the (R)-enantiospecific benzylisoquinoline alkaloid biosynthetic pathways in sacred lotus (Nelumbo nucifera). Scientific Reports 13:2955 doi: 10.1038/s41598-023-29415-0 CrossRef Google Scholar
[10]	Lee EJ, Facchini P. 2010. Norcoclaurine synthase is a member of the pathogenesis-related 10/bet v1 protein family. The Plant Cell 22:3489−3503 doi: 10.1105/tpc.110.077958 CrossRef Google Scholar
[11]	Ziegler J, Facchini PJ. 2008. Alkaloid biosynthesis: metabolism and trafficking. Annual Review of Plant Biology 59:735−769 doi: 10.1146/annurev.arplant.59.032607.092730 CrossRef Google Scholar
[12]	Hu Y, Wang J, Liu L, Yi X, Wang X, et al. 2025. Evolutionary history of magnoliid genomes and benzylisoquinoline alkaloid biosynthesis. Nature Communications 16:4039 doi: 10.1038/s41467-025-59343-8 CrossRef Google Scholar
[13]	Shen G, Luo Y, Yao Y, Meng G, Zhang Y, et al. 2022. The discovery of a key prenyltransferase gene assisted by a chromosome-level Epimedium pubescens genome. Frontiers in Plant Science 13:1034943 doi: 10.3389/fpls.2022.1034943 CrossRef Google Scholar
[14]	Chen S, Zhou Y, Chen Y, Gu J. 2018. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34:i884−i890 doi: 10.1093/bioinformatics/bty560 CrossRef Google Scholar
[15]	Fu L, Niu B, Zhu Z, Wu S, Li W. 2012. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28:3150−3152 doi: 10.1093/bioinformatics/bts565 CrossRef Google Scholar
[16]	Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, et al. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Research 25:3389−3402 doi: 10.1093/nar/25.17.3389 CrossRef Google Scholar
[17]	Zheng Y, Jiao C, Sun H, Rosli HG, Pombo MA, et al. 2016. iTAK: a program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Molecular Plant 9:1667−1670 doi: 10.1016/j.molp.2016.09.014 CrossRef Google Scholar
[18]	Beier S, Thiel T, Münch T, Scholz U, Mascher M. 2017. MISA-web: a web server for microsatellite prediction. Bioinformatics 33:2583−2585 doi: 10.1093/bioinformatics/btx198 CrossRef Google Scholar
[19]	Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, et al. 2013. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nature Protocols 8:1494−1512 doi: 10.1038/nprot.2013.084 CrossRef Google Scholar
[20]	Rozewicki J, Li S, Amada KM, Standley DM, Katoh K. 2019. MAFFT-DASH: integrated protein sequence and structural alignment. Nucleic Acids Research 47:W5−W10 doi: 10.1093/nar/gkz342 CrossRef Google Scholar
[21]	Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. 2009. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25:1972−1973 doi: 10.1093/bioinformatics/btp348 CrossRef Google Scholar
[22]	Nguyen LT, Schmidt HA, von Haeseler A, Minh BQ. 2015. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Molecular Biology and Evolution 32:268−274 doi: 10.1093/molbev/msu300 CrossRef Google Scholar
[23]	Zhang C, Rabiee M, Sayyari E, Mirarab S. 2018. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics 19:153 doi: 10.1186/s12859-018-2129-y CrossRef Google Scholar
[24]	Kumar S, Suleski M, Craig JM, Kasprowicz AE, Sanderford M, et al. 2022. TimeTree 5: an expanded resource for species divergence times. Molecular Biology and Evolution 39(8):msac174 doi: 10.1093/molbev/msac174 CrossRef Google Scholar
[25]	Sanderson MJ. 2003. r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19:301−302 doi: 10.1093/bioinformatics/19.2.301 CrossRef Google Scholar
[26]	Chen H, Zwaenepoel A, Van de Peer Y. 2024. wgd v2: a suite of tools to uncover and date ancient polyploidy and whole-genome duplication. Bioinformatics 40(5):btae272 doi: 10.1093/bioinformatics/btae272 CrossRef Google Scholar
[27]	Van Dongen S. 2008. Graph clustering via a discrete uncoupling process. SIAM Journal on Matrix Analysis and Applications 30:121−141 doi: 10.1137/040608635 CrossRef Google Scholar
[28]	Price MN, Dehal PS, Arkin AP. 2010. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One 5(3):e9490 doi: 10.1371/journal.pone.0009490 CrossRef Google Scholar
[29]	Emms DM, Kelly S. 2019. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biology 20:238 doi: 10.1186/s13059-019-1832-y CrossRef Google Scholar
[30]	Lu S, Wang J, Chitsaz F, Derbyshire MK, Geer RC, et al. 2020. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Research 48:D265−D268 doi: 10.1093/nar/gkz991 CrossRef Google Scholar
[31]	Bailey TL, Boden M, Buske FA, Frith M, Grant CE, et al. 2009. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Research 37:W202−W208 doi: 10.1093/nar/gkp335 CrossRef Google Scholar
[32]	Kosakovsky Pond SL, Poon AFY, Velazquez R, Weaver S, Hepler NL, et al. 2020. HyPhy 2.5-a customizable platform for evolutionary hypothesis testing using phylogenies. Molecular Biology and Evolution 37:295−299 doi: 10.1093/molbev/msz197 CrossRef Google Scholar
[33]	Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, et al. 2018. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Research 46:W296−W303 doi: 10.1093/nar/gky427 CrossRef Google Scholar
[34]	Pettersen EF, Goddard TD, Huang CC, Meng EC, Couch GS, et al. 2021. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Science 30:70−82 doi: 10.1002/pro.3943 CrossRef Google Scholar
[35]	Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31:3210−3212 doi: 10.1093/bioinformatics/btv351 CrossRef Google Scholar
[36]	Jiao Y, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, et al. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473:97−100 doi: 10.1038/nature09916 CrossRef Google Scholar
[37]	Wu S, Han B, Jiao Y. 2020. Genetic contribution of paleopolyploidy to adaptive evolution in angiosperms. Molecular Plant 13:59−71 doi: 10.1016/j.molp.2019.10.012 CrossRef Google Scholar
[38]	Landis JB, Soltis DE, Li Z, Marx HE, Barker MS, et al. 2018. Impact of whole-genome duplication events on diversification rates in angiosperms. American Journal of Botany 105:348−363 doi: 10.1002/ajb2.1060 CrossRef Google Scholar
[39]	Becker A, Bachelier JB, Carrive L, Conde E Silva N, Damerval C, et al. 2024. A cornucopia of diversity-Ranunculales as a model lineage. Journal of Experimental Botany 75:1800−1822 doi: 10.1093/jxb/erad492 CrossRef Google Scholar
[40]	Liu X, Bu J, Ma Y, Chen Y, Li Q, et al. 2021. Functional characterization of (S)-N-methylcoclaurine 3'-hydroxylase (NMCH) involved in the biosynthesis of benzylisoquinoline alkaloids in Corydalis yanhusuo. Plant Physiology and Biochemistry 168:507−515 doi: 10.1016/j.plaphy.2021.09.042 CrossRef Google Scholar
[41]	Morris JS, Yu L, Facchini PJ. 2020. A single residue determines substrate preference in benzylisoquinoline alkaloid N-methyltransferases. Phytochemistry 170:112193 doi: 10.1016/j.phytochem.2019.112193 CrossRef Google Scholar
[42]	Li K, Chen X, Zhang J, Wang C, Xu Q, et al. 2022. Transcriptome analysis of Stephania tetrandra and characterization of norcoclaurine-6-O-methyltransferase involved in benzylisoquinoline alkaloid biosynthesis. Frontiers in Plant Science 13:874583 doi: 10.3389/fpls.2022.874583 CrossRef Google Scholar
[43]	Hagel JM, Morris JS, Lee EJ, Desgagné-Penix I, Bross CD, et al. 2015. Transcriptome analysis of 20 taxonomically related benzylisoquinoline alkaloid-producing plants. BMC Plant Biology 15:227 doi: 10.1186/s12870-015-0596-0 CrossRef Google Scholar
[44]	Leebens-Mack JH, Barker MS, Carpenter EJ, Deyholos MK, Gitzendanner MA, et al. 2019. One thousand plant transcriptomes and the phylogenomics of green plants. Nature 574:679−685 doi: 10.1038/s41586-019-1693-2 CrossRef Google Scholar
[45]	Wang W, Lu AM, Ren Y, Endress ME, Chen ZD. 2009. Phylogeny and classification of Ranunculales: evidence from four molecular loci and morphological data. Perspectives in Plant Ecology, Evolution and Systematics 11:81−110 doi: 10.1016/j.ppees.2009.01.001 CrossRef Google Scholar
[46]	Kim S, Soltis DE, Soltis PS, Zanis MJ, Suh Y. 2004. Phylogenetic relationships among early-diverging eudicots based on four genes: were the eudicots ancestrally woody? Molecular Phylogenetics and Evolution 31:16−30 doi: 10.1016/j.ympev.2003.07.017 CrossRef Google Scholar
[47]	Sun Y, Moore MJ, Lin N, Adelalu KF, Meng A, et al. 2017. Complete plastome sequencing of both living species of Circaeasteraceae (Ranunculales) reveals unusual rearrangements and the loss of the ndh gene family. BMC Genomics 18:592 doi: 10.1186/s12864-017-3956-3 CrossRef Google Scholar
[48]	Torsvik TH, Cocks LRM. 2016. Earth history and palaeogeography. Cambridge: Cambridge University Press
[49]	He J, Lyu R, Luo Y, Xiao J, Xie L, et al. 2022. A phylotranscriptome study using silica gel-dried leaf tissues produces an updated robust phylogeny of Ranunculaceae. Molecular Phylogenetics and Evolution 174:107545 doi: 10.1016/j.ympev.2022.107545 CrossRef Google Scholar
[50]	Linnert C, Robinson SA, Lees JA, Bown PR, Pérez-Rodríguez I, et al. 2014. Evidence for global cooling in the Late Cretaceous. Nature Communications 5:4194 doi: 10.1038/ncomms5194 CrossRef Google Scholar
[51]	Westerhold T, Marwan N, Drury AJ, Liebrand D, Agnini C, et al. 2020. An astronomically dated record of Earth's climate and its predictability over the last 66 million years. Science 369:1383−1387 doi: 10.1126/science.aba6853 CrossRef Google Scholar
[52]	Favre A, Päckert M, Pauls SU, Jähnig SC, Uhl D, et al. 2015. The role of the uplift of the Qinghai-Tibetan Plateau for the evolution of Tibetan biotas. Biological Reviews 90:236−253 doi: 10.1111/brv.12107 CrossRef Google Scholar
[53]	Hewitt GM. 1996. Some genetic consequences of ice ages, and their role in divergence and speciation. Biological Journal of the Linnean Society 58:247−276 doi: 10.1006/bijl.1996.0035 CrossRef Google Scholar
[54]	Liu Y, Wang B, Shu S, Li Z, Song C, et al. 2021. Analysis of the Coptis chinensis genome reveals the diversification of protoberberine-type alkaloids. Nature Communications 12:3276 doi: 10.1038/s41467-021-23611-0 CrossRef Google Scholar
[55]	Yang X, Gao S, Guo L, Wang B, Jia Y, et al. 2021. Three chromosome-scale Papaver genomes reveal punctuated patchwork evolution of the morphinan and noscapine biosynthesis pathway. Nature Communications 12:6030 doi: 10.1038/s41467-021-26330-8 CrossRef Google Scholar
[56]	Weng JK, Philippe RN, Noel JP. 2012. The rise of chemodiversity in plants. Science 336:1667−1670 doi: 10.1126/science.1217411 CrossRef Google Scholar
[57]	Firn RD, Jones CG. 2000. The evolution of secondary metabolism - a unifying model. Molecular Microbiology 37:989−994 doi: 10.1046/j.1365-2958.2000.02098.x CrossRef Google Scholar

About this article

Cite this article

Dai Z, Xu F, Wei X, Qiao Z, Wang W, et al. 2026. Evolution of key enzymes in the alkaloid biosynthetic pathway of Ranunculales. Agrobiodiversity 3(1): 31−40 doi: 10.48130/abd-0026-0003

Dai Z, Xu F, Wei X, Qiao Z, Wang W, et al. 2026. Evolution of key enzymes in the alkaloid biosynthetic pathway of Ranunculales. Agrobiodiversity 3(1): 31−40 doi: 10.48130/abd-0026-0003

Figures(5)

Download PDF

Article Metrics

Article views(706) PDF downloads(313)

Other Articles By Authors

on this site
- Zijian Dai
- Feng Xu
- Xiaomei Wei
- Zhu Qiao
- Weijia Wang
- Zeyu Zhou
- Guangning Zhang
- Yang Dong
- Xuzhen Li
- Ying Hu
- Dazhong Guo
on Google Scholar
- Zijian Dai
- Feng Xu
- Xiaomei Wei
- Zhu Qiao
- Weijia Wang
- Zeyu Zhou
- Guangning Zhang
- Yang Dong
- Xuzhen Li
- Ying Hu
- Dazhong Guo

HTML

Introduction

Ranunculales is one of the early-diverging lineages among the basal angiosperms. As a sister group to other eudicots, it had already formed an independent evolutionary branch before the divergence of core eudicots^[1]. Within this order, plants of the families Berberidaceae and Ranunculaceae are commonly used as medicinal resources, characterized by their high accumulation of benzylisoquinoline alkaloids (BIAs), such as berberine, tetrandrine, morphine, papaverine, and sanguinarine^[2]. These compounds exhibit diverse physiological activities and ecological functions, including antimicrobial properties, defense against herbivores, as well as pharmacological effects like anticancer and analgesic actions^[3,4]. To date, the biosynthetic pathways of BIAs have been extensively studied in several species, particularly in Coptis japonica, Papaver somniferum, Nelumbo nucifera, and Thalictrum flavum^[5−10]. In these plants, the intermediate products and key enzymes involved in the BIAs biosynthetic pathway have been elucidated progressively.

In this biosynthetic pathway, the core intermediate (S)-Reticuline is formed through the catalysis of key rate-limiting enzymes, such as (S)-norcoclaurine synthase (NCS), norcoclaurine 6-O-methyltransferase (6OMT), coclaurine N-methyltransferase (CNMT), N-methylcoclaurine-3'-hydroxylase (NMCH), and 4'-O-methyltransferase (4'OMT)^[11]. These rate-limiting enzyme genes originated in ancient terrestrial plants and underwent multiple duplication events prior to the divergence of core angiosperms. However, throughout the course of evolution, the vast majority of monocots and core eudicots have systematically lost these duplicated gene copies, leading to the loss of their biosynthetic capabilities of BIAs^[12].

Previous genome-based studies of representative Ranunculales species, such as Coptis japonica (Ranunculaceae), Papaver somniferum (Papaveraceae), and Epimedium pubescens^[5,8,13], have shown that whole-genome duplication (WGD) and local gene duplication events are closely associated with the expansion of key enzyme gene families involved in alkaloid biosynthesis. Meanwhile, pronounced differences in alkaloid composition and types among Ranunculales species suggest that their secondary metabolic systems have undergone complex and lineage-specific diversification during evolution. However, most of these conclusions are derived from a limited number of model or representative species, and their general applicability at the order-wide scale of Ranunculales remains largely untested. Therefore, a comprehensive order-level analysis of the evolutionary dynamics of key enzymes in the benzylisoquinoline alkaloid biosynthetic pathway is essential for elucidating the mechanisms and evolutionary history underlying chemical diversity in Ranunculales plants.

Transcriptomic information for a total of 319 species was acquired by integrating leaf transcriptome sequencing data from 227 Ranunculales species with open-access data from National Center for Biotechnology Information (NCBI). A high-quality transcript dataset was constructed through a unified process. On this basis, functional annotation, structure analysis of motifs, selection pressure assessment, and expression level comparison were performed for several key enzymes involved in alkaloid biosynthesis, aiming to explore the sequence conservation and functional evolutionary trends of key enzymes in the benzylisoquinoline alkaloid biosynthetic pathway across different lineages within Ranunculales. This study not only provides rich genetic resources for research on secondary metabolism in Ranunculales plants, but also offers insights into understanding the intra-lineage evolutionary characteristics of key enzymes in specific metabolic pathways.

Materials and methods

Sample collection, library construction, and RNA sequencing

Fresh and healthy leaf tissues were collected from 227 species of Ranunculales. The samples were immediately flash-frozen in liquid nitrogen and stored at −80 °C for subsequent RNA extraction and analysis of key enzymes involved in alkaloid biosynthesis. Total RNA was extracted using the cetyltrimethylammonium bromide (CTAB) method, followed by DNase I treatment. After confirming RNA integrity with an Agilent 2100 Bioanalyzer, high-quality RNA was used to construct sequencing libraries following the standard library preparation protocol. Paired-end sequencing was performed on a BGISEQ-500 platform. To enhance data coverage and lineage representation, transcriptome data of an additional 92 Ranunculales species were downloaded from the NCBI database. In total, transcriptomic data from 319 species were processed and analyzed (the species list is provided in Supplementary Table S1).

Data preprocessing, transcriptome assembly, and annotation
Raw sequencing data were quality-controlled using fastp (v0.23.4)^[14], which included adapter trimming, removal of low-quality reads, and filtering of sequences containing ambiguous bases (N). Quality control results showed high rates of valid data retention across all samples. The cleaned data were then used for de novo transcriptome assembly with Trinity (v2.15.1) under default parameters. After obtaining the initial transcript sets, CD-HIT (v4.8.1)^[15] was used to cluster and remove redundancy at a 95% sequence similarity threshold, retaining representative transcripts. Subsequently, the Trinity 'script get_longest_isoform_seq_per_trinity_gene.pl' was employed to extract the longest isoform per gene as the representative sequence for annotation and comparative analysis.

To assess the completeness of the assemblies, BUSCO (v5.4.4) was run against the embryophyta_odb10 database, and the proportions of 'complete', 'fragmented', and 'missing' genes were calculated. For functional annotation, BLASTALL (E-value ≤ 1e-5)^[16] was used to align transcripts against multiple authoritative protein databases, including NR (www.ncbi.nlm.nih.gov/refseq/about/nonredundantproteins), Swiss-Prot (www.uniprot.org), Pfam (https://pfam.xfam.org), KEGG (www.genome.jp/kegg), COG (www.ncbi.nlm.nih.gov/research/cog-project), and GO (https://geneontology.org). Transcription factors (TFs) were identified via the iTAK tool^[17] by matching to known TF family definitions. Meanwhile, the MISA^[18] software was applied to detect simple sequence repeats (SSRs) in the transcripts, and their types and distribution patterns were summarized.

Transcript quantification and prediction of open reading frames and proteins
Expression levels of each gene across all species were obtained using the quantification module provided by Trinity in combination with the Salmon expression estimation software. First, the TransDecoder.LongOrfs module of TransDecoder^[19] (v5.5.0) was applied to predict open reading frames (ORFs) from the longest transcript sequences. To perform protein homology alignment, protein sequence data for Viridiplantae were downloaded from the UniProt database, and a local protein database was constructed using BLAST to generate alignment files. Finally, based on the preceding results, TransDecoder was used to generate the predicted protein files.

Phylogenetic reconstruction and divergence time estimation
Based on the BUSCO assessment results, putative low-copy nuclear genes that were present in more than 80% of the sampled species were selected for phylogenetic analysis (corresponding to BUSCO-defined single-copy orthologs). For the coalescent-based approach, each single-copy gene was first aligned using MAFFT (v7.505)^[20], and then TrimAl (v1.4.rev22)^[21] was applied under stringent parameters to remove low-quality and highly gapped regions from the alignments, ensuring reliability. The processed alignments were separately input into IQ-TREE (2.1.4-beta)^[22], where the model-selection option (-m MFP) was used to automatically determine the best substitution model. Phylogenetic inference was performed with 1,000 ultrafast bootstrap replicates to generate single-gene trees. All single-gene trees were further integrated with ASTRAL (v5.7.8)^[23] to obtain a multi-gene consensus tree. For the concatenation-based approach, the aligned sequences of all single-copy genes were concatenated into a super-gene matrix after quality control. Model prediction and phylogenetic analysis were then conducted in IQ-TREE, with 1,000 ultrafast bootstrap replicates used to assess topological confidence. To obtain divergence time estimates, previously reported species divergence times from the TimeTree database (https://timetree.org)^[24] were incorporated as fossil calibration points. Specifically, the divergence time between Amborella trichopoda and Ranunculales was set to 196 Ma (95% confidence interval: 179.9–205 Ma). The inferred phylogenetic tree was subsequently time-calibrated using the r8s software^[25], resulting in a time-scaled species phylogeny.

Estimation of whole-genome duplication (WGD) events
To investigate ancient whole-genome duplication (WGD) events in Ranunculales, representative species were selected and analyzed using the WGD (v2.0) pipeline^[26]. First, all-against-all BLAST searches were performed on the predicted protein sequences of each species, with an E-value set at '1e-5'^[16], and homologous gene families were identified via the MCL clustering algorithm implemented in the WGD pipeline^[27]. Phylogenetic trees were then constructed for each gene family by FastTree^[28], and the distribution of synonymous substitution rates (Ks) between homologous gene pairs was estimated with the kd command in the wgd toolkit. In addition, using OrthoFinder(v2.5.5)^[29] with default parameters, phylogenetic orthogroup inference was performed based on the protein sequences of all species to identify orthologous and paralogous relationships (Supplementary Table S2) and to construct the species tree. OrthoFinder assigns genes into orthogroups based on sequence similarity searches and graph-based clustering algorithms, and simultaneously infers gene trees for each gene family, thereby providing a robust homologous framework for subsequent WGD identification and evolutionary analyses. To further assess the reliability of potential WGD events, we performed Whale analyses based on the constructed gene family trees and the time-calibrated species phylogeny. WHALE was run with an MCMC chain length of 500,000 generations, sampling every 100 generations, with a burn-in of 100,000. Branches with posterior probability > 0.7 for gene duplication events were considered candidates for WGD.

Key enzyme profiling for alkaloid biosynthesis
To elucidate the biosynthetic mechanism of benzylisoquinoline alkaloids (BIAs), this study focused on two key enzymes in this pathway, norcoclaurine synthase (NCS) and N-methylcoclaurine-3'-hydroxylase (NMCH), as well as two key methyltransferase gene families, O-methyltransferases (OMT) and coclaurine N-methyltransferase (CNMT). Using previously collected known protein sequences as references, a combined BLASTP and HMMER strategy was employed to conduct homology searches and systematically identify homologous gene members for each target. Non-redundant sequences were further validated using the Conserved Domain Database (CDD; www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi)^[30] to ensure accuracy.

Based on the filtered sequences, phylogenetic trees were constructed separately with IQ-TREE to assess their lineage-specific evolutionary relationships. Conserved motifs in candidate proteins were identified using the MEME SUITE^[31], and functional domains were annotated jointly with the Pfam database.

To investigate adaptive evolution of the genes, the aBSREL method in the HyPhy software package^[32] was applied to detect signals of positive selection across branches, based on the phylogenetic trees and corresponding coding sequences (CDS). For lineages showing positive selection, the Hyphy MEME method was further used for site-specific analysis, and the detected codon sites were mapped to their corresponding positions in the protein sequences. To associate expression patterns with evolutionary features, gene expression levels were obtained from transcriptomic data and visualized in heatmaps. For proteins under positive selection, homology modeling was performed via the SWISS-MODEL server (https://swissmodel.expasy.org/interactive)^[33] to predict their three-dimensional structures. Finally, positive selection sites, key residues, catalytic sites, and predicted substrate-binding sites were spatially mapped and visualized on their three-dimensional structures via ChimeraX (v1.11)^[34].

{{lists.name}}

Evolution of key enzymes in the alkaloid biosynthetic pathway of Ranunculales