[1]

Youngblut ND, Carpenter C, Nayebnazar A, Adduri A, Shah R, et al. 2025. scBaseCount: an AI agent-curated, uniformly processed, and continually expanding single cell data repository. bioRxiv 640494

doi: 10.1101/2025.02.27.640494
[2]

Ruan W, Lyu Y, Zhang J, Cai J, Shu P, et al. 2025. Large language models for bioinformatics. arXiv 2501.06271v1

doi: 10.48550/arXiv.2501.06271
[3]

Gao S, Fang A, Huang Y, Giunchiglia V, Noori A, et al. 2024. Empowering biomedical discovery with AI agents. Cell 187(22):6125−6151

doi: 10.1016/j.cell.2024.09.022
[4]

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, et al. 2023. Attention is all you need. arXiv 1706.03762v7

doi: 10.48550/arXiv.1706.03762
[5]

Ji Y, Zhou Z, Liu H, Davuluri RV. 2021. DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics 37(15):2112−2120

doi: 10.1093/bioinformatics/btab083
[6]

Zhou Z, Ji Y, Li W, Dutta P, Davuluri R, et al. 2024. DNABERT-2: efficient foundation model and benchmark for multi-species genome. arXiv 2306.15006v2

doi: 10.48550/arXiv.2306.15006
[7]

Zhou Z, Wu W, Ho H, Wang J, Shi L, et al. 2024. DNABERT-S: pioneering species differentiation with species-aware DNA embeddings. arXiv 2402.08777v3

doi: 10.48550/arXiv.2402.08777
[8]

Dalla-Torre H, Gonzalez L, Mendoza-Revilla J, Lopez Carranza N, Grzywaczewski AH, et al. 2025. Nucleotide Transformer: building and evaluating robust foundation models for human genomics. Nature Methods 22(2):287−297

doi: 10.1038/s41592-024-02523-z
[9]

Boshar S, Evans B, Tang Z, Picard A, Adel Y, et al. 2025. A foundational model for joint sequence-function multi-species modeling at scale for long-range genomic prediction. bioRxiv 695963

doi: 10.64898/2025.12.22.695963
[10]

Nguyen E, Poli M, Faizi M, Thomas A, Birch-Sykes C, et al. 2023. HyenaDNA: long-range genomic sequence modeling at single nucleotide resolution. arXiv 2306.15794v2

doi: 10.48550/arXiv.2306.15794
[11]

Fishman V, Kuratov Y, Shmelev A, Petrov M, Penzar D, et al. 2025. GENA-LM: a family of open-source foundational DNA language models for long sequences. Nucleic Acids Research 53(2):gkae1310

doi: 10.1093/nar/gkae1310
[12]

Brixi G, Durrant MG, Ku J, Poli M, Brockman G, et al. 2025. Genome modeling and design across all domains of life with Evo 2. bioRxiv 638918

doi: 10.1101/2025.02.18.638918
[13]

Nguyen E, Poli M, Durrant MG, Kang B, Katrekar D, et al. 2024. Sequence modeling and design from molecular to genome scale with Evo. Science 386:eado9336

doi: 10.1126/science.ado9336
[14]

Wu W, Zhou Z, Riley R, Abdulqader M, Song X, et al. 2025. Uncovering the Genomic Manifold via Scalable Learning from the Global Microbiome. bioRxiv 635558

doi: 10.1101/2025.01.30.635558
[15]

Avsec Ž, Latysheva N, Cheng J, Novati G, Taylor KR, et al. 2025. AlphaGenome: advancing regulatory variant effect prediction with a unified DNA sequence model. bioRxiv 661532

doi: 10.1101/2025.06.25.661532
[16]

Penić RJ, Vlašić T, Huber RG, Wan Y, Šikić M. 2025. RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks. Nature Communications 16:5671

doi: 10.1038/s41467-025-60872-5
[17]

Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, et al. 2025. Simulating 500 million years of evolution with a language model. Science 387(6736):850−858

doi: 10.1126/science.ads0018
[18]

Chen B, Cheng X, Li P, Geng YA, Gong J, et al. 2024. xTrimoPGLM: unified 100B-scale pre-trained transformer for deciphering the language of protein. arXiv 2401.06199v2

doi: 10.48550/arXiv.2401.06199
[19]

Senior AW, Evans R, Jumper J, Kirkpatrick J, Sifre L, et al. 2020. Improved protein structure prediction using potentials from deep learning. Nature 577(7792):706−710

doi: 10.1038/s41586-019-1923-7
[20]

Agarwal V, McShan AC. 2024. The power and pitfalls of AlphaFold2 for structure prediction beyond rigid globular proteins. Nature Chemical Biology 20(8):950−959

doi: 10.1038/s41589-024-01638-w
[21]

Jumper J, Evans R, Pritzel A, Green T, Figurnov M, et al. 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596(7873):583−589

doi: 10.1038/s41586-021-03819-2
[22]

Abramson J, Adler J, Dunger J, Evans R, Green T, et al. 2024. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630(8016):493−500

doi: 10.1038/s41586-024-07487-w
[23]

Lewis S, Hempel T, Jiménez-Luna J, Gastegger M, Xie Y, et al. 2025. Scalable emulation of protein equilibrium ensembles with generative deep learning. Science 389(6761):eadv9817

doi: 10.1126/science.adv9817
[24]

Nijkamp E, Ruffolo JA, Weinstein EN, Naik N, Madani A. 2023. ProGen2: exploring the boundaries of protein language models. Cell Systems 14(11):968−978.e3

doi: 10.1016/j.cels.2023.10.002
[25]

Yang J, Bhatnagar A, Ruffolo JA, Madani A. 2024. Function-guided conditional generation using protein language models with adapters. arXiv 2410.03634v2

doi: 10.48550/arXiv.2410.03634
[26]

Garau-Luis JJ, Bordes P, Gonzalez L, Roller M, de Almeida BP, et al. 2024. Multi-modal transfer learning between biological foundation models. arXiv 2406.14150v1

doi: 10.48550/arXiv.2406.14150
[27]

de Almeida BP, Richard G, Dalla-Torre H, Blum C, Hexemer L, et al. 2025. A multimodal conversational agent for DNA, RNA and protein tasks. Nature Machine Intelligence 7(6):928−941

doi: 10.1038/s42256-025-01047-1
[28]

Liu T, Xiao Y, Luo X, Xu H, Zheng WJ, et al. 2024. Geneverse: a collection of open-source multimodal large language models for genomic and proteomic research. arXiv 2406.15534v1

doi: 10.48550/arXiv.2406.15534
[29]

St John P, Lin D, Binder P, Greaves M, Shah V, et al. 2024. BioNeMo Framework: a modular, high-performance library for AI model development in drug discovery. arXiv 2411.10548v5

doi: 10.48550/arXiv.2411.10548
[30]

Theodoris CV, Xiao L, Chopra A, Chaffin MD, Al Sayed ZR, et al. 2023. Transfer learning enables predictions in network biology. Nature 618(7965):616−624

doi: 10.1038/s41586-023-06139-9
[31]

Chen H, Venkatesh MS, Ortega JG, Mahesh SV, Nandi TN, et al. 2024. Quantized multi-task learning for context-specific representations of gene network dynamics. bioRxiv 608180

doi: 10.1101/2024.08.16.608180
[32]

Cui H, Wang C, Maan H, Pang K, Luo F, et al. 2024. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nature Methods 21(8):1470−1480

doi: 10.1038/s41592-024-02201-0
[33]

Wang C, Cui H, Zhang A, Xie R, Goodarzi H, et al. 2025. scGPT-spatial: continual pretraining of single-cell foundation model for spatial transcriptomics. bioRxiv 636714

doi: 10.1101/2025.02.05.636714
[34]

Zeng Y, Xie J, Shangguan N, Wei Z, Li W, et al. 2025. CellFM: a large-scale foundation model pre-trained on transcriptomics of 100 million human cells. Nature Communications 16:4679

doi: 10.1038/s41467-025-59926-5
[35]

Hao M, Gong J, Zeng X, Liu C, Guo Y, et al. 2024. Large-scale foundation model on single-cell transcriptomics. Nature Methods 21(8):1481−1491

doi: 10.1038/s41592-024-02305-7
[36]

Cao S, Yang K, Cheng J, Li J, Shen HB, et al. 2024. stFormer: a foundation model for spatial transcriptomics. bioRxiv 615337

doi: 10.1101/2024.09.27.615337
[37]

Schaar AC, Tejada-Lapuerta a, Palla G, Gutgesell R, Halle L, et al. 2024. Nicheformer: a foundation model for single-cell and spatial omics. bioRxiv 589472

doi: 10.1101/2024.04.15.589472
[38]

Levine D, Rizvi SA, Lévy S, Pallikkavaliyaveetil N, Zhang D, et al. 2024. Cell2Sentence: teaching large language models the language of biology. bioRxiv 557287

doi: 10.1101/2023.09.11.557287
[39]

Rizvi SA, Levine D, Patel A, Zhang S, Wang E, et al. 2025. Scaling large language models for next-generation single-cell analysis. bioRxiv 648850

doi: 10.1101/2025.04.14.648850
[40]

Su Z, Fang M, Smolnikov A, Dinger ME, Oates EC, et al. 2025. GeneRAIN: multifaceted representation of genes via deep learning of gene expression networks. Genome Biology 26(1):288

doi: 10.1186/s13059-025-03749-6
[41]

Ouyang Z, Li J. 2026. Scouter predicts transcriptional responses to genetic perturbations with large language model embeddings. Nature Computational Science 6(1):21−28

doi: 10.1038/s43588-025-00912-8
[42]

Luo E, Hao M, Wei L, Zhang X. 2024. scDiffusion: conditional generation of high-quality single-cell data using diffusion model. Bioinformatics 40(9):btae518

doi: 10.1093/bioinformatics/btae518
[43]

Luo E, Wei L, Hao M, Zhang X, Liu Q. 2025. Multi-modal diffusion model with dual-cross-attention for multi-omics data generation and translation. bioRxiv 640020

doi: 10.1101/2025.02.27.640020
[44]

Cornejo-Páramo P, Zhang X, Louis L, Li Z, Yang Y, et al. 2025. Motif-based models accurately predict cell type-specific distal regulatory elements. Nature Communications 16:10370

doi: 10.1038/s41467-025-65362-2
[45]

Chen W, Zhang P, Tran TN, Xiao Y, Li S, et al. 2025. A visual–omics foundation model to bridge histopathology with spatial transcriptomics. Nature Methods 22(7):1568−1582

doi: 10.1038/s41592-025-02707-1
[46]

Ding T, Wagner SJ, Song AH, Chen RJ, Lu MY, et al. 2025. A multimodal whole-slide foundation model for pathology. Nature Medicine 31(11):3749−3761

doi: 10.1038/s41591-025-03982-3
[47]

Kong Z, Qiu M, Boesen J, Lin X, Yun S,et al. 2025. SPATIA: multimodal model for prediction and generation of spatial cell phenotypes. arXiv 2507.04704v2

doi: 10.48550/arXiv.2507.04704
[48]

Qian L, Dong Z, Guo T. 2025. Grow AI virtual cells: three data pillars and closed-loop learning. Cell Research 35(5):319−321

doi: 10.1038/s41422-025-01101-y
[49]

Bunne C, Roohani Y, Rosen Y, Gupta A, Zhang X, et al. 2024. How to build the virtual cell with artificial intelligence: priorities and opportunities. Cell 187(25):7045−7063

doi: 10.1016/j.cell.2024.11.015
[50]

Noutahi E, Hartford J, Tossou P, Whitfield S, Denton AK, et al. 2025. Virtual cells: predict, explain, discover. arXiv 2505.14613v3

doi: 10.48550/arXiv.2505.14613
[51]

Wei Z, Ma R, Wang Z, Li Z, Song S, et al. 2025. VCWorld: a biological world model for virtual cell simulation. arXiv 2512.00306v2

doi: 10.48550/arXiv.2512.00306
[52]

Johnson JAI, Bergman DR, Rocha HL, Zhou DL, Cramer E, et al. 2025. Human interpretable grammar encodes multicellular systems biology models to democratize virtual cell laboratories. Cell 188(17):4711−4733.e37

doi: 10.1016/j.cell.2025.06.048
[53]

Chen Z, Tian S, Pei J, Gu R, Li Y, et al. 2025. UniCure: a foundation model for predicting personalized cancer therapy response. bioRxiv 658531

doi: 10.1101/2025.06.14.658531
[54]

Adduri AK, Gautam D, Bevilacqua B, Imran A, Shah R, et al. 2025. Predicting cellular responses to perturbation across diverse contexts with State. bioRxiv 661135

doi: 10.1101/2025.06.26.661135
[55]

Zhang J, Ubas AA, de Borja R, Svensson V, Thomas N, et al. 2025. Tahoe-100M: a giga-scale single-cell perturbation atlas for context-dependent gene function and cellular modeling. bioRxiv 639398

doi: 10.1101/2025.02.20.639398
[56]

Ji Y, Tejada-Lapuerta A, Schmacke NA, Zheng Z, Zhang X, et al. 2025. Scalable and universal prediction of cellular phenotypes enables in silico experiments. bioRxiv 607533

doi: 10.1101/2024.08.12.607533
[57]

Xu J, Yang X, Li Y, Wang H, Li Y, et al. 2025. ODFormer: a virtual organoid for predicting personalized therapeutic responses in pancreatic cancer. bioRxiv 663664

doi: 10.1101/2025.07.08.663664
[58]

Peidli S, Green TD, Shen C, Gross T, Min J, et al. 2024. scPerturb: harmonized single-cell perturbation data. Nature Methods 21(3):531−540

doi: 10.1038/s41592-023-02144-y
[59]

Chandrasekaran SN, Cimini BA, Goodale A, Miller L, Kost-Alimova M, et al. 2024. Three million images and morphological profiles of cells treated with matched chemical and genetic perturbations. Nature Methods 21(6):1114−1121

doi: 10.1038/s41592-024-02241-6
[60]

Kraus O, Comitani F, Urbanik J, Kenyon-Dean K, Arumugam L, et al. 2025. RxRx3-core: benchmarking drug-target interactions in high-content microscopy. arXiv 2503.20158v2

doi: 10.48550/arXiv.2503.20158
[61]

Huang AC, Hsieh THS, Zhu J, Michuda J, Teng A, et al. 2025. X-Atlas/Orion: genome-wide perturb-seq datasets via a scalable fix-cryopreserve platform for training dose-dependent biological foundation models. bioRxiv 659105

doi: 10.1101/2025.06.11.659105
[62]

Wu Y, Wershof E, Schmon SM, Nassar M, Osiński B, et al. 2025. PerturBench: benchmarking machine learning models for cellular perturbation analysis. arXiv 2408.10609v4

doi: 10.48550/arXiv.2408.10609
[63]

Li C, Ziyadeh E, Sharma Y, Dumoulin B, Levinsohn J, et al. 2025. Nephrobase cell+: multimodal single-cell foundation model for decoding kidney biology. arXiv 2509.26223v1

doi: 10.48550/arXiv.2509.26223
[64]

Liu L, Li W, Wang F, Li Y, Huang LK, et al. 2025. A pre-trained large generative model for translating single-cell transcriptomes to proteomes. Nature Biomedical Engineering 1−20

doi: 10.1038/s41551-025-01528-z
[65]

Kedzierska KZ, Crawford L, Amini AP, Lu AX. 2025. Zero-shot evaluation reveals limitations of single-cell foundation models. Genome Biology 26(1):101

doi: 10.1186/s13059-025-03574-x
[66]

DenAdel A, Hughes M, Thoutam A, Gupta A, Navia AW, et al. 2025. Evaluating the role of pre-training dataset size and diversity on single-cell foundation model performance. bioRxiv 628448

doi: 10.1101/2024.12.13.628448
[67]

Wang Q, Pan Y, Zhou M, Tang Z, Wang Y, et al. 2025. scDrugMap: benchmarking large foundation models for drug response prediction. arXiv 2505.05612v1

doi: 10.48550/arXiv.2505.05612
[68]

Zhang F, Liu T, Zhu Z, Wu H, Wang H, et al. 2025. CellVerse: do large language models really understand cell biology. arXiv 2505.07865v1

doi: 10.48550/ARXIV.2505.07865
[69]

Xiao Y, Liu J, Zheng Y, Jiao S, Hao J, et al. 2025. CellAgent: LLM-driven multi-agent framework for natural language-based single-cell analysis. bioRxiv 593861

doi: 10.1101/2024.05.13.593861
[70]

Wang H, He Y, Coelho PP, Bucci M, Nazir A, et al. 2025. SpatialAgent: an autonomous ai agent for spatial biology. bioRxiv 646459

doi: 10.1101/2025.04.03.646459
[71]

Alber S, Chen B, Sun E, Isakova A, Wilk AJ, et al. 2025. CellVoyager: AI compbio agent generates new insights by autonomously analyzing biological data. bioRxiv 657517

doi: 10.1101/2025.06.03.657517
[72]

Schaefer M, Peneder P, Malzl D, Lombardo SD, Peycheva M, et al. 2025. Multimodal learning enables chat-based exploration of single-cell data. Nature Biotechnology 1−11

doi: 10.1038/s41587-025-02857-9
[73]

Huang S, Šabanović B, Peng Y, Zheng Q, Alessandri L, et al. 2026. GPTBioInsightor − leveraging large language models for transparent scRAN-Seq cell type annotations. Bioinformatics Advances 6:vbag025

doi: 10.1093/bioadv/vbag025
[74]

Xie E, Cheng L, Shireman J, Cai Y, Liu J, et al. 2026. CASSIA: a multi-agent large language model for automated and interpretable cell annotation. Nature Communications 17:389

doi: 10.1038/s41467-025-67084-x
[75]

Liu W, Li J, Tang Y, Zhao Y, Liu C, et al. 2025. DrBioRight 2.0: an LLM-powered bioinformatics chatbot for large-scale cancer functional proteomics analysis. Nature Communications 16:2256

doi: 10.1038/s41467-025-57430-4
[76]

Zhou J, Zhang B, Li G, Chen X, Li H, et al. 2024. An AI agent for fully automated multi-omic analyses. Advanced Science 11:2407094

doi: 10.1002/advs.202407094
[77]

Mehandru N, Hall AK, Melnichenko O, Dubinina Y, Tsirulnikov D, et al. 2025. BioAgents: bridging the gap in bioinformatics analysis with multi-agent systems. Scientific Reports 15:39036

doi: 10.1038/s41598-025-25919-z
[78]

Hong G, Banos DT. 2025. Nano bio-agents (NBA): small language model agents for genomics. arXiv 2509.19566v1

doi: 10.48550/arXiv.2509.19566
[79]

Roohani Y, Lee A, Huang Q, Vora J, Steinhart Z, et al. 2025. BioDiscoveryAgent: an AI agent for designing genetic perturbation experiments. arXiv 2405.17631v3

doi: 10.48550/arXiv.2405.17631
[80]

Xu Q, Soto C, Shahnawaz M, Liu X, Jiang X, et al. 2025. Multi agent large language models for biomedical hypothesis generation in drug combination discovery. iScience 28(12):113984

doi: 10.1016/j.isci.2025.113984
[81]

Qu Y, Huang K, Yin M, Zhan K, Liu D, et al. 2026. CRISPR-GPT for agentic automation of gene-editing experiments. Nature Biomedical Engineering 10(2):245−258

doi: 10.1038/s41551-025-01463-z
[82]

Ghafarollahi A, Buehler MJ. 2024. ProtAgents: protein discovery via large language model multi-agent collaborations combining physics and machine learning. arXiv 2402.04268v1

doi: 10.48550/arXiv.2402.04268
[83]

Liu S, Lu Y, Chen S, Hu X, Zhao J, et al. 2025. DrugAgent: automating AI-aided drug discovery programming through LLM multi-agent collaboration. arXiv 2411.15692v2

doi: 10.48550/arXiv.2411.15692
[84]

Averly R, Baker FN, Watson IA, Ning X. 2025. LIDDIA: language-based intelligent drug discovery agent. arXiv 2502.13959v3

doi: 10.48550/arXiv.2502.13959
[85]

Zhang F, Zhao Y, Zhang W, Lai L. 2025. BioScientist agent: designing LLM-biomedical agents with KG-augmented RL reasoning modules for drug repurposing and mechanistic of action elucidation. bioRxiv 669291

doi: 10.1101/2025.08.08.669291
[86]

Velez-Arce A, Lin X, Li MM, Huang K, Gao W, et al. 2024. Signals in the cells: multimodal and contextualized machine learning foundations for therapeutics. bioRxiv 598655

doi: 10.1101/2024.06.12.598655
[87]

Gao S, Zhu R, Kong Z, Noori A, Su X, et al. 2025. TxAgent: an AI agent for therapeutic reasoning across a universe of tools. arXiv 2503.10970v1

doi: 10.48550/arXiv.2503.10970
[88]

Schmidgall S, Su Y, Wang Z, Sun X, Wu J, et al. 2025. Agent laboratory: using LLM agents as research assistants. arXiv 2501.04227v2

doi: 10.48550/arXiv.2501.04227
[89]

Lu C, Lu C, Lange RT, Foerster J, Clune J, et al. 2024. The AI scientist: towards fully automated open-ended scientific discovery. arXiv 2408.06292v3

doi: 10.48550/arXiv.2408.06292
[90]

Penadés JR, Gottweis J, He L, Patkowski JB, Daryin A, et al. 2025. AI mirrors experimental science to uncover a mechanism of gene transfer crucial to bacterial evolution. Cell 188(23):6654−6665.e2

doi: 10.1016/j.cell.2025.08.018
[91]

Swanson K, Wu W, Bulaong NL, Pak JE, Zou J. 2025. The Virtual Lab of AI agents designs new SARS-CoV-2 nanobodies. Nature 646(8085):716−723

doi: 10.1038/s41586-025-09442-9
[92]

Huang K, Zhang S, Wang H, Qu Y, Lu Y, et al. 2025. Biomni: a general-purpose biomedical AI agent. bioRxiv 656746

doi: 10.1101/2025.05.30.656746
[93]

Zhang Z, Qiu Z, Wu Y, Li S, Wang D, et al. 2026. OriGene: a self-evolving virtual disease biologist automating therapeutic target discovery. bioRxiv 657658

doi: 10.1101/2025.06.03.657658
[94]

Cong L, Smerkous D, Wang X, Yin D, Zhang Z, et al. 2025. LabOS: the AI-XR co-scientist that sees and works with humans. arXiv 2510.14861v2

doi: 10.48550/arXiv.2510.14861
[95]

Zhu L, Lai Y, Xie J, Mou W, Huang L, et al. 2025. Evaluating the potential risks of employing large language models in peer review. Clinical and Translational Discovery 5(4):e70067

doi: 10.1002/ctd2.70067
[96]

Zhu L, Lai Y, Mou W, Zhang H, Lin A, et al. 2024. ChatGPT's ability to generate realistic experimental images poses a new challenge to academic integrity. Journal of Hematology & Oncology 17(1):27

doi: 10.1186/s13045-024-01543-8
[97]

Rudin C. 2019. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence 1(5):206−215

doi: 10.1038/s42256-019-0048-x
[98]

Kim Y, Jeong H, Chen S, Li SS, Park C, et al. 2025. Medical hallucinations in foundation models and their impact on healthcare. arXiv 2503.05777v2

doi: 10.48550/arXiv.2503.05777
[99]

Zhao H, Chen H, Yang F, Liu N, Deng H, et al. 2024. Explainability for large language models: a survey. ACM Transactions on Intelligent Systems and Technology 15(2):1−38

doi: 10.1145/3639372
[100]

Atti S, Subramaniam S. 2025. Fundamental limitations of foundation models in single-cell transcriptomics. bioRxiv 661767

doi: 10.1101/2025.06.26.661767
[101]

Li H, Zhang Z, Squires M, Chen X, Zhang X. 2025. scMultiSim: simulation of single-cell multi-omics and spatial data guided by gene regulatory networks and cell–cell interactions. Nature Methods 22(5):982−993

doi: 10.1038/s41592-025-02651-0
[102]

Li CP, Kalisa AT, Roohani S, Hummedah K, Menge F, et al. 2025. The imitation game: large language models versus multidisciplinary tumor boards: benchmarking AI against 21 sarcoma centers from the ring trial. Journal of Cancer Research and Clinical Oncology 151(9):248

doi: 10.1007/s00432-025-06304-9
[103]

Zhang Z, Zhou Z, Jin R, Cong L, Wang M. 2025. GeneBreaker: jailbreak attacks against DNA language models with pathogenicity guidance. arXiv 2505.23839v1

doi: 10.48550/arXiv.2505.23839
[104]

Wang M, Dupré la Tour T, Watkins O, Makelov A, Chi RA, et al. 2025. Persona features control emergent misalignment. arXiv 2506.19823v2

doi: 10.48550/arXiv.2506.19823
[105]

Guo W, Kundu J, Tos U, Kong W, Sisto G, et al. 2025. System-performance and cost modeling of large language model training and inference. arXiv 2507.02456v1

doi: 10.48550/arXiv.2507.02456
[106]

Wang Y, He J, Du Y, Chen X, Li JC, et al. 2025. Large language model is secretly a protein sequence optimizer. arXiv 2501.09274v2

doi: 10.48550/arXiv.2501.09274
[107]

Gao Y, Xiong Y, Gao X, Jia K, Pan J, et al. 2024. Retrieval-augmented generation for large language models: a survey. arXiv 2312.10997v5

doi: 10.48550/arXiv.2312.10997
[108]

Wang C, Long Q, Xiao M, Cai X, Wu C, et al. 2024. BioRAG: a RAG-LLM framework for biological question reasoning. arXiv 2408.01107v2

doi: 10.48550/arXiv.2408.01107
[109]

Jeong M, Sohn J, Sung M, Kang J. 2024. Improving medical reasoning through retrieval and self-reflection with retrieval-augmented large language models. arXiv 2401.15269v3

doi: 10.48550/arXiv.2401.15269
[110]

Anthropic Public Benefit Corporation (Anthropic PBC). 2024. Introducing the model context protocol, Anthropic PBC, USA. www.anthropic.com/news/model-context-protocol

[111]

Khoei TT, Ehtesham A, Kumar S, Khoei TT. 2025. A survey of the model context protocol (MCP): standardizing context to enhance large language models (LLMs). Preprints

doi: 10.20944/preprints202504.0245.v1
[112]

Hou X, Zhao Y, Wang S, Wang H. 2025. Model context protocol (MCP): landscape, security threats, and future research directions. arXiv 2503.23278v3

doi: 10.48550/arXiv.2503.23278
[113]

Haase J, Pokutta S. 2026. Human − AI cocreativity: exploring synergies across levels of creative collaboration. In Generative Artificial Intelligence and Creativity, eds. Worwood MJ, Kaufman JC. Amsterdam: Elsevier. pp. 205−221 doi: 10.1016/B978-0-443-34073-4.00009-5

[114]

Kim Y, Lee SJ, Donahue C. 2025. Amuse: human-AI collaborative songwriting with multimodal inspirations. arXiv 2412.18940v2

doi: 10.48550/arXiv.2412.18940
[115]

Wu A, Kuang K, Zhu M, Wang Y, Zheng Y, et al. 2024. Causality for large language models. arXiv 2410.15319v1

doi: 10.48550/arXiv.2410.15319
[116]

Liang H, Wang C, Yu H, Kirsch D, Pant R, et al. 2025. Real-time experiment-theory closed-loop interaction for autonomous materials science. Science Advances 11(27):eadu7426

doi: 10.1126/sciadv.adu7426
[117]

Bayley O, Savino E, Slattery A, Noël T. 2024. Autonomous chemistry: navigating self-driving labs in chemical and material sciences. Matter 7(7):2382−2398

doi: 10.1016/j.matt.2024.06.003