ARTICLE   Open Access    

Genetic diversity, population structure and core collection analysis of Hunan tea plant germplasm through genotyping-by-sequencing

More Information
  • Tea is considered to be a well-known and widely consumed beverage and Hunan province is rich in tea plant germplasm. In order to better conserve and utilize Hunan tea plant resources, 110 tea accessions from seven geographical origins were used to assess genetic diversity of Hunan tea plant germplasm through genotyping by sequencing (GBS) technology. As a result, a total of 311,044 high-quality single nucleotide polymorphism (SNP) markers were obtained. Population structure, phylogenetic relationships and principal component analysis (PCA) divided the entire accessions into three groups. The genetic diversity and population differentiation analysis showed that the mean observed heterozygosity (Ho) ranged from 0.16 to 0.24, while the mean polymorphic information content (PIC) ranged from 0.14 to 0.17, and mean minor allele frequency (MAF) ranged from 0.11 to 0.14. Analysis of molecular variance (AMOVA) indicated that 81.38% of the total variance was derived from within populations, which suggested a rich genetic diversity in Hunan tea germplasms. Furthermore, a core tea germplasm set was developed, which was comprised of 22 tea plant accessions and maintained the whole genetic diversity of the entire collection. This work should be valuable for conservation and utilization of tea germplasm in Hunan.
  • Salvia rosmarinus L. (old name Rosmarinus officinalis), common name Rosemary thrives well in dry regions, hills and low mountains, calcareous, shale, clay, and rocky substrates[1]. Salvia rosmarinus used since ancient times in traditional medicine is justified by its antiseptic, antimicrobial, anti-inflammatory, antioxidant, and antitumorigenic activity[1,2]. The main objective of the study is to evaluate the antimicrobial activity of different extracts of Salvia rosmarinus in vitro, and its compounds related to in silico targeting of enzymes involved in cervical cancer. Since the start of the 20th century, some studies have shown that microbial infections can cause cervical cancers worldwide, infections are linked to about 15% to 20% of cancers[3]. More recently, infections with certain viruses like Human papillomaviruses (HPV) and Human immunodeficiency virus (HIV), bacteria like Chlamydia trachomatis, and parasites like schistosomiasis have been recognized as risk factors for cancer in humans[3]. Then again, cancer cells are a group of diseases characterized by uncontrolled growth and spread of abnormal cells. Many things are known to increase the risk of cancer, including dietary factors, certain infections, lack of physical activity, obesity, and environmental pollutants[4]. Some studies have found that unbalanced common flora Lactobacillus bacteria around the reproductive organ of females increases the growth of yeast species (like Candida albicans) and some studies have found that women whose blood tests showed past or current Chlamydia trachomatis infection may be at greater risk of cervical cancer. It could therefore be that human papillomavirus (HPV) promotes cervical cancer growth[3]. Salvia rosmarinus is traditionally a healer chosen as a muscle relaxant and treatment for cutaneous allergy, tumors, increases digestion, and the ability to treat depressive behavior; mothers wash their bodies to remove bacterial and fungal infections, promote hair growth, and fight bad smells[5] .

    The study of plant-based chemicals, known as phytochemicals, in medicinal plants is gaining popularity due to their numerous pharmacological effects[6] against drug resistance pathogens and cancers. The causes of drug resistance to bacteria, fungi, and cancer are diverse, complex, and only partially understood. The factors may act together to initiate or promote infections and carcinogenesis in the human body is the leading cause of death[7]. Antimicrobial medicines are the cornerstone of modern medicine. The emergence and spread of drug-resistant pathogens like bacteria and fungi threaten our ability to treat common infections and to perform life-saving procedures including cancer chemotherapy and cesarean sections, hip replacements, organ transplantation, and other surgeries[7]. On the other hand, information about the current magnitude of the burden of bacterial and fungal drug resistance, trends in different parts of the world, and the leading pathogen–drug combinations contributing to the microbial burden is crucial. If left unchecked, the spread of drug resistance could make many microbial pathogens much more lethal in the future than they are today. In addition to these, cancers can affect almost any part of the body and have many anatomies and molecular subtypes that each require specific management strategies to avoid or inhibit them. There are more than 200 different types of cancer that have been detected. The world's most common cancers affecting men are lung, prostate, colorectal, stomach, and liver cancers[8]. While breast, cervix, colorectal, lung, and stomach cancers are the most commonly diagnosed among women[8]. Although some cancers said to be preventable they seem to still be one of the causes of death to humans, for example cervical cancer. The need to fill the gap to overcome the problem of searching for antimicrobials and anticancers from one source of Salvia rosmarinus is of importance.

    Cervical cancer is a common cancer in women and a prominent cause of death[9]. In Ethiopia, cervical cancer is a big deal for women aged 15 to 44, coming in as the second most common cancer[9]. Globally, it's the fourth most common prevalent disease for women[10]. Aberrant methylation of tumor-suppressor genes' promoters can shut down their important functions and play a big role in causing cervical tumors[10]. There are various cervical cancer repressor genes (proteins turn off or reduce gene expression from the affected gene), such as CCNA1, CHF, HIT, PAX1, PTEN, SFRP4, and TSC1. The genes play a crucial role in causing cervical cancer by regulating transcription and expression through promoter hypermethylation, leading to precursor lesions during cervical development and malignant transformation[11]. The process of DNA methylation is primarily carried out by a group of enzymes known as DNA methyltransferases (DNMT1). It has been reported that DNMT1 (PDB ID: 4WXX), a protein responsible for DNA methylation can contribute to the development of cervical cancer. DNMT1 inhibits the transcription of tumor suppressor genes, facilitating tumorigenesis, which finally develops into cervical cancer. Tumor suppressor gene transcription is inhibited by DNMT1, which helps cancer grow and eventually leads to cervical cancer. Repressive genes' hypermethylation may be decreased, their expression can be increased, and the phenotype of malignant tumors can be reversed by inhibiting the DNMT1 enzyme.

    On the other hand, infection by the human papilloma virus (HPV) phenotype 16, enzyme 6 (PDB ID: 4XR8) has been correlated with a greatly increased risk of cervical cancer worldwide[12]. Based on variations in the nucleotide sequences of the virus genome, over 100 distinct varieties of the human papilloma virus (HPV) have been identified (e.g. type 1, 2 etc.). Genital warts can result from certain types 6 and 11 of sexually transmitted HPVs. Other HPV strains, still, that can infect the genitalia, do not show any symptoms of infection[8]. Persistent infection with a subset of approximately 13 so-called 'high-risk' sexually transmitted HPVs, including such as types 16, 18, 31, 33, 35, 39, 45, 51, 52, 56, 58, 59, and 68 different from the ones that cause warts may lead to the development of cervical intraepithelial neoplasia (CIN), vulvar intraepithelial neoplasia (VIN), penile intraepithelial neoplasia (PIN), and/or anal intraepithelial neoplasia (AIN). These are precancerous lesions and can progress to invasive cancer. Almost all occurrences of cervical cancer have HPV infection as a required component[13]. Superfluous infection by HPV type 16 E6 (PDB ID: 4XR8) has been correlated with a greatly increased genital risk of precursor cervical cancer worldwide[11]. Scholars more defined in major biochemical and biological activities of HPV type 16 E6 (PDB ID: 4XR8) in high-risk HPV oncogenes and how they may work together in the development of cervical disease and cancer[13].

    One potential approach to treat cervical cancer is to inhibit the activity of the DNMT1 and HPV type 16 E6 enzymes specifically[1316]. Over 50% of clinical drug forms worldwide originate from plant compounds[17]. In the past, developing new drugs was a lengthy and costly process. However, with the emergence of bioinformatics, the use of computer-based tools and methods have become increasingly important in drug discovery. One such method is molecular docking and ADMET profiling which involves using the structure of a drug to screen for potential candidates. This approach is known as structure-based drug design and can save both time and resources during the research process[15]. Structural-based drug designing addresses ligand binding sites with a known protein structure[15]. Using free binding energies, a computational method known as docking examines a large number of molecules and suggests structural theories for impeding the target molecule[17]. Nowadays, due to increasing antibiotic resistance like bacteria, fungi, and cancer cells, natural products remain an important source for discovering antimicrobial compounds and novel drugs for anti-cancers like cervical cancers. Therefore, the purpose of this research is to assess the antimicrobial activity of extracts, molecular docking, ADMET profiling in anticancer properties of compounds isolated from Salvia rosmarinus, on a targeting DNMT1 and HPV type 16 E6 in human cervical cancer. In the present study, various solvent crude extracts obtained from Salvia rosmarinus were used for antimicrobial activity and the isolated compounds 1 and 2 were submitted for in silico study to target the DNMT1 and HPV type 16 E6 enzymes to inhibit the growth of human cervical cancer cells.

    Healthy Salvia rosmarinus leaves were collected in Bacho district, Southwest Showa, Oromia, Ethiopia, during the dry season of November 2022. The plant materials were authenticated by Melaku Wondafrish, Natural Science Department, Addis Ababa University and deposited with a voucher number 3/2-2/MD003-80/8060/15 in Addis Ababa University's National Herbarium.

    The most common organic solvent used in extractions of medicinal plants is 2.5 L of petroleum ether, chloroform/methanol (1:1), and methanol. The test culture medium for microbes was used and performed in sterile Petri dishes (100 mm diameter) containing sterile Muller–Hinton Agar medium (25 mL, pH 7) and Sabouraud Dextrose Agar (SDA) for bacteria and fungi, respectively. A sterile Whatman filter paper (No. 1) disc of 6 mm diameter was used to determine which antibiotics an infective organism is sensitive to prescribed by a minimum zone of inhibition (MZI). Ciprofloxacin antibiotic reference (manufactured by Wellona Pharma Ciprofloxacin tablet made in India) and Ketoconazole 2% (made in Bangladesh) were used as a positive controls for antibacterial and antifungal, respectively and Dimethyl sulfoxide (DMSO) 98.9% was used as a negative control for antimicrobial tests. In the present study, the height of the column was 650 mm and the width was 80 mm. Several studies by previous researchers showed the acceptable efficiency of column chromatography (up to 43.0% w/w recovery) in the fractionation and separation of phenolic compounds from plant samples[18]. In column chromatography, the ideal stationary phase used silica gel 60 (0.200 mm) particles. The 1H-NMR spectrums of the compounds were analyzed using a 600 MHz NMR machine and 150 MHz for 13C NMR. The compounds were dissolved in MeOD for compound 1 and in DMSO for compound 2 for NMR analysis. On the other hand, UV spectroscopy (made in China) used 570 nm ultraviolet light to determine the absorbency of flavonoids (mg·g−1) phytochemicals.

    The samples (extracts) were analyzed to detect the presence of certain chemical compounds such as alkaloids (tested using Wagner's reagents), saponins (tested using the froth test), steroids (tested with Liebermann Burchard's tests), terpenoids (tested with Lidaebermann Burchard's tests), quinones, and flavonoids (tested using Shinoda tests)[19].

    The leaves of Salvia rosmarinus (500 g) were successively extracted using maceration using petroleum ether, chloroform/methanol (1:1), and methanol, every one 2.5 L for 72 h to afford 3.6, 6, and 53 g crude extracts, respectively. The methanol/chloroform (1:1) extract (6 g) was loaded to silica gel (150 g) column chromatography using the increasing polarity of petroleum ether, methanol/chloroform (1:1) solvent system to afford 80 fractions (100 mL each). The fraction obtained from chloroform/methanol 1:1 (3:2) after repeated column chromatography yielded compound 1 (18 mg). Fractions 56-65, eluted with chloroform/methanol (1:1) were combined and purified with column chromatography to give compound 2 (10 mg).

    The microorganisms were obtained from the Ethiopia Biodiversity Institution (EBI). Two gram-positive bacteria namely Staphylococcus aureus serotype (ATCC 25923) and Streptococcus epidermidis (ATCC14990); and three gram-negative bacteria, namely Escherichia coli (ATCC 25922), Pseudomonas aeruginosa (ATCC 5702), and Klebsiella pneumonia (ATCC e13883) were inoculated overnight at 37 °C in Muller–Hinton Agar/MHA culture medium and two fungus strains of Candida albicans (ATCC 16404) and Aspergillus niger (ATCC 11414) were inoculated overnight at 27−30 °C in Sabouraud Dextrose Agar/SDA culture medium[20].

    The antibacterial and antifungal activities of different crude extracts obtained from Salvia rosmarinus plant leaves were evaluated by the disk diffusion method (in accordance with the 13th edition of the CLSI M02 document on hardydiagnostics.com/disk-diffusion). Briefly, the test was performed in sterile Petri dishes (100 mm diameter) containing solid and sterile Muller–Hinton Agar medium (25 mL, pH 7) and Sabouraud Dextrose Agar (SDA) for bacteria and fungi, respectively. The extracts were placed on the surface of the media that had previously been injected with a sterile microbial suspension (one microbe per petri dish) after being adsorbed on sterile paper discs (5 μL per Whatman disc of 6 mm diameter). To prevent test samples from eventually evaporating, all Petri dishes were sealed with sterile laboratory films. They were then incubated at 37 °C for 24 h, and the zone diameter of the inhibition was measured and represented in millimeters. Ciprofloxacin antibiotic reference (manufactured by Wellona Pharma Ciprofloxacin tablet, India) was used as a positive control and DMSO was used as a negative control for antibacterial activity test while Ketoconazole 2% (Bangladesh) was used as a positive control and 10 μL of 0.2% agar as a negative control for antifungal activity tests[20]. The term 'inhibitory concentration' refers to the minimum sample concentration required to kill 99.9% of the microorganisms present[21]. Three repetitions of the crude extract sample were used to precisely measure the inhibitory halo diameter (in mm), which was then expressed as mean ± standard deviation to assess the anti-microbial activity.

    Cervical cancer-causing protein was identified through relevant literature. The protein molecule structure of DNA (cytosine-5)-methyltransferase 1 (DNMT1) (PDB ID: 4WXX)[21] and HPV type 16 E6 (PDB ID: 4XR8)[21] - a protein known to cause cervical cancer - were downloaded from the Protein Data Bank[22]. The stability of the protein molecule was assessed using Rampage[23].

    Phytochemical constituents of Salvia rosmarinus plant leaves were used to select a source of secondary metabolites (ligands). Ligand molecules were obtained through plant extraction, and isolation, and realized with PubChem (https://pubchem.ncbi.nlm.nih.gov/). The ligands were downloaded in Silver diamine fluoride format (SDF) and then converted to PDB format using an online SMILES translator (https://cactus.nci.nih.gov/translate/). The downloaded files were in PDB format, which was utilized for running various tools and software[24].

    The Biovia Discovery Studio Visualizer software was used to analyze the protein molecule. The protein molecule was converted into PDB format and its hierarchy was analyzed by selecting ligands and water molecules. Both the protein molecule and the water molecules lost their attached ligands during the analysis. Finally, the protein's crystal structure was saved in a PDB file[25].

    PyRx software was utilized to screen secondary metabolites and identify those ligands with the lowest binding energy to the protein target. The ligands with the lowest binding energy were further screened for their drug-likeliness property through analysis. It is worth noting that PyRx runs on PDBQT format. To begin using PyRx, it needs to load a protein molecule. This molecule should be converted from PDB to the protein data bank, partial charge (Q), and Atom Type (PDBQT) format. Once the protein molecule is loaded, it can import ligands from a specific folder in Silver diamine fluoride format. The ligand energy was minimized and changed to PDBQT format. The protein was docked with the ligand and screened based on minimum binding energy (https://cactus.nci.nih.gov/translate/).

    The optimal ligand was selected for final docking using AutoDock Vina and Biovia by modifying the reference of Discovery Studio Client 2021 (https://cactus.nci.nih.gov/translate/).

    The protein target from the Protein Data Bank (PDB) was loaded onto the graphical interface of AutoDock Vina. To prepare the protein for docking, water molecules were removed, hydrogen polar atoms were added, and Kollman charges were assigned to the protein molecule. Ultimately, PDBQT format was used to store the protein. After being imported in PDB format, the Ligand molecule was transformed to PDBQT format. Next, a grid box was chosen to represent the docked region. The command prompt was used to run AutoDock Vina and the outcomes were examined (https://cactus.nci.nih.gov/translate/).

    Docking the ligand with the protein target DNMT1(PDB ID: 4WXX)[22] and HPV type 16 E6 (PDB ID: 4XR8)[21] enzymes were performed using Biovia Discovery Studio Client 2021 by loading the protein target first followed by the ligand in PDB format. The charges were attached to the protein molecule, and the energy was minimized for the ligands. Both the protein and ligand molecules were prepared for docking. Once the docking process was complete, the results were analyzed based on several parameters, including absolute energy, clean energy, conf number, mol number, relative energy, and pose number. The interaction between the protein and ligand was analyzed using structure visualization tools, such as Biovia Discovery Studio Visualizer and PyMol (https://cactus.nci.nih.gov/translate/).

    The process of visualizing the structure was carried out using the PyMol tool. PyMol is a freely available software. Firstly, the protein molecule in PDBQT form was loaded on the PyMol graphical screen. Then, the output PDBQT file was added. The docked structure was visualized and the 'molecule' option was changed to 'molecular surface' under the 'shown as' menu (https://cactus.nci.nih.gov/translate/).

    Drug likeliness properties of the screened ligands were evaluated using the SwissADME online server. SMILE notations were obtained from PubChem and submitted to the SwissADME web server for analysis. The drugs were subjected to Lipinski's rule of five[20] for analysis. Lipinski's rules of five were selected for final docking through AutoDock Vina and Biovia Discovery Studio Client 2021. Ligands 1 and 2 were analyzed using Lipinski's rule of five for docking with AutoDock Vina and Biovia Discovery Studio Client 2021.

    The antimicrobial analysis data generated by triplicate measurements reported as mean ± standard deviation, and a bar graph also generated by GraphPad Prism version 8.0.1 (244) for Windows were used to perform the analysis. GraphPad Prism was used and combined with scientific graphing, comprehensive bar graph fitting (nonlinear regression), understandable statistics, and data organization. Prism allows the performance and modification of basic statistical tests commonly used and determined through the statistical applications in microbiology labs (https://graphpad-prism.software.informer.com/8.0/).

    Phytochemical screening of the different extracts for the presence (+) and absence (−) of alkaloids, steroids, glycosides, coumarins, terpenoids, flavonoids, carbohydrates, tannins, and saponins were done. The present study showed that alkaloids, terpenoids, flavonoids, and tannins tests in S. rosmarinus leaves of petroleum ether, chloroform/methanol (1:1), and methanol extracts were high whereas glycoside, coumarins, and carbohydrates had a moderate presence. The extract of S. rosmarinus leaves contain commonly bioactive constituents such as alkaloids, steroids, terpenoids, flavonoids, tannins, and saponins. These bioactive chemicals have active medicinal properties. Phytochemical compounds found in S. rosmarinus leaves have the potential to treat cancer cells and pathogens. The study also found that these flavonoids are related to natural phenolic compounds with anticancer and antimicrobial properties in the human diet (Table 1).

    Table 1.  Phytochemical screening tests result of petroleum ether, chloroform/methanol (1:1) and methanol extracts of Salvia rosmarinus leaves.
    Botanical name Phytochemicals Phytochemical screening tests Different extracts
    Petroleum ether Chloroform/methanol (1:1) Mehanol
    Salvia rosmarinus Alkaloids Wagner's test ++ ++ ++
    Steroids Libermann Burchard test ++ + ++
    Glycoside Keller-Killiani test +
    Coumarins Appirade test + +
    Terpenoids Libermann Burchard test ++ ++ ++
    Flavonoids Shinoda test ++ ++ ++
    Carbohydrate Fehling's test ++ ++
    Tannins Lead acetate test ++ ++ ++
    Saponins Foam test + + +
    + indicates moderate presence, ++ indicates highly present, − indicates absence.
     | Show Table
    DownLoad: CSV

    Two compounds were isolated and characterized using NMR spectroscopic methods (Fig. 1 & Supplementary Fig. S1ac). Compound 1 (10 mg) was isolated as yellow crystals from the methanol/chloroform (1:1) leaf extract of Salvia rosmarinus. The TLC profile showed a spot at Rf 0.42 with methanol/chloroform (3:2) as a mobile phase. The 1H-NMR spectrum (600 MHz, MeOD, Table 2, Supplementary Fig. S1a) of compound 1 showed the presence of one olefinic proton signal at δ 5.3 (t, J = 3.7 Hz, 1H), two deshielded protons at δ 4.7 (m, 1H), and 4.1 (m, 1H) associated with the C-30 exocyclic methylene group, and one O-bearing methine proton at δH 3.2 (m, 1H), and six methyl protons at δ 1.14 (s, 3H), 1.03 (d, J = 6.3 Hz, 3H), 1.00 (s, 3H), 0.98 (s, 3H), 0.87 (s, 3H), and 0.80 (s, 3H). A proton signal at δ 2.22 (d, J = 13.5 Hz, 1H) was attributed to methine proton for H-18. Other proton signals integrate for 20 protons were observed in the range δ 2.2 to 1.2. The proton decoupled 13C-NMR and DEPT-135 spectra (151 MHz, MeOD, Supplementary Fig. S1b & c) of compound 1 revealed the presence of 30 well-resolved carbon signals, suggesting a triterpene skeleton. The analysis of the 13C NMR spectrum displayed signals corresponding to six methyl, nine methylene, seven methine, and eight quaternary carbons. Among them, the signal observed at δ 125.5 (C-12) belongs to olefinic carbons. The methylene carbon showed signals at δC 39.9, 28.5, 18.1, 36.7, 23.9, 30.4, 26.5, 32.9, and 38.6. The quaternary carbons showed a signal at δC 39.4, 41.9, 38.4, 138.2, 41.8, and 47.8. The signals of exocyclic methylene carbon signals appeared at δ 153.1 and 103.9. The spectrum also showed sp3 oxygenated methine carbon at δ 78.3 and carboxyl carbon at δ 180.2. The spectrum revealed signals due to methyl groups at δC 27.4, 16.3, 15.0, 20.2, 22.7, and 16.4. The remaining carbon signals for aliphatic methines were shown at δC 55.3, 55.2, 53.0, and 37.1. The NMR spectral data of compound 1 is in good agreement with data reported for micromeric acid, previously reported from the same species by Abdel-Monem et al.[26]. (Fig. 1, Table 2).

    Figure 1.  Structure of isolated compounds from the leaves of Salvia rosmarinus.
    Table 2.  Comparison of the 13C-NMR spectral data of compound 1 and micromeric acid (MeOD, δ in ppm).
    Position NMR data of compound 1 Abdel-Monem
    et al.[26]
    1H-NMR 13C-NMR 13C-NMR
    1 38.60 39.9
    2 27.8 28.5
    3 3.2 (m, 1H) 78.3 80.3
    4 39.4 39.9
    5 55.3 56.7
    6 18.1 18.3
    7 36.7 34.2
    8 41.9 40.7
    9 53 48.8
    10 38.4 38.2
    11 23.9 24.6
    12 5.3 (t, J = 3.7 Hz, 1H) 125.5 127.7
    13 138.2 138
    14 41.8 43.3
    15 30.4 29.1
    16 26.5 25.6
    17 47.8 48
    18 δ 2.22 (d, J = 13.5 Hz, 1H) 55.2 56.1
    19 37.1 38.7
    20 153.1 152.8
    21 32.9 33.5
    22 39.0 40.1
    23 27.4 29.4
    24 16.3 16.9
    25 15.0 16.6
    26 20.2 18.3
    27 22.7 24.6
    28 180.2 177.8
    29 16.4 17.3
    30 4.7 (m, 1H), and 4.1 (m, 1H) 103.9 106.5
     | Show Table
    DownLoad: CSV

    Compound 2 (18 mg) was obtained as a white amorphous isolated from 40% methanol/chloroform (1:1) in petroleum ether fraction with an Rf value of 0.49. The 1H NMR (600 MHz, DMSO, Supplementary Fig. S2a) spectral-data showed two doublets at 7.79 (d, J = 8.7 Hz, 2H), and 6.90 (d, J = 8.7 Hz, 2H) which are evident for the presence of 1,4-disubstituted aromatic group. The oxygenated methylene and terminal methyl protons were shown at δ 4.25 (q, J = 7.1 Hz, 2H) and 1.29 (t, J = 7.1 Hz, 3H), respectively. The13C-NMR spectrum, with the aid of DEPT-135 (151 MHz, DMSO, Table 3, Supplementary Fig. S2b & c) spectra of compound 2 confirmed the presence of well-resolved seven carbon peaks corresponding to nine carbons including threee quaternary carbons, one oxygenated methylene carbon, one terminal methyl carbon, and two symmetrical aromatic methine carbons. The presence of quaternary carbon signals was shown at δ 120.9 (C-1), 148.2 (C-4), and ester carbonyl at δ 166.0 (C-7). The symmetry aromatic carbons signal was observed at δ 131.4 (C-2, 6), and 116.8 (C-3, 5). The oxygenated methylene and terminal methyl carbons appeared at δC 60.4 (C-8) and 14.7 (C-9), respectively. The spectral results provided above were in good agreement with those for benzocaine in the study by Alotaibi et al.[27]. Accordingly, compound 2 was elucidated to be benzocaine (4-Aminobenzoic acid-ethyl ester) (Table 3, Fig. 1, Supplementary Fig. S2ac), this compound has never been reported before from the leaves of Salvia rosmarinus.

    Table 3.  Comparison of the 1H-NMR, and 13C-NMR spectral data of compound 2 and benzocaine (DMSO, δ in ppm).
    Position NMR data of compound 2 Alotaibi et al.[27]
    1H-NMR 13C-NMR 1H-NMR 13C-NMR
    1 120.9 119
    2 7.79 (d, J = 8.7 Hz, 2H) 131.4 7.86 (d, J = 7.6 Hz) 132
    3 6.90 (d, J = 8.7 Hz, 2H) 116.8 6.83 (d, J = 7.6 Hz) 114
    4 148.2 151
    5 6.90 (d, J = 8.7 Hz, 2H) 116.8 6.83 (d, J = 7.6 Hz) 114
    6 7.79 (d, J = 8.7 Hz, 2H) 131.4 7.86 (d, J = 7.6 Hz) 132
    7 166.0 169
    8 4.3 (q, J = 7.1 Hz, 2H) 60.4 4.3 (q, J = 7.0 Hz) 61
    9 1.3 (t, J = 7.1 Hz, 3H) 14.7 1.36 (t, J = 7.0 Hz) 15
     | Show Table
    DownLoad: CSV

    The extracts and isolated compounds from Salvia rosmarinus were evaluated in vitro against microbes from gram-positive bacteria (S. aureus and S. epidermidis), gram-negative bacteria (E. coli, P. aeruginosa, and K. pneumoniae) and fungi (C. albicans and A. Niger) (Table 4). The petroleum ether extracts exhibited significant activity against all the present study-tested microbes at 100 μg·mL−1, resulting in an inhibition zone ranging from 7 to 21 mm. Chloroform/methanol (1:1) and methanol extracts demonstrated significant activity against all the present study-tested microbes at 100 μg·mL−1 exhibiting inhibition zones from 6 to 14 mm and 6 to 13 mm, respectively (Table 4). The chloroform/methanol (1:1) extracts were significantly active against bacteria of E. coli and K. pneumonia, and A. Niger fungi at 100 μg·mL−1. On the other hand, chloroform/methanol (1:1) extracts were significantly inactive against the S. rosmarinus and P. aeruginosa of bacteria and C. albicans of fungi, and again chloroform/methanol (1:1) extracts overall significantly active produced an inhibition zone of 12 to 14 mm (Table 4). Methanol extracts exhibited significant activity against S. aureus, E. coli bacteria, and A. Niger fungi at 100 μg·mL−1. The inhibition zone was recorded to be 11 to 13 mm. However, methanol extracts exhibited significant inactivity against K. pneumoniae (Table 4). The overall result of our studies shows that Salvia rosmarinus was extracted and evaluated in vitro, exhibiting significant antibacterial and antifungal activity, with inhibition zones recorded between 6 to 21 mm for bacteria and 5 to 21 mm for fungi. In our study, the positive control for ciprofloxacin exhibited antibacterial activity measured at 21.33 ± 1.15 mm, 15.00 ± 0.00 mm, and 14.20 ± 0.50 mm for petroleum ether, chloroform/methanol (1:1), and methanol extracts, respectively. Similarly, the positive control for ketoconazole demonstrated antifungal activity of 22.00 ± 1.00 mm, 13.67 ± 0.58 mm, and 15.00 ± 0.58 mm for petroleum ether, chloroform/methanol (1:1), and methanol extracts, respectively. Additionally, our findings indicated that the mean values of flavonoids (mg/g) tested were 92.2%, 90.4%, and 94.0% for petroleum ether, chloroform/methanol (1:1), and methanol extracts, respectively. This suggests that the groups of phenolic compounds evaluated play a significant role in antimicrobial activities, particularly against antibiotic-resistant strains.

    Table 4.  Comparison of mean zone of inhibition (MZI) leaf extracts of Salvia rosmarinus.
    Type of specimen, and standard antibiotics for
    each sample
    Concentration (μg·mL−1) of extract
    in 99.8% DMSO
    Average values of the zone of inhibition (mm)
    Gram-positive (+) bacteria Gram-negative (−) bacteria Fungai
    S. aurous S. epidermidis E. coli P. aeruginosa K. pneumoniae C. albicans A. niger
    Petroleum ether extracts
    S. rosmarinus 50 18.50 ± 0.50 15.33 ± 0.58 0.00 ± 0.00 0.00 ± 0.00 10.00 ± 0.00 15.93 ± 0.12 4.47 ± 0.50
    75 19.87 ± 0.06 17.00 ± 0.00 9.33 ± 0.29 10.53 ± 0.50 10.93 ± 0.12 18.87 ± 0.23 5.47 ± 0.50
    100 21.37 ± 0.78 17.50 ± 0.50 11.47 ± 0.50 13.17 ± 0.29 12.43 ± 0.51 20.83 ± 0.76 6.70 ± 0.10
    Standard antibiotics Cipro. 21.33 ± 1.15 18.33 ± 0.58 9.33 ± 0.58 12.30 ± 0.52 15.00 ± 0.00
    Ketocon. 22.00 ± 1.00 10.67 ± 0.58
    Chloroform/methanol (1:1) extracts
    50 5.47 ± 0.42 0.00 ± 0.00 10.33 ± 0.00 0.00 ± 0.00 9.70 ± 0.00 0.00 ± 0.12 8.47 ± 0.50
    S. rosmarinus
    75 5.93 ± 0.06 0.00 ± 0.00 11.33 ± 0.29 0.00 ± 0.50 12.50 ± 0.12 0.00 ± 0.23 10.67 ± 0.50
    100 6.47 ± 0.06 0.00 ± 0.00 14.17 ± 0.50 7.33 ± 0.29 14.17 ± 0.51 0.00 ± 0.76 12.67 ± 0.10
    Standard antibiotics Cipro. 15.00 ± 0.00 11.00 ± 1.00 11.33 ± 0.58 10.00 ± 0.52 12.67 ± 0.00
    Ketocon. 7.00 ± 1.00 13.67 ± 0.58
    Methanol extracts
    50 9.17 ± 0.29 5.50 ± 0.50 0.00 ± 0.00 7.50 ± 0.00 0.00 ± 0.00 6.57 ± 0.12 0.00 ± 0.50
    S. rosmarinus
    75 9.90 ± 0.10 6.93 ± 0.12 9.33 ± 0.29 8.50 ± 0.50 0.00 ± 0.00 8.70 ± 0.23 0.00 ± 0.50
    100 11.63 ± 0.55 7.97 ± 0.06 11.47 ± 0.50 9.90 ± 0.10 0.00 ± 0.00 10.83 ± 0.76 13.13 ± 0.10
    Standard antibiotics Cipro. 13.00 ± 0.00 11.50 ± 0.50 14.20 ± 0.58 13.33 ± 0.29 10.00 ± 0.00
    Ketocon. 12.00 ± 1.00 15.00 ± 0.58
    Mean values of flavonoids (mg·g−1) by 570 nm
    S. rosmarinus
    Petroleum ether extracts Chloroform/methanol (1:1) extracts Methanol extracts
    50 0.736 0.797 0.862
    75 0.902 0.881 0.890
    100 0.922 0.904 0.940
    Samples: Antibiotics: Cipro., Ciprofloxacin; Ketocon., ketoconazole (Nizoral); DMSO 99.8%, Dimethyl sulfoxide.
     | Show Table
    DownLoad: CSV

    Determining the three solvent extracts in S. rosmarinus plants resulted in relatively high comparable with positive (+) control. Especially, the S. rosmarinus petroleum ether leaf extracts against drug resistance human pathogenic bacteria S. aureus, S. epidermidis, E. coli, P. aeruginosa, and K. pneumoniae were minimum zone of inhibition (MZI) recorded that 21.37 ± 0.78, 17.50 ± 0.50, 11.47 ± 0.50, 13.17 ± 0.29, and 12.43 ± 0.51 mm, respectively and against human pathogenic fungi C. albicans and A. niger were minimum zone of inhibition (MZI) recorded that 20.83 ± 0.76 and 6.70 ± 0.10 mm, respectively which was used from bacteria against S. aureus MZI recorded that 21.37 ± 0.78 mm higher than the positive control (21.33 ± 1.15 mm). The S. rosmarinus of chloroform/methanol (1:1) extracts were found to be against E. coli (14.17 ± 0.50 mm) and K. pneumoniae (14.17 ± 0.51 mm) higher than the positive control 11.33 ± 0.58 and 12.67 ± 0.00 mm, respectively. The methanol extracts of leaves in the present study plants were found to have overall MZI recorded less than the positive control. The Salvia rosmarinus crude extracts showed better antifungal activities than the gram-negative (−) bacteria (Table 4, Fig 2, Supplementary Fig. S3). Therefore, the three extracts, using various solvents of different polarity indexes, have been attributed to specific biological activities. For example, the antimicrobial activities of Salvia rosmarinus extracts may be due to the presence of alkaloids, terpenoids, flavonoids, tannins, and saponins in natural products (Table 1).

    Figure 2.  Microbes' resistance with drugs relative to standard antibiotics in extracts of Salvia rosmarinus. The figures represent understudy of three extracts derived from Salvia rosmarinus. (a) Petroleum ether, (b) chloroform/methanol (1:1), and (c) methanol extracts tested in Salvia rosmarinus.

    Compounds 1 and 2 were isolated from chloroform/methanol (1:1) extract of Salvia rosmarinus (Fig. 1, Tables 2 & 3). The plant extract exhibited highest antibacterial results recorded a mean inhibition with diameters of 21 and 14 mm at a concentration of 100 mg·mL−1 against S. aureus and E. coli/K. pneumoniae, respectively. After testing, overall it was found that the highly active petroleum ether extract of Salvia rosmarinus was able to inhibit the growth of S. aureus and C. albicans, with inhibition zones of 21 and 20 mm, respectively. The petroleum ether extracts showed good efficacy against all tested microbes, particularly gram-positive bacteria and fungi (Table 4). This is noteworthy because gram-negative bacteria generally exhibit greater resistance to antimicrobial agents. Petroleum ether and chloroform/methanol (1:1) extracts of the leaves were used at a concentration of 100 mg·mL−1, resulting in impressive inhibition zone diameters of 11 and 14 mm for E. coli, 13 and 7 mm for P. aeruginosa, and 12 and 14 mm for K. pneumoniae, respectively.

    The present study found that at a concentration of 50 μg·mL−1, petroleum ether, chloroform/methanol (1:1), and MeOH extracts did not display any significant inhibition zone effects against the tested microbes. This implies that the samples have a dose-dependent inhibitory effect on the pathogens. The leaves of Salvia rosmarinus have been found to possess remarkable antimicrobial properties against gram-negative bacteria in different extracts such as E. coli, P. aeruginosa, and K. pneumoniae with 14.17 ± 0.50 in chloroform/methanol (1:1), 13.17 ± 0.29 in petroleum ether and 14.17 ± 0.51 in chloroform/methanol (1:1), respectively. However, in the present study, Salvia rosmarinus was found to possess remarkable high zones of inhibition with diameters of 21.37 ± 0.78 and 17.50 ± 0.50 mm antimicrobial properties against S. aureus, and S. epidermidis of gram-positive bacteria, respectively (Supplementary Fig. S3). The results are summarized in Fig. 2ac.

    The crystal structure of human DNMT1 (351-1600), classification transferase, resolution: 2.62 Å, PDB ID: 4WXX. Active site dimensions were set as grid size of center X = −12.800500 Å, center Y = 34.654981 Å, center Z = −24.870231 Å (XYZ axis) and radius 59.081291. A study was conducted to investigate the binding interaction of the isolated compounds 1 and 2 of the leaves of Salvia rosmarinus with the binding sites of the DNMT1 enzyme in human cervical cancer (PDB ID: 4WXX), using molecular docking analysis.

    The study also compared the results with those of standard anti-cancer agents Jaceosidin (Table 5 & Fig. 3). The compounds isolated had a final fixing energy extending from −5.3 to −8.4 kcal·mol−1, as shown in Table 4. It was compared to jaceosidin (–7.8 kcal·mol−1). The results of the molecular docking analysis showed that, compound 1 (−8.4 kcal·mol−1) showed the highest binding energy values compared with the standard drugs jaceosidin (–7.8 kcal·mol−1). Compound 2 has shown lower docking affinity (–5.3 kcal·mol−1) but good matching amino acid residue interactions compared to jaceosidin. After analyzing the results, it was found that the isolated compounds had similar residual interactions and docking scores with jaceosidin.

    Table 5.  Molecular docking results of ligand compounds 1 and 2 against DNMT1 enzyme (PDB ID: 4WXX).
    Ligands Binding affinity

    ( kcal·mol−1)
    H-bond Residual interactions
    Hydrophobic/electrostatic Van der Waals
    1 −8.4 ARG778 (2.85249), ARG778 (2.97417), VAL894 (2.42832) Lys-889, Pro-879, Tyr-865, His-795, Cys-893, Gly-760, Val-759, Phe-892, Phe-890, Pro-884, Lys-749
    2 −5.3 ARG596 (2.73996), ALA597 (1.84126), ILE422 (2.99493), THR424 (2.1965), ILE422 (2.93653) Electrostatic Pi-Cation-ARG595 (3.56619), Hydrophobic Alkyl-ARG595 (4.15839), Hydrophobic Pi-Alkyl-ARG595 (5.14967) Asp-423, Glu-428, Gly-425, Ile-427, Trp-464, Phe-556, Gln-560, Gln-594, Glu-559, Gln-598, Ser-563
    Jaceosidin −7.8 ASP571 (2.93566), GLN573 (2.02126), GLU562 (2.42376), GLN573 (3.49555), GLU562 (3.46629) Hydrophobic Alkyl-PRO574 (4.59409), Hydrophobic Alkyl-ARG690 (5.09748), Hydrophobic Pi-Alkyl-PHE576 (5.1314), Hydrophobic Pi-Alkyl-PRO574 (4.97072), Hydrophobic Pi-Alkyl-ARG690 (5.07356) Glu-698, Cys-691, Ala-695, Pro-692, Val-658, Glu-566, Asp-565
     | Show Table
    DownLoad: CSV
    Figure 3.  The 2D and 3D binding interactions of compounds against DNMT1 enzyme (PDB ID: 4WXX). The 2D and 3D binding interactions of compound 1 and 2 represent against DNMT1 enzyme, and jaceosidin (standard) against DNMT1 enzyme.

    Hence, compound 1 might have potential anti-cancer agents. However, anti-cancer in vitro analysis has not yet been performed. Promising in silico results indicate that further research could be beneficial. The 2D and 3D binding interactions of compounds 1 and 2 against human cervical cancer of DNMT1 enzyme (PDB ID: 4WXX) are presented in Fig. 3. The binding interactions between the DNMT1 enzyme (PDB ID: 4WXX), and compound 1 (Fig. 3) and compound 2 (Fig. 3) were displayed in 3D. Compounds and amino acids are connected by hydrogen bonds (green dash lines) and hydrophobic interactions (non-green lines).

    Crystal structure of the HPV16 E6/E6AP/p53 ternary complex at 2.25 Å resolution, classification viral protein, PDB ID: 4XR8. Active site dimensions were set as grid size of center X = −43.202782 Å, center Y = −39.085513 Å, center Z = −29.194115 Å (XYZ axis), R-value observed 0.196, and Radius 65.584122. A study was conducted to investigate the binding interaction of the isolated compounds 1 and 2 of the leaves of Salvia rosmarinus with the binding sites of the enzyme of human papilloma virus (HPV) type 16 E6 (PDB ID: 4XR8), using molecular docking analysis software. The study also compared the results with those of standard anti-cancer agents jaceosidin (Table 6 & Fig. 4). The compounds isolated had a bottom most fixing energy extending from −6.3 to −10.1 kcal·mol−1, as shown in Table 6. It was compared to jaceosidin (–8.8 kcal·mol−1). The results of the molecular docking analysis showed that, compound 1 (−10.1 kcal·mol−1) showed the highest binding energy values compared with the standard drugs jaceosidin (–8.8 kcal·mol−1). Compound 2 has shown lower docking affinity (–6.3 kcal·mol−1) but good matching amino acid residue interactions compared to jaceosidin. After analyzing the results, it was found that the isolated compounds had similar residual interactions and docking scores with jaceosidin.

    Table 6.  Molecular docking results of ligand compounds 1 and 2 against HPV type 16 E6 (PDB ID: 4XR8).
    Ligands Binding affinity
    (kcal·mol−1)
    H-bond Residual interactions
    Hydrophobic/electrostatic Van der Waals
    1 −10.1 ASN101 (2.25622), ASP228 (2.88341) Asp-148, Lys-176, Lys-180, Asp-178, Ile-179, Tyr-177, Ile-334, Glu-382, Gln-336, Pro-335, Gln-73, Arg-383, Tyr-100
    2 −6.5 TRP63 (1.90011), ARG67 (2.16075), ARG67 (2.8181) Hydrophobic Pi-Sigma-TRP341 (3.76182), Hydrophobic Pi-Pi Stacked-TYR156 (4.36581), Hydrophobic Pi-Pi T-shaped-TRP63 (5.16561), Hydrophobic Pi-Pi T-shaped-TRP63 (5.44632), Hydrophobic Alkyl-PRO155 (4.34691), Hydrophobic Pi-Alkyl-TRP341 (4.11391), Hydrophobic Pi-Alkyl-ALA64 (4.61525) Glu-154, Arg-345, Asp-66, Met-331, Glu-112, Lys-16, Trp-231
    Jaceosidin −8.8 ARG146 (2.06941), GLY70 (3.49991), GLN73 (3.38801) Electrostatic Pi-Cation-ARG67 (3.93442), Hydrophobic Pi-Alkyl-PRO49 (5.40012) Tyr-342, Tyr-79, Ser-338, Arg-129, Pro-335, Leu-76, Tyr-81, Ser-74, Tyr-71, Ser-80, Glu-46
     | Show Table
    DownLoad: CSV
    Figure 4.  The 2D and 3D binding interactions of compounds against HPV type 16 E6 (PDB ID: 4XR8). The 2D and 3D binding interactions of compound 1 and 2 represent against HPV type 16 E6 enzyme, and jaceosidin (standard) against HPV type 16 E6 enzyme.

    Hence, compounds 1 and 2 might have potential anti-cancer agents of HPV as good inhibitors. However, anti-cancer in vitro analysis has not been performed yet on HPV that causes cervical cancer agents. Promising in silico results indicate that further research could be beneficial. The 2D and 3D binding interactions of compounds 1 and 2 against human papilloma virus (HPV) type 16 E6 enzyme (PDB ID: 4XR8) are presented in Fig. 4. The binding interactions between the HPV type 16 E6 enzyme (PDB ID: 4XR8) and compound 1 (Fig. 4) and compound 2 (Fig. 4) were displayed in 3D. Compounds and amino acids are connected by hydrogen bonds (magenta lines) and hydrophobic interactions (non-green lines).

    In silico bioactivities of a drug, including drug-likeness and toxicity, predict its oral activity based on the document of Lipinski's Rule[25] was stated and the results of the current study showed that the compounds displayed conform to Lipinski's rule of five (Table 7). Therefore, both compounds 1 and 2 should undergo further investigation as potential anti-cancer agents. Table 8 shows the acute toxicity predictions, such as LD50 values and toxicity class classification (ranging from 1 for toxic, to 6 for non-toxic), for each ligand, revealing that none of them were acutely toxic. Furthermore, they were found to be similar to standard drugs. Isolated compound 1 has shown toxicity class classification 4 (harmful if swallowed), while 2 showed even better toxicity prediction giving results of endpoints such as hepatotoxicity, mutagenicity, cytotoxicity, and irritant (Table 8). All the isolated compounds were predicted to be non-hepatotoxic, non-irritant, and non-cytotoxic. However, compound 1 has shown carcinogenicity and immunotoxicity (Table 9). Hence, based on ADMET prediction analysis, none of the compounds have shown acute toxicity, so they might be proven as good drug candidates.

    Table 7.  Drug-likeness predictions of compounds computed by Swiss ADME.
    Ligands Formula Mol. Wt. (g·mol−1) NRB NHA NHD TPSA (A°2) Log P (iLOGP) Log S (ESOL) Lipinski's rule of five
    1 C30H46O3 454.68 1 3 2 57.53 3.56 −6.21 1
    2 C 9H11NO2 165.19 3 2 1 52.32 1.89 −2.21 0
    Jaceosidin C17H14O7 330.3 3 7 3 105 1.7 1 0
    NHD, number of hydrogen donors; NHA, number of hydrogen acceptors; NRB, number of rotatable bonds; TPSA, total polar surface area; and log P, octanol-water partition coefficients; Log S, turbid metric of solubility.
     | Show Table
    DownLoad: CSV
    Table 8.  Pre ADMET predictions of compounds, computed by Swiss ADME.
    Ligands Formula Skin permeation value
    (logKp - cm·s−1)
    GI
    absorption
    Inhibitor interaction
    BBB permeability Pgp substrate CYP1A2 inhibitor CYP2C19 inhibitor CYP2C9 inhibitor CYP2D6 inhibitor
    1 C30H46O3 −4.44 Low No No No No No No
    2 C 9H11NO2 −5.99 High Yes No No No No No
    Jaceosidin C17H14O7 −6.13 High No No Yes No Yes Yes
    GI, gastrointestinal; BBB, blood brain barrier; Pgp, P-glycoprotein; and CYP, cytochrome-P.
     | Show Table
    DownLoad: CSV
    Table 9.  Toxicity prediction of compounds, computed by ProTox-II and OSIRIS property explorer.
    Ligands Formula LD50
    (mg·kg−1)
    Toxicity
    class
    Organ toxicity
    Hepatotoxicity Carcinogenicity Immunotoxicity Mutagenicity Cytotoxicity Irritant
    1 C30H46O3 2,000 4 Inactive Active Active Inactive Inactive Inactive
    2 C 9H11NO2 NA NA Inactive Inactive Inactive Inactive Inactive Inactive
    Jaceosidin C17H14O7 69 3 Inactive Inactive Inactive Inactive Inactive Inactive
    NA, not available.
     | Show Table
    DownLoad: CSV

    Rosemary is an evergreen perennial plant that belongs to the family Lamiaceae, previously known as Rosmarinus officinalis. Recently, the genus Rosmarinus was combined with the genus Salvia in a phylogenetic study and became known as Salvia rosmarinus[28,29] and it has been used since ancient times for various medicinal, culinary, and ornamental purposes. In the field of food science, rosemary is well known as its essential oil is used as a food preservative, thanks to its antimicrobial and antioxidant properties, rosemary has many other food applications such as cooking, medicinal, and pharmacology uses[30]. According to the study, certain phytochemical compounds found in Salvia rosmarinus leaves have the potential to halt the growth of cancer cells, and pathogens or even kill them[31]. In literature, alkaloids are found mostly in fungi and are known for their strong antimicrobial properties, which make them valuable in traditional medicine[32,33]. However, in the present study, S. rosmarinus species have been shown to possess alkaloids. Most alkaloids have a bitter taste and are used to protect against antimalarial, antiasthma, anticancer, antiarrhythmic, analgesic, and antibacterial[33] also some alkaloids containing nitrogen such as vincristine, are used to treat cancer.

    Steroids occur naturally in the human body. They are hormones that help regulate our body's reaction to infection or injury, the speed of metabolism, and more. On the other hand, steroids are reported to have various biological activities such as chronic obstructive pulmonary disease (COPD), multiple sclerosis, and imitate male sex hormones[34]. It is a natural steroid compound occurring both in plants and animals[35]. Thus, were found in the present study. Terpenoids are derived from mevalonic acid (MVA) which is composed of a plurality of isoprene (C5) structural units. Terpenoids, like mono-terpenes and sesquiterpenes, are widely found in nature and more than 50,000 have been found in plants that reduce tumors and cancers. Many volatile terpenoids, such as menthol and perillyl alcohol, are used as raw materials for spices, flavorings, and cosmetics[36]. In the present study, high levels of these compounds were found in Salvia rosmarinus leaves.

    Flavonoids are a class of phenolic compounds commonly found in fruits and vegetables and are considered excellent antioxidants[37]. Similarly, the results of this study revealed that S. rosmarinus contain flavonoids. According to the literature, these flavonoids, terpenoids, and steroid activities include anti-diabetic, anti-inflammatory, anti-cancer, anti-bacterial, hepatic-protective, and antioxidant effects[36]. Tannins are commonly found in most terrestrial plants[38] and have the potential to treat cancer, and HIV/AIDS as well as to treat inflamed or ulcerated tissues. Similarly, in the present study, tannins were highly found in the presented plant. On the other hand, due to a sudden rise in the number of contagious diseases and the development of antimicrobial resistance against current drugs, drug development studies are vital to discovering novel medicinal compounds[30] and add to these cancer is a complex multi-gene disease[39] as in various cervical cancer repressor genes[11] that by proteins turn off or reduce gene expression from the affected gene to cause cervical cancer by regulating transcription and expression through promoter hypermethylation (DNMT1), leading to precursor lesions during cervical development and malignant transformation.

    In a previous study[40], a good antibacterial result was recorded at a median concentration (65 μg·mL−1). Methanol extract showed a maximum and minimum zone antibacterial result against negative bacteria E. coli 14 + 0.71 and most of the petroleum ether tests show null zone of inhibition. However, in the present study at a concentration of 100 μg·mL−1, the methanol extract demonstrated both maximum and minimum antibacterial zones against E. coli 11.47 ± 0.50. Conversely, the test conducted with petroleum ether exhibited a good zone of inhibition by increasing concentration. Further research may be necessary to determine the optimal concentration for this extract to maximize its efficacy. The results obtained in gram-negative bacteria such as E. coli, P. aeruginosa, and K. pneumoniae are consistent with previous research findings[41]. However, in the present study, Salvia rosmarinus has been found to possess high zones of inhibition with diameters of 21.37 ± 0.78 and 17.50 ± 0.50 mm antimicrobial properties against S. aureus, and S.epidermidis of gram-positive bacteria, respectively (Table 4 & Fig. 2, Supplementary Fig. S3). According to a previous study[42], the ethanolic leaf extract of Salvia rosmarinus did exhibit activity against C. albicans strains. In the present study, the antifungal activity of petroleum ether extracts from Salvia rosmarinus were evaluated against two human pathogenic fungi, namely C. albicans and A. niger. The findings showed that at a concentration of 100 μg·mL−1, the extracts were able to inhibit the growth of C. albicans 20.83 ± 0.76 resulting in a minimum zone of inhibition.

    Antimicrobial agents can be divided into groups based on the mechanism of antimicrobial activity. The main groups are: agents that inhibit cell wall synthesis, depolarize the cell membrane, inhibit protein synthesis, inhibit nucleic acid synthesis, and inhibit metabolic pathways in bacteria. On the other hand, antimicrobial resistance mechanisms fall into four main categories: limiting the uptake of a drug; modifying a drug target; inactivating a drug; and active drug efflux. Because of differences in structure, etc., there is a variation in the types of mechanisms used by gram-negative bacteria vs gram-positive bacteria. Gram-negative bacteria make use of all four main mechanisms, whereas gram-positive bacteria less commonly use limiting the uptake of a drug[43]. The present findings showed similar activity in chloroform/methanol (1:1) and methanol extracts of leaves of Salvia rosmarinus than gram-negative bacteria like P. aeruginosa and Klebsiella pneumoniae. However, Staphylococcus epidermidis of gram-positive bacteria under chloroform/methanol (1:1) extracts have similarly shown antimicrobial résistance. This occurred due to intrinsic resistance that may make use of limiting uptake, drug inactivation, and drug efflux that need further study. The structure of the cell wall thickness and thinners of gram-negative and gram-positive bacteria cells, respectively when exposed to an antimicrobial agent, there happen two main scenarios may occur regarding resistance and persistence. In the first scenario, resistant cells survive after non-resistant ones are killed. When these resistant cells regrow, the culture consists entirely of resistant bacteria. In the second scenario, dormant persistent cells survive. While the non-persistent cells are killed, the persistent cells remain. When regrown, any active cells from this group will still be susceptible to the antimicrobial agent.

    Ferreira et al.[44] explained that with molecular docking, the interaction energy of small molecular weight compounds with macromolecules such as target protein (enzymes), and hydrophobic interactions and hydrogen bonds at the atomic level can be calculated as energy. Several studies have been conducted showing natural products such as epigallocatechin-3-gallate-3-gallate (EGCG), curcumin, and genistein can be used as an inhibitor of DNMT1[4547] . In the literature micromenic (1) is used for antimicrobial activities and for antibiotic-resistance like methicillin-resistant Staphylococcus aureus (MRSA)[48], and benzocaine (2) is used to relieve pain and itching caused by conditions such as sunburn or other minor burns, insect bites or stings, poison ivy, poison oak, poison sumac, minor cuts, or scratches[49]. However, in the present study, Salvia rosmarinus was used as a source of secondary metabolites (ligands) by using chloroform/methanol (1:1) extract of the plant leaves yielded to isolate micromeric (1) and benzocaine (2) in design structure as a candidate for drugs as inhibitors of the DNMT1 enzyme by inhibiting the activity of DNMT1 that prevent the formation of cervical cancer cells.

    Cervical cancer is one of the most dangerous and deadly cancers in women caused by Human papillomaviruses (HPV). Some sexually transmitted HPVs (type 6 owner of E6) may cause genital warts. There are several options for the treatment of early-stage cervical cancer such as surgery, nonspecific chemotherapy, radiation therapy, laser therapy, hormonal therapy, targeted therapy, and immunotherapy, but there is no effective cure for an ongoing HPV infection. In the present study, Salvia rosmarinus leaves extracted and isolated compounds 1 and 2 are one of the therapeutic drugs design structure as a candidate drug for inhibiting HPV type 16 E6 enzyme. Similarly, numerous researchers have conducted studies on the impact of plant metabolites on the treatment of cervical cancer. Their research has demonstrated that several compounds such as jaceosidin, resveratrol, berberin, gingerol, and silymarin may be active in treating the growth of cells[47].

    Small-molecule drugs are still most commonly used in the treatment of cancer[50]. Molecular docking in in silico looks for novel small-molecule (ligands) interacting with genes or DNA or protein structure agents which are still in demand, newly designed compounds are required to have a specific even multi-targeted mechanism of action to anticancer and good selectivity over normal cells. In addition to these, in the literature, anti-cancer drugs are not easily classified into different groups[51]. Thus, drugs have been grouped according to their chemical structure, presumed mechanism of action, and cytotoxic activity related to cell cycle arrest, transcription regulation, modulating autophagy, inhibition of signaling pathways, suppression of metabolic enzymes, and membrane disruption[52]. Another problem for grouping anticancers often encountered is the resistance that may emerge after a brief period of a positive reaction to the therapy or may even occur in drug-naïve patients[50]. In recent years, many studies have investigated the molecular mechanism of compounds affecting cancer cells and results suggest that compounds exert their anticancer effects by providing free electron charge inhibiting some of the signaling pathways that are effective in the progression of cancer cells[53] and numerous studies have shown that plant-based compounds such as phenolic acids and sesquiterpene act as anticancer agents by affecting a wide range of molecular mechanisms related to cancer[53]. The present investigations may similarly support molecular mechanisms provided for the suppression of metabolic enzymes of cervical cancer.

    The main aim of the study was to evaluate the antimicrobial activity of different extracts of Salvia rosmarinus in vitro, and its compounds related to in silico targeting of enzymes involved in cervical cancer. The phytochemical screening tests indicated the presence of phytochemicals such as alkaloids, terpenoids, flavonoids, and tannins in its extracts. The plant also exhibited high antimicrobial activity, with varying efficacy in inhibiting pathogens in a dose-dependent manner (50−100 μg·mL−1). However, this extract exhibited a comparatively high inhibition zone in gram-positive and gram-negative bacteria had lower inhibition zones against E. coli, P. aeruginosa, and K. pneumoniae, respectively, and stronger antifungal activity 20.83 ± 0.76 mm inhibition zone against C. albicans fungi. Molecular docking is a promising approach to developing effective drugs through a structure-based drug design process. Based on the docking results, the in silico study predicts the best interaction between the ligand molecule and the protein target DNMT1 and HPV type 16 E6. Compound 1 (–8.3 kcal·mol−1) and 2 (–5.3 kcal·mol−1) interacted with DNMT1 (PDB ID: 4WXX) and the same compound 1 (–10.1 kcal·mol−1) and 2 (–6.5 kcal·mol−1) interacted with HPV type 16 E6 (PDB ID: 4XR8). Compounds 1 and 2 may have potential as a medicine for treating agents of cancer by inhibiting enzymes DNMT1 and HPV type 16 E6 sites, as well as for antimicrobial activities. None of the compounds exhibited acute toxicity in ADMET prediction analysis, indicating their potential as drug candidates. Further studies are required using the in silico approach to generate a potential drug through a structure-based drug-designing approach.

  • The authors confirm contribution to the paper as follows: all authors designed and comprehended the research work; plant materials collection, experiments performing, data evaluation and manuscript draft: Dejene M; research supervision and manuscript revision: Dekebo A, Jemal K; NMR results generation: Tufa LT; NMR data analysis: Dekebo A, Tegegn G; molecular docking analysis: Aliye M. All authors reviewed the results and approved the final version of the manuscript.

  • All data generated or analyzed during this study are included in this published article.

  • This work was partially supported by Adama Science and Technology University under Grant (ASTU/SP-R/171/2022). We are grateful for the fellowship support from Adama Science and Technology University (ASTU), the identification of plants by Mr. Melaku Wendafrash, and pathogenic strain support from the Ethiopian Biodiversity Institute (EBI). We also thank the technical assistants of the Applied Biology and Chemistry departments of Haramaya University (HU) for their help.

  • The authors declare that they have no conflict of interest.

  • Supplemental Table S1 Detail information of the 110 tea plant accesions collected in this study.
    Supplemental Table S2 Sequencing and identfied SNPs statistics of 110 tea plant accessions.
  • [1]

    Liu S, Liu H, Wu A, Hou Y, An Y, et al. 2017. Construction of fingerprinting for tea plant (Camellia sinensis) accessions using new genomic SSR markers. Molecular Breeding 37:93

    doi: 10.1007/s11032-017-0692-y

    CrossRef   Google Scholar

    [2]

    Wambulwa MC, Meegahakumbura MK, Kamunya S, Muchugi A, Möller M, et al. 2016. Insights into the genetic relationships and breeding patterns of the african tea germplasm based on nSSR markers and cpDNA sequences. Frontiers in Plant Science 7:1244

    doi: 10.3389/fpls.2016.01244

    CrossRef   Google Scholar

    [3]

    Xia E, Zhang H, Sheng J, Li K, Zhang Q, et al. 2017. The tea tree genome provides insights into tea flavor and independent evolution of caffeine biosynthesis. Molecular Plant 10:866−77

    doi: 10.1016/j.molp.2017.04.002

    CrossRef   Google Scholar

    [4]

    Liang Y, Shi M. 2015. Advances in tea plant genetics and breeding. Journal of Tea science 35:103−9

    doi: 10.13305/j.cnki.jts.2015.02.001

    CrossRef   Google Scholar

    [5]

    Barut M, Nadeem MA, Karaköy T, Baloch FS. 2020. DNA fingerprinting and genetic diversity analysis of world quinoa germplasm using iPBS-retrotransposon marker system. Turkish Journal of Agriculture and Forestry 44:479−91

    doi: 10.3906/tar-2001-10

    CrossRef   Google Scholar

    [6]

    Guney M, Kafkas S, Keles H, Zarifikhosroshahi M, Bujdoso G. 2021. Genetic diversity among some walnut (Juglans regia L.) genotypes by SSR markers. Sustainability 13:6830

    doi: 10.3390/su13126830

    CrossRef   Google Scholar

    [7]

    Savaş Tuna G, Yücel G, Kaygisiz Aşçioğul T, Ateş D, Eşİyok D, et al. 2020. Molecular cytogenetic characterization of common bean (Phaseolusvulgaris L.) accessions. Turkish Journal of Agriculture and Forestry 44:612−30

    doi: 10.3906/tar-1910-33

    CrossRef   Google Scholar

    [8]

    Chen T, Wang H, Luo J, Zheng D, Dai S, et al. 2017. Genetic diversity and relationship of tea germplasm resources Camellia sinensis var. assamica cv. Rucheng revealed by ISSR markers. Molecular Plant Breeding 17:16

    Google Scholar

    [9]

    Liu Z, Cheng Y, Yang P, Zhao Y, Ning J, Yang Y. 2020. Genetic diversity and structure of Chengbudong tea population revealed by nSSR and cpDNA markers. Journal of Tea Science 40:250−58

    Google Scholar

    [10]

    Wu Y, Deng T, Li J, Li Y, Liu S, et al. 2013. Genetic diversity of tea germplasm resource 'Huangjincha' (Camellia sinensis) revealed by AFLP analysis. Journal of Tea Science 33:526−31

    doi: 10.13305/j.cnki.jts.2013.06.013

    CrossRef   Google Scholar

    [11]

    Ni J, Li J, Dong L, Yang Y, Zhang S, et al. 2010. Genetic diversity and relationship of tea germplasm resources 'Huangjincha' (Camellia sinensis) revealed by ISSR markers. Journal of Tea Science 30:149−56

    doi: 10.13305/j.cnki.jts.2010.02.008

    CrossRef   Google Scholar

    [12]

    Yang P, Liu Z, Zhao Y, Cheng Y, Ning J, et al. 2021. Evaluation of Jianghua Kucha tea strains based on agronomic and SSR molecular marker relationship analysis. Molecular Plant Breeding 19:2402−9

    Google Scholar

    [13]

    Li D, Li D, Yang C, Wang Q, Luo J. 2012. Genetic diversity and relationship of tea germplasm resources Camellia sinensis var. assamica cv. Jianghua revealed by ISSR markers. Journal of Tea Science 32:135−41

    Google Scholar

    [14]

    Shen C, Huang Y, Huang Ja, Luo J, Liu C, Liu D. 2007. RAPD analysis for genetic diversity of typical tea populations in Hunan province. Chinese Journal of Agricultural Biotechnology 15:855−60

    doi: 10.1017/s147923620800199x

    CrossRef   Google Scholar

    [15]

    Shen C, Luo J, Shi Z, Gong Z, Tang H, et al. 2002. Study on genetic polymorphism of tea plants in Anhua Yuntaishan population by RAPD. Journal of Hunan Agricultural University: Natural Science Edition 28:320−25

    doi: 10.13331/j.cnki.jhau.2002.04.014

    CrossRef   Google Scholar

    [16]

    Taranto F, D'Agostino N, Greco B, Cardi T, Tripodi P. 2016. Genome-wide SNP discovery and population structure analysis in pepper (Capsicum annuum) using genotyping by sequencing. BMC Genomics 17:943

    doi: 10.1186/s12864-016-3297-7

    CrossRef   Google Scholar

    [17]

    Wang X, Bao K, Reddy UK, Bai Y, Hammar SA, et al. 2018. The USDA cucumber (Cucumis sativus L.) collection: genetic diversity, population structure, genome-wide association studies, and core collection development. Horticulture Research 5:64

    doi: 10.1038/s41438-018-0080-8

    CrossRef   Google Scholar

    [18]

    Kim K, Oh Y, Han H, Oh S, Lim H, et al. 2019. Genetic relationships and population structure of pears (Pyrus spp.) assessed with genome-wide SNPs detected by genotyping-by-sequencing. Horticulture, Environment, and Biotechnology 60:945−53

    doi: 10.1007/s13580-019-00178-w

    CrossRef   Google Scholar

    [19]

    Kobayashi F, Tanaka T, Kanamori H, Wu J, Katayose Y, et al. 2016. Characterization of a mini core collection of Japanese wheat varieties using single-nucleotide polymorphisms generated by genotyping-by-sequencing. Breeding Science 66:213−25

    doi: 10.1270/jsbbs.66.213

    CrossRef   Google Scholar

    [20]

    Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754−60

    doi: 10.1093/bioinformatics/btp324

    CrossRef   Google Scholar

    [21]

    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, et al. 2009. The sequence alignment/map format and SAMtools. Bioinformatics 25:2078−9

    doi: 10.1093/bioinformatics/btp352

    CrossRef   Google Scholar

    [22]

    Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, et al. 2007. PLINK: a tool set for whole-genome association and population-based linkage analyses. The American Journal of Human Genetics 81:559−75

    doi: 10.1086/519795

    CrossRef   Google Scholar

    [23]

    Alexander DH, Novembre J, Lange K. 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Research 19:1655−64

    doi: 10.1101/gr.094052.109

    CrossRef   Google Scholar

    [24]

    Weir BS, Cockerham CC. 1984. Estimating F-statistics for the analysis of population structure. Evolution 38:1358−70

    doi: 10.1111/j.1558-5646.1984.tb05657.x

    CrossRef   Google Scholar

    [25]

    Keller MC, Visscher PM, Goddard ME. 2011. Quantification of inbreeding due to distant ancestors and its detection using dense single nucleotide polymorphism data. Genetics 189:237−49

    doi: 10.1534/genetics.111.130922

    CrossRef   Google Scholar

    [26]

    Excoffier L, Smouse PE, Quattro JM. 1992. Analysis of molecular variance inferred from metric distances among DNA haplotypes: application to human mitochondrial DNA restriction data. Genetics 131:479−91

    doi: 10.1093/genetics/131.2.479

    CrossRef   Google Scholar

    [27]

    Ronfort J, Jenczewski E, Bataillon T, Rousset F. 1998. Analysis of population structure in autotetraploid species. Genetics 150:921−30

    doi: 10.1093/genetics/150.2.921

    CrossRef   Google Scholar

    [28]

    Xia E, Tong W, Hou Y, An Y, Chen L, et al. 2020. The reference genome of tea plant and resequencing of 81 diverse accessions provide insights into its genome evolution and adaptation. Molecular Plant 13:1013−26

    doi: 10.1016/j.molp.2020.04.010

    CrossRef   Google Scholar

    [29]

    Zhang W, Zhang Y, Qiu H, Guo Y, Wan H, et al. 2020. Genome assembly of wild tea tree DASZ reveals pedigree and selection history of tea varieties. Nature Communications 11:3719

    doi: 10.1038/s41467-020-17498-6

    CrossRef   Google Scholar

    [30]

    Wang X, Feng H, Chang Y, Ma C, Wang L, et al. 2020. Population sequencing enhances understanding of tea plant evolution. Nature Communications 11:4447

    doi: 10.1038/s41467-020-18228-8

    CrossRef   Google Scholar

    [31]

    Zhang X, Chen S, Shi L, Gong D, Zhang S, et al. 2021. Haplotype-resolved genome assembly provides insights into evolutionary history of the tea plant Camellia sinensis. Nature Genetics 53:1250−59

    doi: 10.1038/s41588-021-00895-y

    CrossRef   Google Scholar

    [32]

    Zhang Q, Li W, Li K, Nan H, Shi C, et al. 2020. The chromosome-level reference genome of tea tree unveils recent bursts of non-autonomous LTR retrotransposons in driving genome size evolution. Molecular Plant 13:935−38

    doi: 10.1016/j.molp.2020.04.009

    CrossRef   Google Scholar

    [33]

    Wang P, Yu J, Jin S, Chen S, Yue C, et al. 2021. Genetic basis of high aroma and stress tolerance in the oolong tea cultivar genome. Horticulture Research 8:107

    doi: 10.1038/s41438-021-00542-x

    CrossRef   Google Scholar

    [34]

    Niu S, Song Q, Koiwa H, Qiao D, Zhao D, et al. 2019. Genetic diversity, linkage disequilibrium, and population structure analysis of the tea plant (Camellia sinensis) from an origin center, Guizhou plateau, using genome-wide SNPs developed by genotyping-by-sequencing. BMC Plant Biology 19:328

    doi: 10.1186/s12870-019-1917-5

    CrossRef   Google Scholar

    [35]

    Yang H, Wei C, Liu H, Wu J, Li Z, et al. 2016. Genetic divergence between Camellia sinensis and its wild relatives revealed via genome-wide SNPs from RAD sequencing. Plos One 11:e0151424

    doi: 10.1371/journal.pone.0151424

    CrossRef   Google Scholar

    [36]

    Hazra A, Kumar R, Sengupta C, Das S. 2021. Genome-wide SNP discovery from Darjeeling tea cultivars - their functional impacts and application toward population structure and trait associations. Genomics 113:66−78

    doi: 10.1016/j.ygeno.2020.11.028

    CrossRef   Google Scholar

    [37]

    Luo J, Shi Z, Shen C, Liu C, Gong Z, Huang Y. 2004. The genetic diversity of tea germplasms [Camellia sinensis (L.) O. Kuntze] by RAPD analysis. Acta Agronomica Sinica 30:266−69

    Google Scholar

    [38]

    Chen L, Yu F, Yang Y. 2006. Tea germplasm resources and genetic improvement. Beijing: China Agricultural Science and Technology Press

    [39]

    Jiang H, Yi B, Liang M, Wang P. 2011. Morphological diversity analysis of tea germplasm resources in Yunnan. Journal of Yunnan Agricultural University (Natural Science) 26:833−40

    Google Scholar

    [40]

    Yao M, Ma C, Qiao T, Jin J, Chen L. 2012. Diversity distribution and population structure of tea germplasms in China revealed by EST-SSR markers. Tree Genetics & Genomes 8:205−20

    doi: 10.1007/s11295-011-0433-z

    CrossRef   Google Scholar

    [41]

    Hu K, He D, Shui X, Hu W. 2017. Genetic diversity of Colocasia esculenta germplasm based on SSR markers. Amino Acids & Biotic Resources 37:40−45

    doi: 10.14188/j.ajsh.2015.03.009

    CrossRef   Google Scholar

    [42]

    Su W, Wang L, Lei J, Chai S, Liu Y, et al. 2017. Genome-wide assessment of population structure and genetic diversity and development of a core germplasm set for sweet potato based on specific length amplified fragment (SLAF) sequencing. Plos One 12:e0172066

    doi: 10.1371/journal.pone.0172066

    CrossRef   Google Scholar

    [43]

    Wadl PA, Olukolu BA, Branham SE, Jarret RL, Yencho GC, et al. 2018. Genetic diversity and population structure of the USDA Sweetpotato (Ipomoea batatas) germplasm collections using GBSpoly. Frontiers in Plant Science 9:1166

    doi: 10.3389/fpls.2018.01166

    CrossRef   Google Scholar

  • Cite this article

    Huang F, Duan J, Lei Y, Liu Z, Kang Y, et al. 2022. Genetic diversity, population structure and core collection analysis of Hunan tea plant germplasm through genotyping-by-sequencing. Beverage Plant Research 2:5 doi: 10.48130/BPR-2022-0005
    Huang F, Duan J, Lei Y, Liu Z, Kang Y, et al. 2022. Genetic diversity, population structure and core collection analysis of Hunan tea plant germplasm through genotyping-by-sequencing. Beverage Plant Research 2:5 doi: 10.48130/BPR-2022-0005

Figures(5)  /  Tables(6)

Article Metrics

Article views(6879) PDF downloads(1218)

ARTICLE   Open Access    

Genetic diversity, population structure and core collection analysis of Hunan tea plant germplasm through genotyping-by-sequencing

Beverage Plant Research  2 Article number: 5  (2022)  |  Cite this article

Abstract: Tea is considered to be a well-known and widely consumed beverage and Hunan province is rich in tea plant germplasm. In order to better conserve and utilize Hunan tea plant resources, 110 tea accessions from seven geographical origins were used to assess genetic diversity of Hunan tea plant germplasm through genotyping by sequencing (GBS) technology. As a result, a total of 311,044 high-quality single nucleotide polymorphism (SNP) markers were obtained. Population structure, phylogenetic relationships and principal component analysis (PCA) divided the entire accessions into three groups. The genetic diversity and population differentiation analysis showed that the mean observed heterozygosity (Ho) ranged from 0.16 to 0.24, while the mean polymorphic information content (PIC) ranged from 0.14 to 0.17, and mean minor allele frequency (MAF) ranged from 0.11 to 0.14. Analysis of molecular variance (AMOVA) indicated that 81.38% of the total variance was derived from within populations, which suggested a rich genetic diversity in Hunan tea germplasms. Furthermore, a core tea germplasm set was developed, which was comprised of 22 tea plant accessions and maintained the whole genetic diversity of the entire collection. This work should be valuable for conservation and utilization of tea germplasm in Hunan.

    • Tea plant, Camellia sinensis, belonging to genus Camellia is one of the most popular and widely consumed beverages and important economic crops in the world, which contains nearly 700 bioactive compounds, including catechins, theanine, caffeine, and volatiles[14]. Tea plants originated in the Yunnan Guizhou Plateau of China and gradually spread to the east, southeast and east of China. Hunan is located in central China, a transitional zone of biodiversity from southwest to southeast and northeast, which created an excellent natural environment for broad genetic variations of tea plants in Hunan.

      Plant genetic resources have been known as one of the most important natural resources, and they have become a significant research topic. As a result, major advances have been made in the field. Gene banks are associated with the maintenance of germplasm and genetic diversity. In recent years, the conservation of plant genetic resources has attracted immense attention. Aimed at developing effective and efficient conservation practices for plant genetic resources, understanding the genetic diversity between and within the population is important[57]. Analysis of genetic diversity and a populations genetic structure is significant to verify domestication events and genetic relationships of tea plants. In the past, molecular markers, including restriction fragment length polymorphism (RFLP), amplified fragment length polymorphism (AFLP), random amplified polymorphic DNA (RAPD), inter-simple sequence repeat (ISSR) and simple sequence repeats (SSR) have been effectively used to assess the genetic diversity of tea resources in Hunan, and this analysis showed that Hunan origined tea plant germplasm could be categorized into five subpopulations, these being: 'Rucheng Baimaocha'[8], 'Chengbu Dongcha'[9], 'Huangjincha'[10,11], 'Jianghua Kucha'[1214] and 'Anhua Yuntaishancha'[15]. With the development of high-throughput sequencing technologies, GBS has been successfully applied into germplasm diversity analysis, and it provides accurate results independently of the target species or population. Due to the characteristics of simple operation, high cost performance and good stability, GBS technology has become a hot spot in the research of genetic relationships, genetic diversity and genetic evolution[16]. Recently, GBS has been applied in the origin and evolution of many crops, such as cucumber[17], pear[18], wheat[19], and so on. In the present study, the population structure, genetic diversity and core collection of 110 tea accessions from Hunan (including 15 Yunnan origin cultivars as control) were analyzed by GBS. Our findings will provide a valuable resource for further understanding the genetic composition and genetic relationship of tea resources in Hunan, which will provide scientific reference for protection and utilization of Hunan tea plants.

    • A total of 110 tea plant accessions were collected in this study (Supplemental Table S1) and all accessions were classified into seven populations according to geographical location, including six populations from six different regions of Hunan province and one population from Yunnan province in China, which was composed of six accessions used as control that were collected from Yunnan Tea Research Institute. One population with 17 accessions was collected from Mangshan nature reserve, and the remains of five populations with 87 accessions were collected from Hunan tea germplasm resource garden (Fig. 1).

      Figure 1. 

      Geographical distribution of Hunan tea plant accessions used in this study. The geographical locations were indicated under the corresponding regions, followed by the abbreviated population name.

    • DNA was extracted from 200 mg of fresh leaf tissue of each sample with QIAGEN plant mini kit (Qiagen, Valencia, CA, USA). DNA purity and concentration were analyzed by NanoPhotometer® spectrophotometer (IMPLEN, CA, USA) and Qubit® 2.0 Flurometer (Life Technologies, CA, USA), respectively. Subsequently, genomic DNA of the accessions was digested with restriction enzyme MseI and NlaIII, and the degradation and contamination was monitored on 1% agarose gels. After adding the adaptors with barcode, DNA fragments with 375−400 bp in length were selected for amplification to construct a paired end sequencing library and subsequently were subjected to sequencing using Illumina Hi-Seq PE150 system.

    • The original image data obtained via sequencing was transformed into raw reads in FASTQ format by base calling analysis. Joint reads and low quality paired reads (reads with ≥10% unidentified nucleotides (N), > 10 nt aligned to the adaptor, allowing ≤ 10% mismatches, > 50% bases having phred quality < 5) were filtered out to obtain clean data. The clean reads were mapped to the 'Shuchazao' reference genome[3] (http://tpia.teaplant.org/download.html) using BWA (Burrows-Wheeler Aligner) (V ersion: 0.7.8)[20]. SNP calling was performed using SAMtools[21].

    • Heterozygosity analysis was performed by Plink v1.9[22]. A phylogenetic tree was constructed using MEGA (www.megasoftware.net) with neighbor-joining (NJ) method. A web tool called iTol (https://itol.embl.de) was used for data visualization. Population structure was analyzed using the ADMIXTURE v1.3.0[23] with 10 independent simulations for each K value ranging from 1−5 (Fig. 2). The optimal number of clusters was determined based on the minimum cross entropy and population structure map was drawn by R language package plot. Plink was used for principal component analysis based on default parameters[22], and the principal component distribution map was drawn in R language package plot3d.

      Figure 2. 

      Calculation of CV errors for K values from 1 to 5.

    • Genetic diversity analysis, including Nei's genetic diversity index (H), polymorphic information content (PIC), minor allele frequency (MAF) and observed heterozygosity (Ho)[24,25], was analyzed using R language package snpReady-popgen. R package poppr.amova was used for the analysis of molecular variance (AMOVA)[26,27]. The core collection of Hunan tea plant germplasm was developed using R package Core Hunter 3.0.

    • A total of 195.85 GB sequencing data was obtained from 110 tea plant collections. After filtering out the low-quality data, 195.82 GB high-quality sequence data was finally obtained. On average, 6,178,038 clean reads were obtained for each sample. The average high-quality sequence data of each sample was 1.78 GMB, accounting for about 60.75% of the genome size (2.93 GB) of tea plant. The filtered sequences were compared with the tea reference genome. The results showed that the average mapping rate of 110 samples was 96.76%. Samtools was used to detect the variation of the sequence of each material compared to the reference genome. After filtering, 311,044 high-quality SNP were obtained, and transformation type SNPs (TS, a/g or C/T) accounted for 76.6%, the transversion type SNPs (TV) for 23.4%. The annotation results of gene structure distribution showed that 89.3% of high-quality SNP loci were distributed in the intergenic region. Further analysis of SNP loci in gene region showed that 18,330 SNPs were distributed in intron, 3,657 SNPs were distributed upstream and 3,607 SNPs were distributed downstream, while 2,952 SNPs distributed in the exon region resulted in synonymous mutation, and 3,534 SNPs resulted in nonsynonymous mutation. The average of Q30 was 89.31% and the average of GC was 49.36% within 110 accessions (Supplemental Table S2).

    • A total of 311,044 high quality SNPs were used to analyze population structure using ADMIXTURE. Firstly, the values of cross-validation error (CV) were calculated using ADMIXTURE for each K to select an optimal number of populations. The results showed that the CV value reached the lowest when K = 3, which indicated that the optimal number of populations should be three, and the whole population was divided into three groups under that condition. When K = 2, the YN (Yunnan) population could not be separated from the populations of Hunan. Under that condition, accessions in AN and HJ were clustered into one group, while the rest of the populations were clustered into another group (Fig. 3). The SNP panel set separated the populations into three geographical types, these were the Yunnan group, south of Hunan group and north of Hunan group, at the CV value at K = 3 (Fig. 3). However, some of accessions in the MS population were assigned to YN groups. When K = 4, the YN population was clearly separated from the populations of Hunan and accessions in RC were clearly separated from those in CB, MS, and JH (Fig. 3). When K = 5, the group clustered by AN and HJ, which were assigned to the north of Hunan type, was divided into two subgroups (Fig. 3). However, the south of Hunan group, including CB, MS, and JH could not be clearly separated at any K value (Fig. 3), which indicated that extensive gene flow should happen among the three geographical populations.

      Figure 3. 

      Analysis of population structure by ADMIXTURE. The x-axis indicates different research materials and the y-axis shows membership probability belonging to different populations.

      In order to validate the results of structural analysis, PCA was performed using an R package, and the result showed that all of the 110 accessions were clearly clustered into three groups (Fig.4), which was consistent with the results of structure analysis at K = 3 (Fig. 3). A NJ tree (Fig. 5) built on the basis of SNPs was used to determine the genetic relationship among tea plant accessions, and a similar result of structure analysis at K = 3 was obtained. All the tea plant accessions in seven geographical populations are located in three independent branches (Fig. 5). Fourteen, 33 and 63 accessions were assigned to group I, II, III, respectively (Fig.5). Eighty-seven percent of all accessions in YN population were distinguished from other accessions from Hunan, and they belonged to a single group (Fig. 5), which confirmed the results of PCA (Fig. 4) and structure analysis with K = 3 (Fig. 3). Most accessions of MS, JH and part of CB were clustered into group II. Most of RC, AN, HJ and part of CB were clustered into group III, and three subgroups were formed in this group (Fig. 5). Eighty-seven of the accessions in population RC were clustered into one subgroup, while eight accessions in CB were assigned to one subgroup (Fig. 5). Meanwhile, most accessions in AN and HJ were clustered into one subgroup (Fig. 5).

      Figure 4. 

      PCA plot of the 110 samples based on the top three principal components with different colors representing the populations, which were divided into three groups by the range of circles with 95% confidence level.

      Figure 5. 

      Phylogenetic tree of the 110 samples with three different colors indicating three groups obtained from the ADMIXTURE analysis result.

    • In order to analyze the genetic diversity of the seven tea plant populations, the genetic parameters, containing PIC H, Ho and MAF were calculated respectively. As shown in Table 1, the value of H suggested that population HJ showed the highest genetic variation, while the population of YN indicated the lowest genetic variance. At the same time, the mean Ho ranged from 0.16 (YN) to 0.24 (HJ) (Table 1). The lowest PIC value was 0.14, whereas the highest PIC value reached 0.17. It was found that the mean MAF values ranged from 0.11 in YN population to 0.14 in HJ population (Table 1), which showed a similar tendency as the PIC values.

      Table 1.  H, Ho, PIC and MAF values among seven tea plant populations and three inferred groups.

      PopulationHHoPICMAF
      MeanRangeMeanRangeMeanRangeMeanRange
      Seven population accessions
      YN0.160.08−0.500.160.11−0.250.140.11−0.380.110.08−0.50
      RC0.180.07−0.500.200.15−0.220.150.09−0.380.120.09−0.50
      CB0.200.10−0.500.210.18−0.240.170.12−0.380.120.08−0.50
      MS0.200.10−0.520.220.17−0.250.160.07−0.380.130.10−0.50
      JH0.200.11−0.530.220.18−0.250.170.09−0.380.130.10−0.50
      AN0.200.10−0.500.230.17−0.280.170.12−0.380.130.11−0.50
      HJ0.220.12−0.540.240.20−0.310.160.10−0.380.140.12−0.50
      Three groups based on Mega and ADMIXTURE
      I0.150.10−0.500.150.11−0.250.120.05−0.380.100.07−0.50
      II0.210.17−0.500.210.15−0.280.180.10−0.380.130.08−0.50
      III0.210.16−0.500.220.17−0.310.170.09−0.380.140.09−0.50
    • Fst analysis and AMOVA were used to assess the genetic differentiation among the seven population groups. The results showed that the Fst value ranged from 0.052 to 0.221, and the highest population differentiation existed between YN and HJ, then between MS and HJ (Table 2). The Fst value between YN population and any other population originating from Hunan was higher than that within Hunan groups, which was consistent with geographical differences. In Hunan region groups, the HJ population showed the biggest population differentiation with other populations in Hunan except the AN population, based on Fst analysis (Table 2). Moreover, there was a lower Fst between the CB population and other populations from Hunan (Table 2). AMOVA results indicated that only 18.6% of the total variance was attributed to genetic differentiation among the seven populations, while 81.38% of the variance was attributed to genetic differentiation within a population (Table 3), which implied that a rich genetic diversity existed in the Hunan tea plant germplasm. Furthermore, the AMOVA in three groups categorized according to ADMIXTURE were performed, and the results showed that the majority of the variance (about 80.77 %), came from within group (Table 3), which further supported the idea that the genetic diversity contributed the most to the differentiation of Hunan tea plant resource than geographical factors.

      Table 2.  Matrix of pairwise Nei's genetic distance and Fst among the seven populations.

      PopulationYNRCCBMSJHANHJ
      YN0.1620.1330.1650.1540.1700.221
      RC0.0350.0760.1020.0830.0880.145
      CB0.0430.0420.0760.0580.0520.114
      MS0.0510.0430.0310.0700.1280.185
      JH0.0460.0370.0380.0380.0770.136
      AN0.0460.0380.0350.0440.0380.078
      HJ0.0460.0520.0440.0590.0530.035
      Notes: Above diagonal Fst; below diagonal: Nei's genetic distance.

      Table 3.  AMOVA of the whole population.

      Source of variationDegree of freedomSum of squareMean of squareComponents of covariance
      Sigma%
      Seven populationsaccessions
      Between population617,773.892,962.31147.6418.62
      Within population10366,455.06645.19645.1981.38
      Total21984,228.95772.74792.84100.00
      Three groups based on Mega and ADMIXTURE
      Between groups211,758.905,879.45161.2819.23
      Within groups10772,470.05677.29677.2980.77
      Total10984,228.95772.74838.57100.00
    • A core collection containing 22 individuals from seven populations was constructed using the R package Corehunter (Table 4). In order to check if the core germplasm could effectively represent the genetic diversity of the whole tea germplasm, the genetic parameters of the core collection were estimated, and the results revealed that H, Ho, PIC, and MAF values of the developed core collection were consistent with the entire collection (Table 5). The results of AMOVA indicated that no significant difference was observed between the rest of the entire collection and the core germplasm set developed in the present work, and that 100.41% of the total variation was attributed to genetic differences within the collection, suggesting that the core germplasm set completely represented the whole germplasm (Table 6).

      Table 4.  The core collection.

      PopulationCore collection
      YNYN2, YN9
      RCRC11, RC14, RC16
      CBCB1, CB6, CB8, CB9, CB11
      MSMS4, MS5, MS14, MS17
      JHJH9, JH16
      ANAN1, AN2, AN5, AN11, AN12
      HJHJ1

      Table 5.  Genetic parameters of the core collection and the whole germplasm.

      GermplasmHHoPICMAF
      MeanRangeMeanRangeMeanRangeMeanRange
      Entire germplasm0.220.06−0.500.210.13−0.310.190.05−0.380.140.03−0.50
      Core germplasm set0.210.00−0.500.200.11−0.280.180.00−0.380.130.00−0.50

      Table 6.  The AMOVA results among the core germplasm and non-core germplasm.

      Source of variationDegree of freedomSum of squareMean of squareComponents of covarianceP-value
      Sigma%
      Between germplasm1660.86660.86−3.21−0.420.85
      Within germplasm10883,568.10773.78773.78100.41
      Total11084,228.95772.74770.57100.00
    • Genomics research of tea plants has developed rapidly over the recent decade. Several reference genomes of tea plants, including 'Yunkang 10'[3], 'Shuchazao'[28], wild tea plant[29], 'Longjing 43'[30], 'Tieguanyin'[31], 'Biyun'[32] and 'Huangdan'[33], have been released. Recently, genetic diversity analysis of C. sinensis has been identified using genome sequencing technology[3436]. In this study, the genetic diversity, population structure, population differentiation and core germplasm of Hunan tea plant resources have been evaluated using GBS.

      Analysis of cross-validation errors demonstrated the lowest value was reached at K = 3, and PCA and phylogenetic tree analysis showed that seven geographical populations were clearly clustered into three groups. Based on the high quality SNPs, multiple analyses, including population structure analysis, PCA and phylogenetic analysis, it was confirmed that the YN population were clearly clustered to one single group, and most of accessions in An and HJ from the north of Hunan were assigned to one group, while the rest of accessions in RC, MS and JH from the south of Hunan were classified into one group. Therefore, the YN population could be separated from Hunan populations which verified that the SNPs data obtained by GBS were reliable and indicated that geographical barriers led to genetic differences between Hunan and YN populations. At the same time, three populations from southern Hunan and two populations from northern Hunan were divided into two groups, which was consistent with geographical distribution[37]. The analysis of the results of population structure, phylogenetic relationships, and PCA showed that the RC population was clustered into one subgroup, which is in agreement with morphological results[38] and RAPD molecular marker[14] analysis, which indicated that the RC population was derived from other tea plant populations in Hunan. Results of phylogenetic tree analysis showed that RC shared a nearer evolutionary relationship with MS, JH, CB, AN, or HJ than that of C. sinensis var. pubilimba in Yunnan[39]. However, accessions in CB were divided into two different subgroups, which indicated population differentiation occurred in the CB population[29]. Accessions in CB, MS, JH and other populations from Hunan contained more gene exchanges, which was also confirmed by the results of genetic structure analysis. The above results confirmed the reliability of phylogenetic evolutionary tree analysis, and they suggested that there were obvious gene flows between different cohabitation groups at the genomic level.

      AMOVA results revealed that the population differentiation between the seven surveyed regions and three groups (Table 3) contributed only 18.62% and 19.23% of the total variances respectively, and the main genetic variation came from differentiation within populations, which was similar to that observed by Yao et al.[40]. Therefore, the AMOVA results indicated rich genetic diversity among Hunan tea germplasm within populations. These results could explain the NJ tree analysis, which showed that accessions from the same geographical region, such as GB, JH and AN populations, were not completely clustered into the same group in the NJ tree. The introduction of frequent tea plant breeding from different geographical regions, possibly promoted genetic material exchange, which led to a similar genetic background between different locations. The geographical locations have less effect on the genetic diversity, and revealed a lack of geographical differentiation, which were also found in crops of taro[41], potato[42]and sweet potato[43].

      Furthermore, a core tea germplasm set, containing 22 tea accessions, was developed in this study, according to 311,044 genome-wide SNPs. The core collection preserved the genetic diversity of the whole resource population to the greatest extent with the least amount of genetic resources, as well as representing the genetic diversity and the geographical distribution of the whole resource population (Table 4), which should effectively improve the efficiency of germplasm exchange, utilization and germplasm resource nursery management. This work is the first report to construct the core tea germplasm in Hunan, which would help breeders to use the Hunan tea plant resource effectively and to reduce redundant breeding. Additionally, based on the core germplasm, we could remove genetically similar accessions and focus on important agronomic and quality traits in a relatively small number of tea plant germplasm that could be used as breeding materials.

      • This work was financially supported by The Central Government Guides Local Funds (2019XF5041), Hunan Agricultural Science and Technology Innovation Fund (2020CX035), the National Natural Science Foundation of China (32172629, U19A2030, 31670689), Provincial Natural Science Foundation of Hunan (2020JJ4358), and Hunan Provincial Seed Industry Innovation Project (2021NK1008).

      • The authors declare that they have no conflict of interest.

      • Copyright: © 2023 by the author(s). Published by Maximum Academic Press, Fayetteville, GA. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.
    Figure (5)  Table (6) References (43)
  • About this article
    Cite this article
    Huang F, Duan J, Lei Y, Liu Z, Kang Y, et al. 2022. Genetic diversity, population structure and core collection analysis of Hunan tea plant germplasm through genotyping-by-sequencing. Beverage Plant Research 2:5 doi: 10.48130/BPR-2022-0005
    Huang F, Duan J, Lei Y, Liu Z, Kang Y, et al. 2022. Genetic diversity, population structure and core collection analysis of Hunan tea plant germplasm through genotyping-by-sequencing. Beverage Plant Research 2:5 doi: 10.48130/BPR-2022-0005

Catalog

  • About this article

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return