Skip to content

GAGE is a method for identification of plant species based on whole genome analysis and genome editing

Construction of target sequences library of crocus sativus

To establish GAGE, whole genome sequences are necessary. The whole genome of crocus sativus (saffron) was determined in an early study, and considering its high economic and medicinal value, we selected sativus as a demonstration of GAGE. sativusone of the most expensive herbs in the world, is wildly cultivated in the Mediterranean, east Asia, and Irano-Turanian region for its highly valuable application in medicine, tea, and cooking seasonings for ages17,18. We first identified the candidate target sequences (hereinafter referred to as Targets) with a nearby PAM in the genome of sativus. The genome sequence analysis revealed that there were more than 178 million Targets in the genome of sativus, and that nearly one-third of these Targets remained after deduplication (Table 1). On average, there was one potential target sequence per 26.8 bp in the genome of sativus. For all Targets in the annotated regions of the genome, most of them were located in the coding genes; only 21,275 of them were located in non-coding RNAs. Among Targets located in coding genes, 1,997,115 of them were located in DNA coding sequences (CDSs), which could be used for screening Targets in functional genes. The Targets in the intraspecifically conserved, species-specific regions are more likely to be the final target sequence used for identification and the universal DNA barcoding regions had been demonstrated to have these characteristics5. In order to rapidly evaluate GAGE, we focused on the Targets in ITS2 region of sativusalthough Targets in other regions with the above characteristics would also be suitable, such as the ycf1 and leafy genes19.20. There was only one target sequence which we named Cs_target1 (Fig. 2a), located in ITS2 region, that had 201 copies in the genome (Fig. 2b). The matched crRNA of Cs_target1 (Cs_crRNA) was designed by adding crRNA repeat to the upstream of spacer (Fig. 2a).

Table 1 Statistics of Targets in different classification.
Fig. 2: Targets in ITS2 region of crocus sativus.
figure 2

a Target sequence located in ITS2 region of crocus sativus, Cs_target1 and its crRNA, Cs_crRNA. The red sequence is PAM and purple sequence is protospacer in Cs_target1; The gray sequence is the universal sequence of crRNA for LbaCas12a and purple sequence is spacer in Cs_crRNA. b Density of Targets’ copy numbers ranging from 1 to 300 in sativus. Cs_target1 has 201 copies in the genome (a).

Feasibility, specificity, and sensitivity of GAGE

Taking Cs_target1 as crRNA, ITS2 fragments of sativus as DNA substrate, we confirmed the feasibility, specificity, and sensitivity of GAGE. We first investigated the endonuclease activity and collateral cleavage activity of LbaCas12a to test the feasibility of GAGE. As shown in Fig. 3a, the complex of Cas12a and Cs_crRNA significantly cleaved the DNA substrate and generated short DNA fragments. In Fig. 3b, b1, b2, b3 had the same 50 bp ssDNA (Fig. S1), but only the ssDNA in b1 was digested by the powerful collateral cleavage activity of Cas12 coupled with DNA substrate and crRNA. Encouraged by this, we next assembled the reaction with a ssDNA reporter (Poly_A_FQ: 5′-FAM-AAAAAAAAAA-BHQ-3′) to finally evaluate the feasibility of GAGE ​​and successfully detected fluorescence signal (Fig. 3c). So we believed that GAGE ​​is a feasible approach.

Fig. 3: Feasibility, specificity and sensitivity of GAGE.
figure 3

a Endonuclease activity of Cas12a. Lane Marker: DL1000 (Takara Biomedical Technology (Beijing) Co., Ltd., China); Lane a1: Cas12a + Cs_crRNA + ITS2 fragments of crocus sativus; Lane a2: Cas12a + Cs_crRNA: Lane a3: ITS2 fragments of Cr Sativus. b Collateral cleavage activity of Cas12a. Lane b1: Cas12a + Cs_crRNA + ITS2 fragments of sativus + 50bp ssDNA; Lane b2: 50 bp ssDNA; b3: Cas12a + Cs_crRNA + 50 bp ssDNA. c Fluorescence signal of GAGE. The two groups were incubated at 37 °C for 25 min and detected by excitation with 470 nm light. Cs: Cas12a + Cs_crRNA + ITS2 fragments of sativus+ ssDNA reporter; CK (negative control): Cas12a + Cs_crRNA + HtwoOR + ssDNA reporter. d Specificity of GAGE. The plant materials in the circle are the stigma of saffron. The DNA substrate of two groups were as follows: Cs: ITS2 fragments of sativus and CK (negative control): nuclease-free water. and Sensitivity of GAGE. The seven groups had different concentrations of DNA substrates (ITS2 fragments of sativus ), e1: 10 ng/μL; e2: 1ng/μL; e3: 0.1ng/μL; e4: 0.01ng/μL; e5: 0.001ng/μL; e6: 0.0001 ng/μL and CK (negative control): 0 ng/μL. All plots data represented means +/− standard deviation (SD) from three independent replicates.

The specificity of GAGE ​​was shown in Fig. 3d, the two groups had the same complex expected for DNA substrate. Only when the ITS2 fragments of sativuswere present, Cas12a digested the Poly_A_FQ and generated fluorescence signal, which rose rapidly in a short time and was significantly higher than that of the negative control (CK). The result provided a firm support for the specificity of GAGE. In order to determine the sensitivity of GAGE, the ITS2 fragments of sativus were diluted ten-fold to obtain a series of DNA substrates with six final concentrations, ranging from 0 to 10 ng/μL. As shown in Figs. 3e, e2 (1 ng/μL) had the highest fluorescence signal, which reached its maximum in the shortest time. the you -test showed that there was no significant difference (P < 0.01) between e5 (0.001 ng/μL) and CK during the 35 min, but significant difference (P < 0.01) was found between e4 (0.01 ng/μL) and CK, so the limit of detection (LOD) of GAGE ​​was considered as 0.01 ng/μL.

Identification of crocus sativus with GAGE

sativus and its adulterants were first subjected to GAGE. The stigma of the sativus is used for medicinal purposes and for perfume material. Because of its high value and low yield, many plant materials of other plant species with similar characteristics, including the flower of Carthamus tinctorius (safflower), the stamen of Nelumbo nucifera(lotus), and the style and stigma of Zea mays(corn), are dyed red and used to impersonate the stigma of sativustwenty-one. For the identification of Cr Sativus (saffron) and its adulterants, only Targets present in the genome of Cr Sativusand not in the genome of Ca. tinctorius , N.nuciferaand Z. mayswere considered as a contender for the final target sequence for identification. In addition, previous studies showed that off targets occur with three or less mismatches22. Based on the target sequence, the crRNA was designed to recognize and bind to the DNA substrate from Cr Sativus, which further drove the generation of fluorescence. Because there was no sequence in the genomes of adulterants that matched the designed crRNA, no fluorescence was generated when the sequences from adulterants were used as DNA substrate. So we analyzed the specificity and predicted off targets of Cs_target1 by mapping it to the genomes of sativus and its three adulterants. The prediction results showed that there was no sequence within three base mismatches compared to Cs_target1 in the genome of Ca. tinctorius, N.nucifera and Z. mays and that there were 13 sequences with one base mismatch and 8 sequences with two base mismatches in the genome of sativus (Fig. 4a and Tables S1, S2), so we chose Cs_crRNA as crRNA to identify sativus . The ITS2 fragments of sativusand its three adulterants were used as DNA substrate, and every group had the same composition except the DNA substrate. As depicted in Fig. 4b, only Cs generated the detectable fluorescence signal, which reached the maximum at 25 min and remained steady until the end of the assay. The groups comprising ITS2 fragments from the adulterants did not generate fluorescence signal. These results indicated that GAGE ​​can specifically identify sativusfrom its adulterants within a short time.

Fig. 4: Identification of crocus sativuswith GAGE.
figure 4

a Selection of target sequences. The red lines represented Cs_target1 and its copies and white lines represented the sequences within three base mismatches compared to Cs_target1. b Results of the identification of sativus with GAGE. The plant materials in the circle, square, triangle, and inverse triangle are the stigma of saffron, the flower of safflower, the stamen of lotus, and the style and stigma of corn, respectively. The DNA substrates of each group were as follows: Cs: ITS2 fragments of sativus; Ct: ITS2 fragments of Ca. tinctorius; Nn: ITS2 fragments of N.nucifera; Zm: ITS2 fragments of Z. maysand CK (negative control): nuclease-free water. All plots data represent means +/− standard deviation (SD) from three independent replicates.

Application of GAGE ​​in plants from different classes

In order to evaluate the applicability of GAGE, it was further subjected to Ricinus communis, setaria italica , ginkgo biloba , Alsophila spinulosa and Selaginella tamariscina . among them, R. communis and S.italicabelong to angiosperms; G. bilobabelongs to gymnosperms; A. spinulosabelongs to ferns, and sel. tamariscine belongs to lycophytes. Similar to the procedures in sativuswe first analyzed the genomes of R. communis , Set. italic, A. spinulosa and sel. tamariscine, there was one target sequence per 18.5 bp, 29.9 bp, 26.3 bp, and 24.6 bp, covering 99.97%, 94.14%, 99.94%, and 99.91% of coding genes (CDS regions), respectively (Fig. 5a). The Targets located in ITS2 region of the four species were also extracted: there were three targets in both R. communis and Set. italictwo targets in A. spinulosaand only one in sel. tamariscine . The copy number of all three targets of the R. communis genome was more than 200, but there were only 9–23 copies of the targets in sel. tamariscine, Set. italicand A. spinulosa(Table S3). Because the full genome size of G. bilobais too big to analyze, we constructed a smaller Targets library in its ITS2 region rather than the whole genome and extracted two targets (Gb_target1 and Gb_target2). Finally, GAGE ​​was performed using high-copy Rc_target3, Si_target3, As_target2, Gb_target1, and St_target1 as targets, and the matched crRNAs (Rc_crRNA, Si_crRNA, Gb_crRNA, As_crRNA, and St_crRNA) were synthesized based on them. The ITS2 fragments of the above plants were used as DNA substrates for their respective assays. The results (Fig. 5b) showed that each group had a significant fluorophore signal, which confirmed that GAGE ​​has great universality in different plant classes, including angiosperms, gymnosperms, ferns, and lycophytes.

Fig. 5: Application of GAGE ​​in plants from different classes.
figure 5

a Statistics of Targets in different classification derived from the four genomes. Genome: Targets located in the whole genome; Unannotated regions: Targets located in the unannotated regions of genome; Coding genes: Targets located in coding genes; CDS: Targets located in CDS. b Result of employing GAGE ​​in plants from different classes. The DNA substrates and crRNA of every group were as follow: Rc:ITS2 fragments of R. communis+ Rc_crRNA; Yes: ITS2 fragments of Set. italic+ Si_crRNA; Gb: ITS2 fragments of G. biloba+ Gb_crRNA; As: ITS2 fragments of A. spinulosa+ As_crRNA; St: ITS2 fragments of sel. tamariscine+ St_crRNA and CK (negative control): nuclease-free water + Cs_crRNA. All plots data represent means +/− standard deviation (SD) from three independent replicates.

Leave a Reply

Your email address will not be published.