Garlic common latent virus (GarCLV) is a member of the genus Carlavirus belonging to the family Betaflexiviridae of the order Tymovirales. GarCLV has flexuous, filamentous particles typical of carlaviruses [1], containing a single molecule of linear ssRNA of ~8.6 kb [2]. GarCLV is symptomless on garlic but it induces severe yellowing and mosaic when in mixture with other potyviruses such as Onion yellow dwarf virus and Leek yellow stripe virus [35]. Although, GarCLV has been reported from many countries around the world, its molecular characteristics such as genetic diversity, phylogenetic relationship, and intra and inter species recombination have not been sufficiently investigated to date. The present study reports the molecular characterization of five GarCLV isolates from India along with their phylogenetic relationships and evidence of recombination. For this study we have analyzed the sequence of CP since it is considered one of the most important region for species demarcation criteria in genus Carlavirus by ICTV [1].

Five isolates of GarCLV were selected from previously tested garlic accessions [6]: GarCLV-G1 (Northern India), GarCLV-RAU (Eastern India), GarCLV-Kolar (Southern India), GarCLV-JN and -Anand (Western India).

Primer GCLVCP-F (ATGTCAA/GC/TGAGTGAAACAGA) and primer GCLVCP-R(CTAGTCTGCATTGTTGGATCC) were designed and synthesized (Sigma Aldrich Co. India) for amplification of CP gene.Total RNA was extracted using the RNeasy Plant Mini Kit (QIAGEN GmbH, Hilden, Germany) following the manufacturer’s instructions. The first cDNA of GarCLV CP gene was synthesized using GCLVCP-R primer. The total 20 μl reverse transcription mixture containing 0.2 μM reverse primer (GCLVCP-R), 5 U MuLVRT (MBI, Fermentas, USA), 2 μl of 5X reaction buffer, and 0.3 mM dNTPs was incubated at 42 °C for 45 min. Then, the complete CP was amplified following the PCR condition described previously [6].

A 960 bp product corresponding to the complete CP gene of GarCLV was amplified in all five isolates. Purified RT-PCR products were cloned into pGEM-T easy vector, according to manufacturer’s instructions (Promega, Madison, USA) and confirmed by standard colony PCR with specific CP primers as well as by EcoRI digestion. Two clones of each isolate were sequenced using an AB 13130 Genetic analyzer (Xcelaris, Ahmadabad, India). The consensus sequences were deposited in GenBank bearing accession numbers JQ818255, JQ818256, JQ818257, JQ818258, and JQ818259 which correspond to JN, RAU, Kolar, Anand, and G1 isolates, respectively. In all isolates the length of the CP was found to be 960 nucleotide (nt) and the isolates shared nt and aa identity of 95.7–99.0 and 98.1–99.6 %, respectively, among them.

The CP sequences of 23 GarCLV isolates and a reference CP sequence for each of 37 Carlavirus species were retrieved from GenBank and used for sequence comparisons. The comparison of 28 (5 from this study) GarCLV CP sequences revealed an identity of 88.1–99.8 % and 91.5–100 % at nt and aa level, respectively (Table 1). The comparison of CP sequences of 37 Carlavirus species with five GarCLV Indian isolates revealed an identity of 33–62 and 28–61 % at nt and aa level, respectively. The closest and farthest Carlavirus species of GarCLV were found to be Coleus vein necrosis virus (CVNV) and Cole latent virus (CoLV), sharing 61–62 and 33–34 % nt identity, respectively (Suppl. Table 1). This study shows the existence of an high sequence diversity (11.9 %) in CP gene among all GarCLV isolates worldwide. However, when the sole Indian isolates were considered, a diversity of only 4.3 % was observed in CP nucleotide sequences indicating distinctiveness of Indian isolates from other isolates of the world. Lack of substantial diversity among Indian isolates could be attributed to the propagative nature of the crop and the large diffusion of the single variety “G1” in the major garlic growing areas of India.

Table 1 Sequence identity (%) of coat protein (CP) gene of Garlic common latent virus isolates

A multiple alignment of the CP aa sequences revealed that the major variability was located within the N-terminal 48 aa, where 21 aa were highly variable among 28 isolates (Suppl. Fig. 1). Conversely, the central and the C-terminal regions (from 49 to 319 aa) are highly conserved in all isolates (data not shown). The presence of conserved domains such as Flexi_CP_N and Flexi_CP had been previously reported in CP of carlaviruses and potexviruses [7]. In the present study, the presence of 13 different conserved domains and of 3 motifs within the CP were predicted using SMART [8] and MEME [9] programs, respectively (Suppl. Table 2). Sequences of all the three motifs were subjected to TOMTON motif comparison web tool, which indicated that they showed similar signatures with transcriptional factors and a possible role in nucleic acid binding property.

Fig. 1
figure 1

a Phylogenetic analysis of coat protein (CP) of Garlic common latent virus (GarCLV) isolates from different parts of the world. The evolutionary history was inferred using the Maximum Likelihood method based on the Tamura-Nei model. The percentage of trees in which the associated taxa clustered together is shown next to the branches. Initial tree(s) for the heuristic search were obtained automatically as follows. When the number of common sites was <100 or less than one-fourth of the total number of sites, the maximum parsimony method was used; otherwise BIONJ method with MCL distance matrix was used. The analysis involved 28 sequences. Codon positions included were 1st + 2nd + 3rd + Noncoding. All positions containing gaps and missing data were eliminated. Evolutionary analyses were conducted in MEGA4. b Phylogenetic analysis of coat protein (CP) of Carlavirus species infecting diverse plant species. The evolutionary history was inferred using the Neighbor-Joining method. The bootstrap consensus tree inferred from 1,000 replicates is taken to represent the evolutionary history of the taxa analyzed. Branches corresponding to partitions reproduced in less than 50/bootstrap replicates are collapsed. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 replicates) are shown next to the branches. The evolutionary distances were computed using the p-distance method and are in the units of the number of base differences per site. The analysis involved 43 nucleotide sequences. All ambiguous positions were removed for each sequence pair. Evolutionary analyses were conducted in MEGA4. Full name of the viruses: NSV Narcissus symptomless virus, SPCFV Sweet potato chlorotic fleck virus, RCVNV Red clover vein mosaic virus, NCLV Narcissus common latent virus, HpMV Hop mosaic virus, CVNV Coleus vein necrosis virus, CoLV Cole latent virus, AcLV Aconitum latent virus, BlScV Blueberry scorch virus, CLV Carnation latent virus, CVB Chrysanthemum virus B, CPMMV Cowpea mild mottle virus, HpLV Hop latent virus, LSV Lily symptomless virus, MYaV Melon yellowing- associated virus, PLV Passiflora latent virus, PeSV Pea streak virus, PotLV Potato latent virus(), PVM Potato virus M, PVS Potato virus S PVP Potato virus P, SLV Shallot latent virus(), AHLV American hop latent virus, DVS Daphne virus S, ButMV Butterbur mosaic virus, HeMV Helleborus mosaic virus, HeNNV Helleborus net necrosis virus, HdCMV Hydrangea chlorotic mottle virus, KLV Kalanchoe latent virus, LNRSV Ligustrum necrotic ringspot virus, MjMV Mirabilis jalapa mottle virus, PhlVB Phlox virus B, PhlVS Phlox virus S, PopMV Poplar mosaic virus, CapLV Caper latent virus, HVS Helenium virus S, NeLV Nerine latent virus and GarCLV Garlic common latent virus K2 isolate

Maximum likelihood phylogenetic tree constructed in MEGA4 [10] splits the GarCLV isolates into two major clusters, designated as Subgroup I and Subgroup II (Fig. 1a). Subgroup I included isolates from USA and China, whereas Subgroup II included isolates from Australia, Brazil, India, Japan, and South Korea. The five Indian isolates were grouped together with K2 isolate of South Korea and GCLV-BZL isolate of Brazil. On the basis of CP gene sequences, isolates of various virus species, such as, Cucumber mosaic virus, Prunus necrotic ringspot virus, Turnip mosaic virus, Chrysanthemum virus B (CVB), Peanut stunt virus, etc, have been grouped into 2–4 sub-groups [1117]. In our study phylogenetic analysis of the CP revealed the distinct clustering pattern based on geographic origin of the GarCLV isolates. CP gene sequences could also group viruses of genus Carlavirus in two distinct phylogenetic subgroups. A similar sub grouping of Carlavirus species could also be shown when other genomic regions, i.e., nucleic acid binding protein and replicase genes, are used [7, 18, 19].

Phylogenetic analysis of GarCLV Indian isolates with representative isolates of other 37 Carlavirus species showed the formation of two phylogenetic subgroups (Fig. 1b). All the Indian isolates formed a single clustering pattern and grouped together with Shallot latent virus (SLV) and 10 other Carlavirus species. The remaining 26 Carlavirus species clustered in a separate subgroup.

Potential recombination events were identified using the RDP3 v.3.34 [20]. Intraspecies recombination analysis of 28 GarCLV isolates showed in the CP of the Anand isolate (breakpoint from nt position 167 to 5) a recombination event having JN and Kolar isolates as donors. The recombination was detected simultaneously by two algorithms of RDP3 (3Seq and SiScan) (Suppl. Fig. 2). Intraspecies recombination in the CP gene of the carlaviruses CVB [13, 21] and Lily symptomless virus has been already reported [22]. In our study we could detect intraspecies recombination in only one of the 28 isolates. Evidence for the lack of strong recombination signals in CP gene of remaining GarCLV isolates has suggested that, the CP gene may be recombination cold spot as reported in other carlaviruses such as CVB [21], SLV, and Potato virus S [23]. The existence of recombinant isolate in India could be attributed to the existence of multiple isolates in the same garlic accessions which are brought together during the extensive multilocational trials and screening conducted before their release for commercial cultivation.

Interspecies recombination analysis of 37 Carlavirus species along with GarCLV in RDP3 program detected the recombination signals in the CP of 22 Carlavirus species. Interspecies recombination has played a major role in the evolution of members of ssDNA virus family Geminiviridae [24], whereas, in positive sense ssRNA viruses, it has been reported in the Potyviridae [25] and Luteoviridae [26]. Interspecies recombination analysis in the current study revealed the lack of extensive exchange of genetic material from other Carlavirus species to GarCLV, but still, provided an evidence for interspecies recombination events in the CP gene of 22 Carlavirus species (Suppl. Table 3). Interestingly, SLV, another Carlavirus infecting garlic neither contributed nor received the CP fragment from GarCLV (Suppl. Table 3), though both infect same hosts and are transmitted by the same vector.

In conclusion, molecular analysis of CP of GarCLV has clearly indicated the existence of two phylogenetic subgroups and the occurrence of a recombinant isolate in India.