A total of 555 fecal or anal samples from fruit bats were collected at four locations in Yunnan province, China in 2009-2016 (Fig. 1). By RT-PCR detection targeting partial RdRP, 46 (8.29%) samples were positive for HKU9 and 13 (2.34%) were positive for GCCDC1 or closely related viruses (Table 1). Different sampling times and sites showed different detection rates for HKU9. No positive results were detected in samples collected in Mengla, 2011 and Mojiang in 2013 (Table 1). HKU9 infection rates in Chuxiong, Mengla, and Jinghong were 18.59% (29/156), 5.32% (10/188), and 6.14% (7/114), respectively. GCCDC1 was not detected until 2015, with a positive rate of 5.26% in 2015 and significantly high positive rate in 2016 (18.86%) in Mengla.
Figure 1. Map of sampling sites in Yunnan province of China. Red regions indicate the four districts where bat samples were collected.
Table 1. Detection of BatCoV HKU9 and BatCoV GCCDC1 by RT-PCR in bat fecal or anal samples collected from four districts in the Yunnan province of China during 2009-2016.
The amplified partial RdRp sequences in this study shared 74.4%-100% identity at the nucleotide (nt) level. A phylogenetic tree was conducted based on the alignment of partial RdRp sequences along with previously reported HKU9, GCCDC1, and related stains, as well as representative strains of other betacoronaviruses. The results revealed 59 sequences classified as two coronavirus species, HKU9 or GCCDC1 (Fig. 2A). All sequences from Rousettus bats were HKU9-related viruses and those from E. spelaea were GCCDC1-related viruses. In contrast to the GCCDC1 strains which are highly similar, the HKU9-related strains were highly diverse. Within the HKU9 species, the sequences in this study and previously reported sequences were divided into 5 lineages: Lineage 1 comprising 28 sequences and previously reported HKU9-10-2, HKU9-5-2, and HKU9-2 exclusively from R. leschenaulti; Lineage 2 comprising 5 sequences and previously reported HKU9-1 from R. leschenaulti; Lineage 3 comprising 10 sequences and previously reported HKU9-4 from unidentified Rousettus species R. sp.; Lineage 4 comprising the previously detected HKU9-3, 9-5, and 9-10 from R. leschenaulti; Lineage 5 comprising 3 sequences from Rousettus species. The other 13 sequences were exclusively from E. spelaea and grouped with previously reported BatCoV GCCDC1 (Huang et al. 2016).
Figure 2. Phylogenetic analysis of the detected coronaviruses in this study. Partial RdRp sequences (A), complete nucleoprotein gene sequences (B), and full-length genomic sequence of BatCoV HKU92202 (C) were aligned with corresponding sequences of representative viral species in the genus Betacoronavirus. Phylogenetic trees were constructed using the neighbor-joining method implemented in MEGA7 and bootstrap values calculated from 1000 replicates. The sequence obtained in this study is labeled in color and named by the sample isolate identifier followed by bat species, location, and collection year.
To further characterize the relationships between the newly detected coronaviruses, we amplified the full-length sequences of S, N, and P10 gene from selected positive samples. We amplified N from 9 HKU9-related viruses and 5 GCCDC1-related viruses and P10 from 13 GCCDC1related viruses. The amplifications of S failed for all positive samples. p10 amplified from this study shared 99%- 100% similarity with previously reported sequences (Huang et al. 2016). The amplified N sequences of HKU9 and GCCDC1-related viruses showed 74.5%-100% and 95.2%-97.4% nt identity with each other, respectively. The phylogenetic tree constructed based on N showed a topology structure similar to that of RdRp (Fig. 2B).
The full-length genome sequence was obtained from one sample (BatCoV HKU9-2202) in lineage 5 by highthroughput sequencing and RACE. The genome of HKU92202 is 29, 118 nt in length excluding the polyA tail, with a G/C content of 42%. The main ORFs of HKU9-2202 were predicted and deduced in the order: 5'-ORF1ab-Spike (S)NS3-Envelope (E)-Membrane (M)-Nucleocapsid (N)NS7a-NS7b-3' (Table 2). The putative transcription regulatory sequences (TRSs) and their genomic localization were predicted based on the conserved core sequence (50ACGAAC-3') of the TRSs of betacoronaviruses. Notably, in the putative TRS of E, there was a difference of one nucleotide with the consensus core sequences (Table 2).
Table 2. Amino acid identity, TRS and sequence comparisons of BatCoV HKU9-2202 with BatCoV HKU9 and BatCoV GCCDCC1.
Comparative genomic sequence analysis indicated that HKU9-2202 shared 83% nt identity with other previously reported BatCoV HKU9 strains. The most divergent regions were located in the S protein, which shared only 68% amino acid (aa) identity with those of other BatCoV HKU9. The aa identities of seven concatenated replicase domains, which were selected to define coronavirus species by the International Committee on Taxonomy of Viruses, shared 93% identity with other BatCoV HKU9, which was higher than the new species demarcation of 90%. Thus, the newly identified HKU9-2202 likely belongs to the BatCoV HKU9 species. To determine the evolutionary position of HKU9-2202, the full genome was subjected to phylogenetic analysis. HKU9-2202 formed a separate branch within the clade of BatCoV HKU9 species (Fig. 2C).
Tissues (heart, liver, spleen, lung, kidney, brain, intestine) from five bats positive for coronavirus were quantified by qPCR (Fig. 3). Higher virus genome copies were detected in all intestines and varied from 4.89 × 102 to 5.67 × 106 copies/g in different tissues. Three HKU9-positive bats (Bt9431, Bt9446 and Bt9466) showed wider tissue tropism, as demonstrated by the presence of viral RNA in the kidney, heart, and lung tissues (Fig. 3A). Three GCCDC1positive bats (Bt9444, Bt9463, and Bt967) showed exclusive intestine tropism (Fig. 3B). The viral RNA was not detected in the brain, spleen, and liver tissues.
Prevalence of Betacoronavirus HKU9 and GCCDC1 and Related Viruses in Fruit Bats
Genomic Characterization of Novel Strains BatCoV HKU9-2202
Tissue Tropism of batCoV HKU9 and GCCDC1Related Virus
Table S1. Primer sequence used for quantitative PCR.