Introduction

Noroviruses (NoVs) are the primary cause of epidemic and sporadic infectious gastroenteritis cases in people of all ages, worldwide (Allen et al. 2008; Luo et al. 2015). Although the disease is usually mild and self-limiting, prolonged and severe symptoms are observed in children, the elderly and immunocompromised patients (Kroneman et al. 2008). Each year, NoV-associated gastroenteritis is estimated to cause 200,000 deaths of children <5 years old in developing countries and 64,000 cases of hospitalisations in industrialised countries (Patel et al. 2008). Extensive outbreaks are more prominent in confined areas where there are frequent human–human interactions and shared sources of food and water such as hospitals, nursing homes, sporting events, cruise ships and holiday resorts (Kroneman et al. 2008; Currier et al. 2015).

Noroviruses are classified in the Norovirus genus within the Caliciviridae family. Noroviruses are grouped into seven genogroups (GI–GVII) based on the complete capsid gene nucleotide sequence (De Graaf et al. 2015; Vinjé 2015). Even though there is a high diversity amongst NoVs, the majority of epidemic and sporadic gastroenteritis infections are caused by strains of a single genotype, GII.4 (Dai et al. 2015; Vinjé 2015). Every 2–3 years, genetically and antigenically novel GII.4 variants emerge and replace the previously predominant strains (Eden et al. 2014; Giammanco et al. 2014). Since 1995, at least nine distinct GII.4 variants circulated globally and six of them were linked to NoV-associated gastroenteritis pandemics (Dai et al. 2015). Since the emergence of Sydney 2012 (Eden et al. 2014; Van Beek et al. 2013), no new pandemic strain has emerged. However, between 2012 and 2013, a novel recombinant of the two pandemic variants (GII.P4_New Orleans_2009 with GII.4_Sydney_2012) emerged in Asia, Canada, Europe, Denmark, United States of America (USA) and South Africa (SA) (Hasing et al. 2013; Mans et al. 2014; Wong et al. 2013; Martella et al. 2013; Fonager et al. 2013). Late 2015, this strain was detected in low frequencies in Australia and re-emerged with genetic changes in June 2016. This novel strain named Melbourne 2016 is believed to be a possible emerging pandemic variant (Bruggink et al. 2016). During the 2014/15 gastroenteritis season in Asia, NoV GII.4 was replaced by the novel GII.17 Kawasaki 2014 strain as the predominant genotype causing gastroenteritis outbreaks (De Graaf et al. 2015). These strains were also identified in other parts of the world, but only in sporadic cases (Medici et al. 2015; Lu et al. 2015; Parra and Green 2015). In 2008, NoV was detected in 14.3% of the stool samples collected from hospitalised paediatric patients with gastroenteritis in Pretoria, SA with the Apeldoorn 2007 and New Orleans 2009 GII.4 variants being described in SA for the first time (Mans et al. 2010). Between 2008 and 2011, three sewage-polluted rivers from Gauteng were monitored for NoV GI and GII and wastewater samples from different communities in five provinces of SA were screened for NoV GI, GII, GIV and sapovirus. Noroviruses were detected in 63% of the sewage-polluted river water samples and at least one calicivirus was detected in 82% of the wastewater samples (Mans et al. 2013; Murray et al. 2013). In these studies, 16 NoV genotypes were shown to be circulating in SA wastewater and sewage-polluted rivers. Of these genotypes, eight were previously found to be circulating in the hospitalised paediatric patients in SA (Mans et al. 2010). In addition, novel recombinants have also been reported in SA (Mans et al. 2014). These data clearly indicate that there is a need for a real-time surveillance for NoVs circulating in SA. This study focused on the detection, quantification and characterisation of NoV GI and GII from wastewater samples collected from selected wastewater treatment plants (WWTPs) in the Free State and Gauteng provinces of SA. These data will contribute to the growing NoV surveillance database in SA.

Methods

Site Selection and Sample Collection

From April 2015 to March 2016, wastewater samples (raw sewage [R] and wastewater discharge (effluent [E]) were collected from five different WWTPs (designated WWTP1 [DV], WWTP2 [FF], WWTP3 [OV], WWTP4 [VM], and WWTP5 [VL]) within the catchment area of the Vaal Dam. These WWTPs service small communities situated on the banks of Vaal River/Dam into which effluent from the above mentioned communities is discharged. Raw sewage (1 L) and effluent water (10 L) samples were collected monthly from each WWTP. The temperature and pH of each sample were recorded upon arrival at the laboratory.

Bacterial Indicators

Escherichia coli (E. coli), which indicates faecal contamination from warm-blooded animals, was quantified in raw and effluent wastewater to provide an estimate of the functionality of the different WWTPs. A total of 108 samples were received and screened for E. coli, using the membrane filtration technique and m-ColiBlue24® broth (Merck KGaA, Darmstadt, Germany). Ten millilitres (effluent) and ten-fold series dilution in maximum recovery diluent (MRD) (Oxoid Ltd., Hampshire, England) (raw sewage) were filtered through cellulose nitrate filters (0.45 µm) (Sartorius AG, Goettingen, Germany) and incubated at 35 °C for 24 h. The blue colonies were counted and expressed as colony forming units per 100 mL (cfu/100 mL).

Virus Concentration and Nucleic Acid Extraction

The glass wool adsorption-elution technique was used for viral recovery from effluent water samples (Vilaginès et al. 1993; Mans et al. 2013). The recovered viruses were eluted from glass wool using 100 millilitres (mL) of glycine-beef extract buffer (pH 9.5) (Glycine; Merck KGaA) (BBL Beef Extract; Becton, Dickinson and Company, Sparks, MD) and the pH of the eluate was adjusted to pH 7 using 1 M HCl (Merck KGaA). The viruses in 100 mL eluate were further concentrated into a final volume of 10 mL in phosphate-buffered saline (PBS) (pH 7.4) (Sigma-Aldrich Co., St. Louis, MO) by polyethylene glycol 8000/sodium chloride (PEG8000/NaCl) precipitation (PEG8000; Amresco LLC, Solon, OH) (NaCl; Merck KGaA) (Minor 1985; Lowther et al. 2013). The raw sewage samples (1 L) were subjected to PEG8000/NaCl precipitation and viruses were concentrated to a final volume of 10 mL in PBS. Aliquots (1 mL) of viral concentrates were spiked with 10 µL of mengovirus (5 × 104 genome copies) before total nucleic acid extraction using the automated NucliSENS® EasyMAG® Instrument (BioMérieux, Marcy-l’Étoile, France) to assess extraction efficiency. The extracted nucleic acids were eluted in a final volume of 100 µL and stored in 10 µL aliquots at −70 °C.

Detection and Quantification of NoVs

Commercial real-time reverse transcription-PCR (RT-PCR) kits were used for the detection and quantification of NoV GI (norovirusGI@ceeramTools™) (Ceeram S.A.S, La Chapelle sur Erdre Cedex, France), GII (norovirusGII@ceeramTools™) (Ceeram S.A.S) and quantification of mengovirus (Mengo Extraction Control) (Ceeram S.A.S). All kits used contained internal controls to monitor the efficiency of target amplification. The mengovirus concentrations were used to calculate the extraction efficiency for each sample. NoV GI and GII were detected and quantified using one-step real-time qRT-PCR according to manufacturer’s instructions. Norovirus and mengovirus standard curves were generated using plasmid DNA standards (Norovirus GI Q Standard) (Norovirus GII Q Standard) (Ceeram S.A.S) and RNA standard (Mengo Extraction Control) (Ceeram S.A.S), respectively. Norovirus concentrations were then adjusted to compensate for extraction efficiencies below 100% and expressed as genome copies/litre (gc/L). All NoV-negative samples were re-tested using a 1:10 dilution of RNA in nuclease-free water to exclude possible false-negative results due to inhibition.

Genotyping of Noroviruses

Complementary DNA (20 µL) was prepared using 10 µL of the extracted nucleic acids, 30 µM random hexamers, 10 mM dNTPs and 50 U RevertAid Premium reverse transcriptase (Thermo Scientific, Waltham, MA). The partial capsid (region C) (GI; 326 bp, GII; 343 bp) was amplified using a two-step semi-nested RT-PCR. The first 50 µL PCR reaction comprised of 5 µL cDNA, 1.25 U AmpliTaq Gold® DNA polymerase (Applied Biosystems, Foster City, CA), 200 µM dNTPs and primers QNIF4 (0.2 µM) and G1SKR (0.4 µM) for GI (Kojima et al. 2002; da Silva et al. 2007) or primers QNIF2 (0.2 µM) and G2SKR (1 µM) for GII (Kojima et al. 2002; Loisy et al. 2005) (Table 1). One microliter of the first round PCR was used as a template for the second round PCR. The PCR reaction mixture comprised of the same reagents except for the forward primers G1SKF (0.2 µM) for GI and G2SKF (0.2 µM) for GII (Kojima et al. 2002). The cycling parameters were as follows for GI; 95 °C for 10 min, 45 cycles of 94 °C for 30 s, 50 °C for 30 s and 72 °C for 45 s, followed by 72 °C for 5 min. For NoV GII; 95 °C for 10 min, 45 cycles of 94 °C for 30 s, 55 °C for 30 s and 72 °C for 45 s, followed by 72 °C for 5 min.

Table 1 Primer sequences used for genotyping of norovirus GI and GII

Norovirus RNA-dependent RNA polymerase (RdRp) (region A, 326 bp) was amplified from randomly selected NoV GII positive samples. The conventional PCR was performed with published primers JV12Y and JV13I (Table 1) (Vennema et al. 2002) and AmpliTaq Gold® (Applied Biosystems) according to the manufacturer’s recommendations. If no PCR product was obtained after one PCR reaction, a second round PCR was performed using the same primers with 1 µL of the first PCR reaction as template. The following cycling parameters were used for first round; 95 °C for 10 min, 45 cycles of 94 °C for 30 s, 37 °C for 1 min, 30 s and 72 °C for 1 min, followed by 72 °C for 5 min. Second round PCR cycling parameters were as follows; 95 °C for 10 min, 45 cycles of 94 °C for 30 s, 45 °C for 1 min and 72 °C for 1 min, followed by 72 °C for 5 min. In-house designed primers (Table 1), were used to amplify the GII.17 (726 bp) P2 domain. A two-step RT-PCR was performed using EmeraldAmp® Max HS PCR Master Mix (Takara Bio Inc., Kusatsu, Japan) and primers 17PF532 and 17PR1346 (Inqaba Biotech, Pretoria, South Africa). A second round PCR was performed to increase the yield of the amplified product using the following cycling parameters; 95 °C for 10 min, 30 cycles of 94 °C for 30 s, 48 °C for 45 s and 72 °C for 90 s, followed by 72 °C for 5 min. All amplicons were analysed on 1.5% agarose gels and purified with the DNA Clean and Concentrator-25 Kit (Zymo Research, Irvine, CA) or the Gel DNA Recovery Kit (Zymo Research). The PCR products were cloned using the ClonJET PCR cloning kit (Thermo Scientific) and transformed into Lucigen E.cloni 10G competent cells (Lucigen Corp., Middleton, WI). To confirm presence of the insert fragment, ten clones were randomly selected and subjected to colony PCR. The PCR reaction (20 µl) comprised of OneTaq® Quick-load® Master Mix (New England Biolabs Inc., Ipswich, MA), 0.2 µM forward and reverse pJET1.2 primers. PCR products were analysed on 1.5% agarose gels and clones with the correct insert size were purified with the DNA Clean and Concentrator™-25 Kit (Zymo Research). Subsequently, clones were sequenced using pJET1.2 primers (Thermo Scientific) and the ABI PRISM BigDye1 Terminator v. 3.1 Cycle Sequencing kit on an ABI 3130 automated analyser (Applied Biosystems).

Phylogenetic Analysis

Nucleotide sequences of the selected clones were assembled using Sequencher 4.9 (Gene Codes Corporation, Pittsfield, MI) and BioEdit Sequence Alignment Editor (V.7.0.9.0). The online NoV Genotyping Tool version 1.0 (http://www.rivm.nl/mpf/norovirus/typingtool/) was used to assign NoV genotypes (Kroneman et al. 2011). BLAST-N (http://blast.ncbi.nlm.nih.gov/Blast.cgi) was used to compare the detected sequences to NoV sequences in GenBank. To elucidate the relationship between the strains, these sequences were aligned with the reference strains from genogroup I and II using MAFFT version 7 (http://mafft.cbrc.jp/alignment/software/). Phylogenetic analysis was performed in MEGA6 using the Neighbour-Joining method supported by 1000 bootstrap replicates and the evolutionary distances were determined using the Kimura 2-parameter model (Tamura et al. 2013).

Nucleotide Sequence Accession Numbers

The nucleotide sequences determined in this investigation were submitted to GenBank under the following accession numbers: partial capsid sequences (MF182117-MF182315, MF182325-MF182350) and partial RdRp sequences (MF182316-MF182324).

Results

Quantification of Bacterial Indicators

Escherichia coli concentrations ranged from 1.30 × 103 to 2.00 × 107 cfu/100 mL in raw sewage and from 1.00 × 101 to 1.00 × 107 cfu/100 mL in effluent samples. The E. coli concentrations were significantly reduced during wastewater treatment at WWTP1, WWTP2 and WWTP4. Wastewater treatment plants three and five had a slight decrease in E. coli concentrations in discharged effluent waters, however, the concentrations were still above the allowed target water quality range (ATWQR) (1000 cfu/100 mL) as recommended by the Department of Water Affairs and Forestry (DWAF), SA for discharged effluents (National Water Act No. 36 of 1998) (DWAF 2013). There was no correlation observed between the E. coli concentrations or presence and NoV concentrations or presence. There was also no clear pattern of seasonality in the E. coli distribution throughout the study period (data not presented).

Detection and Quantification of NoV GI and GII

Noroviruses were detected in 72.2% (78/108) [(61.1%; 33/54) raw sewage and (83.3%; 45/54) effluent] of the samples tested. Norovirus GI was detected in 4.6% (5/108) of the wastewater samples tested, whereas 31.5% (34/108) of the samples were positive for GII. Both GI and GII were detected in 36.1% (39/108) wastewater samples. Norovirus concentrations ranged from 1.02 × 102 to 3.41 × 106 gc/L for GI and 5.00 × 103 to 1.31 × 106 gc/L for GII. Similar NoV concentrations were observed in the raw sewage and effluent water samples in all five WWTPs, throughout the study (Table 2). This is suggestive of inefficient removal of NoVs during wastewater treatment. The highest concentrations of NoV GI (6.80 × 106 gc/L) and GII (2.60 × 106 gc/L) were recorded in November 2015 from an effluent sample collected at WWTP1. In all WWTPs, NoV GII concentrations were generally higher than the GI concentrations throughout the study. Even though no clear seasonality was documented, NoV GI concentrations in raw sewage peaked between November 2015 and February 2016 (late spring and summer), whereas NoV GII concentrations in raw sewage showed no distinct pattern. Norovirus GI and GII concentrations in effluent wastewater were overall higher during the warmer months (Fig. 1). In addition, NoVs with concentrations ranging between 1.06 × 103 gc/L and 8.04 × 105 gc/L were detected in 61.1% (33/54) of effluent samples with E. coli levels below the ATWQR.

Table 2 Summary of NoV quantification and genotyping results
Fig. 1
figure 1

Norovirus GI and GII concentrations in raw sewage and effluent samples from five wastewater treatment plants tested over a 12 month period. The concentrations are expressed in Log10 genome copies/litre. a NoV GI concentrations in raw sewage. b NoV GI concentrations in effluent. c NoV GII concentrations in raw sewage. d NoV GII concentrations in effluent

Norovirus Genotyping and Phylogenetic Analysis

Norovirus partial capsid sequence information was successfully obtained from 80.8% (63/78) of the NoV-positive wastewater samples. The partial RdRp and P2 domain sequences were also obtained with success from the selected samples. In total, 16 NoV capsid genotypes were characterised (Fig. 2). Thirty-five NoV GI positive samples with a C t value < 35 were selected for capsid genotyping. Five GI genotypes (GI.2, GI.3, GI.4, GI.5, and GI.6) were characterised from 51.4% (18/35) of the NoV GI positive samples. The dominant strain, GI.4 was detected in 66.7% (12/18) samples and circulated in four of the five WWTPs (WWTP1, WWTP2, WWTP3, WWTP4) between May 2015 and January 2016. All GI strains detected, except GI.6 strains, were closely related (96–100% similarity over 280 nucleotides [nt]) to strains that were previously detected in SA sewage-polluted surface waters (Fig. 3) (Mans et al. 2013). Eleven GII genotypes (GII.2, GII.3, GII.4, GII.7, GII.9, GII.10, GII.14, GII.16, GII.17, GII.20, and GII.21) were successfully genotyped from 82.2% (60/73) of the NoV GII positive samples. The untypeable strains could be due to low template concentration and primer-template mismatches. Norovirus GII.2 and GII.17 co-dominated as they were detected in 40% (24/60) and 35% (21/60) of the successfully genotyped samples, respectively. The majority of GII.17 strains detected clustered with the novel Kawasaki 2014 variant (Fig. 4). Due to presence of mixed genotypes in one sample, multiple clones were sequenced from one sample and representative strains for each genotype are shown in Figs. 3 and 4. On average, five different NoV genotypes were detected in wastewater every month. Five NoV RdRps (GII.Pe, GII.Pg, GII.P13, GII.P15 and GII.P17) (Fig. 5) were detected in four selected samples in which GII.2, GII.3, GII.4, GII.7 and GII.17 capsids were successfully amplified and sequenced. Attempts to amplify the NoV GII.17 ORF1/2 overlapping region failed but the P2 domain was successfully amplified and sequenced and phylogenetic analysis indicated that both Kawasaki 2014 sub-variants 308 and 323 were present (Fig. 6).

Fig. 2
figure 2

Distribution of NoV genotypes characterised from 63 wastewater samples from April 2015 to March 2016. The numbers in parentheses indicates the number of samples successfully genotyped/number of samples positive for NoVs on real-time qRT-PCR/number of samples tested

Fig. 3
figure 3

Neighbour-joining phylogenetic tree of partial capsid sequences (285 bp, region C) that represent 19 norovirus GI strains detected in this study. Reference sequences and closely related strains from GenBank are indicated by accession numbers, strains from this study (closed circle). The percentage of replicate trees in which the associated strains clustered together in the bootstrap test (1000 replicates) is shown next to the branches. The sample site, type of samples, month of collection and clone number are included in each sample name; e.g. OV E Feb 24 = Clone 24 from effluent sample collected in OV in February. The scale bar indicates 0.05 nucleotide differences per site, over the indicated region

Fig. 4
figure 4

Neighbour-joining phylogenetic tree of partial capsid sequences (303 bp, region C) representing 62 norovirus GII strains detected in this study. Reference sequences and closely related strains from GenBank are indicated by accession numbers, strains from this study (closed circle). The percentage of replicate trees in which the associated strains clustered together in the bootstrap test (1000 replicates) is shown next to the branches. The sample site, type of samples, month of collection and clone number are included in each sample name. The scale bar indicates 0.05 nucleotide differences per site, over the indicated region

Fig. 5
figure 5

Neighbour-joining phylogenetic tree of partial RdRp sequences (251 bp, region A) that represent six norovirus GII strains from this study (closed circle). Reference sequences and closely related strains from GenBank are indicated by accession numbers, RdRp previously associated with GII.17 (closed triangle). The percentage of replicate trees in which the associated strains clustered together in the bootstrap test (1000 replicates) is shown next to the branches. The scale bar indicates 0.05 nucleotide differences per site, over the indicated region

Fig. 6
figure 6

Neighbour-joining phylogenetic tree of complete P2 domain sequences (726 bp) that represent two GII.17 strains (Kawasaki 308 and 323) detected. Reference sequences and closely related strains from GenBank are indicated by accession numbers, strains from this study (closed circle). The percentage of replicate trees in which the associated strains clustered together in the bootstrap test (1000 replicates) is shown next to the branches. The scale bar indicates 0.005 nucleotide differences per site, over the indicated region

Discussion

There is lack of data on the prevalence and diversity of NoV GI and GII circulating in the population from countries in the African continent. To date, the available data is from wastewater, sewage-polluted river water (Sdiri-Loulizi et al. 2010; Mans et al. 2013; Kiulia et al. 2014; Murray et al. 2013) and clinical specimens in 17 of the 54 African countries, with large-scale studies in Burkina Faso (Nordgren et al. 2013), Ghana (Krumkamp et al. 2015), Malawi (Tran et al. 2013), SA (Mans et al. 2016), Tanzania (Moyo et al. 2014) and Tunisia (Hassine-Zaafrane et al. 2013). The available SA NoV molecular epidemiology data is mostly based on the use of the 320 bp region at the 5′-end of the NoV capsid gene. However, the first NoV GII.4 full genome analysis from clinical specimens in Africa was completed in SA in 2015 (Botha et al. 2017). In spite of this progress, characterisation of NoVs from clinical specimens does not reflect the true epidemiology and the distribution of the enteric viruses as it excludes specimens from individuals with mild disease and asymptomatic individuals that shed the virus but do not seek medical advice. Therefore, screening and characterising NoVs from environmental samples which include treated and raw wastewater might be a feasible method to determine the prevalence and diversity of NoVs (Murray et al. 2013).

Noroviruses were successfully detected, quantified and genotyped from wastewater samples received from all five WWTPs. Nucleic acid extraction efficiency from each sample was calculated using pre- and post-extraction mengovirus concentration. Even though the NoV concentrations were adjusted accordingly, they are likely to be an underestimation of the true abundance of NoVs in wastewater given that the efficiency of viral recovery of glass wool adsorption-elution and PEG8000/NaCl precipitation were not taken into consideration. In addition, the use of a DNA standard for quantification is not ideal and leads to an underestimation of virus concentration in the water samples. Noroviruses were detected in 72.2% (78/108) of samples and were successfully genotyped with a success rate of 80.8% (63/78). The inability to genotype NoVs from the remaining 19.2% (15/78) environmental samples could be influenced by the turbidity of the water samples, low viral recovery efficiency and presence of reverse transcriptase inhibitors. Different protocols (sample volume and virus recovery techniques) were used for sample processing. This is also likely to be accountable for the discrepancy seen in the detection results (NoV pos. effluent NoV pos. raw sewage). The results revealed that NoV GII (67.6%, 73/108) is more prevalent than NoV GI (40.7%, 44/108) in the environment which is similar to what has been observed in a previous environmental surveillance study in SA (Murray et al. 2013), Kenya (Kiulia et al. 2014) Netherlands (Lodder and de Roda Husman 2005) and Nicaragua (Bucardo et al. 2011). According to data from this study, the majority of NoV-positive samples were collected during winter/early-spring period (42.3%; 33/78) whereas 32.1% (25/78) were detected during late-spring/summer months (Fig. 2). This data correlates with environmental NoV surveys in Japan (Kitajima et al. 2010, 2012; Iwai et al. 2009; Katayama et al. 2008; Haramoto et al. 2006). However, despite the “lower than expected” prevalence in warm months, a great diversity of NoVs circulated in wastewater; six to eight genotypes were characterised each month between November 2015 and January 2016 (Fig. 2). This diversity correlates with NoV concentrations which also peaked between November 2015 and February 2016 (Fig. 1).

The majority of the GI.4 strains detected share high identity (99–100% similarity over 285 nt) with strains that previously circulated in SA (JN191356) and Italy (JX142184) (Fig. 3). Despite their predominance in the environment, GI.4 strains have never been detected in South African clinical specimens. Similarly to GI.4, the GI.3 strain detected is closely related to a strain that was previously detected in SA wastewater (JN191361). There is a vast diversity of GI.2 strains; phylogenetic analysis done shows three different sub-clusters within this genotype. All GI.2 strains characterised from this study are closely related and form their own cluster which is distinct from a strain previously detected in SA wastewater (HQ201648). Two GI.5 strains detected in the current study are closely related to strains previously identified in a South African environmental sample (HQ201654) and a clinical specimen (KR904276). For the first time in SA, two GI.6 strains were identified. They share between 96% and 99% nt similarity with a clinically significant strain detected in Moscow (FJ383817) and a strain identified from wastewater in China (KX245202).

Norovirus GII.4 was previously shown to be the most prevalent genotype in SA polluted surface water (Mans et al. 2013). The same trend was therefore expected in the wastewater samples screened in this study. However, NoV GII.4 was one of the least detected (1.9%; 2/108) genotypes. These GII.4 strains were detected in two different sites, WWTP2 and WWTP3 in August and September 2015, respectively. They were identified as the Sydney 2012 and Osaka 2016 variants and form two sub-clusters with GII.4 strains previously detected in clinical samples from KwaZulu-Natal province, SA in 2012 and 2013 (Fig. 4). The decline in GII.4 in environmental samples has been documented previously on the African continent where GII.17 was the most prevalent genotype detected in polluted surface waters (Kiulia et al. 2014). Currently, there is no published clinical surveillance data to determine GII.17 prevalence and impact on the NoV burden in SA but on-going surveillance has not detected the GII.17 Kawasaki 2014 variant in children hospitalised with gastroenteritis. However, given that GII.4 (median age = 1 year) and GII.17 (median age = 49 years) have different age distributions (Chan et al. 2015) and the surveillance is mainly focused on children <5 years, GII.17 infections are likely missed and therefore underreported. In SA, GII.17 was first detected in sewage-polluted river water in 2009 and in 2011, a novel recombinant GII.P16/GII.17 was characterised from a stool sample (Mans et al. 2013, 2014). Since then, the parental GII.17 has been detected as a co-dominating genotype in sewage wastewater from different provinces around the country (Murray et al. 2013). Similar GII.17 strains have been detected before from stool samples of healthy children in the Netherlands and they have also played a role in nosocomial infections around the world (Sukhrie et al. 2012; Ayukekbong et al. 2011). In this study, GII.17 (35%; 21/60) was the second most abundant genotype detected after GII.2 (40%; 24/60). The majority of the GII.17 strains characterised clustered together with strains from Japan (AB983218, LC037415), Kenya (KF916585) and South Korea (KT438790, KT384068).

The original GII.17 variant is still circulating in SA. Two closely related strains were detected in two different wastewater samples from WWTP1 in October 2015 and January 2016. They form a cluster away from a strain that was previously detected in wastewater in SA (KC495674). Norovirus Kawasaki 2014 strains have a novel RdRp, GII.P17 which has never been seen before amongst GII.17 variants. Therefore, more gastroenteritis outbreaks associated with this variant could be expected as it has escaped herd immunity (Matsushima et al. 2015). The novel GII.P17 detected in this study shares 98% similarity over 251 nt with a strain detected in a clinical specimen in China (KT633396) (Fig. 5). Phylogenetic analysis of the GII.17 P2 domain confirmed the presence of two strains closely related to Kawasaki 2014 variants (LC037415 and AB983218) (Fig. 6). Although these novel SA GII.17 strains cluster with GII.17 Kawasaki 2014 variants, it cannot be presumed that they are equally clinically significant without supporting data from clinical surveillance. It is still not clear if Kawasaki 2014 variants will replace the previously dominant GII.4 Sydney 2014 strains globally. Clinically important strains that have been previously detected in hospitalised children in SA (GII.2, GII.3, GII.4, GII.7, GII.10, GII.14, GII.16, GII.17, and GII.21) (Mans et al. 2014, 2016) make up over 80% (9/11) of GII genotypes detected in this study. Two sub-clusters of GII.2 strains circulated in SA wastewater between June 2015 and March 2016. The majority of the strains are closely related to a strain previously detected in a SA clinical sample (KR904490). One strain clusters away from other strains and it shares 99% similarity over 302 nt with an emerging recombinant strain (GII.P16/GII.2) detected in USA (KY865307). Interestingly, all GII.16 strains were characterised from one site, WWTP3. These strains group into two sub-clusters and are not closely related to a strain previously detected during clinical surveillance in SA (KR904317).

This study has contributed important information to the growing data of NoVs circulating in SA. It has shown that there are at least 16 NoV genotypes circulating in Free State and Gauteng, SA. These results are comparable to the previous data from environmental samples in SA (Mans et al. 2013). Although the data collected in this study has revealed a vast diversity of NoVs, the real-time qRT-PCR and genotyping results are likely an underestimation of the true NoV prevalence and diversity in SA. The presence of NoVs in effluent water samples is suggestive of inefficient removal of potentially harmful pathogens. In addition, data from this study showed that the use of E. coli alone as an indicator for acceptable water quality is not sufficient. Significant NoV concentrations were detected in more than 50% of effluent samples within the ATWQR for E. coli. This correlates with previously described observations where enteric viruses were detected in wastewater samples with E. coli levels within the ATWQR (Pusch et al. 2005; Adefisoye et al. 2016). Even though virus infectivity and risk of human infection were not investigated in this study, other potentially infectious cytopathogenic viruses were detected in the NoV-positive samples (data not presented). Effluent water is discharged into surface waters which in most instances, are used for agricultural and recreational purposes. Given the high prevalence of HIV/AIDS and significant number of people who live in informal settlements with no proper sanitation and clean running water in SA, contaminated surface waters are a public health risk. Therefore, continuous surveillance of pathogenic viruses in wastewater, specifically effluent water is important in ensuring the safety of downstream waters and public health (Adefisoye et al. 2016; Mans et al. 2013). This study has shown that sewage surveillance is useful to identify NoV strains circulating in the broader community. Hellmér and co-workers reported that the presence of NoV and hepatitis A in sewage provided early warning of virus outbreaks (Hellmér et al. 2014). Such information is vital to alert public health officials to the introduction of new strains. Since SA lacks NoV outbreak reporting systems, there is currently no data to determine whether the presence of a new strain detected during sewage surveillance successfully predicts NoV outbreaks. However, combined clinical and sewage NoV surveillance is likely the best approach to obtain a comprehensive understanding of NoV epidemiology.