MOLECULAR CHARACTERIZATION OF THE MOVEMENT AND COAT PROTEINS OF A NEW ELM MOTTLE VIRUS ISOLATE INFECTING EUROPEAN WHITE ELM (ULMUS

European white elms (Ulmus laevis Pall.) growing in a park in Caputh near Berlin (Germany) were regularly monitored over a period of 18 years showing virus infection-like symptoms such as chloroses, chlorotic ringspots, mottling and dieback. To obtain the evidence for viral infection, RNA-seq using an Illumina Hi Seq2500 was conducted and three contigs were obtained. They match with the three EMoV genomic RNAs and cover the open reading frames for the viral replicase, the polymerase and the movement and coat proteins (MP, CP). The contigs show identities of 95.3–96.4%, 91.9–93.3% and 89.0–92.5% at the nucleotide level with RNA 1, RNA 2 and RNA 3 of reference sequences, respectively. The analyses of the MP and CP showed significant differences in amino acid sequence compositions compared to those of reference EMoV sequences. These results demonstrate the presence of a so far unknown isolate of EMoV. This is the first report of sequence data of EMoV infecting U. laevis.


INTRODUCTION
A population of European white elms (Ulmus laevis Pall.) in the park in Caputh (federal state Brandenburg, Germany) suffer from chlorotic ringspots, mottling and necrosis since decades (Figure 1). Assuming a virus to be the causal agent, long-term investigations were conducted and revealed an increase of the extent of symptoms on the affected branches. At the same time, distinct dieback and a decline of the annual growth was observed. Investigations in identification and characterization of the causal pathogen have been conducted for several years. In previous investigations viruses known to infect elm trees such as Elm mottle virus (EMoV) were excluded as the causal agent applying biological and serological methods (Bandte et al., 2004). Nonetheless, recent findings reveal the presence of EMoV in German elms with described symptoms (Büttner et al., 2015). Although, the EMoV belonging to subgroup 2 of the genus Ilarvirus (Bromoviridae) was recently described as common in elm species (EPPO, 2017), the number of recorded detections is rare (Table 1). EMoV has a tripartite positive single-stranded RNA ((+) ssRNA) genome with cap structures at the 5'-termini ( Figure 2). The three RNA molecules lack polyadenylations at the highly conserved 3'-termini but form strong nonaminoacetylated secondary structures. The EMoV genomic RNAs code for four open reading frames (ORFs) and are separately embedded forming quasi-isometric and bacilliform virions, respectively (King et al., 2012). RNA 1 (3,431 bp) codes for the replicase (ORF 1) comprising the conserved domains (CDs) for the viral methyltransferase (Vmethyltransf, pfam1660) and a helicase motif (Viral_helicase1, pfam1443). This and the RNA-dependend RNA polymerase (RdRP_2, pfam00978) coding within ORF 2 of the RNA 2 (2,874 bp) are regarded to be subunits of the viral replicase complex with strictly linked and coordinated RNA 1 and RNA 2 replication strategies (Bol, 2005). The RNA 2 of EMoV includes a second ORF (ORF 2b) that codes for a 2b protein that has been proven to be involved in gene silencing (Shimura et al., 2013). RNA 3 (2,325 bp) encodes the MP (proximal ORF 3a) belonging to the "30K" Bromo_MP superfamily (Melcher, 2000;pfam01573) and the CP (distal ORF 3b), which comprises the Ilar_coat superfamily CD (pfam01787).  Germany (Schmelzer, 1974)
This work shows sequence analysis of EMoV genomic RNAs. It highlights the analysis of the EMoV RNA 3, coding for the viral MP and CP and provides new insights into the variability of the RNA 3 on nucleotide and amino acid levels. Data show an as yet unreported variability in sequence compositions and point to the presence of a new EMoV strain infecting U. leavis in Germany.

MATERIALS AND METHODS
In 2014, leaves of a diseased tree were sampled.
Complementary to the EMoV-specific RT-PCR described by Büttner et al. (2015) a next generation sequencing method was employed. Areas with ringspots were cut from symptomatic leaves and 70 mg fresh material was used for RNA extraction with Invitrap Spin Plant RNA Mini Kit (STRATEC Molecular). The NucleoSpin® RNA Kit (Macherey-Nagel) containing rDNase was used for removing residual DNA, followed by cleaning the sample with NucleoSpin® RNA clean-up Kit (Macherey-Nagel). Efficient depletion of plant large ribosomal RNA from total RNA was accomplished using a RiboMinus Plant Kit for RNA-Sequ (Invitrogen) and 10 µg of high integrity total RNA. Double-stranded full-length cDNA was synthesized with 1-2 µg RiboMinus RNA using the Maxima H Minus Double-Stranded cDNA Synthesis Kit (Thermo Scientific) primed with random hexamers. All kits were executed according to manufacturer's instructions. Approximately 1-2 µg of double-stranded cDNA were sent to BaseClear (Netherlands) for RNA sequence analysis. Paired-end 100 bp sequence reads (≈ 50 Mb) were generated conducting Illumina Hi Seq2500 system. The reads were mapped, and virus sequences de novo assembled on Biolinux and CLC Genomics Workbench, respectively. Out of a dataset of 1,011,396 paired-end reads, 908 contigs were constructed and used to identify viral sequences, which were analyzed with Clustal W (Larkin et al., 2007) and Geneious version 9.1.3 (Kearse et al., 2012). Genomic EMoV RNA was identified with BLASTX 2.2.25 (Altschul et al., 1990). The resulting EMoV RNA sequences were aligned with reference EMoV isolates in order to determine variability in sequence composition. ORF 3a and 3b were verified conducting a PCR followed by Sanger sequencing (data not shown).   (Thomas et al., 1983;Ge et al., 1997;Scott et al., 2003), show 89.0 -92.5% identity with the German EMoV isolate from U. laevis (Table 2). This value is remarkably lower than the identities between the British isolates that show between 95.0% and 100% identity. ORF 3a is 852 nt long with a computed weight of 31.3 kDa. The sequence identity between the British isolates ranges between 95% and 99.7% whereas the identity to the German isolate is 93.7% and 94.1% respectively. Within ORF 3a, 49 base substitutions were found. On the amino acid level six substitutions occur by replacing F67 with L67, I75 with L75, R157 with K157, P258 with S258, R260 ith G260 and H280 with R280. Therefore, the protein sequence of the German isolate share 97.5% identity with the reference sequences, while they show identities between 99.3% and 100% to each other ( Table 2). The MP of the German isolate shows six amino acid substitutions from which residues 67 and 75 are located within one of the two conserved RNA-binding domains (RBD). Hybridization studies revealed that some basic amino acids in RBD are essential for the viral capability for cell-to-cell movement as well as for RNA-binding (Herranz et al., 2005). In the case of EMoV isolate 'Berlin', the residues R62, R73 and K74 are basic and residues K71 and H79 are highly conserved (Pallás et al., 2012). By substituting F67 and I75 with L that all are non-polar a change within the RBD motif polarity does not occur. Accordingly, the substitutions within the isolate 'Berlin' are not assumed neither to hinder RNA affinity nor cell-to-cell movement.  The nucleotide sequence of ORF 3b show 57 base substitutions. Additionally, the sequence obtained from U. laevis possess three prominent nucleotide insertions in ORF 3b. At positions 1,469 and 1,499, two nucleotide triplets were found comprising a GGG and a GCA motif. Additionally, an insertion of a CACAAA motif augment the open reading frame at position 1,526 ( Figure 3A). These insertions do not shift the open reading frame, but they are translated as additional amino acids ( Figure 3B). Consequently, the computed 24.0 kDa CP of the EMoV isolate 'Berlin' contains 217 instead of 213 amino acids. The additional amino acids glycine, glutamine, asparagine and proline characterize the CP structure of this EMoV isolate and distinguish it significantly from those of isolates obtained from U. glabra and H. macrophylla. Within the first 25 amino acids of the Nterminus of the CP that are proven to be mandatory for binding to viral RNA (Bol, 2005), the German isolate differs at three positions from reference isolates by replacing A16 with T16 and substituting G23 with R23 and S24 with G24.
Overall, 17 out of 217 amino acids are uniquely substituted within the CP of the EMoV isolate 'Berlin'. To initiate infection of ilarviruses the binding of CP to the 3'-NTR of the genomic RNAs is required (phenomenon of genome activation). Thus, for the initiation as well as for the propagation of the viral infection, the CP acts as structural key component (Sánchez-Navarro et al., 1997;Pallás et al., 1999;Bol, 2005;MacFarlane and McGavin, 2009). Within the N-terminus of the EMoV CP, an arginine-rich basic motif that is considered to bind to the 3'-NTR of its RNA 3 is determined between amino acid residues 22 and 44 (Pallás et al., 1999;Aparicio et al., 2003;Pallás et al., 2013). The R34 is proposed to be the central essential residue for specific binding of the CP to the 3'-NTR (Ansel-McKinney et al., 1996). Within this motif, the EMoV isolate 'Berlin' has three substitutions and one insertion ( Figure 3A) augmenting the number of arginine residues. This leads to the assumption, that R25, R28, R29, R34 and R36 are conserved within the EMoV CP, whether R23 appears only in the German isolate. With regard to the functional role of the basic motif for the infection process, the content of six instead of five arginine residues is supposed to support genome activation process. The scientific proof of the question, if the higher content of arginine within this region has an enhancing impact to the infectivity of this isolate, remains for further investigations, as well as a putative host specificity. The EMoV isolate investigated here features distinct differences at nucleotide as well as at amino acid sequence that distinguish it remarkably from other EMoV isolates. This study contributes in data collection and genome analysis of the EMoV augmenting the knowledge about viruses infecting elms. It provides for the first time sequence data of a so far unknown EMoV isolate affecting U. laevis.

ACKNOWLEDGEMENT
The author gratefully thanks the Division Phytomedicine at the Humboldt-Universität zu Berlin, in particular Martina Bandte for contributing to the sample collection, Markus Rott for supporting the preparation steps, Artemis Rumbou for establishing the NGS project and Carmen Büttner for facilitating the experiments by providing the laboratory equipment and funding of the Illumina high throughput sequencing technique.

COMPLIANCE WITH ETHICAL STANDARDS
The author declares that ethical standards have been followed and that no human participants or animals were involved in this research.

COMPETING INTERESTS
The author declares that there are no competing interests.