Tracking MERS-CoV evolution

About 3 years ago, a novel respiratory disease was diagnosed in Saudi Arabia and called Middle East Respiratory Syndrome. The responsible agent was identified as a betacoronavirus, a single stranded RNA virus, and is called MERS-CoV. Up until a few weeks ago, the disease was restricted to the middle east with most of the 1200 reported cases in Saudi Arabia. In early May, a business traveller brought the virus to South Korea, where it rapidly spread within a hospital — 120+ confirmed cases so far. From there, one infected person travelled to China. The chart below shows case numbers over time.

mers_cases

Sequences of recently isolated viruses from China and Korea have been deposited in genbank this week. To track the spread and evolution of the virus, we adapted nextflu to MERS-CoV. This is now live at mers.nextflu.org

Embedded image permalink
mers.nextflu.org allows to color the tree by country, host (human/camel) and sampling date.

The tree is characterized by multiple clusters of virus sequences that are very similar and are isolated almost at the same time (see tree colored by sampling data below. The clusters contain almost mers_dateonly viruses isolated from humans, whereas the viruses isolated from camels are more isolated and scattered around in the tree, pointing towards multiple camel-human transmissions followed by localized transmissions among humans.

Among RNA viruses MERS-CoV has an exceptionally large genome of more than 30kb in length. Together with the high mutation rate of RNA viruses, this would in principle result in well resolved trees with a mutation every couple of days. However, MERS-CoV also seems to recombine, which makes the interpretation of sequence differences between viruses difficult. Viruses can be similar in one region of the genome, and quite different in another. This is illustrated in the matrices of pairwise distances below. Trees constructed from sequences that have undergone recombination are still useful as a summary of the average differences between sequences, but deep branches in particular don’t necessarily represent the true history of the virus. The tight clusters associated with recent outbreaks don’t suffer from this problem as much. For the virus, recombination is probably a necessity since maintaining a long genome with high mutation rate is next to impossible without recombination.

Color encodes distance between viral sequences (blue = similar, red = distant). The part of the genome used to compare sequences is indicated above the panels. The order of sequences is the same as in the tree with the oldest sequences at the bottom left.
Color encodes distance between viral sequences (blue = similar, red = distant). The part of the genome used to compare sequences is indicated above the panels. The order of sequences is the same as in the tree with the oldest sequences at the bottom left.

Look here for a more detailed analysis of recombination in MERS-CoV by Gytis Dudas and Andrew Rambaut.

Advertisements