Wednesday, 5 February 2020

What could I find out with a Wuhan COVID-19 coronavirus sequence?

When I heard on the news that the sequence of the new Wuhan COVID-19 coronavirus had been made public, the nerd in me was awoken. It triggered memories of when I used to work with plant viruses two decades ago. Would I still be able to find out more about the virus using some of the methods I would have used back then? Would I be able to understand how the up-to-date and vastly more experienced teams in animal and human virology might approach the problem? Could I explain it to others who don’t work with viruses?

If you are simply looking for general information on the current COVID-19 epidemic, have a look at my earlier post here http://www.miltoncontact-blog.com/2020/02/should-i-worry-about-wuhan-2019-ncov.html. This includes up to date charts and tables on the progress of the epidemic using data from the WHO situation reports.

For my adventure with the Covid-19 sequence, read on.

Finding the Wuhan COVID-19 sequence


One of the great early benefits of the internet was the setting up of DNA databases accessible to all scientists, via The European Molecular Biology Laboratory (EMBL), a molecular biology research institution supported by 27 member states - https://www.embl.org/. This database is currently held by the EBI, the European Bioinformatics Institute, whose centre is based in Hinxton, just outside of Cambridge.

My first step was to see if I could find and download the 2019 nCoV (COVID-19) sequence, using the European Nucleotide Sequence Browser that I found at https://www.ebi.ac.uk/ena/browser/home. In the last week of January, I found a sequence of "A novel coronavirus associated with a respiratory disease in Wuhan of Hubei province, China" provided by F. Wu and a further 18 co-workers. It had been submitted on 05-JAN-2020 by the Shanghai Public Health Clinical Center & School of Public Health, Fudan University, Shanghai, China and entered into the database on 13-JAN-2020. Its entry number on the EBI database is MN908947.

Note that if you search now, you will pick up a different set of 7 later sequences under the accession number MN988668: The first two are from "RNA based mNGS approach identifies a novel human coronavirus from two individual pneumonia cases in 2019 Wuhan outbreak"; Emerg Microbes Infect :0(2019) by Chen Land others, submitted 23-JAN-2020, State Key Laboratory of Virology, Wuhan University, Bayi Road, Wuchang District, Wuhan, Hubei 430072, P.R. China. These cover the first two sequences.

There are 5 further sequences from US laboratories from isolates of the virus from US patients in Arizona, California and Illinois, also from around the end of January..

The virus sequence is given as 29,881 nucleotides

Are there differences between the eight COVID-19 sequences?


I downloaded all the sequences and combined them into a merged file using the software SeqVerter, part of a downloaded freeware called GenStudio Pro. I then uploaded the merged file to Clustal Omega, an EBI program that compares multiple sequences online. After a cup of tea and a scone (with jam), the result appeared on the screen.

They were all 100% identical – the three Chinese and the 5 five different US isolates.

Short section showing sequence identity over the eight available 2019 nCoV (COVID-19) sequences

From a disease point of view, this was ‘good’ news. It showed that from the beginning of the outbreak in China to the first cases in the US, the virus had not mutated into a different strain.

From an old virologist’s point of view, this was quite an unusual result. Why? Well, virus RNA is replicated with a far higher error rate than DNA. The rate is 1 in 10,000 nucleotides. The virus RNA is almost 30,000 nucleotides long. So every time the virus reproduces, I would expect two to three differences to be introduced. When you get millions of virus particles made during an infection, the sequences are actually a spread of mutations that average out around a consensus sequence. An RNA virus is thus not a species as such, but technically a quasispecies. This hold true for another coronavirus disease, MERS (Middle East Respiratory Syndrom), as explained in Mandary et a; (2019) “Impact of RNA Virus Evolution on Quasispecies Formation and Virulence” https://www.researchgate.net/publication/335938867_Impact_of_RNA_Virus_Evolution_on_Quasispecies_Formation_and_Virulence.

I would expect it to be true for the COVID-19.

Diseases like polio, for example, took advantage of these spread of mutations. After infecting a body and causing mild symptoms, a few viruses were able to break through the blood brain barrier and cause the more severe paralysis. This spread of mutations is also what might make it possible for a virus to jump species and infect a host it does not normally reproduce in.

When I did sequencing more than 20 years ago, we had to clone virus fragments and then sequence each clone. We would have seen the different sequence mutations and would have had to sequence a number of clones to get the average quasispecies sequence.

Modern virus sequencing gets around the individual cloning by using NGS, next generation sequencing https://bitesizebio.com/21193/a-beginners-guide-to-next-generation-sequencing-ngs-technology/. The viral RNA, with its whole population of different sequences is extracted, amplified and sequenced. The sequence obtained is the most average sequence, the quasi-species sequence (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3708773/pdf/1471-2164-14-444.pdf).


What is the closest relative to the COVID-19?


The beauty of having a public sequence database is that you can take a new sequence, like COVID-19, and use it to see if you find similar sequences amongst the millions already there. I did this online at EBI, using the nucleotide similarity search, Fasta. It was limited to finding 50 related sequences.

I had the results displayed as a “Phylogenetic tree cladogram”, a branching pattern showing the degree of similarity between the different sequences.

My first phylogenetic tree using 2019 nCoV (Wuhan Seafood Market, COVID-19) against the EBI nucleotide database
The tree showed that the closest similarity of the COVID-19 (Wuhan Seafood Market) coronavirus was to Bat SARS-like viruses, and more distantly to other SARS viruses. Those simply called SARS are human isolates. There was a SARS epidemic, which also originated in China, back in 2002, which was finally brought under control in 2004. SARS stands for Severe Acute Respiratory Syndrome. It killed almost one in ten of people infected.

It is interesting how many patent sequences were also picked up. Presumably from companies and organisations that wanted to provide detection and possibly treatment products against SARS.

I wanted a different display to put the COVID-19 in a wider context. I therefore downloaded a number of species specific Coronavirus sequences that I found by searching the European Nucleotide Sequence Browser for coronavirus. I also removed all the patent sequences from the original set found. The new set of data was uploaded for analysis.

The new phylogenetic tree is shown below. I’ve left the accession numbers in to make to make it easier for future work. I also stretched the tree horizontally from the original, to make the branching clearer and coloured different groups for interpretation.

My second phylogenetic tree from using the results from the first search, minus the patent sequences, plus 15 other coronaviruses from different animals


The different human SARS (simply called SARS) sequences in dark red divide into two groups. One has similarities to Civet SARS (in orange), the other has links that reach to Bat SARS strains (orange). The Wuhan COVID-19, marked in bold red, is more closely related to the Bat SARS. The MERS, Middle Eastern Respiratory Syndrome (dark red), was first identified in Saudi Arabia in 2012. It seems to be more lethal than SARS, with about 36% of individuals diagnosed with the disease dying from it. However, it does not seem to spread easily and there have been around 2000 cases recorded in the period 2012 to 2017. My and the professional advice is – keep away from sick camels.

The remaining animal coronaviruses marked in black cover a range from pig to human to rat. The human coronavirus OC43 is one of a number of viruses that cause the common cold.

Using the COVID-19 sequence to find a vaccine


Companies and organisations around the world, including the US and Porton Down in the UK, are now racing to develop a vaccine. One company in the news recently hoping to get to human trials in the Summer.

Where would I begin?

A search for antigenic regions in SARS on Google, after the last SARS outbreak, will find a number of publications. Researchers use blood serum from people who are ill with, or have just recovered from SARS and see if their sera cross react with any of the virus’s proteins (are they antigenic). If they do, they probably contain useful antibodies.

A paper I particularly liked studied the SARS spike protein. The spike protein sits on the outside of the virus capsule. It interacts with the human cells during infection and plays a part in the absorption of the virus into the cell. The spike protein has two domains (stretches), S1 and S2. S1 is not very antigenic, but S2 is. Hong Zhang et al (2004) used a strain of the human SARS called BJ01 (accession no. AY278488). They created 12 different overlapping fragments of the S2 domain of the spike protein. They labelled them F1 to F13. They looked at which of these fragments were bound by antibodies in sera from 15 different SARS patients. (published as “Identification of an Antigenic Determinant on the S2 Domain of the Severe Acute Respiratory Syndrome Coronavirus Spike Glycoprotein Capable of Inducing Neutralizing Antibodies” https://www.ncbi.nlm.nih.gov/pmc/articles/PMC421668/). 

The fragments F3 and F9 in SARS turned out to be antigenic, i.e., the patients’ sera reacted with them. Proteins are made up of chains of amino-acids. F3 stretched from the amino acid Arginine at position 797 in the spike protein to amino acid Proline at position 844. F9 stretched from amino-acid Leucine at position 1045 to Aspartic acid at 1109.

Using a single letter code for each amino acid, the sequences of F3 and F9 are:

F3=RSFIEDLLFNKVTLADAGFMKQYGECLGDINARDLICAQKFNGLTVLP

F9=LHVTYVPSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTDNTFVSGNCD

First I compared how similar the human SARS BJ01, a Bat SARS, a MERS and a human coronavirus spike proteins were to that of COVID-19 (labelled 2019 nCoV). I did this in pairs using the alignment program in GenStudioPro. The results are shown in the figure below. You can get a good first impression how well sequences match by looking for how much of the aligned protein sequences is coloured darkly, indicating a 100% match.

Pairs of alignments between COVID-19 (2019 nCoV) and human SARS BJ01, a Bat SAR, a MERS and human coronavirus OC43. The locations of the antigenic fragments F3 and F9 from human SARS BJ01 are marked in red bars. The dark colours in the aligned sequences show a 100% match.
Just from the colour patterns alone, you can see that the spike protein of COVID-19 is very similar to both the human and the bat SARS sequences. There is much less matching between 2019 nCoV and the MERS or human coronavirus OC43.

I then looked more closely at the similarities between the F3 and F9 antigenic fragments from human coronavirus OC43 and a variety of sequences. I always included the F3 or F9 fragment, the human coronavirus OC43 sequence from which it was derived, and the COVID-19 sequence.

Looking at similarities to the F3 antigen from human SARS BJ01 with COVID-19  (2019 nCoV) and Bat SARS, a MERS and human coronavirus OC43. Differences just in the COVID-19 (2019 nCoV) are highlighted in red. Differences in the other sequences are highlighted in orange.
Looking at similarities to the F9 antigen from human SARS BJ01 with COVID-19 (2019 nCoV) and Bat SARS, a MERS and human coronavirus OC43. Differences just in the COVID-10 (2019 nCoV) are highlighted in red. Differences in the other sequences are highlighted in orange.
COVID-19 shows three differences in amino acids from F3, marked in red and a further three differences from Bat SARS marked in orange. Compared to F9, COVID-19 shows 9 differences marked in red, as well as a further three differences from Bat SARS.

The question I might try to answer in experimental trials would be, if I changed the F3 and the F9 sequences to match those of COVID-19, would these changes give me antigens that could generate antibodies, and therefore help create potential vaccines against the current COVID-19 outbreak?

Other approaches


The answer to my question might actually be no. Remember earlier on we learnt that an RNA virus like COVID-19 is likely to be a quasispecies. It is not one definite sequence but a population of viruses with an average around the published sequence. The altered F3 and F9 based vaccines would only affect those viruses in the population with exactly these changes. Those with a different mutation might slip through.

Vaccines made with live attenuated viruses are often the most effective and can be made by using such a mixed population found in a quasispecies virus. The other strategy could well be to create an attenuated version of the COVID-19. By being weakened in some functions so it did not cause disease, it might still being able to induce the same antibodies against the native and more dangerous virus and so create an effective vaccine. The availability of the consensus viral sequence and existing information on related SARS viruses might make this work easier.

Conclusion


I hope this gives an insight on what can be done if you have available sequences for a new disease like COVID-19. But do remember, I am just an outsider, 20 years behind the times, who used to work with plant viruses, which are very different to animal and human viruses.
It is reassuring to know, as the disease continues its growth and spread, that vastly more experienced research teams in public labs and private companies are racing to generate an effective vaccine.
They are likely to do so at a pace that I would have found unbelievable when I was working in my field.

Monday, 3 February 2020

Should I Worry About Covid-19, the Wuhan 2019 nCoV Coronavirus?

Photo: CDC/C.S. Goldsmith - https://www.cdc.gov/sars/lab/images.html

Updated 16 February 2020

What is the COVID-19 coronavirus?


Includes charts of epidemic cases and deaths

COVID-19, formerly the Wuhan 2019 nCoV, is the WHO (World Health Organisation) name for the 2019 novel Coronavirus. This is a new coronavirus that began in the Chinese city of Wuhan. It causes a viral pneumonia. Symptoms can include fever, sore throat, dry cough, fatigue and breathing difficulties. There are lots of different coronaviruses infecting humans and animals. Some of the colds you had in the past were probably caused by a human coronavirus. This Covid-19 appears to be mild in most cases. However, those with pre-existing health problems, the very young and the elderly are more likely to be severely affected.  See below for more information. 

COVID-19 seems to be at least as infectious as flu, with an average rate of one person infecting between 1.4 to 4 people, with suggestions tending towards the higher figure. One person with Rubella infects 5 to 7 people and with measles, 12 - 18 people.

On 16th February, 2020, more than 69267 people have been infected, the vast majority of them in China. As many people only have very mild symptoms and might slip through unnoticed, there is speculation that the figure could be considerably higher. Only 9 cases have been detected in the UK. There have been more than 1666 deaths in China, 4 deaths outside of China.

The charts below appear to indicate  that the number of new cases in China is tailing off and the number of deaths is also slowing down, in that they have stopped increasing exponentially (ever faster) and are almost linear over the past week. In the rest of the world, cases are still increasing exponentially but the numbers, spread over so many countries, are low and hopefully manageable.

Current estimates are that the death rate is likely to be 2%, or 1 in 50. For comparison, Measles kills 1 in 500, a previous coronavirus – SARS - killed  almost 1 in 10. There were 80,000 Seasonal influenza and pneumonia deaths during the particularly bad 2018 season in the US, On average there are about 8,000 flu deaths per year on average in England.

If you are fit and healthy, the COVID-19 disease is much less likely to be severe. You are at much greater risk if you have a compromised immune system or are affected by other illnesses. As with many diseases, the very young (under fives) or over 65s are vulnerable.

Recent WHO figures from their situation reports are:

Figure 2. Chart of progress of COVID-19 epidemic. Using data from WHO Situation Reports. Note that China changed its assessment method from laboratory tests to clinical cases (green triangle). For 5 days in January, WHO also gave values for predicted cases (red squares) which might seem to lead up naturally to the clinical case number in China today. WHO still including laboratory confirmed cases (blue squares).

Figure 3. COVID-19 cases in rest of world

Figure 3. COVID-19 deaths in China

Table of progress of COVID-19 epidemic. Using data from WHO Situation Reports and also numbers from new Chinese clinical assessment method of counting cases.

Will I be infected?

Initial information was that you have to be within 2 meters of someone having the illness for at least 15 minutes to breathe in enough virus to cause an infection. More recent cases suggest it may infect more easily. Touching surfaces contaminated by the virus and then touching your mouth or eyes can also transfer the virus. This does not mean that you will always become ill (see How can I protect myself below).

The virus is shed into the air by an infected person. The idea that virus is spread even people are not displaying obvious symptoms like fever or a runny nose seems to be less likely.

The virus is unstable outside of the body and is thought to be inactive after 24 hours.
Your risk of infection is dramatically reduced the further away you are from ill people.

What does the virus COVID-19 do?

The COVID-19 coronavirus infects the cells lining the airways of the body, the epithelial cells. In severe cases it seems to progress to the lungs, causing pneumonia.

The virus consists of a complex protein capsule that contains the virus genes. The virus genes are on a single strand of RNA – not DNA. This strand is 29903 bases (units) long. At least seven isolates of the virus have been sequenced and their sequences made publically available for all scientists - see https://www.ebi.ac.uk/ena/browser/text-search?query=Wuhan%202019%20nCoV.

On contact with one of your epithelial cells, some of the proteins, probably the spike protein, on the outside of the capsule interact with proteins on the surface of your cell. The cell is triggered to take up the virus. Inside the cell, the virus hijacks your cell’s own functions to make copies of its single RNA strand. The genes encoded on the virus RNA are also translated into a range of virus proteins. New virus particles are then assembled and then either are exported by the cell or released when the cell dies. Neighbouring cells are then infected and, if the conditions are right, the virus begins to spread along your airways.

The symptoms you may get range from fever, sore throat, dry cough, fatigue and breathing difficulties. They are in part due to the virus affecting/killing cells but also due to your body going into overdrive to try to fight the virus infection. How ill you are is a balance between virus multiplication and how fast and effectively your body defence is successful. There is more information below in the next section, on how you can protect yourself.

How can I protect myself?

There is no vaccine for COVD-19 yet – but with the full sequence of the virus available, work is in progress to provide a vaccine in the next months. Therefore isolation and quarantine remain the most effective means to prevent the spread of the disease. Things you can do are:

  • Keeping healthy by eating and sleeping well, exercising
  • Avoiding locations and people with the illness
  • Hand-washing
  • Use of hand sanitisers
  • Good personal hygiene generally

Keeping healthy is a great prophylactic as it means that your immune system is in best condition. Our bodies are actually geared to be alert to any foreign invaders and illnesses and the incoming virus does not have it all its own way.

In their paper “Mechanisms of Severe Acute Respiratory Syndrome Pathogenesis and Innate Immunomodulation”, (2008 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2593566/) Matthew Frieman and Ralph Baric describe how incoming coronaviruses trigger the production of interferons within the cell and initiate other yet unknown responses. These seem to slow down virus action. In turn, viruses continually evolve to overcome the cell’s defences.

If a cell is overcome and dies, this triggers other chemical signals which alert a variety of white blood cells. Some, called macrophages, come to absorb the invading foreign viruses and take the information back to T-cells. The T-cells in turn use this information to help create killer T-cells and antibodies. There are also memory cells that will remember the antibodies required to fight any future infections by the same strain of virus.

Some of my neutrophils from a cold,
photographed at over 1000x magnification
using anoptral contrast

In the meantime, a whole army of another sort of white cell, neutrophils, invade the infected area and gobble up all the debris of damaged cells and the viruses they come across. When you have pus from a spot or your runny nose produces the thick white stuff, that is mostly made up of these short lived neutrophils that have gorged themselves on what is infecting you.

In chemotherapy and radiotherapy, these immune systems are weakened, hence you become more susceptible to infections taking over. So take extra care.

What nations and the international community can do

Whilst the media is enjoying a feeding frenzy in response to the current epidemic, countries and the World Health Organisation have plans and structures in place to trigger action when diseases are spreading. Vaccination programs are not there to poison people, they are there to build up a preventative immunity in individuals against diseases. Every year, new strains of influenza arise naturally by mutation as the virus adapts to us changing humans. Rather than letting a large proportion of the world’s population become ill and therefore retrospectively develop resistance to the newest strain, letting many people die, we humans are proactive. Up and coming new strains of viruses are identified and vaccines produced so that by the time the disease arrives in your part of the world, you are protected in advance and do not get ill.

With totally new viruses, like the COVID-19 coronavirus, there is no immediate vaccine defence. It is therefore vital that a country keeps tabs on new illnesses that arise. They need to have plans in place to deal with the isolation of infected people. They also need to provide care whilst patients go through the illness, to mitigate symptoms until they get well.

Until we have a vaccine, severe cases may be helped by giving them antibodies from people who have recovered from the disease. This method was used against diphtheria in the early 1900's in Alaska. I currently do not know if this being pursued.

In this interconnected world, nations also have a responsibility to alert the WHO early about upcoming diseases. This time round, full marks for the Chinese response, because we were made aware of the issue earlier than in the past. The world could start monitoring for carriers of the illness and put in place travel restrictions. Whilst China has to bear the brunt of the current epidemic with tens of thousands likely to suffer, we are hopefully able to catch the disease before it spreads through our populations.

The UK and WHO have the following information on responses to epidemics and the teams and mechanisms in place. They can be found in public documents such as:



So, should I be concerned?

You should be aware that the COVID-19 virus is currently an epidemic. In the UK, as of today, 15th February 2020, we have two sets of UK citizens  released from 14 day quarantine  and 9 cases of illness recognised and contained in Newcastle.

Look out for public advice from the authorities dealing with the outbreak. The following advice is practical not just in this instance but to minimise your risk of getting ill from any disease that is circulating:
  • Keep healthy by eating and sleeping well and exercising
  • Avoiding locations and people with the illness
  • Regular hand-washing
  • Use of hand sanitisers where there might be a risk in public
If you have returned from a region seriously affected by COVID-19, or met an individual who has subsequently succumbed to the illness, and begin to experience chest and cough symptoms, stay at home and call your surgery. This ensures that you get the right response and treatment and do not accidentally spread the disease further, endangering people in public places, doctors surgeries or hospital reception.



Google