How do we know the genome sequence?

Imagine someone asked you to explain how a car works. Even if you knew nothing about cars, you could take the car apart piece by piece, inspect each piece in your hand and probably draw a pretty good diagram of how a car is put together.  You wouldn’t understand how it works, but you’d have a good start in trying to figure it out.

Now what if someone asked you to figure out how the genome works? You know it’s made of DNA, but it’s the ORDER of the nucleotides that helps to understand how the genome works (remember genes and proteins?). All the time in the news, you hear about a scientist or a doctor who looked at the sequence of the human genome and from that information could conclude possible causes of the disease or a way to target the treatment. DNA sequencing forms a cornerstone of personalized medicine, but how does this sequencing actually work? How do you take apart the genome like a car so you can start to understand how it works?

As a quick reminder – DNA is made out of four different nucleotides, A, T, G, and C, that are lined up in a specific order to make up the 3 billion nucleotides in the human genome.  DNA looks like a ladder where the rungs are made up of bases that stick to one another: A always sticking to T and G always sticking to C.  Since A always sticks to T and G always sticks to C, if you know the sequence that makes up one side of the ladder, you also know the sequence of the other side.


The first commonly used sequencing is called Sanger sequencing, named after Frederick Sanger who invented the method in 1977. Sanger sequencing takes advantage of this DNA ladder – this method breaks it in half and using glowing (fluorescent) nucleotides of different colors, this technique rebuilds the other side of the ladder one nucleotide at a time. A detector that can detect the different fluorescent colors creates an image of these colors that a program then “reads” to give the researcher the sequence of the nucleotides (see image below to see what this looks like).  These sequences are just long strings of As, Ts, Gs, and Cs that the researcher can analyze to better understand the sequence for their experiments.


This was a revolutionary technique, and when the Human Genome Project started in 1990, Sanger Sequencing was the only technique available to scientists. However, this method can only sequence about 700 nucleotides at one time and even the most advanced machine in 2015 only runs 96 sequencing reactions at one time.  In 1990, using Sanger sequencing, scientists planned on running lots and lots of sequencing reaction at one time, and they expected this effort would take 15 years and cost $3 Billion. The first draft of the Human Genome was published in 2000 through a public effort and a parallel private effort by Celera Genomics that cost only $300 million and took only 3 years once they jumped into the ring at 2007 (why was it cheaper and fast, you ask? They developed a fast “shotgun” method and analysis techniques that sped up the process).

As you may imagine, for personalized medicine where sequencing a huge part of the genome may be necessary for every man, woman, and child, 3-15 years and $300M-$3B dollars per sequence is not feasible. Fortunately, the genome sequencing technology advanced in the 1990s to what’s called Next Generation Sequencing. There are a lot of different versions of the Next Gen Sequencing (often abbreviated as NGS), but basically all of them run thousands and thousands of sequencing reactions all at the same time. Instead of reading 700 nucleotides at one time in Sanger sequencing, NGS methods can read up to 3 billion bases in one experiments.

How does this work? Short DNA sequences are stuck to a slide and replicated over and over. This makes dots of the exact same sequence and thousands and thousands of these dots are created on one slide. Then, like Sanger sequencing, glowing nucleotides build the other side of the DNA ladder one nucleotide at a time. In this case though, the surface looks like a confetti of dots that have to be read by a sophisticated computer program to determine the millions of sequencing.


So what has this new technology allowed scientists to do? It has decreased the cost of sequencing a genome to around $1000. It has also allowed researchers to sequence large numbers of genomes to better understand the genetic differences between people, to better understand other species genomes (including the bacteria that colonize us or the viruses that infect us), and to help determineexomee the genetic changes in tumors to better detect and treat these diseases. Next Generation Sequencing allows doctors to actually use genome sequencing in the
clinic. A version of genome sequencing has been developed called “exome sequencing” that only sequences the genes.  Since genes only make up about 1-2% of the genome, NGS of the exome takes less time and money but provides lots of information about what some argue is the most important part of the genome – the part that encodes proteins.  Much of the promise of personalized medicine can be found through this revolutionary DNA sequencing technique – and with the cost getting lower and lower, there may be a day soon when you too will have your genome sequence as part of your medical record.

For more information about the history of Sequencing, check out this article “DNA Sequencing: From Bench to Bedside and Beyond” in the journal Nucleic Acids Research.

Here is an amusing short video about how Next Generation Sequencing works described by the most interesting pathologist in the world.

Personalized Medicine: A Cure for HIV

Personalized Medicine – finding the right treatment for the right patient at the right time – is quickly becoming a buzzword both in the medical field but also to the public. But is it just hype? No!  I discussed a number of examples of how personalized medicine is currently be used in breast cancer in a previous post. In this and future posts, I’ll talk about a few fascinating emerging examples of the promise of personalized medicine.  These are NOT currently being used for patient treatment as part of standard of care, but could be someday.


HIV lentivirus

The Human Immunodeficiency Virus (HIV), the cause of AIDS, is a virus that attacks the immune system.  This attack prevents immune cells from fighting other infections.  The result of this is that the patient is more likely to acquire other infections and cancers that ultimately kill them.  When first discovered in the early 1980s, HIV infection was a death sentence. Untreated, survival is 9 to 11 years.  In the past 30 years, antiviral treatments have been developed that, when taken as prescribed, essentially make HIV infection a chronic disease, extending life to 25-50 years. But there is no cure for HIV, and as of 2012, over 35.3 million people were infected with the virus.

The lack of a vaccine to prevent the disease or of a cure to treat those infected isn’t because no one is trying. Since the virus was identified as the cause of the disease, scientists have been working to find a prevention or cure (along with developing all of the antiretroviral drugs that delay/treat the disease). I’m not going to discuss all of this interesting research (though it is worthy of discussion), instead I’m going to talk about one patient, Timothy Ray Brown, who was cured of HIV/AIDS through a stroke of genetic understanding and luck!

Brown was HIV positive and had been on antiretroviral therapy for over 10 years when he was diagnosed with leukemia in 2007. His leukemia – Acute Myeloid Leukemia (AML) – is caused by too many white blood cells in the bone marrow, which interferes with the creation of red blood cells, platelets and normal white blood cells. Chemotherapy and radiation are used to treat AML by wiping out all of the cells in the bone marrow – both the cancer cells and the normal cells. Brown’s doctors then replaced the cells in the bone marrow with non-cancerous bone marrow cells of a donor.  This is called a stem cell transplant, and it is commonly used to treat leukemia – often resulting in long term remission or a cure of the disease.

But the really cool part of this story isn’t the treatment itself.  Rather it’s that that Brown’s doctor selected bone marrow from a donor that had a mutation in the gene CCR5. So what? The CCR5 protein is found on the outside of the cells that the HIV virus infects. CCR5 is REQUIRED for the virus to get inside the cell, replicate, and kill the cell. Without CCR5, HIV is harmless. There is a deletion mutation in CCR5 called delta32 that prevents HIV from binding to the cell and infecting it.  Blocking HIV from getting into the cell prevents HIV infection.  In fact, it’s been found that some people are naturally resistant to HIV infection because they have this deletion. Two copies of the gene are found in 1% of the Caucasian population, and it’s thought that this mutation was selected for because it also prevents smallpox infection.
HIV_ccr5So Brown’s doctors repopulated his bone marrow with cells that had the CCR5-delta32 mutation.  This didn’t just cure his leukemia but it also prevented the HIV from infecting his new blood cells, curing his HIV. He is still cured from HIV today!

What does this mean for others who are infected with HIV? Is a stem cell transplant going to work for everyone?  Unfortunately, no. This mutation is very rare, so finding donors with this mutation isn’t feasible.  Plus, this is a very expensive therapy that comes with risks such as graft-versus-host disease from the mismatch between the person receiving the transplant and the transplanted cells themselves. However, there are possible options to overcoming these challenges, including “gene editing.” In this method, T cells from HIV-positive patients would be removed from the body and then gene editing would be used to to make the CCR5-delta32 mutation in these cells.  These cells could then be re-introduced into the patient.  With the mutation, HIV won’t be able to infect these T cells, which would hopefully cure the disease, while avoiding some of the major graft-versus-host side effects. A small clinical trial tested this idea in 2014 (full article can be found in the New England Journal of Medicine), and HIV couldn’t be detected in one out of four patients who could be evaluated. Although this is a preliminary study using an older gene-editing technique, it shows promise for “personalized gene therapy” to potentially cure HIV.

Why does everyone, including me, like astronomy so much? OR – how I became a biologist.

Photo Feb 17, 6 57 16 PMI LOVED SPACE.  I loved space so much that in the sixth grade I spent most of my time at recess – without shame – with a friend planning on how to create a tractor beam (for those who aren’t complete geeks, that’s the force in Star Trek that allowed the Enterprise to latch on to other spaceships).  Our solution – a very big, long rope. Completely ignoring my fear of heights or adventure rides, I was convinced that I was going to be the female Jean-Luc Picard.  And as everyone else wrote in their sixth grade yearbooks how they wanted to be a teacher, doctor or lawyer, I wanted to be an aerospace engineer (and I honestly cannot believe that I’m showing the proof with my sixth grade yearbook photo).

"Aequorea victoria" by Mnolf - Photo taken in the Monterey Bay Aquarium, CA, USA. Licensed under CC BY-SA 3.0 via Wikimedia Commons -

“Aequorea victoria” by Mnolf – Photo taken in the Monterey Bay Aquarium, CA, USA. Licensed under CC BY-SA 3.0 via Wikimedia Commons

With this deep-seeded love of astronomy, you may be asking yourself how I became a biologist?  In my senior year of high school, I attended a Boston University Medical Center program called City Lab.  This was a six week program that my mom drove my friend Missy and me to (an hour each way in rush hour traffic) so that we could do a lab experiment.  Each week, we spent several hours in the Boston University lab doing different parts of the experiment I describe below.

gfprabbitThe goal of this experiment was to take a piece of DNA that coded for the green fluorescent protein (also known as GFP) and put it into a piece of DNA that could make bacteria glow green.  We haven’t talked in detail about genes or protein expression yet, so I’ll stick to the basics. GFP is from a jellyfish called aequorea victoria, and it’s what makes the jellyfish glow green (see above).  A scientist isolated this one gene, and if transferred properly into another organism, it can make that organism glow green.  And yes, people have tried this.  People have created GFP rabbits and mice and…NO, NOT PEOPLE.  Why not?  Well, first, because it’s highly unethical to do genetic engineering in people for no clinical reason (and having glowing eyebrows will not cure any disease).  Also it’s very difficult to manipulate the DNA in humans for a variety of reasons that we will discuss when talking about gene therapy.

So, at City Lab our job was to cut the GFP gene using restriction enzymes (which are essentially DNA gpfbacteriascissors that cut DNA in a specific place) and then insert the GFP gene into another piece of DNA using ligase (essentially, DNA glue).  This  new piece of DNA (called a bacterial expression vector) makes the GFP protein in bacteria cells.  When the GFP protein is expressed in bacterial cells, the bacteria glow green (like the picture to the right).  It was easy to figure out if your six weeks of effort was worth it if your bacteria glowed green.

OUR EXPERIMENT WAS THE ONLY ONE THAT WORKED. Not only had we understood how a piece of DNA worked, moved it from one place to another, but we then were able to get it to do something in a bacterial cell.  At the time, I didn’t realize that this was what scientists called “recombinant DNA technology”.  I didn’t know that this was used all of the time in the laboratory as a foundation of molecular biology studies.  I had no idea that someday I would be managing a facility that stored hundreds of thousands of these pieces of DNA to help researchers worldwide with their experiments.  I only knew the thrill of “discovery” and I wanted more.

That’s how I became a biologist #tbt