How do we know the genome sequence?

Imagine someone asked you to explain how a car works. Even if you knew nothing about cars, you could take the car apart piece by piece, inspect each piece in your hand and probably draw a pretty good diagram of how a car is put together.  You wouldn’t understand how it works, but you’d have a good start in trying to figure it out.

Now what if someone asked you to figure out how the genome works? You know it’s made of DNA, but it’s the ORDER of the nucleotides that helps to understand how the genome works (remember genes and proteins?). All the time in the news, you hear about a scientist or a doctor who looked at the sequence of the human genome and from that information could conclude possible causes of the disease or a way to target the treatment. DNA sequencing forms a cornerstone of personalized medicine, but how does this sequencing actually work? How do you take apart the genome like a car so you can start to understand how it works?

As a quick reminder – DNA is made out of four different nucleotides, A, T, G, and C, that are lined up in a specific order to make up the 3 billion nucleotides in the human genome.  DNA looks like a ladder where the rungs are made up of bases that stick to one another: A always sticking to T and G always sticking to C.  Since A always sticks to T and G always sticks to C, if you know the sequence that makes up one side of the ladder, you also know the sequence of the other side.


The first commonly used sequencing is called Sanger sequencing, named after Frederick Sanger who invented the method in 1977. Sanger sequencing takes advantage of this DNA ladder – this method breaks it in half and using glowing (fluorescent) nucleotides of different colors, this technique rebuilds the other side of the ladder one nucleotide at a time. A detector that can detect the different fluorescent colors creates an image of these colors that a program then “reads” to give the researcher the sequence of the nucleotides (see image below to see what this looks like).  These sequences are just long strings of As, Ts, Gs, and Cs that the researcher can analyze to better understand the sequence for their experiments.


This was a revolutionary technique, and when the Human Genome Project started in 1990, Sanger Sequencing was the only technique available to scientists. However, this method can only sequence about 700 nucleotides at one time and even the most advanced machine in 2015 only runs 96 sequencing reactions at one time.  In 1990, using Sanger sequencing, scientists planned on running lots and lots of sequencing reaction at one time, and they expected this effort would take 15 years and cost $3 Billion. The first draft of the Human Genome was published in 2000 through a public effort and a parallel private effort by Celera Genomics that cost only $300 million and took only 3 years once they jumped into the ring at 2007 (why was it cheaper and fast, you ask? They developed a fast “shotgun” method and analysis techniques that sped up the process).

As you may imagine, for personalized medicine where sequencing a huge part of the genome may be necessary for every man, woman, and child, 3-15 years and $300M-$3B dollars per sequence is not feasible. Fortunately, the genome sequencing technology advanced in the 1990s to what’s called Next Generation Sequencing. There are a lot of different versions of the Next Gen Sequencing (often abbreviated as NGS), but basically all of them run thousands and thousands of sequencing reactions all at the same time. Instead of reading 700 nucleotides at one time in Sanger sequencing, NGS methods can read up to 3 billion bases in one experiments.

How does this work? Short DNA sequences are stuck to a slide and replicated over and over. This makes dots of the exact same sequence and thousands and thousands of these dots are created on one slide. Then, like Sanger sequencing, glowing nucleotides build the other side of the DNA ladder one nucleotide at a time. In this case though, the surface looks like a confetti of dots that have to be read by a sophisticated computer program to determine the millions of sequencing.


So what has this new technology allowed scientists to do? It has decreased the cost of sequencing a genome to around $1000. It has also allowed researchers to sequence large numbers of genomes to better understand the genetic differences between people, to better understand other species genomes (including the bacteria that colonize us or the viruses that infect us), and to help determineexomee the genetic changes in tumors to better detect and treat these diseases. Next Generation Sequencing allows doctors to actually use genome sequencing in the
clinic. A version of genome sequencing has been developed called “exome sequencing” that only sequences the genes.  Since genes only make up about 1-2% of the genome, NGS of the exome takes less time and money but provides lots of information about what some argue is the most important part of the genome – the part that encodes proteins.  Much of the promise of personalized medicine can be found through this revolutionary DNA sequencing technique – and with the cost getting lower and lower, there may be a day soon when you too will have your genome sequence as part of your medical record.

For more information about the history of Sequencing, check out this article “DNA Sequencing: From Bench to Bedside and Beyond” in the journal Nucleic Acids Research.

Here is an amusing short video about how Next Generation Sequencing works described by the most interesting pathologist in the world.

Journal Club: The Microbiome Autism Connection

As I’ve mentioned in other posts, scientists have to read about and understand the current scientific literature. Lots of the time this is done alone, at your desk in the office or the lab, for hours and hours so that you can really understand whatever topic it is that you’re studying. But one of my favorite ways to share scientific papers is through a weekly meeting with the whole lab called a Journal Club. Although my husband laughs about this kind of nerdy science “club” (akin to his amusement about scientific societies), it’s a great way to discuss a particular topic and dive deep into a discussion about how the researchers got their results and came to their conclusions. This is the first of many Journal Clubs where we will do an abbreviated version of what we would discuss in a typical journal club in the lab. 

Paper TItle:  Reduced Incidence of Prevotella and Other Fermenters in Intestinal Microflora of Autistic Children

Authors: Dae-Wook Kang , Jin Gyoon Park ,Zehra Esra Ilhan, Garrick Wallstrom, Joshua LaBaer, James B. Adams, Rosa Krajmalnik-Brown
Full disclosure, Dr. LaBaer is the Director of the center I previously worked in at the Biodesign Institute at ASU. Drs. Park and Wallstrom worked in offices down the hall from me and Dr. Krajmalnik-Brown was in another center at the Biodesign Institute.

Journal: PLOS One (PLOS stands for “Public Library of Science”). In case you want to read the whole article, it can be downloaded (for free) here

Background/Introduction: Before this paper was published, scientists knew that many children with autism also had gastrointestinal (GI) issues suggesting that there may be a connection between the two. There have been some studies looking at antibiotic treatment (which could change the gut microbiome) before 3 years of age and how this might be connected with autism.  There have also been studies connecting the gut microbiome and the brain. So there was evidence that the gut microbiome and autism may be related in some way.

When this paper was written, scientists also knew about the microbiome and how changes in the bacteria (all 1,000,000,000,000,000 of them) that are in the gut are found in patients with many different diseases – from C. diff infection to obesity to depression.

Goal of this paper: Look closely at the changes in the gut microbiome of children with autism to better understand how these two might be related.

Methods/what did they do?: Bacteria in fecal samples is considered representative of bacteria in the gut microbiome. Therefore, the researchers collected fecal samples from children both with (20 children) and without (20 children) autism.  The samples from patients without autism were used as a “control” to compare to the autistic samples. The researchers also asked the children (or their parents) questions to help determine the level of GI issues, the severity of their autism, and their environmental factors like their diet. The researchers isolated bacterial DNA from each of these fecal samples and then sequenced the DNA to determine what types of bacteria are in the gut.

autism microbiome diversity

A figure from the paper comparing the phylogenetic diversity (PD) of the bacteria in autistic versus non-autistic children. you can see that the red boxes (for autistic children) are lower than the blue boxes (for non-autistic children) indicating a lower microbiome diversity.

Results: Through sequence analysis and other statistical methods, the authors found that children who did not have autism have a more diverse microbiome compared to autistic children.  If there is higher diversity, it means that the gut contains more different types of bacteria, and lower diversity means a smaller variety of bacteria in the gut. They also found that in the autistic patients with a greater diversity in their microbiome, their autism was generally less severe. They also did not find any correlation between age, gender or diet with these microbiome changes.

The scientists also looked at what specific genus and species of bacteria were more represented in non-autistic versus autistic children. Specifically the bacteria from genus Veillonellacaea, Provetella, and Coprococcus  are less abundant in autistic children.

Discussion/Significance: What does this all mean?  The researchers did find a correlation between decreased gut microbiome diversity and autism. It should be clarified that just because GI problems are often found in autistic children and the severity of the GI issues correlates with the severity of autism, this does not necessary mean that GI issues cause autism or vice versa.  That still needs to be determined. Also because the diversity of bacteria in autistic children is low, it is not clear if this is a cause of autism or an effect of a child having autism.  However, this paper does provide a “stepping stone” to better understand what is happening in the gut of autistic children and may help define a target for diagnosing autism (by looking at the decreased diversity in the gut as a diagnostic test) or treatment (perhaps through fecal transplant).

What has been done since? This paper was published in 2013.  So what has changed since the paper was published?  Do we know whether or not changes in the gut microbiome cause autism or not?  Unfortunately, this is still unclear.  However, if these microbiome changes are a cause of the neurological changes in autism, then one would want to do a clinical trial to test what happens to autism symptoms when the microbiome has been altered.  This could be done in a number ways including diet modulations, prebiotics, probiotics, synbiotics, postbiotics, antibiotics, fecal transplantation, and activated charcoal.  Researchers have started this process by holding a meeting that included patients and their families to figure out how this type of trial could be designed (for more details, check out this journal article).

For more information about the microbiome/autism connection, check out Autism Speaks. To read more Journal Clubs, visit the archives here.