January 19, 2020
Science

The Mystery in DNA Ancestry Testing

Inaccurate ethnicity estimate? Some biology & genetics will explain why it may be inaccurate and how to find your hidden family members.
By
Tomohiro Takano

When people get their Ethnicity Estimate back from a DNA testing company, the results page has a neat and simple breakdown of the global ethnicities that have contributed to their genetic code. Typically, these results are condensed down from complex algorithms, DNA sequence matches, and a bit of guesswork into a perfectly palatable percentage.

For some people, these hard percentages may fully support their family’s oral history, written records, or family trees. For others, it may feel like their family just lied to them their whole life. But, both of these groups are looking at their Ethnicity Estimate in the wrong way.

To understand the right way to look at your Ethnicity Estimate, you need to understand how it was created, what it measures, and a bit of basic genetics. 

An Ethnicity Estimate is:

Well, it is exactly that: an estimation of the different regional ethnic groups that have contributed to your genome.

This is how companies make your Ethnicity Estimate:

  1. They sequence the DNA from people who have lived in the same area for many generations.
  2. These populations often have changes in the DNA code that are specific to the population.
  3. The populations become “Reference Populations” to which other DNA samples can be compared.
  4. You send in your DNA sample.
  5. Specific regions of your DNA are compared to many Reference Populations.
  6. If you have many of the same genetic markers as a Reference Population, it is assumed that you inherited these genes from the Reference Population.
  7. Some math and algorithms determine what percentage of your total genome came from each Reference Population.


And that’s it! That is your Ethnicity Estimate. You can see an example of an Ethnicity Estimate from MyHeritage, below.


Plus, it is very accurate when done right. DNA sequencing technology is extremely accurate, close to 100%. So, if a company finds a match between you and a Reference Population it is highly likely that the match actually exists. 


But, many people think that the Ethnicity Estimate is something that it is not.


An Ethnicity Estimate is not:


An Ethnicity Estimate is not a family history, a genealogy, or a perfect picture of your family’s past. (That’s why companies call them ‘estimates.’)  


Many of the worst reviews of DNA companies seem to focus on this confusion, as customers do not understand why their results are not more specific. Other customers think that their Ethnicity Estimate is somehow inaccurate because it does not match their family’s oral or recorded history.


To fully understand the difference between an Ethnicity Estimate and genealogy or family history, some biology and genetics fundamentals can help.


Don’t worry. There is not a test at the end. 


DNA and Ethnicity Estimates


You are made of cells, and each cell in your body contains DNA molecules. Each DNA molecule is made up of a specific sequence of nucleotides, the small building blocks of DNA, bonded together.


A sequence of 3 nucleotides is called a codon. Each codon calls for a specific amino acid. 


Amino acids are the building blocks of proteins: little cellular machines that carry out a specific function. Many codons together in the DNA molecule specify the exact sequence of amino acids in a protein. These codons, together, are referred to as a gene. Each gene carries the information to build a different protein, as seen below.  


Image source.


Altogether, you have tens of thousands of genes within your DNA. These proteins create different functions within each cell. Essentially, your body is the combined effort of trillions of cells, each with thousands of different proteins and functions, all operating to keep you alive. Modern DNA technology allows us to take a peek inside this incredibly complicated process.


Modern DNA tests measure single nucleotide polymorphisms, called ‘SNPs’ or ‘snips’ for short. DNA tests for health and ancestry typically measure around 700,000 different positions in your DNA to see if it matches the nucleotides adenine (A), thymine (T), guanine (G) or cytosine (C). With hundreds of thousands of positions tested, multiple positions are tested on most of your genes. 


These nucleotides change spontaneously (or mutate) at a very low rate. So if you have the same nucleotide sequence as a Reference Population, the most likely reason is that one of your ancestors came from that reference population. It is also why Ethnicity Estimates are pretty accurate. They are based purely on biology and data science.


However, Ethnicity Estimates are not the same as a family tree or a detailed family history. To understand why this is the case, genetics can inform us.


Genetics and Ethnicity Estimates 


One of the first lessons in genetics is that each one of your parents donates 50% of their DNA to create you! 


Most animals have 2 copies of their entire genetic code. In order to reproduce, these 2 copies are separated into individual cells during the process of meiosis, creating haploid cells. When these sperm and egg cells meet, they fuse their DNA copies together to create a cell with 2 full copies of the DNA (called the diploid condition.) One is from your mother, and the other copy is from your father. Ta-da! 


Image source


While this seems very simple, there is actually some very important mixing that happens during the process of meiosis. The genes you received from your mom and the genes you received from your dad get all mixed up as your body creates egg or sperm cells. This is important for two reasons:


  1. Variation in a population tends to help a species adapt to change and survive. 
  2. Without this mixing, many traits would be linked - reducing variability. 


Without this mixing, your children would have traits from only your mother or your father, but not both. With this mixing, the variation is almost endless. It is likely one of the main reasons animals have been reproducing this way for hundreds of millions of years.


But, more importantly to this article, this mixing process makes the combination of genetics and genealogy a difficult task. In fact, there is an entire scientific field aimed at using genetics to understand a person’s genealogy and human relationships that would put Sherlock Holmes to shame. (Disappointingly, this field is  “Genetic Genealogy.”) 


Unfortunately, because of this constant cycle of genetic mixing and inheritance, even a genetic genealogist can not be 100% sure about your family history based on genetics alone. 


Your Hidden Family


Due to the cycle of separating 2 DNA copies into different cells and the various mixing mechanisms that ensure variability, certain genes are simply lost over time. To understand why your grandparent (or others in your family tree) may have not contributed a single gene to your DNA, let’s look at a simple example.


Let’s say that you are 100% Irish, and your family has lived in Ireland for many generations. You have children with a partner that is 100% Japanese. Your children will have 50% of their traits from the Irish population, and 50% of their traits from the Japanese population. Your children will create gametes that will mix genes from these two populations and separate them into gametes. The catch is, the genes are not always mixed equally.


Sometimes, an individual sperm or egg cell will have more of the Irish DNA, while other times it will have more of the Japanese DNA. Theoretically, your children could create gametes that are completely unmixed and contain only Irish or Japanese DNA. While this is extremely unlikely because the mixing process is very thorough, as a grandparent you could have a grandchild with whom you do not share a single genetic variant. 


Plus, with each subsequent generation and mixing with other populations that have different genetic markers, the chances increase that Irish genetic variants will be lost from the DNA completely.


For more than a few ancestry DNA kit users, Ethnicity Estimates do not reflect their known family tree. This can happen when the genes from a relative do not make it through the many cycles of mixing and reduction that happen over generations. As we have seen, this can wipe out markers from entire sides of your family fairly quickly under the perfect conditions. 


The Mystery Continues: Find Your Hidden Family


While certain branches of your family tree are not traceable using ancestry DNA testing alone, there are several companies that can give you access to historical records and access to family tree building software to really get your search started. 


MyHeritage and Ancestry.com both offer DNA ancestry testing, family trees, and subscription-based access to massive historical records documents. Both companies allow you to place living DNA matches within your family tree, which can expand your results greatly. Then, you can fill in the blanks by using historical records to find out about marriages, children, and the lives of your ancestors.


Your Ethnicity Estimate, DNA matches, and other genetic evidence can be a great way to start a family tree. While some of your relatives and family history may not show up on your DNA results, a little bit of good-old-fashioned detective work should get the job done!


You can read [our review of MyHeritage] and [our review of Ancestry] for more information about each of these companies. Each company offers slightly different advantages.



Tomohiro Takano
Tomohiro Takano
Co-Founder and CEO