July 15, 2024
Science

About our Genomelink Deep Ancestry report

Here are FAQs about our Genomelink Deep Ancestry report. Learn how the report works and how it is different from other services.
By
Tomohiro Takano

Here are FAQs about our Genomelink Deep Ancestry report. Learn how the report works and how it is different from other services.

Outline

What is the Genomelink Deep Ancestry report?

How does Genomelink Deep Ancestry calculate your ethnicity estimate?

How is it different from other Genomelink reports?

Why are my results different from other reports?

I lost the % of ethnicities on Deep Ancestry I had on Global Ancestry. Why could this happen?

How should we interpret the % of results between modern and ancient ancestry analysis?

What is the Genomelink Deep Ancestry report?

The Genomelink Deep Ancestry report offers a detailed look into your genetic heritage with over 100 ethnic and regional labels. This report is designed to give you confidence in your ancestry results by providing clear and transparent information.

Genomelink Deep Ancestry uses a curated scientific reference dataset and a unique dataset from Genomelink users. Each ethnicity estimate is based on careful statistical analysis and reliability scores for every result, so you can see which parts of your ancestry are most reliable.

The results page shows detailed comparisons and is structured into three distinct levels for clarity and depth of understanding: Tier 1 provides a broad overview of major continental groups; Tier 2 delves into sub-regional nuances within those continental categories, and Tier 3 uses advanced genetic analysis to pinpoint specific regions within countries, offering the highest precision available. This structured approach ensures a thorough understanding of your genetic heritage, from broad continental origins to specific regional roots. 

How does Genomelink Deep Ancestry calculate your ethnicity estimate?

At Genomelink, we provide genetic ancestry reports through a detailed analysis of specific DNA segments called genetic markers. These markers are unique sequences that vary among individuals. By tracing lineage through generations, these markers offer insights into one's ethnic background and ancestral origins.

Ethnicity Estimate Calculation:

  1. DNA Analysis: We begin by collecting and analyzing a person’s DNA, focusing on genetic markers.
  2. Comparison to Reference Datasets: These markers are compared against comprehensive reference datasets—collections of genetic information from various population groups—to map out ancestral heritage.
  3. Statistical Models: We use advanced statistical models to estimate the geographical origins and percentage composition of an individual’s ancestry.

Reference Datasets and Data Sources:

  • Our reference datasets are collected from publicly available sources and Genomelink’s data warehouse with users' consent.
  • An initial analysis selects a pool of candidate ethnicities. Ethnicities lacking sufficient data samples are excluded.
  • The final ethnicities for the report are selected if their reference samples can be isolated from others.

Advanced Clustering and Training Models:

  • A cutting-edge clustering method is applied to the reference samples to train a statistical model.
  • This model estimates the percentages of each ethnicity based on the user’s genotyping data.

Reliability Score Introduction: We introduced a reliability score to enhance the transparency and usefulness of the Deep Ancestry reports. This score will provide users with a quantifiable measure of confidence in their ancestry results based on several factors:

  1. Depth of Reference Data: Reflecting the comprehensiveness and diversity of the reference datasets used, with higher scores for reports based on more extensive datasets.
  2. Analytical Confidence: Assessing the statistical confidence of the ancestry estimations, factoring in the precision of the models and algorithms employed.
  3. Marker Coverage & Imputation Quality: This measure considers the number and diversity of genetic markers analyzed, with higher scores indicating a more thorough genetic analysis.
  4. Comparative Analysis: Including a comparative analysis of an individual’s results against multiple reference datasets or through different statistical models, offering a measure of consistency.

Ethnicity Composition Breakdown: The 'Ethnicity Composition' provides a comprehensive analysis of your genetic heritage, structured into three distinct levels for clarity and depth of understanding:

  • Tier 1: Continental Level Breakdown: This initial stage offers a broad overview, categorizing your ancestry into major continental groups. It sets the stage for understanding the global regions from which your ancestors originated.
  • Tier 2: Sub-Regional Breakdown: This tier focuses on sub-regional nuances within the broader continental categories. It may include specific geographical regions encompassing groups of countries, peninsular areas, or other significant subdivisions.
  • Tier 3: Most Magnified Regional Level Breakdown: The most detailed tier utilizes advanced genetic ancestry analysis and algorithmic clustering to identify your roots with the highest precision, sometimes pinpointing particular areas within countries.

Result Logic: We provide results from three different models to offer a balanced view of your ancestry:

  • Balanced Result: Considers the trade-off between the most accurate results and including as many ethnicities as possible. It weighs the outputs of the accurate and discovery models for optimal predictive performance.
  • Accurate: Aims to provide a set of ethnicities with high reliability, focusing on recent admixtures and showing fewer ethnicities.
  • Discovery: It traces back to older admixture events than the accurate model, displaying more ethnicities but potentially lower reliability for some of them.

This comprehensive approach empowers users with a deeper understanding of the strengths and limitations of their ancestry results, fostering a more nuanced appreciation of their genetic heritage.

Summary

At Genomelink, we craft genetic ancestry reports by analyzing specific DNA segments called genetic markers. These markers are compared against comprehensive reference datasets to estimate geographical origins and ancestry composition. With user consent, our reference datasets are sourced from publicly available data and our data warehouse. Using advanced statistical models and clustering methods, we provide detailed ancestry estimates.

To enhance the transparency and usefulness of our Deep Ancestry reports, we have introduced a reliability score based on the depth of reference data, analytical confidence, marker coverage, and comparative analysis. The Ethnicity Composition is structured into three tiers: Continental Level Breakdown, Sub-Regional Breakdown, and Most Magnified Regional Level Breakdown, offering a detailed view of your heritage. We provide results from three models—Balanced, Accurate, and Discovery—to ensure a comprehensive and nuanced understanding of your genetic heritage.

How is it different from other Genomelink reports?

  • Differences with Global Ancestry
    • Global Ancestry uses a chromosome breakdown approach to detect and show minor ethnicities. We co-developed the report with a research group spin-out from Stanford and used their proprietary XGMix algorithm. It uses lower cut-off criteria so that we can detect and show more minor ethnicities that you might have. 
    • Deep Ancestry focuses on providing a detailed breakdown of over 100 ethnicity and regional labels. Its goal is not to detect minor ethnicities but to give a comprehensive view of your ancestry with a broader perspective. Therefore, Deep Ancestry uses regular cut-off criteria, similar to those used by other primary DNA testing services like Ancestry and 23andMe. This ensures that only the most significant ethnic contributions are highlighted in your report.
    • Suppose you had a small % (1% or below) in your Global Ancestry results but do not see them on Deep Ancestry. In that case, it likely comes from the difference in cut-off criteria based on the intended specification above.
  • Differences with other regional ancestry reports, including European Breakdown, Native American, Asian, UK, African, Latino
    • Our regional reports are co-developed with partners, including GalateaBio, LivingDNA, and SOMOS, using their proprietary algorithms and datasets. It is very common to see some differences in your results when you use different services’ algorithms and datasets.
    • Each company uses different segmentation to define each ethnicity and region.
  • Differences with Ancient ancestry reports, including Ancient Bloodline, Viking, Ancient Ancestor
    • The Genomelink team develops all of our ancient ancestry reports. Ancient ancestry analyses differ from modern ones, such as the Deep Ancestry, Global Ancestry, or Regional Ancestry reports. Some reports are developed with genealogy experts, utilizing datasets with exclusive accessibility and broader outreach (Viking, Ancient ancestors, etc.). Other reports are created using open-access data and datasets published by academically acclaimed scientists (Ancient Bloodlines, Neanderthal, etc.).
    • Timelines span from the Pleistocene (up to 10000 BC) to the Iron Age (1 BC)

Why are my results different from other reports?

Various reasons could contribute to it. First and foremost, the reference dataset we used to build the product differs from other reports. As genetic ancestry is inferred from reference data, variation among reference labels or samples could result in different results. We will continuously update the results with increasing available samples to make them more accurate. Second, there is a chance that the difference results from genetic marker differences. Often, files from different providers could differ in genetic markers. Sometimes, imputation could result in different genotypes. Lastly, every report uses a unique algorithm and may have varying performance.

I lost the % of ethnicities on Deep Ancestry I had on Global Ancestry. Why could this happen?

The difference in results between Global Ancestry and Deep Ancestry reports is due to the distinct methodologies, algorithms, and criteria each report employs to analyze your DNA.

Background: 

Global Ancestry uses a chromosome breakdown approach co-developed with a research group spin-out from Stanford University. This approach utilizes a proprietary algorithm designed to detect and show minor ethnicities with high precision. By applying lower cut-off criteria, Global Ancestry can reveal even the most minor ethnic contributions in your genetic makeup, often identifying ethnicities that constitute less than 1% of your DNA.

On the other hand, Deep Ancestry focuses on providing a detailed breakdown of over 100 ethnicity and regional labels. Its goal is not to detect minor ethnicities but to give a comprehensive view of your ancestry with a broader, more general perspective. Therefore, Deep Ancestry uses regular cut-off criteria, similar to those used by other major DNA testing services like Ancestry and 23andMe. This ensures that only more significant ethnic contributions are highlighted in your report.

Key Points:

  1. Different Algorithms and Datasets:
    • Global Ancestry: The algorithm employed by Global Ancestry is specifically designed to be highly sensitive to minor ethnic contributions. It uses advanced global and local ethnicity estimate techniques to analyze your DNA in detail. The dataset supporting this algorithm includes extensive yet exclusive reference populations that allow for detecting even the most minor ethnicities.
    • Deep Ancestry: In contrast, Deep Ancestry uses a different set of algorithms optimized to provide a broader overview of your ethnic and regional background. These algorithms are designed to identify and classify larger ethnic groupings and regions using a more standard set of reference populations and cut-off criteria. This approach ensures a comprehensive but less granular view of your ancestry.
  2. Different Cut-off Criteria:
    • Global Ancestry: The lower cut-off criteria in Global Ancestry allow for detecting tiny percentages of various ethnicities, capturing even the most minor traces in your DNA.
    • Deep Ancestry: Deep Ancestry applies regular cut-off criteria similar to those used by primary DNA testing services like Ancestry and 23andMe. This means that only ethnic contributions above a certain threshold are included in your report, providing a clearer and more generalized view of your ancestry.

Therefore, if you notice a small percentage (1% or below) of ethnicities in your Global Ancestry results that are absent in the Deep Ancestry report, this is likely due to the differences in algorithms, datasets, and cut-off criteria. Global Ancestry aims to identify every trace of your diverse heritage, while Deep Ancestry offers a broader overview of your major ethnic and regional backgrounds.

How should we interpret the % of results between modern and ancient ancestry analysis?

Modern and ancient ancestry analysis results will likely differ due to variations in methodology and reference datasets used for each type of analysis. Here’s how to interpret the differences:

  1. Methodology Differences:
    • Modern Ancestry: This analysis focuses on your recent ancestry, typically covering the past 10 generations. It uses advanced algorithms and large, diverse reference datasets to provide a detailed overview of your genetic makeup and recent admixture.
    • Ancient Ancestry: This analysis delves much further back in time, often thousands of years, to trace your deep ancestral origins. It relies on limited ancient DNA samples continually uncovered and studied.
  2. Reference Datasets:
    • Modern Ancestry: Utilizes comprehensive and up-to-date reference populations that reflect genetic diversity and historical migrations over the past few centuries to millennia.
    • Ancient Ancestry: Uses ancient DNA samples, which are less numerous and represent populations from a much older time period, providing a snapshot of your distant past.
  3. Evolution of DNA Analysis:
    • Both modern and ancient ancestry results can change over time as DNA analysis techniques improve and new ancient DNA samples are discovered and added to the databases. This ongoing research can lead to more accurate and refined ancestry estimates.
  4. Interpretation of Percentages:
    • The percentages in your modern ancestry results represent your genetic contributions from various populations over the last few hundred years.
    • The percentages in your ancient ancestry results reflect your genetic ties to ancient populations and how those ancient genes have been passed down through millennia.

In summary, while modern ancestry analysis provides a snapshot of your recent genetic heritage, ancient ancestry analysis offers insights into your deep ancestral past. The differences in results stem from the distinct methodologies and datasets used, and both are subject to change as DNA research evolves.

Summary

The percentages in modern and ancient ancestry analysis results will differ due to the distinct methodologies and reference datasets used for each type of analysis. Modern ancestry focuses on recent genetic heritage, tracing up to 10 generations with detailed algorithms and extensive, current reference populations. In contrast, ancient ancestry delves into your distant past, relying on limited ancient DNA samples representing populations from thousands of years ago.

Both types of analysis are subject to change as DNA analysis techniques improve and new ancient samples are discovered. Modern ancestry results reflect recent genetic contributions, while ancient ancestry results highlight deep ancestral origins. These differences provide complementary insights into your genetic heritage, from recent centuries to ancient times.


Tomohiro Takano
Tomohiro Takano
Co-Founder and CEO