Tuesday, November 7, 2017

Visualizing Ancestry Relationships using Google Fusion Tables (Part 1 - Creating the sheet)

First, I want to thank Randy W Whited for planting the seed to use Fusion Tables in the first place. Thank you!

This blog post describes how to create a simple but insightful network plot like the following (some names are blurred to protect the privacy of living people--no ancestors were harmed in the making of this chart):


The difficulty in creating this chart is not the chart itself, it is arriving at the required data to build it.

The hard part:

Some prerequisites:
1-you have an existing tree and can get a simple listing of your direct ancestors (your pedigree).
2-you have installed, and already run the DNAGedcom Client and have an "A_File" that is a listing of all your DNA matches' ancestors.
3-the format of your pedigree (Surname, Given name etc) match the format in the A_File.
4-using Excel (or Access) you can compare 1 above with 2 and come up with a listing of all the ancestors of all your matches who match your direct ancestors

If you have done the prerequisite work above, you probably already know where I am going with this.

Before diving in and setting up the required table, it might be good to take a moment to understand what a network plot is trying to do. In this case, I am only trying to do the most basic of things: compare items in column 1 to items in column 2. That is it. Very simple.

From that simple idea can come complex relationships. For this example, I want to visualize how I am related to those Ancestry DNA matches of mine who share common ancestors.

In a network diagram, the most basic explanation is that it plots a line between the value in column 1 with the value next to it in column 2. That is it. If I created a simple spreadsheet that looked like this:


The resulting network diagram would look like this:

So far, so good. Easy, right? So if I repeat the same in col 1 and something different in col 2, I get the following result:


So you can see, that with a simple idea can come some complex relationships:

That is the foundation for how Network diagrams work, whether in NodeXL, Gephi, or Google Fusion tables.

Setting up a spreadsheet with ancestor data is basically the same as right above. In the first column, add your ancestors list and in column 2, add those DNA matches who have them in their tree. Pay attention to the ancestor details because small difference will give different results (circled names in red, even though the same person in your mind, will be treated as two people):


Once you have that part done, you need to connect it to you. I found the easiest way was to simply copy the ancestors in column 1, and paste it under the last entry in column 2.


While they are still highlighted, go ahead and remove any duplicates, because this is where we connect your ancestors to you, and even if they occupy more than one Ahnentafel number, you only want them in this list one time while connecting them to you. (Under the Data Menu, you can find the Remove Duplicates Button. Don't expand the selection if asked):


The last thing to do now, is simply add your name into the blanks cells of column 1 next to your ancestors: 

That's it! You now have a file you can upload to Google's Fusion Tables and have some fun.

I will describe the process of uploading the file and creating the chart in Fusion Tables in the next post. (see part 2).





















9 comments:

  1. Clark, I converted my GEDcom to Excel get the ancestors to enter in column 1. You indicate "in column 2, add those DNA matches who have them in their tree." What column of my ancestor (a_XXXXX) file do I copy to column 2? Thanks.

    ReplyDelete
    Replies
    1. If you are familiar with MS Access, you will have to compare your pedigree (ancestor list) to your A_File ancestors to see which are the same.

      Or you can email both files to me and I can do it for you. Or upload to Google drive or dropbox or whatever you use, and share it with me. cwlind at gmail dot com.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. Here is a form where you can upload your two files to my google drive folder. Copy/paste this entire text into your browser:


      https://docs.google.com/forms/d/e/1FAIpQLSfEpML_rHqvhiDyEf63A3dVCAMHCUPGjwDWnAd9SwICtV1-rw/viewform

      Delete
  2. Thank you Clark. I hate to have you do this for me as I'd rather understand how to do it myself. Nonetheless I signed onto the Google site but must have made and error as I got the following message:
    "You've already responded
    You can only fill out this form once.
    Try contacting the owner of the form if you think this is a mistake."
    Will mail the two lists to your gmail account. Thank again. --Dave

    ReplyDelete
    Replies
    1. Thanks Dave. I changed the form to allow more than one response. --Clark (Your Fusion Table should be shared with you as the Owner now.)

      Delete
  3. Clark; thanks for the detail in creating a fusion table. After two days of downloading FTDNA and Ancestry I thought I was ready to continue with your instructions. This may seem like a stupid question, but here goes... In prerequisite 2 it says install and run DNAGedcom Client and create an "A_File". This is the first of the three Ancestry files I downloaded, correct? Is the "A_File" something I can do with FTDNA?

    ReplyDelete
  4. Hi! The A_File is created by the client for the Ancestry data. The kinda, sorta equivalent file for FTDNA is the "_Kit Number_ Trees.csv" file. Unfortunately, the FTDNA file doesn't have birth/death dates & places, so comparing the trees of your FTDNA matches is trickier. A James McDonald in one tree and a James McDonald in another may or may not be the same person. However, because you also have the chromosome browser info, if both owners of those James McDonalds also match you on the same chromosome in the same location, and the only person in common between their trees is James McDonald,that could be a good clue :)

    ReplyDelete
  5. Clark,

    Thank you for explaining how this tool works. I've used it to match my results' ICW connections in DNAGedcom. That helps me visualize how the linked matches line up with my various branches.

    ReplyDelete