Sunday, August 20, 2017

Visualizing Your Family Tree Part 1

The below graphic is courtesy of B.F. Lyon Visualizations at https://learnforeverlearn.com/ancestors/.

One neat option is displaying the flags of where each person was born to get an idea of your heritage.

Try it out for yourself. [click image for a clean version.]


One thing you may notice are any lines that cross other lines. Yup! Pedigree collapse!

Wednesday, August 9, 2017

Playing with DNA Information Part4

Well, all the preliminary work and background stuff is out of the way.

***From here on out, I will assume all csv files and any other files you've created have been imported into their own tables in MS Access.


We can finally start playing around with the information. What to do?

How about we try to find out which of my Ancestry matches can be identified on Gedmatch? How would you go about finding out?

This is what I did:

In Access:
I first wanted to isolate the "A" kit numbers, so I ran a simple query on my Gedmatch Match list. All fields were added to the query, and I put  Like "A*"  in the Kit Number field, and  >=7 in the Shared cM field.  This produced a list of all my Ancestry matches at Gedmatch. You can save this query with a unique name and we'll use it in a minute as a source for our next query.

With the new query created above saved, let's see what we get. My Ancestry Match file has over 21K entries. My Gedmatch Match file has over 12K+ entries. The query above (>=7cM) reduced this to just over 1000 entries. Now we need enough identifying information to be able to conclude that the person in the Ancestry list is the same as the person in the Gedmatch list. I used the following fields, and linked the Full Name from the Gedmatch Query with the Admin name in the Ancestry table:
























Which produced some interesting results! I was able to "verify" (I use the term loosely) the accounts for over 50 people. Here is an example of the results. Many were easy to determine since people tend to use the same username everywhere. And when it is not obvious, don't keep it. We don't need to be creating false positives!












Save the results in a new table in the database. I specifically kept the KitID from Gedmatch and the MatchID from Ancestry in the above query. This has effectively become a join table where I can now link Gedmatch Chromosome browser information to their Ancestry Tree (assuming there is one).

We'll find out in the next post!







Sunday, August 6, 2017

Playing with DNA Information Part3

How many of your matches on Ancestry have trees? Is there a way to tell?  Of course! It is one of the files you will want to create for yourself anyway as a resource and reference.

Let's look at the "a" file created by the DNAGedcom client. In my case it is called "a_Clark_Lind.csv".

What is it telling us? Yes, it is a listing of the people in our matches' trees. But if you think about it, isn't it also a listing of matches who have trees? No tree, then they wouldn't be in this file!

So here is how you can create a list of just names.
-Create a new (blank) Excel file.
-Open your "a" file in Excel (a_Your_Name.csv) BE PATIENT, it can take a while to load..

Once open,
-click on columns C and D, highlighting them both.
-Right-click and select Copy (or ctrl+c)
-Go to the new Excel file and select cell A1. Right-click and select Paste (or ctrl+v).
-With both columns still highlighted, go to the Data Menu, and select Remove Duplicates.
-Save the file. Put it with the other files since you may as well import this into MS Access later anyway.

Now you know which of your matches on Ancestry has a tree. If you compare this file to your "m" file, you can also see who DOES NOT have a tree.

--------------------------

Another file you will want to have is a listing of your ancestors. Not your complete tree, just your direct ancestors. This will help out later when you start comparing matches. At this stage, we are not looking for matches by casting a wide net, that should already have happened. Now we are trying to see where those people match you in your tree.

There is no real easy way to create such a listing without using some genealogical software. One of the free programs I use for such things is Gramps (Gramps-project.org). If you download your Gedcom file from Ancestry (or have one already), you can open it in Gramps. Set yourself as the home person, then export to a new CSV file (not GEDCOM!!) using the option "Ancestors of Home Person" [you].

That will give you a csv file with just you and your direct relatives. Place it in the same folder as the other files.

These are just "utility" files that will come in handy later once you start comparing data.

More in the next part!

Playing with DNA Information Part2

Now let's look at what is actually in the different files.

Before we import anything into MS Access, it is a good idea to know what each file contains. It will become important later when we start developing queries.

On the left are the files created by the DNAGedcom client for uploading into the DNAGedcom site. Once they've been uploaded and processed, you will have the files on the right available to you (look in the Members/Files area on the DNAGedcom site).


If we look at what each file contains, we can start to get an idea of how we might combine otherwise uncombined information.

In the next part, I'll discuss a file or two that you will want to create that will come in handy.


Playing with DNA Information Part 1

This diagram is a quick layout of how the data moves around from different sites, and what that data looks like. This is not intended to be an exhaustive chart to end all charts. (click to enlarge)


Data created by the different sites and tools can also be utilized locally for our own purposes. "Why?" you may ask. Because even after using many of the tools, many questions still go unanswered. My ultimate goal will be revealed soon. :)