Tuesday, April 17, 2018

How Close Are We - GEDmatch and Known Matches

A few days ago, Liza Lizarraga called our attention on Facebook to a year-old blog about the "generations" column in the GEDmatch "one-to-many" results. In her blog GenGenAus, Cate Pearce tells us how the generations that GEDmatch shows us line up with her own reality. I had never seen this blog or any other form of this analysis.

Above are my own one-to-many matches with four people in my Pikholz project. I am not sure how I am related to them, but I call them "my fourth cousins". The first two are known to have multiple Pikholz ancestors, but the approximations that GEDmatch shows (in the red box) do not help much.
Yet some people want them to be useful - even definitive - and the occasional poster on Facebook or in discussion groups will ask "What does 4.7 generations mean, exactly?" Well, there is no "exactly" once you get past parent/child relationships. And beyond that, Cate shows us what her actual family matches look like:
Gen 2.3 1C1R (first cousins once removed)
Gen 2.5 1C1R (first cousins once removed) Again, this makes sense: my cousin is a generation older than me, his grandparents, which is 2 generations, are my great-grandparents, which is 3 generations. Therefore, the Gedmatch Generation is calculated as being between 2 and 3 = 2.5
Gen 2.6 1C1R (first cousins once removed) 2C (second cousins)
Gen 2.9 2C (second cousins)
Gen 3.0 2C (second cousins)
This is the ideal scenario, with the common shared ancestors for me and my match both being 3 generations back.
I figured it would be useful to do something similar for endogamous populations and felt that it would be more user-friendly to put it in a structure like Blaine Bettinger's Shared cM Project. So with Blaine's kind permission, I prepared an analysis of my 1184 known family relationships, shown in the chart below. (There are more data points on the way, waiting to get a few more kits onto GEDmatch.)

This chart has no claim on precision. The averages do not include those matches which do not show in the traditional one-to-many search on GEDmatch, what we call "the zeroes."

Some do not show up because they do not meet the conditions for a match. Some are good matches but have only a single segment and for some reason GEDmatch does not display these. Some are matches but do not fit into the 2000-match limit which GEDmatch imposes.

Multiple known relationships are listed by the closest one and there is no special acknowledgement of my double second cousins.

I also show the sample size, which does not include the "zeroes."

I would be really pleased to have some more data. Anyone in the endogamous community who wishes to join in can download the simple Excel form at www.pikholz.org/GenerationsForm.xlsx. Needless to say, privacy will be maintained. And you do not need to identify the specific matches, just the relationships.
Housekeeping notes
I just took DNA from the wife of my boss forty years ago. She told me back then that she is a Pikholz descendant and now I know that her third-great-grandparents are Mordecai Pikholz and his wife Taube. I still don't know how Mordecai and my second-great-grandfather Isak Fischel are connected. Maybe brothers. I have been after her to give me DNA for years.

I'd be pleased to see any of you at the following three events.

30 April 2018, 7:00 – Jewish SIG of the St. Louis Genealogical Society, Holocaust Museum & Learning Center Theater, 12 Millstone Drive, St Louis Missouri
Using Genetics for Genealogy Research
(Lessons in Jewish DNA – One Man’s Successes and What He Learned On the Journey)

2 May 2018, 6:00 – Jewish Genealogical Society of Kansas City, Johnson County Central Resource Library, Carmack Room, 9875 West 87th Street, Overland Park Kansas
Lessons in Jewish DNA – One Man’s Successes and What He Learned On the Journey

8 May 2018, 7:00 – Youngstown Area Jewish Federation and the Mahoning County Chapter of the Ohio Genealogical SocietyJewish Community Center of Youngstown, 505 Gypsy Lane, Youngstown Ohio
Why Did My Father Know That His Grandfather Had An Uncle Selig?
(because genealogy is more than names and dates) 

JRI-Poland still needs considerable funds for the new indexing projects. Among the towns I am responsible for, Skalat and Skole are far from their goals, but Rozdol, Komarno, Zbarazh and Podkamen need funds as well. See instructions for donations here - and don't forget to say which projects you are supporting. And let me know what you have contributed.


  1. I believe Lara Diamond collected a lot of our example data that could be useful.

    1. This comment has been removed by the author.

    2. As far as I know, she is doing an endogamous version of Shared cM. The data submitted to her does not include GEDmatch generations.

  2. Israel,

    Your numbers will give the closest relationship for a GEDmatch Gen number when a relationship is known.

    What I think would also be very useful is if you determine (at least for yourself, or for others as well if they want to contribute), what the GEDmatch Generation numbers are for matches of people who you don't know the relationship to.

    I'm not sure how to best present it. Maybe % of known relationships at each Gen number. e.g. Maybe 80% of all matches at Gen 2.0 have known relationships, but only 20% at Gen 3.0.

    I sent you by private email 3 relationships I know: an uncle (1.4), a 2C1R (3.6) and a 3C (3.7).

    At gen 1.4, I know the closest relationship of 100% of my matches.

    At gen 3.6, I know the closest relationship of 0.6% (1 out of 168) of my matches.

    At gen 3.7, I know the closest relationship of 0.2% (1 out of 444) of my matches.

    At all other gen levels, I know the closest relations of 0% of my matches.

    Thinking about this, I have hundreds of 3.6 and 3.7 matches I haven't identified relationships for. I find it hard to believe that they'll all be 2nd to 4th cousins, but expect they'll more likely be 3rd to 7th cousins at the closest, with other endogmatic (if that's a word) relationships at that level and further adding up to be a 3.6 or 3.7 equivalence.

    So I think your table (and Lara's cM table) are useful to figure out where known closest relationships should fall, but without further info we should be wary of using them as guides to what the closest relationship of an unknown 3.6 might be.