Sunday, October 26, 2014

A Look at the Matching Segment Search (GEDmatch)

Last week, for the first time, I wrote about GEDmatch, just in time for the launch of their four new subscription tools which they call "Tier1.

Kitty Cooper has blogged about the triangulation feature and Blaine Bettinger has blogged about a tool they call Lazerus, which recreates the DNA of ancestors based on the tests of living descendants.

Miriam and my second cousins - a directed search
Last week I looked at GEDmatch results for a woman named Miriam who is connected to quite a few of my family. Miriam tested with Ancestry. Her matches were with two second cousins of mine of my mother's side, first cousins to one another. Miriam's two matches with these two cousins are on the order of 9-10 centiMorgans and I wrote to Miriam that we really need a tool that allows us to see who else matches her on those two specific segments - on chromosomes 8 and 15..

Later that same day, I saw the announcement of the GEDmatch Tier1 tool Matching Segment Search. It took a few days to register my subscription but by early Friday afternoon, I was ready to have a go.

I logged in at GEDmatch and found the four Tier1 links on the bottom right. You don't see the utilities until you have completed your donation. I chose the first - Matching Segment Search - and it gave me the screen below:

Note that what was called "Matching Segment Search" on the first screen is "GEDmatch DNA Segment Search" on the second.

I entered Miriam's kit number and left the minimum default values untouched. I also chose the chromosome bar (the default) in order to get a better visual picture.

The results came up in about a minute. Today (Sunday) as I repeat the same process, it is taking several minutes. I assume this is a server issue.

This is what the heading and the first few results look like after I blurred the identifying information for privacy:

The results are the matches for the twenty-two chromosomes - not the X.

You can copy and paste the results into an Excel file where you can manipulate them as you wish and save them for future use. But in this case, I had two specific segments in mind, so I saw no need for anything more than a single screen shot for each of the two relevant chromosomes.
My cousins are marked by the arrows. Kit numbers, names and emails are hidden for privacy.

The table on the left is the segment on chromosome 8 and the one on the right is the segment on chromosome 15. First I Iooked for people who matched Miiriam and my cousins on both segments and I was surprised to find none. I know a few of the names - one is a Pikholz descendant - but nothing jumped out at me as interesting.

I suggested to Miriam that she write to those matches, starting with the ones nearest my cousins on the list and ask if they have any of the ancestral surnames which are relevant for my cousins: Gordon, Kugel and Jaffe, or anything else in the right parts of Lithuania and Belarus. She can also show the charts to the matches to try to determine which of these matches match each other and if any are known family to one another.

The party on chromosome 6 - too many matches
About six months ago, I discussed the matches we have with Steve Turner on chromosome 6 and I decided to have a look at those with this new tool.

I entered Steve's kit number and set the minimum at 8 cM. Ffiteen minutes of waiting and I gave up. I raised it to 9 cM and the same thing. At 10 cM, I got results - but of course the only matches were 10 cM or more.

As I have discussed before, we are told we should ignore the smaller matches as they are probably Identical By State (IBS), splinters of DNA from the far distant past, beyond what we call genealogical time. But it seems obvious to me (though not to everyone) that when you have several matches of 10 or 12 or 16 cM and probable family members fall in the same segment with matches of 6 or 8 cM, these are almost certainly relevant.

So I wanted Steve's matches from at least 8 cM and couldn't get them. (This was Friday.)

I tried to look at Aunt Betty's matches and couldn't get anything below 10 cM there either.

Then I had a look at a few of my people whose other side is not Jewish - people who have fewer than 2000 matches on FTDNA rather that the 3500 or more that the 100%-Jewish descendants have. Those came up with no problem.

So this was obviously an issue of too many matches and the solution looked simple. GEDmatch should allow us to download a person's matches in two or three pieces.

I discussed this with GEDmatch Friday and although they understood my problem, they felt that my solution would create server pressure.

After Shabbes, I found the following message:
I have increased the maximum number of segments to 10,000.  Please let us know if this works better for you.

John Olson
Co-Administrator, GEDmatch.Com
I wasn't sure if the 10,000 match limit was a temporary solution or meant for long-term. Keep in mind, the number of tests is rising all the time as is the percentage of tests uploaded to GEDmatch, so what works now may not work a few months hence.

I had a look and was immediately pleased by a new screen:

Now I know the system is working on retrieving my data and I'm not just hanging around.

Of course, I had no idea how many segments they had been allowing before, so I did not know what to expect from the new 10,000 limit. I see now that they are at the end of the results

Aunt Betty's results came up at 8 and even 7 cM within an eminently reasonable two-three minutes.  Aunt Betty had 4174 matched segments with a minimum of 8 cM and 6477 with a minimum of 7 cM.

Steve Turner's did not. GEDmatch was having server problems.

Mark Halpern, guinea pig
One of the earliest non-Pikholz to join our project is my friend, veteran researcher, with known Skalat ancestors, Mark Halpern. Mark matches twenty-three known Pikholz descendants plus my two Kwoczka cousins. He matches seven of the nine descendants of my great-grandparents plus several others whom we think are close to us. Eight of his matches with us are suggested third-fifth cousins. Seven of his matches with us are from the Rozdol Pikholz family.

Tier1 looked like a good place to see who else matches in the same segments. It took maybe five minutes to pull down his matches at 8 cM and then quite a while to move it into Excel in six or seven pieces - probably an Excel problem on my end. Of 3771 segment matches, 61 match my families' kits. 

Of those sixty-one segments, one is 20.8 cM and only two more are more than 15 cM. Twenty-four are less than 9 cM. That sounds like a huge number of segments less than 8 cM. The IBS splinters from the distant past.

Taking Mark's matches with me as an example, we have a total of 21 matching segments at a total of 88.79 cM, but only three segments over 8 cM totalling 30.1 cM. And of the small ones, only one is more than 4.89 cM. Truly alot of splinters.

The Pikholz who matches Mark the most is Charlie, with five segments and 46.8 cM, followed by my Uncle Bob with three segments and 34.4 cM. In my personal family, Rhoda and Lee have three segments, one other Skalater, two Rozdolers and one of my second cousins on my mother's side have three segments each.

This is not an impressive set of matches. But there are nine matches which involve more than one of mine together with Mark and it's worth a look to see exactly who and where. However, I think the place to do so is the triangulation tool..

So now that the GEDmatch server seems to  be working...
I went back and looked at the Steve Turner matches. There were over two hundred matches on the same segment of chromosome 6. About half of those are between 7 and 9 cM and about half of the rest are below 12 cM. About half the matches are from 23 & Me kits, so these are clearly matches I would not see if I worked within FTDNA.

I am not quite ready to draw conclusions about the Matching Segment Search. It is certainly an excellent solution for a directed search. We'll see what else.

Housekeeping notes
Well, not strictly housekeeping, but some clean-up from last week.

I wrote last week about an apparent connection on the X chromosome between a second cousin (Rhoda) on my grandfather's side and my grandmother's Hungarian side. This was due to the fact that Rhoda and my two sisters matched Aunt Betty on the same segemnt of the X chromosome.

A closer look solved that. Aunt Betty, like all women, has two X chromosomes. Rhoda matches her on one, her father 's, and my sisters match her on the other, her mother's. I know this because Rhoda does not match my sisters on that segment.

The match that Aunt Betty and Rhoda share comes from my great-grandmother's Kwoczka side.


  1. So, I am a Pickholz imposter .... or what. So confusing

    1. I like the idea of splinters. Fits well with the "holtz" part.