Sunday, June 7, 2015

Improved Strategies with GEDmatch

It's been not quite two years since I first began uploading raw autosomal and X-chromosome data to GEDmatch. Because of the nature of my research, I really want all the kits I manage to show up together when sorted alphabetically on a "one-to-many" comparison. For that reason, I assigned each of my kits an alias that begins with "Pikholz" followed by initials or a nickname. My own kit was called "Pikholz - IP." Aunt Betty's was "Pikholz - AB" and Gary's was "Pikholz - GZP."

For some reason, GEDmatch treats all these as though they have an asterisk in front of them (ie "*Pikholz - IP"). In an alphabetical sort, the names with an asterisk come before those without and that's fine with me.

When a few of my mother's Gordon family began testing, I added a "G" after "Pikholz" to help both with the identification and the sorting.

Some months ago, I decided that I wanted the Rozdol Pikholz descendants (there are twelve of these now) to sort together, so I added "Roz" to their aliases. Gary is now "*Pikholz - Roz - GZP."

Recently, I made two other changes. I have nearly fifty Skalat kits and it was getting cumbersome. First of all, I added "Sk" to all the Skalat Pikhlz descendants - and further added coding for descendants of my great-grandfather Hersch Pikholz and for descendants of Peretz and Nachman Pikholz. I became "*Pikholz - SkH - IP" and others begin with "*Pikholz - SkP -" and "*Pikholz - SkN." Other Skalaters begin with "*Pikholz - Sk -."

I did one other thing. All my kits now begin with the number "1." My alias is now "*1Pikholz - SkH - I." I did this of course because I wanted my kits to sort near the top. But it isn't just an issue of convenience.

GEDmatch processes all the data but only shows the first 1500 results. When you want the results to sort to show your closest matches, 1500 is plenty, even as the number of kits in the system has grown. But for those of us who sort alphabetically, that's not good enough because often our matches will not make the cut. For awhile I have been raising the threshold to 8 cM (the default is 7 cm) in order to reduce my matches, but often that is not enough and frequently I have to raise the threshold to 9 cM to reduce the number of matches displayed even further.

Sorting by email doesn't help because the email that represents all my kits begins with "israelp@," which comes out somewhere in the middle. I could change my email to something beginning with "ZZZisraelP@" and sort in reverse, but that seemed like alot of trouble.

So the aliases of all my kits now begin with "*1Pikholz" and I can go back to the default threshold of 7 cM. Eventually, enough other people will figure this out and perhps I'll have to change them to "*00Pikholz," but for now this will do.


  1. I am looking for suggestions on finding my biological father (I was adopted and have only my mother's side documented). My ancestry shows as 47% Ashkenazi Jewish. You and I match on Gedmatch within 3.7 generations. I match with 1-Pikholz. What should I do to find my father? I don't have his name. Thank you!

    1. We'll have this conversation by email.

  2. A cousin of mine who'd researching our family tree has told me that you and I are related - "cousins"I. He has just discovered through a DNA test on my uncle that my maternal grandfather was 100% Ashkenazi Jewish. This is of great interest to me since I am a convert to Judaism. My grandfather, Alfred Kussel, emigrated from Berlin to the US in about 1910 an married Anne Waldron. He revealed very little of his past but I do know that his mother's maiden name was Bernhardt. I have heard that he had 2 sisters who perished in the Holocaust and a brother who died, but I don't know under what circumstances. If you can put this together at all, would be glad to hear from you -

    1. I don't see anything obvious but I also do not know why your cousin thinks we are related.
      If you want to have this discussion, email is the way.

  3. Israel, I'm not sure I understood everything you are saying about sorting. I understand increasing minimum cM to get back further but it also sounds like you are getting it to produce a different list of 1500 by sorting on kit names. How are you telling it to do that? I thought you always get the top 1500 based on the estimated generations to MRCA.