The following is an essay that was originally submitted to the journal American Renaissance for publication. They declined to run it; the reason given that it was too similar to other race-genetic essays published there in the past. Nevertheless, I believe the findings discussed here to be of extreme importance. Thus, I am reproducing the proposed essay here, with minor revisions to remove it from its previous “Amren” context.


Racial Genetic Similarity and Difference: A New Study

One scientific topic that I have previously discussed is the biological validity of the race concept. This, unfortunately, has become necessary, because some people, perhaps with political motivations, assert, contrary to the evidence, that “race does not exist” and that race is a “social construct” with “no biological foundation.” These views have been effectively refuted in various forums, and more objective researchers support the race concept as well, if for no other reason the important medical implications of racial differences.

One popular and misinterpreted finding that has been eagerly grasped at by those who preach that “race is not real” is derived from the work of Richard Lewontin, which demonstrated that more genetic variation exits within rather than between groups. I have previously explained how Lewontin’s finding in no way discredits the race concept. However, there are “anti-racist” activists who still claim, based on their misinterpretations of population genetics, that individual Europeans (“whites”) can be more genetically similar to sub-Saharan Africans (“blacks”) than to other Europeans. Until now, there has been no formal proof that this assertion is incorrect. I am now pleased to say that a recent scientific paper has delved into this very topic and that the findings of this paper clearly demonstrate that the race deniers are wrong. First, let me give a brief introduction for the sake of clarity.

A number of scientific studies have shown that it is possible to genetically cluster individuals to their self-identified race with near 100% accuracy. Further, racial categories can be determined by the genetic data even without any a priori information about the groups involved. In other words, racial groups can be empirically observed through genetic analysis without any prior assumptions about these groups by the researchers.

However, does that imply that individual members of these races will always be more genetically similar to members of their own racial group compared to members of other groups? Or, are genetic clustering and individual genetic similarity so different that this may not be always so? Can individuals share more genetic similarity to members of other groups rather than to members of their own group, even if everyone is properly clustered with their self-identified race? In other words, can there be significant genetic overlap between individuals on the fringes of, say, the European and African clusters?

These are the questions asked, and answered, in the paper Genetic Similarities Within and Between Human Populations by Witherspoon et al., Genetics 176, pgs. 351-359, 2007. I will simplify the authors’ statements and analogies so as to make the work more understandable to the broad readership; although this may mean that certain detailed specifics are glossed over, the main “take home” points and essential interpretations remain intact. And, since the paper is available online at no cost, any reader interested in delving into the scientific details can do so at their leisure.

The authors introduced the metric “w”, which they defined as:

… the frequency with which a pair of individuals from different populations is genetically more similar than a pair from the same population.

In other words, what is being determined with “w” is the frequency with which, for example, individual whites and individual blacks may be more similar to each other than to members of their own race. This measurement, which is based upon gene by gene comparisons between individuals, is different from the two clustering measurements that the authors compare to “w.” Unlike “w”, the clustering measurements incorporate population-level genetic information, and thus consider the “aggregate” qualities of the population’s genetic information. To put it simply, and bypassing many details, “w” compares individuals to each other, while clustering is, essentially, comparisons of individuals to the “genetic average” (or “centroid”) of different populations. By crude analogy, we can consider physical traits. “W” would be analogous to how similar two individuals are to each other in height, weight, eye color, skin color, hair color, facial features, etc. Clustering, in contrast, is more analogous to how similar each individual is to the average measurements of height, weight, eye color, etc. for any group. Thus “w” can tell us how similar individuals are to each other, while clustering tells us whether an individual is more similar to one group or another. Clustering allows us to “bin” (or “cluster”) individuals as belonging to one group or another.

Is it possible for individuals from different groups to be more genetically similar to each other than to members of their own group? More importantly, can this occur even if all of these individuals are correctly “binned” by genetic cluster analysis to their correct racial group? In other words, is it possible to correctly cluster everyone to their self-identified race, even though members of different groups are more similar to each other than to some members of their own group? In theory, yes, and the authors provide an example of how this may occur. For the sake of understanding, I will simplify their explanation and calculations. Assume that the measurement “q” represents the averaged gene frequencies for groups or for individuals. The African genetic average (or “centroid”) of “q” may be 0.46; the European “q”, 0.61. This “q” measures the average frequency of different gene types at various parts of the genome. Assume three individuals, two Africans and one European, with their own individual “q” measurements of 0.4, 0.52, and 0.55 respectively. Consider the African with q = 0.52. He is closer to the African average of 0.46 than to the European average of 0.61. Thus, he clusters with Africans; in fact all three individuals would cluster with their identified group. Yet, at the individual level, the African at 0.52 is closer to the European’s 0.55 value than to the other African’s 0.4 value. Thus, it would seem that individual racial overlap can be possible even though clustering is absolutely correct. Does this actually occur in reality?

Bamshad et al. (“Deconstructing The Relationship Between Genetics and Race,” Nat. Rev. Genet. 5, 598-609, 2004) , using 377 DNA markers in 1,056 individuals, found that in 38% of the cases, individual Europeans were more similar to individual Asians than to other Europeans. So it would seem that significant genetic overlap across broad racial lines exists, even if everyone is correctly binned to their own racial group. But, is this really true? Will that hold true when more markers are used?

These are the questions that the Witherspoon et al. paper attempted to address. What were their basic findings? The authors first examined the amount of genetic overlap between individual Europeans and sub-Saharan Africans using 175 markers, comparing the “w” metric with two measurements of clustering. Since clustering is a less stringent measurement than is genetic similarity (“w”), it is not surprising that, with a given number of genetic markers, there is less overlap with clustering than with “w.” For example, in the case of Africans vs. Europeans and 175 markers, the two measures of clustering gave overlaps of 4.9% and 1.9%; in contrast, the “w” measure of genetic similarity have an overlap of 23%. This “w” means that, given these 175 markers, nearly one quarter of the time an individual European will be genetically more similar to an African than to another European. This tracks fairly well with the findings of Bamshad, discussed above. At the same time, 175 markers were sufficient to yield clustering at an accuracy of ~95-98%. Thus, given a moderate number of markers, accurate racial clustering of individuals may not coincide with individual members of a group always being more similar to members of their own compared to individuals of another group. Are the racial liberals then correct? It is possible for a Dane to be more similar, genetically, to a Nigerian than to a fellow Dane, even if the error rate is less than ¼ of the time? The answer is, simply put, no. This genetic overlap between individuals from the major racial groups is an artifact of not using sufficient numbers of markers.

As the authors used more and more markers to compare the three major racial groups (Europeans, East Asians, and sub-Saharan Africans), the less stringent clustering measurements rapidly fell to a 0% overlap, as expected from previous studies. What about the more stringent measurement “w”, which looks at comparisons between individuals, and does not consider group data? Once the authors reached 1,000 (or more) markers, the genetic overlap between these groups essentially reached zero. It is useful at this point to quote the authors about this fundamentally important finding: “This implies that, when enough loci are considered, individuals from these population groups will always be genetically more similar to members of their own group.” With respect to the question of whether individual members of one group may be genetically more similar to members of another group, they write:

However, if genetic similarity is measured over many thousands of loci, the answer becomes “never” when individuals are sampled from geographically separated populations.

Thus, the naïve “anti-racist” view, actually stated by some people, that it is possible for individual Europeans and Africans to be more genetically similar to each other than to members of their own race, is simply false. Any such “finding” is simply due to insufficient numbers of DNA markers being used. With an adequate methodology, individual members of the major racial groups will always be more similar to members of their own group than to members of other groups. Some may not like this, and deem it “racist,” but these are the scientific facts, nonetheless.

For whatever reason, the authors were not satisfied with ending their study with these findings and decided to repeat their data analysis incorporating populations they term “intermediate” or “admixed.” These included New Guineans, South Asians, Native Americans, African Americans and “Hispano-Latino” groups. Not unexpectedly, it became somewhat more difficult to distinguish between groups, with a given number of markers, when these additional “intermediate/admixed” populations were added. Even with more than 10,000 markers, the “w” measurement and the clustering measurements never quite reached zero with respect to overlap, although the numbers were low. For example the authors state that with 1,000 or more markers the “w” measurement reached a value of 3.1%, meaning that even with the intermediate/admixed populations, genetic overlap was at a frequency of less than 5%.

Do these latter findings mean that there will always be genetic overlap between members of more closely related groups, especially when so-called “intermediate” and “admixed” populations are considered? Although some people may fervently wish that 100% accurate classification will remain impossible, except for the most widely divergent groups, this may well not be the case. We are entering an era in which reasonably affordable whole genome sequencing will be possible, and with the proper methodologies, it will be possible to compare a number of markers considerably larger than what is used in the current paper. While 10,000 markers may not be sufficient to eliminate overlap between all groups completely – although it does reduce the overlap to very low levels – it is possible that larger numbers of markers, or even whole genome comparisons, could do so. With more data, it may well be possible to distinguish, with near 100% accuracy, between groups that still demonstrate a low level of “w” with current data.

Then we must consider the issue of genetic structure, not directly addressed in this study. Although structure can include such genetic phenomena as inversions, deletions, and copy number variation, the major component of genetic structure is the co-inheritance of specific genes. In other words, we must consider not only the frequencies of each gene taken in turn, but the frequencies of specific genes together. For example, there are genes that code for eye color, skin color, hair color, etc. One can examine the frequency of each gene on a one-by-one basis in an individual (or group) and do all the pairwise comparisons to another individual (or group) and determine “w.” But what are the frequencies of particular combinations of gene types inherited together? For example, what is the frequency of having genes for blue eyes and blonde hair and fair skin, etc. co-inherited, rather than measuring the frequencies of each of these genes in turn and averaging the results? Genetic structure superimposes further genetic differences on top of one-by-one consideration of genes; therefore, differences between groups are going to be larger when structure is considered compared to when only frequency differences of individual genes are measured and averaged.

To further explain the difference between genetic similarity and genetic structure, I present an analogy using colored marbles. Assume that individuals of different races each have a set of marbles, numbered from one to 100, with the marbles being of various colors. Genetic similarity (the basis of the “w” metric) would be analogous to comparing the marbles of two individuals one-by-one; first comparing the color of marble #1, then #2, then #3, and so forth, on an individual basis and then counting the total number of matches. Genetic structure, on the other hand, would be analogous to asking if the two individuals have similar, or even identical, combinations of colors for specific marbles. For example, person A may have red marbles for #1, 6, and 15; blue marbles for # 3, 10, 33, and 95; green marbles for # 7, 8, 22, and 84, and a yellow marble for # 38. If this particular, specific combination of colored marbles is of importance, we can then ask if person B has a similar combination. What is important here is not the one-by-one counting of matches, but whether the whole pattern is replicated, or almost replicated, between two individuals (or groups).

What about the relation between genetic ancestry and individual phenotype? The authors do state that: “Thus it may be possible to infer something about an individual’s phenotype from knowledge of his or her ancestry.” However, since phenotypic traits are coded for by a number of genes smaller than that required to yield low genetic overlap, the authors assert that there may be significant phenotypic overlap between people of different groups. They give an example of a trait “determined by 12…loci,” which would yield a 36% overlap of phenotypes between individuals of different groups. Yet, racial groups show markedly different phenotypes. How is this so, if what the authors state is true? There are two points that the authors neglect to emphasize. First, many phenotypic traits, including racially relevant ones, have been selected for because of their adaptive value, or the populations commonly exhibiting these traits have been subject to genetic drift isolated from other populations. Thus, it is not reasonable to assume that genes that code for phenotype are going to have the same “worldwide distributions” as markers used in this study. For example, gene alleles coding for skin color show markedly higher frequency differences between populations than do the neutral markers used in population genetics. A second point is that racial phenotypes are the result of genetic structure, of many types of traits co-inherited together, and it is the sum total of all these differences that allow for racial distinction at the phenotypic level. Looking at individual phenotypic traits, just like looking at individual gene frequencies, is going to provide a markedly incomplete picture of human racial variation.

How do the findings of the paper relate to the subject of Frank Salter’s concept of “ethnic genetic interests?” This paper strongly supports the concept, which is dependent upon genetic differences between peoples. After all, there is essentially zero genetic overlap between individual members of the major racial groups; a member of one of these groups is always going to be more similar to a member of their own group than to that of another. Multiplied over the large numbers of people that constitute racial groups yields a very substantial genetic interest. Even if we take at face value this paper’s findings concerning the intermediate/admixed populations, the ethnic genetic interest concept holds as well. In the vast majority of cases, an individual will be more similar to members of their own group; overlap, while not zero, is low. When one multiples these differences over the large numbers of people involved, then there are very large and crucial differences of genetic interests regardless of which populations are considered.

But that is not all. First, consider that with sufficient numbers of genes assayed, the small degree of overlap observed with the intermediate/admixed groups may disappear; it would almost certainly disappear if genetic structure is considered. Second, and perhaps most important, the ethnic genetic interest concept is not based on overall genetic similarity/difference, but rather on differences in frequencies of distinctive genes, above and beyond random gene sharing. After all, those genes that do not differ in frequency between groups do not contribute to differences in genetic interests, because their frequency stays unchanged regardless of the outcome of competition. Even if an entire racial group were to die out, the frequency of these “shared genes” would remain unchanged. Note that measurements of overall genetic similarity, such as “w,” will as a matter of course also include genes that do not differ in frequency between groups. Therefore, even when “w” shows a low degree of overlap, there may well be no overlap at all with respect to those genes that are distinctive, that vary in frequency between populations.

To further explain the importance of distinctive genes vs. “w,” I will go back to my colored marbles analogy. Imagine that the distribution of colors for marbles 1-80 was completely random, but the colors for marbles 81-100 were specific to a person’s race. Overall similarity in marble color (analogous to “w”) would consider all 100 marbles. However, if we were to ask how the color frequencies of the marbles were to change if people of one race were completely removed from the example, we would observe that only marbles 81-100 would be affected. For marbles 1-80, since the color distribution is completely random with respect to race, it doesn’t matter if one race or another is eliminated from this marble counting exercise. Only the “population-distinctive marbles” are at issue here. Likewise, when considering competition and conflicting genetic interests between human groups, the gene frequencies that really matter are those that exhibit differences in frequency between the groups, not those that are randomly distributed between the groups.

Thus, while this new paper strongly supports the concept of ethnic genetic interests, we need to remember that ethnic genetic interests is a more stringent and specific concept than simply measuring the degree of genetic similarity. If we are not careful, we may otherwise conclude that a group of mice constitute a greater genetic interest for a person than does another person, since the group of mice would contain more copies of the person’s gene sequences than would another single person (by some measurements, mice and humans are ~90% genetically similar)! But this is not the case; it are precisely those gene frequencies that are distinctive between humans and mice (as well as differences in genetic structure between the two species) that determine genetic interests when comparing these species, not the overall genetic similarity, and not counting the numbers of gene sequences held in common.

In summary, this is a crucially important paper that demonstrates that individual members of the major racial groups will always be more genetically similar to members of their own group than to individuals of the other major races. The paper demonstrates the importance of using sufficient numbers of markers in these studies, and the findings also underscore the differences between the concepts of clustering (“binning”) of individuals into groups vs. measurements of the genetic similarity between individual members of these groups (“w”). Although the inclusion of “intermediate” and “admixed” populations prevented the genetic overlap of cross-racial individuals from reaching zero, with a sufficient number of markers the overlap was at a very low level. Further, it is quite possible that when utilizing a greater number of markers, or even a whole genome analysis, this genetic overlap may vanish completely. In addition, another important point to consider when evaluating this (and any other) genetic study is that genetic structure is an important part of human genetic variation that has not yet been carefully examined, but which will likely amplify the differences in genetic variation between human population groups. When considering the totality of genetic structure, individual overlap between racial population groups, including “intermediate” and “admixed” groups, will almost certainly be nil. Finally, the data from this paper support Frank Salter’s conception of ethnic genetic interests, although we must remember that genetic interests are properly thought of as derived from differences in the frequencies of distinctive genes, rather than counting total copies of genes shared in common. In the final analysis, the primary findings of this paper are a devastating blow to politically motivated assertions of “no genetic differences between human races.”

JW Holliday



Posted by rocket on Tue, 04 Mar 2008 01:40 | #

JW , being a layman in this area of study ( actually more like a novice), could you explain to me what you agree with in the work of Frank Salter , and what you disagree with in a nutshell . thanks .


Posted by lothar on Tue, 04 Mar 2008 18:24 | #

... everyone is properly clustered with their self-identified race?

... any reader interested in delving into the scientific details can do so at their leisure.
... is it possible to correctly cluster everyone to their self-identified race

... everyone is correctly binned to their own racial group

... a member of one of these groups is always going to be more similar to a member of their own group

... an individual will be more similar to members of their own group

Mr. Holliday, remember the principle of agreement in number?
“everyone” is singular,  “their” is plural.
“any reader” is singular,  “their” is plural.
“a member” is singular,  “their” is plural.
“an individual” is singular,  “their” is plural.
please attend to your English!


Posted by JWH on Wed, 05 Mar 2008 00:33 | #

I essentially agree with Salter on all the basics.  My disagreements center mostly on that he does not, in my opinion, go far enough in his estimation of genetic interests.  For example, he focuses on distinctive functional genes, while I see all distinctive genetic information as contributing to genetic interests.  That doesn’t mean that all base-pairs are of equal value, but that it’s not correct to divide the distinctive genome into “interests” and “non interests.”  I see instead a continuum of interest with functional genes being of greatest interest, but distinctive “non-fuctional” gene seqeuences - which carry information on kinship - also being of significant, albeit lesser, value (since functional genes can carry kinship information as well, and are thus more “information rich.)

Information is fundamental - a topic for future analysis, most likely.

Salter has also ignored genetic structure - a topic that has been discussed here previously.

Finally, on a more “proximate” level, I disagree that “democracy” is an optimal political form for the expression of EGI.

glad to see you extracted from the essay the most important points.  Thank you for those thought-provoking insights.


Posted by rocket on Wed, 05 Mar 2008 23:57 | #

JWH—thank you for your reponse. it seems the feild of genetics has as much controversy and nuiance as those of us who toil in the study of theology .


Posted by lothar on Wed, 12 Mar 2008 16:16 | #

Apparently my comments were not nearly as thought-provoking to you as they should have been.

If you think the trashing of the English language is an indifferent matter, you are badly mistaken,

If you think that you can effectively combat the enemies of scientific enquiry or the enemies of White racial survival and at the same time use corrupted, politically-correct language in your writing, you are again badly mistaken.

Edward Abbey said, Freedom begins between the ears.

You can not expect to be an effective fighter for anything you believe in if the enemy has already an outpost in your mind.

And yes, your article was useful - except for the linguistic atrocities - and i appreciated your contribution.

