Salter and Sallis and Harpending Oh My!
Against Lewontin; some points on genetic variation. In all cases, emphasis is added.
I have previously written about the “more genetic variation within that between” scam of the race deniers, the scam building on the infamous dictum of Lewontin.
I went back to read an older Salter piece I had previously reviewed, and realized that there is material there, particularly from Harpending, that approaches the question in another manner.
First, I will provide some relevant excerpts from my piece, but I suggest you read the whole thing (linked above), including the Fst calculations.
You hear these people make the most bizarre claims – that “more variation within than between” means that “Whites are genetically more similar to Blacks than they are to other Whites”- comments that reflect a complete misunderstanding of the concept (to be fair, those “academics” who have for decades championed Lewontinism to the rubes have, in my opinion, intentionally attempted to promote such a misunderstanding for political reasons).
You see, the basic problem is that these people think there is something special – in the negative sense – about classifying people by race (or ethnicity) that creates the Lewontin finding. Because there is more genetic variation within “races” – for example, more variation within Whites than between Whites and Blacks – they think that means that if you were to compare one random group of Whites to another similar group of Whites then there would be more genetic variation between those groups of Whites than within those same groups (ignore the gaps of logic in this implicit, or sometimes overt, leftist “argument). In other words, they say or imply, something like this:
Race is such a bad way to divide people, it is so wrong and meaningless, that WHEN you divide people by race THEN you get the result that there is more genetic variation within groups than between them. [Implication: this difference in the apportionment of variation occurs as a result of binning people by race]. If we were to bin people randomly, arbitrarily, or by how “closely related they are independent of race” (whatever that means), then there would be more variation between than within groups, but when we use this stupid artificial racial boundary we see more variation within. Indeed, the fact that binning people by race creates a situation that genetic variation is greater within the group proves that race is an invalid concept – how can a grouping that creates “more genetic variation within groups” be better than random groupings or aracial groupings that do not (we assume) do so?
You see, this is the implied message. Race (and ethnicity) are negatively “privileged” groupings that create the Lewontin “finding” – after all, that’s how he reported it, and after all, that’s how it’s been discussed for decades, through the lens of racial classification.
My argument has been that this is a complete misunderstanding.
With respect to Lewontin’s well known “there is more genetic variation within groups than between groups” we need to clarify whether the 85:15 split has any meaning other than the fact that the bulk of human genetic variation is randomly distributed.
Comparing Danes vs. Nigerians: 85% variation within each group and 15% between. The same would be observed with Japanese vs. Iranians.
What if you considered a mixed group of Danes + Nigerians as a single population, and the same for Japanese + Iranians? If you then apportioned genetic variation between D+N vs. J+I you would still get more variation within than between.
If you went in the opposite direction, and considered Japanese from Tokyo as one population and Japanese from Kyoto as another population, the same within/between distinction would hold. If you compared one Japanese family to another, you would also see more genetic variation within the group (family) than between families.
As has been pointed out previously by others, a significant amount of genetic variation is found within single individuals; thus, if you were to compare one Japanese individual to another,~ half the genetic variation would be found within the single individual.
For any set of human groups, one would expect to find more genetic variation within the group than between groups.
Hence, the “within group” component of genetic variation is found within any defined set of individuals, and is randomly distributed among individuals. It cannot be used to assert that members of an ethny are more dissimilar than to other ethnies, nor can it be used as a legitimate argument against the reality of genetically distinct population groups.
And this doesn’t even touch upon the fact that with respect to many phenotypically relevant traits under selective pressure, racial differences in allele frequency is so great that there is actually greater genetic variation between compared to within groups.
Thus, most genetic variation is randomly distributed among individuals irrespective of classification. It has nothing to do with race (or ethnicity). Racial classifications are not – as the leftists slyly imply – in any way special in exhibiting more variation within than between. ALL and ANY human groups – even random, arbitrary groupings of people from within the same race or ethnic group, will show the same pattern of more variation within than between. You can mix up groups of different races and get the same result. You can create any arbitrary groups of individuals, in endless combination, and no matter how you do it, you will always get more variation within then between.
I doubt Lewontin and all the other academics who have foisted his “finding” on the masses were/are so stupid as to not realize this. They must understand that any and all human groupings, no matter how random or absurd, will show the same pattern. Then, I suspect, knowing this, they decided to specifically choose racial classification as an example in order to trick people to believe that race is invalid, and do so for political reasons.
In actuality, the reality is the opposite, the genetic variation argument actually supports race, since the portion of genetic variation that is between groups is greatest when you bin people based on this concrete biological concept, and the between group variation portion is smaller (or in some cases virtually non-existent) when you bin people by random, or other arbitrary, methods. Dividing Whites from Blacks is when you get the greatest amount of variation between, NOT dividing Whites from other Whites. There was never reason to expect that human genetic differentiation was so extreme that the differences in genetic variation between groups would be greater than the unstructured variation found within groups. If that was so, we would be totally different species, rather than variations (no pun intended) of one species. One could continue playing around with genetic data in this manner, with larger data sets, random number generators to form groups, etc., but the point has already been established. Thus, you can pick names randomly out of any diverse big city phonebook – New York for example – and use these random people to form groups, and if you would analyze the genetic variation of these random and arbitrary aracial groupings you will find more variation within than between AND a smaller Fst compared to real inter-racial comparisons.
Now, it can be – and should be – argued that the arguments and findings in this blog post are simple, common-sense, intuitive, even trivial. OF COURSE random groups would have even more genetic variation within and OF COURSE racial groups will have a larger Fst, indicative of a larger share of variation between. Of course races are real biological groups and of course the Left is wrong. But given leftist hysteria and mendacity over race and genetics, the issue had to be formally demonstrated, which it was here. It is unfortunate one must waste time “proving” things so obvious it is the equivalent of “the sky is blue” but so it goes in the modern world.
In summary, my point – supported by actual calculations of SNP data – is that ANY human grouping (random, multiracial, whatever you may choose) is going to show “more variation within than between” with respect to genetics. The Lewontin meme is completely irrelevant for the biological reality of race.
Now, an excerpt from Salter’s piece:
In fact, intra-family variation is about three times inter-family variation. Fully half of the variation within a population exists within any randomly chosen individual (Harpending, Appendix; Pääbo 2003). Should we then conclude about the family what Lewontin concludes about race, that it is of ‘no social value and is positively destructive of social and human relations’?
Paabo (*) actually said that 30% of the total human genetic variation in haplotype blocks is found within single individuals. So, fine, let’s put the single individual genetic variation as between, generally speaking, 30-50% of the total human genetic variation (depending on how you measure it). Well, OF COURSE, there is going to be more variation within than between groups if 30-50% of that is in a SINGLE INDIVIDUAL. As Harpending shows – and Paabo’s differing estimate wouldn’t change the outcome much at all – any two to four random people will contain the overwhelming majority of human genetic variation. So, what? Do families not exist in a biological sense? The whole variation argument is nonsense.
And the relevant material from Harpending from that piece:
Appendix:
The Apportionment of Variation Within and Among Families
Henry Harpending
[Henry Harpending’s derivation of within-family variation is unpublished as I write. Following is his derivation, received as a personal communication.]
If we choose an allele A at some locus that has frequency p in a randomly mixed population, and if we pick a single gene from this population from this locus, the probability that it is A is just p. The variance of this frequency is just the variance of a single Bernouilli trial, p(1-p) or pq if we let q=1-p. If our population of genes is grouped in certain ways, we can partition this variance into within-group and between-group components. We are doing precisely what Lewontin (1972) and others have done, partitioning diversity (variance) into within and between-group parts.
First consider diploid individuals in a random mating population. What is the variance of the frequency of A in diploid individuals. Since mating is random, diploids are simply random alleles taken 2 at a time. The variance of the frequency of A in samples of 2 is binomial, pq/2. This shows that half the variance is among diploid individuals.
Now consider the variance within an individual. Call the frequency in an individual p2. The variance of the frequency of A in a single gene chose from an individual is p2(1-p2), and this figure averaged over all individuals is
Average(p2(1-p2)) =
Average(p2 -p2\2) =
p-p2-Var(p2) =
pq – pq/2 = pq/2
since the average of the square of any random variable is the mean of that variable squared plus the variance of that variable. This shows that half the variance of a gene frequency is within any individual member of a random mating population. We have partitioned the variance into between and within individual components as 1/2 within and 1/2 between. (Once stated, this result is obvious, but I cannot find an earlier reference to it. Perhaps it was considered too obvious to publish.) Now consider couples chosen at random, that is with no assortative mating. Each couple has 4 copies of A at the locus. Each couple has a frequency of A: it can be 0, 1/4, 1/2, 3/4, or 1. Call the frequency in a clump p4, and ask what is the variance of p4? It is just the variance of a binomial with n=4 or pq/4. We have established that one-fourth of the variance is among couples.
Now consider the variance within a couple. Pick one gene from a couple. The mean is still p and the variance is p4q4.. The average value of p4(1-p4) over all couples is the average of p4 - p4 2 which is p - p2 – Var(p4), or (p – p2 – p(1-p)/4)= pq(1-1/4) = (3/4)pq.
This shows that the variance within couples is 3/4 of the total and among couples 1/4 of the total. Another way of saying that 0.25 of the variance is among couples is that the coefficient of kinship of full sibs, offspring of a single couple, is 0.25. We could continue with larger and larger sets. For example two random couples from a population contain 7/8 of the total diversity, while 1/8 of the diversity is among couples. This partitioning roughly corresponds to that among human races. What this means, for example, is that if humans were to disappear save a single race that would repopulate the earth, the diversity loss would be the same as the loss if two couples from a random mating population were to reconstitute a population.
That basically says the same thing as my piece but in a different way. Harpending says that most of the human genetic variation will be in one to two random couples; I said – and proved with a small SNP set – that random or racially mixed human groups will also contain most or all of the total human genetic variation. Indeed, in such groups, between group variation approaches zero.
Lewontin’s dogma is once again falsified as any sort of relevant comment against the biological reality of race.
Note:
* To demonstrate the mendacity of race-deniers like Paabo, he made a big deal about how a European may be genetically more similar to an African than to another European for a particular haplotype block. Maybe so (and that raises questions of the validity of testing companies’ “chromosome painting” accuracy, but that’s another story), but that is for only ONE given block. What about multiple blocks? What about the entire genome? We know the answer – Europeans are always closer to other Europeans. Of course, there can be overlap in any given sequence block. So what? The entire autosomal genome matters (and not NRY or mitochondrial DNA that behave as single haplotypes as well).
Posted by Ted at 1:00 AM No comments:
Labels: debunking Lewontin, Fst/Gst, Harpending, Lewontin, reality of race, Salter




Really fascinating stuff.