The Genomics of Race and Identity Fear of Biological Difference When I started my first academic job in 2003, I bet my career on the idea that the history of mixture of West Africans and Europeans in the Americas would make it possible to find risk factors that con- tribute to health disparities for diseases like prostate cancer, which occurs at about a rate 1.7 times higher in African Americans than in European Americans.! This particular disparity had not been possi- ble to explain based on dietary and environmental differences across populations, suggesting that genetic factors might play a role. African Americans today derive about 80 percent of their ancestry from enslaved Africans brought to North America between the six- teenth and nineteenth centuries. In a large group of African Ameri- cans, the proportion of African ancestry at any one location in the genome is expected to be close to the average (defining the pro- portion of African ancestry as the fraction of ancestors that were in West Africa before around five hundred years ago). However, if there are risk factors for prostate cancer that occur at higher frequency in West Africans than in Europeans, then African Americans with prostate cancer are expected to have inherited more African ancestry than the average in the vicinity of these genetic variations. This idea can be used to pinpoint disease genes. ‘To make such studies possible, I set up a molecular biology labora- 248 Who We Are and How We Got Here tory to identify mutations that differed in frequency between West Africans and Europeans. My colleagues and I developed methods that used information from these mutations to identify where in the genome people harbor segments of DNA derived from their West African and European ancestors.” To prove that these ideas worked in practice, we applied them to many traits, including prostate can- cer, uterine fibroids, late-stage kidney disease, multiple sclerosis, low white blood cell count, and type 2 diabetes. In 2006, my colleagues and I applied our methods to 1,597 African American men with prostate cancer, and found that in one region of the genome, they had about 2.8 percent more African ancestry than the average in the rest of their genomes.’ The odds of see- ing a rise in African ancestry this large by accident were about ten million to one. When we looked in more detail, we found that this region contained at least seven independent risk factors for pros- tate cancer, all more common in West Africans than in Europeans.* Our findings could account entirely for the higher rate of pros- tate cancer in African Americans than in European Americans. We could conclude this because African Americans who happen to have entirely European ancestry in this small section of their genomes had about the same risk for prostate cancer as random European Americans.° In 2008, I gave a talk about my work on prostate cancer to a confer- ence on health disparities across ethnic groups in the United States. In my talk, I tried to communicate my excitement about the scientific approach and my conviction that it could help to find genetic risk fac- tors for other diseases. Afterward, though, I was angrily questioned by an anthropologist in the audience, who believed that by studying “West African” or “European” segments of DNA to understand bio- logical differences between groups, I was flirting with racism. Her questions were seconded by several others, and I encountered similar responses at other meetings. A legal ethicist who heard me talk on a similar theme suggested that I might want to refer to the popu- lations from which African Americans descend as “cluster A” and “cluster B.” But I replied that it would be dishonest to disguise the model of history that was driving this work. Every feature of the data I looked at suggested that this model was a scientifically meaningful The Genomics of Race and Identity 249 one, providing accurate estimates of where in the genome people harbor segments of DNA from ancestors who lived in West Africa or in Europe in the last twenty generations, prior to the mixture caused by colonialism and the slave trade. It was also clear that the approach was identifying real risk factors for disease that differ in frequency across populations, leading to discoveries with the poten- tial to improve health. Far from being extremists, my questioners were articulating a mainstream view about the danger of work exploring biological dif- ferences among human populations. In 1942, the anthropologist Ashley Montagu wrote Man’s Most Dangerous Myth: The Fallacy of Race, arguing that race is a social concept and has no biological real- ity, and setting the tone for how anthropologists and many biologists have discussed this issue ever since.° A classic example often cited is the inconsistent definition of “black.” In the United States, people tend to be called “black” if they have sub-Saharan African ancestry— even if it is a small fraction and even if their skin color is very light. In Great Britain, “black” tends to mean anyone with sub-Saharan African ancestry who also has dark skin. In Brazil, the definition is different yet again: a person is only “black” if he or she is entirely African in ancestry. If “black” has so many inconsistent definitions, how can there be any biological meaning to “race”? Beginning in 1972, genetic arguments began to be incorporated into the assertions that anthropologists were making about the lack of substantial biological differences among human populations. In that year, Richard Lewontin published a study of variation in pro- tein types in blood.’ He grouped the populations he analyzed into seven “races”—West Eurasians, Africans, East Asians, South Asians, Native Americans, Oceanians, and indigenous Australians—and found that around 85 percent of variation in the protein types could be accounted for by variation within populations and “races,” and only 15 percent by variation across them. He concluded: “Races and populations are remarkably similar to each other, with the largest part by far of human variation being accounted for by the differences between individuals. Human racial classification is of no social value and is positively destructive of social and human relations. Since such racial classification is now seen to be of virtually no genetic or 250 Who We Are and How We Got Here taxonomic significance either, no justification can be offered for its continuance.” In this way, through the collaboration of anthropologists and geneticists, a consensus was established that there are no differences among human populations that are large enough to support the con- cept of “biological race.” Lewontin’s results made it clear that for the great majority of traits, human populations overlap to such a degree that it is impossible to identify a single biological trait that distin- guishes people in any two groups, which is intuitively what some people think of when they conceive of “biological race.” But this consensus view of many anthropologists and geneticists has morphed, seemingly without questioning, into an orthodoxy that the biological differences among human populations are so mod- est that they should in practice be ignored—and moreover, because the issues are so fraught, that study of biological differences among populations should be avoided if at all possible. It should come as no surprise, then, that some anthropologists and sociologists see genetic research into differences across populations, even if done in a well- intentioned way, as problematic. They are concerned that work on such differences will be used to validate concepts of race that should be considered discredited. They see this work as located on a slippery slope to the kinds of pseudoscientific arguments about biological dif- ference that were used in the past to try to justify the slave trade, the eugenics movement to sterilize the disabled as biologically defective, and the Nazis’ murder of six million Jews. The concern is so acute that the political scientist Jacqueline Stevens has even suggested that research and even emails discuss- ing biological differences across populations should be banned, and that the United States “should issue a regulation prohibiting its staff or grantees... from publishing in any form—including inter- nal documents and citations to other studies—claims about genet- ics associated with variables of race, ethnicity, nationality, or any other category of population that is observed or imagined as heri- table unless statistically significant disparities between groups exist and description of these will yield clear benefits for public health, as deemed by a standing committee to which these claims must be submitted and authorized.”* The Genomics of Race and Identity 251 The Language of Ancestry But whether we like it or not, there is no stopping the genome rev- olution. The results that it is producing are making it impossible to maintain the orthodoxy established over the last half century, as they are revealing hard evidence of substantial differences across popula- tions. The first major engagement between the genome revolution and anthropological orthodoxy came in 2002, when Marc Feld- man and his colleagues showed that by studying enough places in the genome—they analyzed 377 variable positions—it is possible to group most people in a worldwide population sample into clusters that correlate strongly to popular categories of race in the United States: “African,” “European,” “East Asian,” “Oceanian,” or “Native American.”” While Feldman’s conclusions were broadly consistent with Lewontin’s in that his data also showed more variation within groups than among them, his study defined clusters in terms of com- binations of mutations instead of looking at mutations individually as Lewontin had done. Scientists were quick to respond. One was Svante Paibo, who eight years later would go on to lead the work to sequence whole genomes of archaic Neanderthals and Denisovans. Paabo came to the debate about the nature of human population structure as a founding direc- tor of the Max Planck Institute for Evolutionary Anthropology in Leipzig, which was set up in 1997 in an effort to return Germany to a field in which it had played a leading role before the Second World War but that it had largely abandoned due to anthropologists’ cen- tral contribution to developing Nazi race theory. Paabo took seriously his moral responsibility as head of an ambi- tious German institute of anthropology, and wondered whether the truth about human population structure could be more like the anthropologist Frank Livingston’s suggestion that “there are no races, there are only clines”—a view in which human genetic vari- ation is characterized by gradual geographic gradients that reflect interbreeding among neighbors.'° To explore this possibility, Paabo 252 Who We Are and How We Got Here investigated whether the clusters the Feldman study found appeared sharply defined because the analyzed populations had been sampled in a nonrandom fashion across the world. To understand how non- random sampling could contribute to this result, consider the United States, which harbors extraordinary diversity, but where genetic dis- continuities among groups such as African Americans, European Americans, and East Asians are sharper than in the places from which immigrant populations came because the United States has drawn its immigrants from a subset of world locations. For example, in the United States, most of the African ancestry is from a handful of groups in West Africa,'! most of the European ancestry is from northwest Europe, and most of the Asian ancestry is from Northeast Asia. Paabo showed that such nonrandom sampling could account for some of the effects Feldman and colleagues observed. However, later work proved that nonrandom sampling could not account for most of the structure, as substantial clustering of human populations is observed even when repeating analyses on geographically more evenly distributed sets of samples.'” Another flurry of discussion followed a 2003 paper led by Neil Risch, who argued that racial grouping is useful in medical research, not just to adjust for socioeconomic and cultural differences, but also because it correlates with genetic differences that are important to know about when diagnosing and treating disease.'? Risch was con- vinced by examples like sickle cell disease, which occurs far more often in African Americans than in other populations in the United States. He argued that it was appropriate for doctors to be more likely to think of sickle cell disease if the patient is African American. In 2005, the U.S. Food and Drug Administration lent support to this way of thinking when it approved BiDil, a combination of two medications approved to treat heart failure in African Americans because data suggested it was more effective in African Americans than in European Americans. But on the other side of the argument, David Goldstein suggested that U.S. racial categories are so weakly predictive of most biological outcomes that they do not have long- term value.'* He and his colleagues showed that the frequencies of genetic variants that determine dangerous reactions to drugs are poorly predicted by U.S. census categories. He acknowledged that The Genomics of Race and Identity 253 the reliance on racial and ethnic categories is useful given our poor present knowledge, but predicted that the future will involve testing individuals directly for what mutations they have, and doing away altogether with racial classification as a basis for making individual- ized decisions about care. Against this backdrop of controversy emerged work like mine, focusing on methods to determine population origin not just of our ancestors but also of individual segments of our genomes. The anthropologist Duana Fullwiley has written that the development of what she calls “admixture technology” and the language of “ances- try” that geneticists like me have adopted is a reversion to traditional ideas of biological race.'* She has pointed out that in the United States, the “ancestry” terms that we use map relatively closely to tra- ditional racial categories, and her view is that the population genetics community has invented a set of euphemisms to discuss topics that had become taboo. The belief that we have embraced euphemisms is also shared by some on the other side of the political spectrum. At a 2010 meeting I attended at Cold Spring Harbor Laboratory, the journalist Nicholas Wade described his resentment of the population genetics community’s “ancestry” terminology, asserting that “race is a perfectly good English word.” But “ancestry” is not a euphemism, nor is it synonymous with “race.” Instead, the term is born of an urgent need to come up with a precise language to discuss genetic differences among people at a time when scientific developments have finally provided the tools to detect them. It is now undeniable that there are nontrivial aver- age genetic differences across populations in multiple traits, and the race vocabulary is too ill-defined and too loaded with historical bag- gage to be helpful. If we continue to use it we will not be able to escape the current debate, which is mired in an argument between two indefensible positions. On the one side there are beliefs about the nature of the differences that are grounded in bigotry and have little basis in reality. On the other side there is the idea that any bio- logical differences among populations are so modest that as a matter of social policy they can be ignored and papered over. It is time to move on from this paralyzing false dichotomy and to figure out what the genome is actually telling us. 254 Who We Are and How We Got Here Real Biological Difference I have deep sympathy for the concern that genetic discoveries about differences among populations may be misused to justify racism. But it is precisely because of this sympathy that I am worried that people who deny the possibility of substantial biological differences among populations across a range of traits are digging themselves into an indefensible position, one that will not survive the onslaught of sci- ence. In the last couple of decades, most population geneticists have sought to avoid contradicting the orthodoxy. When asked about the possibility of biological differences among human populations, we have tended to obfuscate, making mathematical statements in the spirit of Richard Lewontin about the average difference between individuals from within any one population being around six times greater than the average difference between populations. We point out that the mutations that underlie some traits that differ dramati- cally across populations—the classic example is skin color—are unu- sual, and that when we look across the genome it is clear that the typical differences in frequencies of mutations across populations are far less.'© But this carefully worded formulation is deliberately masking the possibility of substantial average differences in biologi- cal traits across populations. ‘To understand why it is no longer an option for geneticists to lock arms with anthropologists and imply that any differences among human populations are so modest that they can be ignored, go no further than the “genome bloggers.” Since the genome revolution began, the Internet has been alive with discussion of the papers writ- ten about human variation, and some genome bloggers have even become skilled analysts of publicly available data. Compared to most academics, the politics of genome bloggers tend to the right—Razib Khan’ and Dienekes Pontikos'* post on findings of average dif- ferences across populations in traits including physical appearance and athletic ability. The Eurogenes blog spills over with sometimes as many as one thousand comments in response to postings on the charged topic of which ancient peoples spread Indo-European lan- The Genomics of Race and Identity 255 guages,” a highly sensitive issue since as discussed in part II, nar- ratives about the expansion of Indo-European speakers have been used as a basis for building national myths,’? and sometimes have been abused as happened in Nazi Germany.’! The genome blog- gers’ political beliefs are fueled partly by the view that when it comes to discussion about biological differences across populations, the academics are not honoring the spirit of scientific truth-seeking. The genome bloggers take pleasure in pointing out contradictions between the politically correct messages academics often give about the indistinguishability of traits across populations and their papers showing that this is not the way the science is heading. What real differences do we know about? We cannot deny the existence of substantial average genetic differences across popula- tions, not just in traits such as skin color, but also in bodily dimen- sions, the ability to efficiently digest starch or milk sugar, the ability to breathe easily at high altitudes, and susceptibility to particular diseases. These differences are just the beginning. I expect that the reason we don’t know about a much larger number of differences among human populations is that studies with adequate statistical power to detect them have not yet been carried out. For the great majority of traits, there is, as Lewontin said, much more variation within populations than across populations. This means that indi- viduals with extreme high or low values of the great majority of traits can occur in any population. But it does not preclude the existence of subtler, average differences in traits across populations. The indefensibility of the orthodoxy is obvious at almost every turn. In 2016, I attended a lecture on race and genetics by the biolo- gist Joseph L. Graves Jr. at the Peabody Museum of Archaeology and Ethnography at Harvard. At one point, Graves compared the approximately five mutations known to have large effects on skin pigmentation and that are obviously different in frequency across populations to the more than ten thousand genes known to be active in human brains. He argued that in contrast to pigmentation genes, the patterns at genes particularly active in the brain would surely average out over so many locations, with some mutations nudging cognitive and behavioral traits in one direction and some pushing in the other direction. But this argument doesn’t work, because 256 Who We Are and How We Got Here in fact, if natural selection has exerted different pressures on two populations since they separated, traits influenced by many muta- tions are just as capable of achieving large average differences across populations as traits influenced by few mutations. And indeed, it is already known that traits shaped by many mutations (as is probably the case for behavior and cognition) are at least as important tar- gets of natural selection as traits like skin color that are driven by a small number of mutations.”” The best example we currently have of a trait governed by many mutations is height. Studies in hundreds of thousands of people have shown that height is determined by thou- sands of variable positions across the genome. A 2012 analysis led by Joel Hirschhorn showed that natural selection on these is respon- sible for the shorter average height in southern Europeans compared to northern Europeans.”’ Height isn’t the only example. Jonathan Pritchard led a study showing that in the last approximately two thousand years there has been selection for genetic variations that affect many other traits in Britain, including an increase in average infant head size and an increase in average female hip size (possibly to accommodate the increased higher average infant head size dur- ing childbirth).”* It is tempting to argue that genetic influence on bodily dimensions is one thing, but that cognitive and behavioral traits are another. But this line has already been crossed. Often when a person participates in a genetic study of a disease, he or she fills out a form providing information on height, weight, and number of years of education. By compiling the information on the number of years of education for over four hundred thousand people of European ancestry whose genomes have been surveyed in the course of various disease stud- ies, Daniel Benjamin and colleagues identified seventy-four genetic variations each of which has overwhelming evidence of being more common in people with more years of education than in people with fewer years even after controlling for such possibly confounding fac- tors as heterogeneity in the study population.”> Benjamin and col- leagues also showed that the power of genetics to predict number of years of education is far from trivial, even though social influences surely have a greater average influence on this behavior than genet- ics. They showed that in the European ancestry population in which The Genomics of Race and Identity 257 they carried out their study, it should be possible to build a genetic predictor in which the probability of completing twelve years of edu- cation is 96 percent for the twentieth of people with the highest pre- diction compared to 37 percent for the lowest.”° How do these genetic variations influence educational attainment? The obvious guess is that they have a direct effect on academic abili- ties, but that is probably wrong. A study of more than one hundred thousand Icelanders showed that the variations also increase the age at which a woman has her first child, and that this is a more powerful effect than the one on the number of years of education. It is possible that these variations exert their effect indirectly, by nudging people to defer having children, which makes it easier for them to complete their education.”’ This shows that when we discover biological dif- ferences governing behavior, they may not be working in the way we naively assume. Average differences across populations in the frequencies of the mutations that affect educational attainment have not yet been identified. But a sobering finding is that older people in Iceland are systematically different from younger people in having a higher genetically predicted number of years of education.’* Augustine Kong, the lead author of the Icelandic study, showed that this reflects natural selection over the last century against people with more pre- dicted education, likely because of selection for people who began having children at a younger age. Given that the genetic underpin- nings of the number of years of education a person achieves have measurably changed within a century in a single population under the pressure of natural selection, it seems highly likely that the trait differs across populations too. No one knows how the genetic variations that influence educa- tional attainment in people of European ancestry affect behavior in people of non-European ancestries, or in differently structured social systems. That said, it seems likely that if these mutations have an effect on behavior in one population they will do so in others, too, even if the effects differ by social context. And educational attainment as a trait is likely to be only the tip of an iceberg of behavioral traits affected by genetics. The Benjamin study has already been joined by others finding genetic predictors of behavioral traits,”? including one 258 Who We Are and How We Got Here of more than seventy thousand people that found mutations in more than twenty genes that were significantly predictive of performance on intelligence tests.*° For those who wish to argue against the possibility of biological differences across populations that are substantial enough to make a difference in people’s abilities or propensities, the most natural ref- uge might be to make the case that even if such differences exist, they will be small. The argument would be that even if there are aver- age differences across human populations in genetically determined traits affecting cognition or behavior, so little time has passed since the separation of populations that the quantitative differences across populations are likely to be trivially small, harkening back to Lewon- tin’s argument that the average genetic difference between popula- tions is much less than the average difference between individuals. But this argument doesn’t hold up either. The average time separa- tion between pairs of human populations since they diverged from common ancestral populations, which is up to around fifty thousand years for some pairs of non-African populations, and up to two hun- dred thousand years or more for some pairs of sub-Saharan African populations, is far from negligible on the time scale of human evolu- tion. If selection on height and infant head circumference can occur within a couple of thousand years,*! it seems a bad bet to argue that there cannot be similar average differences in cognitive or behavioral traits. Even if we do not yet know what the differences are, we should prepare our science and our society to be able to deal with the reality of differences instead of sticking our heads in the sand and pretend- ing that differences cannot be discovered. The approach of staying mum, of implying to the public and to colleagues that substantial dif- ferences in traits across populations are unlikely to exist, is a strategy that we scientists can no longer afford, and that in fact is positively harmful. If as scientists we willfully abstain from laying out a rational framework for discussing human differences, we will leave a vacuum that will be filled by pseudoscience, an outcome that is far worse than anything we could achieve by talking openly. The Genomics of Race and Identity 259 The Genome Revolution’s Insight On the question of whether traditional social categories of race cor- respond to meaningful biological categories, the genome revolution has already provided us with new insights that go far beyond the information that was available to the first population geneticists and anthropologists who grappled with the issue. In this way, the data provided by the genome revolution are potentially liberating, pro- viding an opportunity for intellectual progress beyond the current stale framing of the debate. As recently as 2012, it still seemed reasonable to interpret human genetic data as pointing to immutable categories such as “East Asians,” “Caucasians,” “West Africans,” “Native Americans,” and “Australasians,” with each group having been separated and unmixed for tens of thousands of years. The 2002 study led by Marc Feldman produced clusters that corresponded relatively well to these catego- ries, and the model seemed to be doing a good job of describing var- iation in many parts of the world (with some exceptions).*” In other papers, Feldman and his colleagues proposed a model for how this kind of structure could arise among human populations. Their pro- posal was that modern humans expanding out of Africa and the Near East after around fifty thousand years ago left descendant popula- tions along the way, which in turn budded off their own descendant populations, with the present-day inhabitants of each region being descended directly from the modern humans who first arrived.*? Their “serial founder” model was more sophisticated than that imag- ined by biological race theorists in the seventeenth to twentieth cen- turies, but shared with it the prediction that after being established, human populations hardly mixed with each other. But ancient DNA discoveries have rendered the serial founder model untenable. We now know that the present-day structure of populations does not reflect the one that existed many thousands of years ago.** Instead, the current populations of the world are mix- tures of highly divergent populations that no longer exist in unmixed form—for example, the Ancient North Eurasians, who contributed 260 Who We Are and How We Got Here a large amount of the ancestry of present-day Europeans as well as of Native Americans,** and multiple ancient populations of the Near East, each as differentiated from the other as Europeans and East Asians are differentiated from each other today.*° Most of today’s populations are not exclusive descendants of the populations that lived in the same locations ten thousand years ago. The findings that the nature of human population structure is not what we assumed should serve as a warning to those who think they know that the true nature of human population differences will cor- respond to racial stereotypes. Just as we had an inaccurate picture of early human origins before the ancient DNA revolution unleashed an avalanche of surprises, so we should distrust the instincts that we have about biological differences. We do not yet have sufficient sample sizes to carry out compelling studies of most cognitive and behavioral traits, but the technology is now available, and once high- quality studies are performed—which they will be somewhere in the world whether we like it or not—any genetic associations they find will be undeniable. We will need to deal with these studies and react responsibly to them when they are published, but we can already be sure that we will be surprised by some of the outcomes. Unfortunately, today there is a new breed of writers and scholars who argue not only that there are average genetic differences, but that they can guess what they are based on traditional racial stereo- types. The person who has most recently made a prominent argument that there is a genetic basis to stereotypes about differences across human populations is the New York Times journalist Nicholas Wade, who in 2014 published A Troublesome Inheritance: Genes, Race and Human History.*” The abiding theme of Wade’s reporting is the pro- pensity of academics to band together to enforce orthodoxies and to be shown up by a band of rebels speaking the truth (he has written on scientific fraud, described the Human Genome Project as a monolith wastefully spending the public’s money, and attacked the value of genome-wide association studies for finding common genetic varia- tions contributing to risk for diseases). Wade’s Troublesome Inheri- tance ran with the theme again, suggesting that a politically correct alliance of anthropologists and geneticists has banded together The Genomics of Race and Identity 261 to suppress the truth that there are significant differences among human populations and that those differences correspond to classic stereotypes. One part of the argument has something to it—Wade correctly highlights the problem of an academic community trying to enforce an implausible orthodoxy. Yet the “truth” that he puts forward in opposition, the idea that not only are there substantial differences, but that they likely correspond to traditional racial ste- reotypes, has no merit. Wade’s book combines compelling content with parts that are entirely speculative, presenting everything with the same authority and in the same voice, so that naive readers who accept the parts of it that are well argued are tempted to accept the rest. Worse, when compared to Wade’s previous writing, in which the rebels speaking the truth were scholars of creativity and accom- plishment, he does not identify any serious scholarship in genetics supporting his speculations.** And yet by celebrating those who have opposed the flawed orthodoxy, he implies wrongly that their alterna- tive theories must be right. As an example of the speculations to which Wade gives pride of place, one of his chapters focuses on a 2006 essay by Gregory Coch- ran, Jason Hardy, and Henry Harpending suggesting that the high average intelligence quotient (IQ) of Ashkenazi Jews (more than one standard deviation above the world average), and their dispropor- tionate share of Nobel Prizes (about one hundred times the world average), might reflect natural selection due to a millennium-long history in which Jewish populations practiced moneylending, a pro- fession that required writing and calculation.*? They also pointed to the high rate in Ashkenazi Jews of Tay-Sachs disease and Gaucher disease, which are due to mutations that affect storage of fat in brain cells, and which they hypothesized rose in frequency under the pres- sure of selection for genetic variations contributing to intelligence (they argued that these mutations might be beneficial when they occur in one copy rather than the two needed to cause disease). This argument is contradicted by the evidence that these diseases almost certainly owe their origin to random bad luck—the fact that during the medieval population bottleneck that affected Ashkenazi Jews, the small number of individuals who had many descendants happened to carry these mutations*’—yet Wade highlights the work on the basis 262 Who We Are and How We Got Here that it might be right. Harpending has a track record of speculat- ing without evidence on the causes of behavioral differences among populations. In a talk he gave at a 2009 conference on “Preserving Western Civilization,” he asserted that people of sub-Saharan Afri- can ancestry have no propensity to work when they don’t have to— “I’ve never seen anyone with a hobby in Africa,” he said—because, he thought, sub-Saharan Africans have not gone through the type of natural selection for hard work in the last thousands of years that some Eurasians had.*! Wade also highlighted A Farewell to Alms, a book by the economist Gregory Clark suggesting that the reason the Industrial Revolution took off in Britain before it did elsewhere was the relatively high birth rate among wealthy people in Britain for the preceding five centuries compared to less wealthy people. Clark argued that this higher birth rate spread through the population the traits needed for a capitalist surge, including individualism, patience, and willing- ness to work long hours.” Clark admits that he cannot distinguish between the transmission of genes and the transmission of culture across the generations, but Wade nevertheless takes his argument as evidence that genetics might have played a role. I have spent some space discussing errors in Wade’s book because I feel it is important to explain that just because many academics have been engaged in trying to maintain an implausible orthodoxy, it does not mean that every unorthodox “heretic” is right. And yet Wade suggests precisely this. He writes, “Each of the major civilizations has developed the institutions appropriate for its circumstances and survival. But these institutions, though heavily imbued with cultural traditions, rest on a bedrock of genetically shaped human behavior. And when a civilization produces a distinctive set of institutions that endures for many generations, that is the sign of a supporting suite of variations in the genes that influence human social behavior.”* In a written version of a nod and a wink, Wade is suggesting that popular racist ideas about the differences that exist among populations have something to them. Wade is far from the only person who is convinced he knows the truth about the differences among populations. At the same 2010 meeting on “DNA, Genetics, and the History of Mankind” at which The Genomics of Race and Identity 263 I first met Wade, I heard a rustling behind my shoulder and turned with a shock to see James Watson, who in 1953 codiscovered the structure of DNA. Watson had until a few years earlier been the director of the Cold Spring Harbor Laboratory at which the meet- ing was held. A century ago, the laboratory was the epicenter of the eugenics movement in the United States, keeping records on traits in many people to help guide selective breeding, and lobbying for legislation that was passed in many states to sterilize people con- sidered to be defective and to combat a perceived degradation of the gene pool. It was ironic, then, that Watson was forced to retire as head of Cold Spring Harbor after being quoted in an interview with the British Sanday Times newspaper as having said that he was “inherently gloomy about the prospect of Africa,” adding that “[all] our social policies are based on the fact that their intelligence is the same as ours—whereas all the testing says not really.”** (No genetic evidence for this claim exists.) When I saw Watson at Cold Spring Harbor, he leaned over and whispered to me and to the geneticist Beth Shapiro, who was sitting next to me, something to the effect of “When are you guys going to figure out why it is that you Jews are so much smarter than everyone else?” He then said that Jews and Indian Brahmins were both high achievers because of genetic advantages conferred by thousands of years of natural selection to be scholars. He went on to whisper that Indians in his experience were also servile, much like he thought they had been under British colonialism, and he speculated that this trait had come about because of selection under the caste system. He also talked about how East Asian students tended to be conformist, because of selection for con- formity in ancient Chinese society. The pleasure Watson takes in challenging establishment views is legendary. His obstreperousness may have been important to his suc- cess as a scientist. But now as an eighty-two-year-old man, his intel- lectual rigor was gone, and what remained was a willingness to vent his gut impressions without subjecting them to any of the testing that characterized his scientific work on DNA. Writing now, I shudder to think of Watson, or of Wade, or their forebears, behind my shoulder. The history of science has revealed, again and again, the danger of trusting one’s instincts or of being 264 Who We Are and How We Got Here led astray by one’s biases—of being too convinced that one knows the truth. From the errors of thinking that the sun revolves around the earth, that the human lineage separated from the great ape lin- eage tens of millions of years ago, and that the present-day human population structure is fifty thousand years old whereas in fact we know that it was forged through population mixtures largely over the last five thousand years—from all of these errors and more, we should take the cautionary lesson not to trust our gut instincts or the stereotyped expectations we find around us. If we can be confi- dent of anything, it is that whatever differences we think we perceive, our expectations are most likely wrong. What makes Watson’s and Wade’s and Harpending’s statements racist is the way they jump from the observation that the academic community is denying the possi- bility of differences that are plausible, to a claim with no scientific evidence* that they know what those differences are and also that the differences correspond to long-standing popular stereotypes—a conviction that is essentially guaranteed to be wrong. We truly have no idea right now what the nature or direction of genetically encoded differences among populations will be. An example is the extreme overrepresentation of people of West African ancestry among elite sprinters. All the male finalists in the Olympic hundred-meter race since 1980, even those from Europe and the Americas, had recent West African ancestry.*° The genetic hypoth- esis most often invoked to explain this is that there has been an upward shift in the average sprinting ability of people of West Afri- can ancestry due to natural selection. A small increase in the aver- age might not sound like much, but it can make a big difference at the extremes of high ability—for example, a 0.8-standard-deviation increase in the average sprinting ability in West Africans would be expected to lead to a hundredfold enrichment in the proportion of people above the 99.9999999th percentile point in Europeans. But an alternative explanation that would predict the same magnitude of effect is that there is simply more variation in sprinting ability in people of West African ancestry—with more people of both very high and very low abilities.*” A wider spread of abilities around the same mean and a hundredfold enrichment in West Africans in the proportion of people above the 99.9999999th percentile point seen The Genomics of Race and Identity 265 in Europeans is in fact exactly what is expected given the approxi- mately 33 percent higher genetic diversity in West Africans than in Europeans.** Whether or not this explains the dominance of West Africans in sprinting, for many biological traits—including cognitive ones—there is expected to be a higher proportion of sub-Saharan Africans with extreme genetically predicted abilities. So how should we prepare for the likelihood that in the coming years, genetic studies will show that behavioral or cognitive traits are influenced by genetic variation, and that these traits will differ on average across human populations, both with regard to their average and their variation within populations? Even if we do not yet know what those differences will be, we need to come up with a new way of thinking that can accommodate such differences, rather than deny categorically that differences can exist and so find ourselves caught without a strategy once they are found. It would be tempting, in the wake of the genome revolution, to settle on a new comforting platitude, invoking the history of repeated admixture in the human past as an argument for population differ- ences being meaningless. But such a statement is wrongheaded, as if we were to randomly pick two people living in the world today, we would find that many of the population lineages contributing to them have been isolated from each other for long enough that there has been ample opportunity for substantial average biological differ- ences to arise between them. The right way to deal with the inevita- ble discovery of substantial differences across populations is to realize that their existence should not affect the way we conduct ourselves. As a society we should commit to according everyone equal rights despite the differences that exist among individuals. If we aspire to treat all individuals with respect regardless of the extraordinary dif- ferences that exist among individuals within a population, it should not be so much more of an effort to accommodate the smaller but still significant average differences across populations. Beyond the imperative to give everyone equal respect, it is also important to keep in mind that there is a great diversity of human traits, including not just cognitive and behavioral traits, but also areas of athletic ability, skill with one’s hands, and capacity for social inter- action and empathy. For most traits, the degree of variation among 266 Who We Are and How We Got Here individuals is so large that any one person in any population can excel at any trait regardless of his or her population origin, even if particular populations have different average values due to a mixture of genetic and cultural influences. For most traits, hard work and the right environment are sufficient to allow someone with a lower genetically predicted performance at some task to excel compared to people with a higher genetically predicted performance. Because of the multidimensionality of human traits, the great variation that exists among individuals, and the extent to which hard work and upbringing can compensate for genetic endowment, the only sen- sible approach is to celebrate every person and every population as an extraordinary realization of our human genius and to give each person every chance to succeed, regardless of the particular average combination of genetic propensities he or she happens to display. For me, the natural response to the challenge is to learn from the example of the biological differences that exist between males and females. The differences between the sexes are in fact more profound than those that exist among human populations, reflecting more than a hundred million years of evolution and adaptation. Males and females differ by huge tracts of genetic material—a Y chromosome that males have and that females don’t, and a second X chromosome that females have and males don’t. Most people accept that the bio- logical differences between males and females are profound, and that they contribute to average differences in size and physical strength as well as in temperament and behavior, even if there are questions about the extent to which particular differences are also influenced by social expectations and upbringing (for example, many of the jobs in industry and the professions that women fill in great numbers today had few women in them a century ago). Today we aspire both to recognize that biological differences exist and to accord everyone the same freedoms and opportunities regardless of them. It is clear from the abiding average inequities that persist between women and men that fulfilling these aspirations is a challenge, and yet it is important to accommodate and even embrace the real differences that exist, while at the same time struggling to get to a better place. The real offense of racism, in the end, is to judge individuals by a supposed stereotype of their group—to ignore the fact that when The Genomics of Race and Identity 267 applied to specific individuals, stereotypes are almost always mislead- ing. Statements such as “You are black, you must be musical” or “You are Jewish, you must be smart” are unquestionably very harmful. Everyone is his or her own person with unique strengths and weak- nesses, and should be treated as such. Suppose you are the coach of a track-and-field team, and a young person walks on and asks to try out for the hundred-meter race, in which people of West African ances- try are statistically highly overrepresented, suggesting the possibility that genetics may play a role. For a good coach, race is irrelevant. Testing the young person’s sprinting speed is simple—take him or her out to the track to run against the stopwatch. Most situations are like this. A New Basis for Identity The genome revolution is actually a far more effective force for com- ing to a new understanding of human difference and identity—for understanding our own personal place in the world around us—than for promoting old beliefs that more often than not are mistaken. To understand the power of the genome revolution for under- mining old stereotypes about identity and building up a new basis for identity, consider how its finding of repeated mixture in human history has destroyed nearly every argument that used to be made for biologically based nationalism. The Nazi ideology of a “pure” Indo-European-speaking Aryan race with deep roots in Germany, traceable through artifacts of the Corded Ware culture, has been shattered by the finding that the people who used these artifacts came from a mass migration from the Russian steppe, a place that German nationalists would have despised as a source.*” The Hindu- tva ideology that there was no major contribution to Indian culture from migrants from outside South Asia is undermined by the fact that approximately half of the ancestry of Indians today is derived from multiple waves of mass migration from Iran and the Eurasian steppe within the last five thousand years.*° Similarly, the idea that the Tutsis in Rwanda and Burundi have ancestry from West Eurasian 268 Who We Are and How We Got Here farmers that Hutus do not—an idea that has been incorporated into arguments for genocide*!—is nonsense. We now know that nearly every group living today is the product of repeated population mix- tures that have occurred over thousands and tens of thousands of years. Mixing is in human nature, and no one population is—or could be—“pure.” Nonscientists have already realized the potential of the genome revolution for forming new narratives. African Americans have been at the forefront of this movement. During the slave trade, Africans were uprooted and forcibly deprived of their culture, with the effect that within a few generations much of their ancestors’ religion, lan- guage, and traditions were gone. In 1976, Alex Haley’s novel Roots used literature to begin to reclaim lost roots by recounting the odys- sey of the slave Kunta Kinte and his descendants.*” Following in this tradition, Harvard professor of literature Henry Louis Gates Jr. has capitalized on the potential of genetic studies to recover lost roots for African Americans. In his Faces of Americans television series and the Finding Your Roots series that followed it, he declares to the cel- list Yo-Yo Ma, who is able to trace his ancestry back to thirteenth- century China, that Gates, as an African American, will never know how that feels, but he shows that genetics can provide richly infor- mative insights even for African Americans with limited genealogical records.”? A new industry, “personal ancestry testing,” has sprung up to capi- talize on the potential of the genome revolution to form the basis for new narratives and to compare the genomes of consumers to others who have already been tested. The television programs that Gates has produced have been built around the idea of tracing the genealo- gies and DNA of celebrity guests, using the literary device of telling the personal stories of famous people to help viewers understand the power of genetic data to reveal features of their family’s past about which they could not otherwise have been aware. For example, the programs revealed unknown deep relationships between pairs of guests on the program (shared ancestors within the last few hundred years). They also used genetic tests to determine not only the conti- nents on which people’s ancestors lived, but also the regions within continents. The Genomics of Race and Identity 269 As a white person in the United States with its history of forcible deprivation of peoples of their roots, I feel that everyone—African Americans and Native Americans especially—has the right to try to use genetic data to help fill in missing pieces in his or her family history. Nevertheless, for those who assume that personal ancestry testing results have the authority of science, it is important to keep in mind that many of the results are easily misinterpreted and rarely include the warnings that scientists attach to tentative findings. Some of the best examples come from the industry that sprang up to provide genetic results to African Americans. One company is African Ancestry, which provides customers with information on the West African tribe and country in which their Y-chromosome or mitochondrial DNA type is most common. Such results are easy to overinterpret, as the frequencies of Y-chromosome and mito- chondrial DNA types are too similar across West Africa to make exact determinations with confidence. As an example, consider a Y-chromosome type that is carried slightly more often in the Hausa ethnic group than in the neighboring Yoruba, Mende, Fulani, and Beni groups. When African Ancestry sends its report, it might state that an African American man has a Y-chromosome type that is most common in the Hausa.** But it is quite possible and even likely that the true ancestor was not the Hausa, because there are many tribes in West Africa, and no one tribe contributed more than a modest fraction of the African ancestry of African Americans.°? And yet people who have taken these tests often return with the impression that they know their origin. The geneticist Rick Kittles, a popula- tion geneticist who is the cofounder of African Ancestry, described this feeling, asserting, “My female line goes back to northern Nige- ria, the land of the Hausa tribe. I then went to Nigeria and talked to people and learned about the Hausa’s culture and tradition. That gave me a sense about who I am.”°® Whole-genome ancestry tests in theory have much more power than tests based on Y chromo- somes and mitochondrial DNA. But at present, even whole-genome methods are not good enough to provide high-resolution informa- tion about where the ancestors of an African American person lived within Africa, in part because the databases of present-day popula- tions in West Africa are not complete enough. Much more research 270 Who We Are and How We Got Here needs to be done to make it possible to carry out studies like these with any reliability. For African Americans, another frustration may be that the cul- tural upheaval that occurred after African slaves arrived in North America has been so enormous that today there are few differences among African Americans with respect to the places in Africa from which their ancestors came. Africans from one part of the continent were traded around and mixed with those from another, with the result that within a few generations the great cultural diversity and variation of ancestry that existed among the first slaves were blurred to the point of unrecognizability. The nearly complete homogeniza- tion of African ancestry that occurred was evident in an unpublished study I carried out in 2012 with Kasia Bryc, who analyzed genome- wide data from more than fifteen thousand African Americans from Chicago, New York, San Francisco, Mississippi, North Carolina, and the South Carolina Sea Islands, and tested if some African American populations were more closely related to particular West Africans than others, as might be expected based on the heterogeneous sup- ply routes for U.S. slaves.*” It made sense to expect some differences. Of the four big slave ports, New Orleans was supplied mostly by French slave traders, whereas Baltimore, Savannah, and Charleston were supplied mostly by the British drawing from different points in Africa. But what we found is that the mixing of the West Afri- can ancestors of African Americans has been so thorough that we could not detect any differences in the African source populations for mainland populations. Only in the Sea Islands off South Caro- lina did we detect evidence of a particular connection to one place in Africa, in this case to people of the country of Sierra Leone, the place of origin of the language with an African grammar still spoken by Gullah Sea Islanders. It will take ancient DNA studies of first- generation enslaved Africans to actually trace roots to Africa.*® The problem with the results sometimes provided by personal ancestry testing companies is not limited to African Americans. It is a more general pitfall that stems from the financial incentive that such companies have to provide people with what feel like meaningful findings. This is a problem even for the most rigorous of the compa- nies. Between 2011 and 2015, the genetic testing company 23andMe The Genomics of Race and Identity 271 provided customers with an estimate of their proportion of Nean- derthal ancestry, allowing them to make a personal connection to the research showing that non-Africans derive around 2 percent of their genomes from Neanderthals.*” The measurement made by the test was highly inaccurate, however, since the true variation in Nean- derthal proportion within most populations is only a few tenths of a percent, and the test reports variation of a few percentage points. Several people have told me excitedly that their 23andMe Neander- thal testing result put them in the top few percent of people in the world in Neanderthal ancestry, but because of the test’s inaccuracies, the probability that people who got such a high 23andMe Nean- derthal reading really do have more than the average proportion of Neanderthal ancestry is only slightly greater than 50/50. I raised this problem to members of the 23andMe team and even highlighted the problems in a 2014 scientific paper.°! Later, 23andMe changed its report to no longer provide these statements. However, the company continues to provide its customers with a ranking of the number of Neanderthal-derived mutations they carry.” This ranking, too, does not provide strong evidence that customers have inherited more Neanderthal DNA than their population average. Not all the findings reported by the personal ancestry companies are inaccurate, and many people have obtained what for them is sat- isfying information from such testing, especially when it comes to tracing genealogies where the paper trail runs cold. One example is adoptees seeking their biological parents. Another is tracking down extended families. From my own perspective, though, I do not find this approach to be satisfying. In preparing to write this book, I considered whether I should send my DNA to a personal testing company or study it in my own lab, and then describe the results, in imitation of the approach taken by many journalists covering the field of personal ancestry test- ing. But honestly, I am not interested. My own group—Ashkenazi Jews—is already overstudied. I am confident that my genome will be much like that of anyone else from this population. I would much rather use any resources I have to sequence the genomes of peo- ple who are understudied. I am also worried about the intellectual pitfall of self-study. I am innately suspicious of scientists who are 272 Who We Are and How We Got Here hyper-interested in their own family or culture. They simply care too much. In my own laboratory, there are researchers from all over the world, and I encourage them, not always successfully, to choose projects on peoples not their own. For me, the approach of using the genome as a tool to connect myself to the world around me through personal links of family and tribe seems parochial and unfulfilling. What the genome revolution has given us, though, is an even more important way to come to grips with who we are—a way to hold in our minds the extraordinary human diversity that exists today and has existed in our past. The problem of understanding the connections between self and the world is a central one for me, and has driven my lifelong interest in geography, history, and biology. Ironically for a person like myself, who is not at all religious, it is an example from the Bible that provides me with insight into how the genome revolu- tion might be able to help solve this existential problem. Every year on the holiday of Passover, Jews sit around the dinner table and recount the story of the Exodus from Egypt. The Passover holiday is important to Jews because it reminds them of their place in the world and encourages them to draw lessons about how they should behave. This narrative has been extraordinarily successful, as measured by the fact that it has sustained Jews in their identity for thousands of years as a minority living in foreign lands. The Passover story begins with the myth of the patriarchs in ancient Israel: the first generation of Abraham and Sarah; the second of Isaac and Rebecca; the third of Jacob, Leah, Rachel, Bilhah, and Zilpah; and the fourth generation of twelve male children (the fore- fathers of the tribes of Israel) and a daughter, Dinah. These people are too removed from the huge populations of today to seem mean- ingfully connected to the present. The literary device that connects this ancient family to the multitudes that follow is Joseph, one of the sons of Jacob, who is sold by his brothers into slavery in Egypt, and who rises to a position of great power. When a famine strikes the land, the rest of the family also migrates to Egypt, where they are welcomed by Joseph despite the earlier crime they had committed against him. Four hundred years pass, and their descendants expo- nentially multiply into a nation numbering more than six hundred thousand military-age men and an even larger number of women The Genomics of Race and Identity 273 and children. Under the leadership of Moses, they break their bonds of oppression, wander for dozens of years, and work out their code of laws. They then return to the Promised Land of their ancestors. After reading the Passover story, Jews intuitively understand how within their population, numbering millions of people, they are related to each other and the past. The story allows Jews to think of those millions of coreligionists as direct relations—and to treat them with equal respect and seriousness even if they do not understand their exact relationships—to break out from the trap of thinking of the world from the perspective of the relatively small families we were raised in. For me, the multitude of interconnected populations that have contributed to each of our genomes provide a similar narrative that helps me to understand my own place in the world and to avoid being daunted by the vast number of people in our species—the immensity of the human population numbering in the billions. The centrality of mixture in the history of our species, as revealed in just the last few years by the genome revolution, means that we are all intercon- nected and that we will all keep connecting with one another in the future. This narrative of connection allows me to feel Jewish even if I may not be descended from the matriarchs and patriarchs of the Bible. I feel American, even if I am not descended from indigenous Americans or the first European or African settlers. I speak English, a language not spoken by my ancestors a hundred years ago. I come from an intellectual tradition, the European Enlightenment, which is not that of my direct ancestors. I claim these as my own, even if they were not invented by my ancestors, even if I have no close genetic relationship to them. Our particular ancestors are not the point. The genome revolution provides us with a shared history that, if we pay proper attention, should give us an alternative to the evils of racism and nationalism, and make us realize that we are all entitled equally to our human heritage.