Master Class: A Conversation with Jonathan Birch About the Equivalence of Theories of Social Evolution

In my seemingly endless writing on multilevel selection (my first article was published in 1975), I often lament that the general public lags so far behind the peer-reviewed literature. The former still thinks that group selection has been rejected in favor of kin selection and all that, whereas the latter has embraced the concept of equivalence.

Equivalence notes that theories can differ, not by invoking different causal processes, but by offering different perspectives on the same causal processes. Three metaphors can make the concept of equivalence clear (see my book Does Altruism Exist? for a fuller treatment). Imagine a graph that shows the distribution of a variable (such as body weight) and calculating a mean and variance from this data. The two summary statistics look different from the distribution but they are based on the same numbers and there is a way to translate between them. Actually, the translation only flows one way, because the summary statistics can be calculated from the distribution but the distribution cannot be recovered from the summary statistics. Many distributions can result in the same mean and variance; a fact to which I will return below.

Second, imagine two people who speak different languages—say, English and Spanish—declaring that each other’s languages are confusing and wrong. Maybe they’d change their minds if they became bilingual. Then they might appreciate the different perspectives offered by different languages.

Third, imagine that you decide to climb a mountain with a friend and are standing together to plan your ascent. You decide that it would be better to view the mountain from different locations to get a better sense of its contours. It’s the same mountain, but it can usefully be seen from different vantages.

These three metaphors illustrate the concept of theories that invoke the same causal processes but deserve to coexist by virtue of their different perspectives. The controversy over group selection that emerged in the 1960’s seemed as if one theory could be rejected in favor of another, but it was really more like monolingual people declaring each other to be confusing and wrong. Thankfully, the scientists closest to the subject, who contribute to the peer-review literature, have become more multi-lingual. If only the general public would catch up!

Jonathan Birch is definitely a scientist close to the subject, but in a recent academic article titled “Are Kin and Group Selection Rivals or Friends?” he offers an account of equivalence quite different from my own. Yes, equivalence has become the consensus among the cognoscenti, but it hasn’t really drawn the two sides closer together. Moreover, equivalence has become “a purely statistical formulation” that isn’t really worth wanting. Perhaps we should be drawing distinctions between kin selection and group selection after all, in what Birch calls a “K-G space”.

I agree with about half of Birch’s analysis but disagree substantively with the rest. Rather than debating the issues by trading articles in the peer-reviewed literature (a process that can require years), I invited him to have a conversation on TVOL and he graciously accepted. TVOL readers can think of it as a Master class in kin selection, group selection, and all that. The best way to prepare is by reading Jonathan’s academic article, which is open access.

David Sloan Wilson (DSW): Welcome Jonathan! I admire your work and look forward to taking a deep dive with you. Please introduce yourself to our audience. What is your academic background and what drew you to the study of social evolution?

Jonathan Birch (JB): I started with an undergraduate degree in Natural Sciences at Cambridge, gravitated towards the excellent History & Philosophy of Science Department they have there, and then took a PhD in philosophy of science.

At first, my main motivation was to understand what an organism is. Bacteria, trees and elephants are organisms, whereas biofilms, forests and herds of elephants are not. Organisms, though, are themselves groups of lower-level entities (cells, organelles, etc.). So what does it take for something to constitute an organism rather than a group? Obviously, it has something to do with “integration”, but this is just restating the problem: what kind of integration does the job and why? And what should we say about really integrated groups like ant colonies—the kind of groups many (including you, I think!) want to call “superorganisms”?

I still don’t have a good answer to these questions. But they’re the questions that led me towards social evolution theory. I came across the idea (in a paper by Dave Queller and Joan Strassmann) that organisms are literally nothing more than social groups with very high levels of cooperation and very low levels of conflict. That led to a lasting interest in trying to understanding social evolution theory—and in seeing whether it can help explain the origins and nature of organisms.

DSW: Thanks! Let me begin by outlining what I see as our zone of agreement. I think it’s safe to say the we are both multi-lingual and can speak the “languages” of not just two but a whole family of theories of social evolution—group selection, kin selection (aka inclusive fitness), evolutionary game theory (aka reciprocity), selfish gene theory, extended phenotypes. New terms continue to be coined, such as “social selection”. Why these perspectives proliferate is an interesting question, suited for sociologists, historians, and philosophers of science as much as practicing scientists.

We agree that something called equivalence has become the consensus view among the cognoscenti. But I’d like to push back on the impression you give in your article that the two camps are as far apart as before. When I was a graduate student in the 1970’s, it was almost mandatory for authors to assure their readers that group selection was not being invoked. Now it is almost mandatory to acknowledge equivalence. That’s a huge difference! I, for one, always acknowledge that the gene-centered view is insightful, as long as it’s not used as an argument against group selection. To pick someone from the “other side”, Andy Gardner has made a useful contribution to MLS theory which includes the statement “social evolution theorists now widely agree that a covariance between group trait and group fitness may arise in the natural world, resulting in a response to group selection.” Please—let’s acknowledge the progress that has occurred!

JB: This is progress! And yet, I’m still troubled by the divisions that clearly exist.

Andy Gardner’s paper is an example: although he acknowledges that group selection can happen, and acknowledges a form of equivalence, he goes on to claim that “there can be kin selection in the absence of group selection, as defined above, even in populations that are structured into clearly defined kin groups.”

Gardner’s argument is that, although kin and group selection (or multi-level selection, MLS) theory are “equivalent” when applicable, kin selection theory is the mathematically richer of the two, and the one that is better able to handle complexities like class structure (i.e. populations where you have distinct classes of organism such as “worker” and “queen” that differ in their reproductive value). This point about methodology is then taken to imply that group selection is “absent” when you have class structure.

The other side of the divide is well illustrated by the last sentence of your 2007 paper with E. O. Wilson. This paper too acknowledges equivalence, but ends with the line: “Selfishness beats altruism within groups. Altruistic groups beat selfish groups. Everything else is commentary.”

The sentiment being expressed here, if I’m reading it correctly, is that, equivalence notwithstanding, the group selection tradition has latched on to the fundamental causal explanation of why altruism evolves. All the kin selection tradition has done is develop a unhelpful way of expressing that insight that buries it deep in a superficially group-free formalism.

To return to your linguistic analogy, both sides seem to regard the other as a bit like Pig Latin, where you move the first letter of each word to the end. Yes, all the same claims can be expressed in that language, but why on earth do you think that’s a good idea?

DSW: There is a lot to unpack here. I think it is important to distinguish between two debates taking place in parallel and—being debates–both have two sides. The first debate is about the virtues of different mathematical and statistical formalisms, which you describe in your article. This is what Gardner means when he describes his model of kin selection as “mathematically richer”. On the other side we have the infamous Nature paper by Martin Nowak, Corina Tarnita, and Ed Wilson asserting that their mathematical formalism is better than inclusive fitness theory.

The second debate is about causal processes that can be described in words, graphs, computer simulations, or any number of formal analytical models. Darwin’s theory of natural selection is like this. It is a series of three causal claims that can be described in words: 1) Individuals vary; 2) Their differences make a difference in terms of survival and reproduction; and 3) Offspring resemble their parents. Therefore…

Of course, the verbal description can be enhanced with models of various sorts, which have the virtue of precision but also the limitation of their simplifying assumptions. The more complex the real world being modelled, the greater the limitations of the models. The physics of celestial bodies rotating in space hits a complexity wall going from two bodies to three bodies. Population genetic theory hits a similar complexity wall going from two-locus to three-locus models, not to speak of the whole genome!

Against this background, the need for group selection can be seen as an addendum to Darwin’s theory of natural selection that can be described in words: 4) Prosocial behaviors oriented toward the welfare of others usually require time, energy, and risk on the part of the individual actors; 5) In the absence of other reinforcing behaviors such as rewards and punishments, this places prosocial individuals at a direct fitness disadvantage, compared to more self-oriented individuals with whom they interact, who receive social benefits without providing them; 6) Social behaviors are almost always expressed among sets of individuals (groups) that are small compared to the total evolving population; 7) To find the direct relative fitness advantage of prosocial behaviors, it is necessary to go up in scale. Groups (as defined above) with a higher frequency of prosocial individuals collectively survive and reproduce better than groups with a lower frequency of prosocial individuals. This is what Ed Wilson and I summarized with our pithy “Selfishness beats altruism within groups. Altruistic groups beat selfish groups. Everything else is commentary.”

Against this background, equivalence in the causal sense of the term asserts that most theories of social evolution reflect claims 4-7. Notice that this generality requires a certain definition of groups. Define groups another way, and the generality goes away. Also, I have been careful to use the term “direct fitness” and added the caveat “in the absence of reinforcing behaviors such as rewards and punishments”. I will elaborate on these points below. For now, the main point I want to make is this one: Like the theory of natural selection described in statements 1-3, MLS theory is a series of causal claims that can be stated in words (statements 4-7) without requiring mathematical or statistical formalisms at all. This is the second debate that takes place in parallel to debates about whose mathematical formalism is the best. Are we in agreement on these points?

JB: I think so! I agree that the core commitments of MLS theory can be stated verbally, and the theory doesn’t need to be wedded to any specific formalism.

Your claims 4-7 raise the question of what constitutes a group, and this seems like a crucial question to me. Your claims, as you note, rely on a very broad definition of group. And you assume that, even when group is defined in this very broad way, it still makes sense to talk about groups “collectively surviving and reproducing”. There is room for debate here. I think it might be a more substantial notion of a group, one requiring really tight integration, that gives rise to genuinely collective reproduction.

I talk in my Current Biology essay about the “quest for generality”. Both the kin selection modelling tradition and the multi-level tradition have sought, over the years, to cover as wide a range of cases as they possibly can. On the kin selection side, this has led to a broadening of the concepts of “relatedness”, “cost” and “benefit”. For example, relatedness is understood to encompass any kind of genetic assortment. It sometimes seems as though little importance is attached to the question of whether social partners are close genetic kin.

On the group selection side, the quest for generality has led to a remarkably broad definition of “group”, as captured in your “trait-group” concept. Trait-groups can be very ephemeral, very unstable, and they can be “continuous” rather than discrete: the groups can blur into one another, without sharp boundaries where one group ends the other begins.

“Group” has to be defined in this very broad way for it to be the case that groups are present wherever altruism evolves. They have to be defined in this broad way for your slogan to be true. An example that brings this out is a “one-shot” two-player game where the players interact once, then disappear back into the population and never meet again. Altruism can evolve in such a model if the players’ genotypes are correlated. For your slogan to be true, you have to say that these two organisms, who interact once during their entire lifetime, form a “group”. That’s a broad definition of “group”.

In the essay, I push back against this a little bit. I suggest that it may be more helpful to reserve the term “group selection” for those cases where reasonably enduring, stable, well-bounded groups can be identified. (I call the degree of well-bounded and stable group structure “G”). On the other side, I suggest it may be more helpful to reserve the term “kin selection” for those cases where there is whole-genome relatedness between real genetic kin (I call the degree of whole-genome relatedness “K”). That makes it possible to have an empirically-driven debate about the relative importance of these causal factors.

If you think about the distinction my way (and perhaps we’ll come back to this), there will be some evolutionary processes that are clearly kin selection and not group selection at all, some that are clearly group selection and not kin selection at all, some that are both, and some that are neither. They are neither equivalent nor mutually exclusive. Instead, we think about them as things that come in degrees. A given process has a degree of “kin selection character” and a degree of “group selection character”. In the article, I explain how we can plot these variables in what I call a “K–G space”.

DSW: I will come back to this, but first I’d like to unpack the caveat “in the absence of reinforcing behaviors such as rewards and punishments”. It is intuitively obvious that altruism can beat selfishness within groups if altruism is rewarded and selfishness is punished by other members of the same group. To understand what is going on in detail, we must think about rewarding and punishing as separate traits that coevolve with the altruistic trait being reinforced. For example, punishing selfishness requires time, energy, and risk on the part of the punisher, which creates a direct relative fitness disadvantage compared to non-punishers within the same group. Invoking punishment to explain the evolution of altruism doesn’t change the problem, but merely relocates it to the punishing trait. Ditto for time, energy, and risk required to reward altruists. Agreed?

JB: That’s right—punishment itself calls for explanation, and it sometimes leads to “second-order free rider problems”, if individuals stand to gain from leaving all the punishment to others.

The exceptions are cases where the punishment is directly beneficial to the punisher in some way. For example, worker ants in many species are not fully sterile, but they will “police” each other’s egg-laying, eating any eggs they find. Eating the egg may be directly beneficial to the eater—at the very least, it offsets the energy cost.

DSW: Agreed. As for the distinction between direct vs. indirect fitness, this is worth spending time on to get it right. I said that we would be taking a deep dive! The concept of indirect fitness is the hallmark of inclusive fitness theory and requires calculating the effect of a behavior on the gene influencing the behavior in both the actor and the recipients of the behavior. To see how this works in a model of social interactions among full siblings, imagine a mutant dominant gene for altruism (A) in a large population of selfish individuals (a). Because it is the only one, the mutant gene exists as a heterozygote (Aa) that mates with a selfish homozygote (aa) and produces a clutch of offspring that socially interact with each other. It doesn’t matter whether the siblings confine their interactions to each other because they are spatially isolated or because they are capable of recognizing each other. It is the social interactions, and not the physical appearance of a group to our eyes, that counts.

In the sibling group, the altruists (Aa) exist at a frequency of about 50% (there will be sampling error around this mean). Altruists deliver a benefit (b) to their siblings at a cost (c) to themselves. They always bear the cost and the recipient shares the altruistic gene 50% of the time, so there is a net increase in copies of the altruistic gene (A) when (0.5)b-c>0. This is the inclusive fitness of the altruist, compared to its direct fitness of -c. However, we have not yet calculated relative fitness within the sibling group. There is an even larger increase in the copies of the selfish gene because selfish siblings are the recipients of altruism the other 50% of the time and they never bear the cost. Hence, calculating indirect fitness rather than direct fitness does not alter the conclusion that the altruistic gene is less fit than the selfish gene within every sibling group containing both genes.

To know whether the altruistic gene evolves in the total population, we need to compare the fitness of the two genes averaged across all of the sibling groups. The altruistic gene exists in only one group so its fitness is (0.5)b-c>0. In a large population, almost all of the selfish genes exist in sibling groups comprised entirely of selfish individuals. The lucky few selfish genes that exist in the sibling group with altruists are such a small proportion of all the selfish genes that they can be ignored. Hence, the altruistic gene increases in frequency in the total population when (0.5)b-c>0. This is Hamilton’s rule. It correctly predicts when altruism evolves in the total population, but obscures the fact that altruism evolves only by virtue of a fitness difference between groups and not a fitness difference among individuals within the group of socially interacting individuals.

It is important to stress to our readers, and to agree between ourselves, that everyone at the time, from Hamilton on down, thought that inclusive fitness theory explained the evolution of altruism without invoking group selection. It was only by encountering the Price equation that Hamilton realized that altruism is selectively disadvantageous within each and every kin group containing both types (the negative within-group component of the Price equation) and requires the differential contribution of groups to the evolving populations to evolve (the positive between-group component of the Price equation). Whatever we might say about the failings of a statistical partitioning method such as the Price equation, it did deliver this fundamental insight to Hamilton. And Hamilton was happy to embrace the conclusion! This makes it especially intriguing why so many of his followers couldn’t follow him in this particular respect.

To summarize, equivalence can be argued in causal terms, so it need not be an argument about mathematical formalisms or statistical abstractions. In addition, once we appreciate the limitations of formal models (their need for simplifying assumptions) in addition to their strengths (precision), then there will always be a diversity of formal models addressing complex topic in physics, biology, or the human social sciences. The idea that one formalism is superior to all the others emerges as wrongheaded.

JB: Yes—I support a genuine pluralism in this area, where researchers make free use of kin and group selectionist methodologies and “speak both languages”, as it were.

And one the goals of my Current Biology essay is to encourage people to read Bill Hamilton’s 1975 paper on “Innate Social Aptitudes of Man”, in which an argument for equivalence based on the Price equation is made for the first time. Hamilton, I think, has a nuanced and appealing way of thinking about the concepts of kin and group selection, and my “K–G space” is inspired by a quotation from the paper.

I don’t agree with everything you said, however. When you say that the inclusive fitness approach “obscures the fact that altruism evolves only by virtue of a fitness difference between groups”, that sounds like the type of “Pig Latin” accusation I mentioned earlier. Hamilton says: altruism evolves by virtue of an inclusive fitness difference between individuals. You say: altruism evolves by virtue of fitness differences between groups. Both are correct descriptions in different languages, so why say that one “obscures” and the other reveals?

DSW: I think that I can say this based on the history of the subject. We have to remember that the main context for thinking about group selection, starting with Darwin, was not the evolution of cohesive groups that we associate with major evolutionary transitions, but the problem explaining the evolution of single prosocial behaviors, despite their local relative fitness disadvantage. When modelers such as Sewall Wright, Ronald Fisher, and J.B.S. Haldane tackled this problem (which they did only briefly), each made different specific assumptions about the groups, since the purpose of their models was to capture the gist of the problem, similar to Darwin talking about “tribes” and “communities” in words. V.C. Wynne Edwards was not a modeler. He gestured toward Sewall Wright for that kind of authority and proceeded to describe a diversity of population structures, including some that did not have sharp group boundaries. Gregory Pollock has written an important article on this subject that deserves to be widely read. Discrete group boundaries are prominent in the theoretical literature because they are mathematically tractable, not because naturalists such as Wynne Edwards thought that they must be important.

Despite their diversity, what all of these models and verbal descriptions shared in common was that: 1) the trait of interest was locally disadvantageous, but; 2) nevertheless evolved because of the differential contribution of groups to the total evolving population. Nothing more was required for groups to count as “units of selection”.

This is the historical background for the advent of theories that claimed to explain the evolution of altruism without invoking group selection–not just inclusive fitness theory but also evolutionary game theory and selfish gene theory. The standard narrative was to acknowledge that group selection is a theoretical possibility but one that is seldom realized because within-group selection is invariably stronger than between-group selection. Then, in every case, individual-level or gene-level fitness is defined on the basis of what evolves in the total population, not what evolves within the groups defined by the model, essentially defining group selection out of existence. This is a fallacy, plain and simple, which Elliott Sober and I dubbed The Averaging Fallacy.

It is important to stress that there is something genuinely subtle about multilevel selection. The idea that a trait can evolve in the total population, despite being selectively disadvantageous in every group, seems impossible. In statistics, it is called Simpson’s Paradox and has applications outside of evolutionary theory. Thus, smart people can be forgiven for not seeing the elements of multilevel selection within their own models—but it is still a mistake that needs to be corrected!

Part of the standard narrative is to identify something called “naïve group selection”, which assumes that traits evolve “for the good of the group” without appreciating the special conditions that are required. Fair enough, but let’s also identify something called “naïve gene selection” and “naïve individual selection”, which assumes that gene-level and individual-level fitness comparisons, all things considered, qualify as arguments against group selection! This was rampant among the cognoscenti in the second half of the 20^thcentury and remains rampant among the general public. Doesn’t this entitle me to say that, historically, so-called alternatives to group selection obscured something that required a lot of time among some very smart people to work through?

JB: But this is a coin with two sides, don’t you think? You say that, when researchers describe the conditions for the evolution of altruism using Hamilton’s rule, they “obscure” the importance of group structure. They will retort that, if one describes the conditions for the evolution of altruism using the multi-level Price equation (or some other formulation of MLS theory), this “obscures” the importance of relatedness, cost and benefit. This is the kind of thing I was lamenting earlier.

To return to your mountain analogy: it’s true enough that climbing one side of the mountain will “obscure” the other side, but it would be odd to complain about that. That’s just how it is with mountains. Equally, there’s some truth in the idea that an organizing framework for social evolution research that puts relatedness at the centre will risk underplaying the importance of group structure, and a corresponding risk that a framework which puts group structure at the centre will underplay the importance of relatedness.

That’s why I’d like to see a deeper kind of pluralism than we have now. One in which researchers see the value in both frameworks and don’t identify as belonging to one “camp” or the other.

DSW: Agreed—and others can judge how well I achieve that deeper kind of pluralism in my own writing. To pick a single case, Hamilton’s observation that haplo-diploidy results In extra-high genetic relatedness among sisters, which might be relevant to the evolution of eusociality in insects, was genuinely novel.

Let’s take stock of our progress. We agree that the concept of equivalence can go beyond statistical abstractions and apply to causal models of social evolution. We agree that the averaging fallacy is a fallacy—in other words, averaging the fitness of lower-level units across higher-level units does not constitute an argument against higher-level selection. We agree that the major theories of social evolution justify their coexistence by offering different insights, in the same way that it is useful to view a mountain from different perspectives. Finally, we agree that even when folks such as Andy Gardner and you define groups more narrowly than I do, it is not the case that within-group selection invariably trumps between-group selection. This is real progress!

Now I’d like to defend my broad definition of groups, but in a way that is respectful of your K-G space. I think that my broad definition captures the essence of the problem that group selection was invented to solve—the local disadvantage of prosocial traits. The fact that my definition includes groups that are small and ephemeral in addition to large and enduring—not in an arbitrary way, but as required for each trait under consideration (the trait-group concept)–strikes me as a strength. That said, a broad definition of groups does not alter the fact that there is a huge multi-factorial parameter space to explore, which will require defining types of groups, or more generally types of population structures. My challenge is roughly the same as your challenge trying to define what you call a K-G space. If you want to define groups more narrowly than me, then we’ll need to become bi-lingual to translate between our equivalent frameworks.

Here is an example to distinguish between “types of groups” and “types of population structures”. When offspring are deposited close to parents, as with many species of plants, then neighbors tend to be genetic relatives. Hamilton called this “population viscosity” and assumed that it would favor the evolution of altruism. Given Hamilton’s rule, why conclude otherwise? But Hamilton’s rule includes an implicit assumption that groups of altruists export their productivity to other regions of the evolving population. Complexity theorists such as Yaneer Bar-Yam call this the mean field approximation. It is biologically justified when there is a global dispersal phase of the life cycle, as when sibling groups of caterpillars socially interact and then fly away as butterflies. But it is not justified with plant-like population viscosity, when the many progeny produced by altruistic patches are deposited primarily within the same patch. In this case, local competition can cancel out the collective advantages of altruism. Inclusive fitness theory can be adjusted to reflect local competition, but it was a MLS model that led to the original insight that purely viscous populations are not necessarily favorable for the evolution of altruism.

The main point that I’m trying to make in the context of our conversation is that even when you define a type of group in your K-G space, such as “groups of genetic relatives”, there is still a multi-factorial parameter space concerning such things as the presence or absence of a global dispersal stage. The parameter space that both of us must navigate with our verbal taxonomies goes beyond two dimensions!

JB: The scale of competition also matters, yes. Certainly, I don’t want to suggest that “K” and “G” are the only variables that matter for understanding the different varieties of social evolution. I’m just saying they are variables that matter.

It’s true that some of the important early insights into the importance of the scale of competition came from your work on multi-level selection. The idea was picked up by kin selection theorists, and incorporated, and is now an important idea for them too. This just underlines the absurdity of trying to draw any kind of sharp distinction between the two approaches. I think, all along, they have been pretty well intertwined and symbiotic, in constant dialogue with each other (as we’ve noted, Hamilton’s 1975 article is an example).

DSW: Now let’s switch gears and talk about major evolutionary transitions. Way back in 1997, I wrote an article titled “Altruism and Organism: Disentangling the Themes of Multilevel Selection Theory”. The first theme concerns the evolution of single traits that are selectively disadvantageous within groups and therefore require between-group selection to evolve. The second concerns groups that become so cooperative, in so many respects, that they qualify as a superorganism in their own right. Thinking of social insect colonies and human societies as superorganisms has a long history (the cover of Hobbe’s Leviathan features a big human made out of little humans), but it wasn’t until the 1970’s that Lynn Margulis proposed that nucleated cells are bacterial superorganisms. Then John Maynard Smith and Eörs Szathmáry began to pull it all together in their 1995 book The Major Transitions of Evolution, which covered everything from the origin of life as groups of cooperative molecular interactions to symbolic thought in humans. What are your impressions of how the concept has been developed up to the present? What are the major unanswered questions? It strikes me that this is one topic area where the different “camps” are largely in agreement, in part because what constitutes the group, selection at the group level, and the need to suppress disruptive selection within groups (e.g., cancer) are so obvious. Is that also your impression?

JB: As I said earlier, I’ve been fascinated by the idea of the organism as societies, and societies as organisms, for a long time. Something that’s easy to miss, in this context, is that every human being is a society of cells, or a “cell state”. The multicellular organism is, intrinsically, a social phenomenon. In my book The Philosophy of Social Evolution (Chapter 7), I discuss the history of this idea, which has its roots in the nineteenth century, and especially in the work of Rudolf Virchow, Ernst Haeckel, and Herbert Spencer (for more, see this blog post).

The more I read about this, the more I started to wonder whether we even need the concept of a super-organism. If all organisms can be seen as well integrated, highly cooperative societies, aren’t insect colonies just a type of organism? Isn’t an extremely cohesive social group just what an organismis? Strassmann and Queller have defended this view. Hamilton gestured towards it in his 1964 paper introducing the concept of inclusive fitness. At one point, he notes:

[O]ur theory predicts for clones a complete absence of any form of competition which is not to the overall advantage and also the highest degree of mutual altruism. This is borne out well enough by the behavior of clones which make up the bodies of multicellular organisms.

I think the inclusive fitness perspective is useful for understanding how multicellular organisms first evolved from groups of unicellular organisms, for the reason Hamilton gave, and I think the multi-level perspective is useful too, for the reasons you give.

In my Current Biology essay, I relate some of these ideas about transitions to the concept of K–G space. The evolution of multicellular organisms, or of organismal insect societies (i.e. a “fraternal transition”), is something that becomes possible when you have high G and high K at the same time: stable, well-bounded social groups related through kinship at every locus in the genome. Even then, it will still be rare and will require the right ecological conditions. I think that, when a population is in this region of K–G space, it is pointless to debate whether the process driving the evolution of cooperation is kin selection or group selection. It has the core features of both. It is “kin-group selection”.

The two “camps” of kin and group selection theorists should join forces—and throw everything they’ve got at the problem of understanding the transformation of a social group into a new, higher-level organism.

You ask about “major unanswered questions”. There are too many to list, of course! But here’s a puzzle that especially interests me: there may have been as many as 25 separate instances of the evolution of multicellular life, but in only three cases (plants, animals, and fungi) has the lineage in question proceeded to evolve large numbers of distinct cell types. As Eörs Szathmáry and Lewis Wolpert once put it, “three hits in 3.5 billion years is not that many”. It seems that evolving high numbers of specialized cell types remains very improbable even once a lineage has attained evolved multicellularity. Special further conditions are required, but we don’t understand these special conditions very well at all.

DSW: Thanks! I can’t help but jump in with some observations of my own. As you know but needs to be stressed for our readers, there are two forms of major transitions, called fraternal and egalitarian. Fraternal transitions start out genetically homogenous, as you say, but egalitarian transitions start out diverse—so much that they can begin as multi-species associations, as with the origin of the nucleated cell as a symbiotic community of bacteria. The key requirement for an egalitarian transition, as the name implies, is the absence of differential reproduction within the group.

Even the evolution of multicellularity, the strongest example of a fraternal major transition, requires elements of an egalitarian transition. There must be rules of meiosis insuring the unbiased transmission of the two sets of genes. The cells might start out genetically identical but they become diverse with every new mutation. Mechanisms are required to suppress cancer, which is nothing other than disruptive selection among the cells of a multicellular organism.

Even when social insect colonies originate with a single female who mates with a single male, the colony is genetically diverse—a far cry from a multicellular organism that originates as a single cell. In current social insect colonies, single queens often mate with multiple males and there are often multiple queens. In these cases, kin selection writ small becomes part of the problem, as matrilines and patrilines act in disruptively self-serving ways within the colony. Mechanisms are required to suppress these disruptive processes, which are given terms such as “policing”, as when honey bee workers eat the eggs produced by other workers.

When we get to human social groups as organisms (I’m happy to drop the prefix “super” if you like), we enter even more into egalitarian territory. To the best of our knowledge, our ancestral social groups were mixes of genetic relatives and non-relatives, similar to modern hunter-gatherer societies. They became highly cooperative units thanks primarily to mechanisms that suppressed bullying and other forms of disruptive behaviour within groups, especially by wannabe alpha males. The difference between a chimp community and a human community is that in a chimp community, the dominant individuals get their way. In a human community, people who try to get their way are collectively suppressed so that high status must be earned by having a good reputation, which requires behaving as a solid citizen (go here for more). That’s just another way of describing a fraternal major transition!

According to Peter Turchin, when human societies became larger with the advent of agriculture, they outstripped our genetically evolved levelling mechanisms and became despotic, more like chimp societies than human hunter-gatherer societies. But then cultural group selection went to work, leading to the remarkably cooperative megasocieties of today. In modern life, kin selection (aka nepotism) and reciprocity (aka cronyism) are seen as problems, similar to individual selfishness, that interfere with the functioning of the whole social organism—just like cancer and nepotism in social insect colonies.

Much of my current work involves studying small human groups in everyday life. They vary in how well they function, something that is obvious on the basis of our experience but can also be quantified. The groups that function best have rules in place (what the Nobel laureate Elinor Ostrom called “core design principles”) that suppress disruptive self-serving behaviours within the group. In other words, they have accomplished a miniature major transition! I find it extraordinary that the same set of ideas that explains our genetic evolution as a highly cooperative species can be applied to make our current-day groups work better at all scales (go here for more). What are your own thoughts about the distinction between fraternal and egalitarian transitions?

JB: Well, I essentially agree that the distinction is important but somewhat blurry. But don’t you think human social groups are, and probably always have been, too riven by conflicting interests to qualify as organisms or super-organisms to any substantial degree? I certainly tend to think this. So, I tend to think they haven’t undergone either type of transition.

Humans are a flashpoint, in lots of ways, for controversies about kin and group selection. My hope is that “K–G space” will be a helpful tool for thinking about these controversies. E. O. Wilson, for example, can be seen as someone whose views about early human evolution have moved through K–G space over the years. If I understand correctly, he once thought human social evolution had very little group selection character and a lot of kin selection character, and he now thinks (according to The Social Conquest of Earth) that it has a lot of group selection character and very little kin selection character.

You mention cultural group selection. I think one of my reservations about that idea (which I expressed in my review of Michael Tomasello’s A Natural History of Morality) is that it sometimes looks (at least at face value) to be assuming that human social life is structured into well-bounded, stable, persisting groups, which isn’t the case in the modern world and may well not have been the case in the Pleistocene world either. This feeds into a certain amount of scepticism about the idea that these groups could undergo a “miniature major transition”. But in your very broad sense of “group”, no doubt there are groups, and always have been groups.

DSW: Right—everything we have discussed about groups carries over to human groups. I’d like to end our conversation in a way that might seem circuitous but hopefully will soon make sense. Lately I have become fascinated by open-source software development, which is an amazing example of large-scale cooperation in humans [A good book on the subject is The Success of Open Source by Steven Weber]. It is imperative for the pieces of code contributed by different people to be compatible with each other. In a sense, you could say that they all need to speak the same language. When this fails to happen, different lineages of code develop that are incompatible with each other. This is called forking. Sometimes forking is okay, as when the different versions become adapted to different purposes and don’t need to communicate with each other. But mostly forking is a bad thing and elaborate efforts are made to prevent it.

Here is the connection with theories of social evolution. Some of them are useful and deserve to coexist by offering different perspectives. But others are like unhelpful forms of forking. They are developed by people who have an incomplete grasp of the literature, resulting in new terms that are redundant or altered definitions of old terms. The peer review process guards against this to a degree but not nearly as well as the oversight process in open source software development.

Another problem is when people invent new terms or ways of describing their work just to avoid using the dreaded words “group selection”. And their fear is well founded! I could tell you many stories of colleagues who had papers and grants rejected when they used the G-word, only to have better luck when the G-word was removed. In this case, the peer-review system becomes part of the problem by perpetuating a stigma. It’s easy to understand the short-term temptation to avoid using a stigmatized word, but it imposes a cost on the field of inquiry as a whole. This is where historians, sociologists, and philosophers of science can make a big contribution by providing the kind of conceptual oversight that is needed to prevent the equivalent of unhelpful forking. Do you agree with my assessment and would you like to add other thoughts on when the proliferation of equivalent theories becomes unhelpful?

JB: It’s a great question to ask any self-styled pluralist: Don’t you also see the value of unity? If you let a thousand flowers bloom, your window box is going to be a horrible mess.

There are definitely trade-offs here. I have some experience with this, because, when researching for my book, I spent a long time reviewing the different versions of Hamilton’s rule in the social evolution literature. I can see a clear case for a general version, to be used as a high-level organizing framework. I can also see a clear case for an approximate version, to be used for deriving specific predictions of evolutionarily stable strategies in particular cases. What is harder to justify are the many, many other formulations that also exist. Sometimes a bit of pruning is necessary.

DSW: And this is a diversity of approaches within Inclusive Fitness Theory, before we even get to MLS theory!

JB: Multi-level selection theory has a similar problem, I think, although in this case the theoretical literature is not as sprawling. It could do with settling on one or two preferred mathematical approaches, and developing them in real depth, so that inclusive fitness theorists can no longer taunt “our mathematical framework is more developed than yours”.

Like you, however, I see no problem with the co-existence of the inclusive fitness and multi-level approaches. They are designed to highlight and privilege different causal factors in social evolution: relatedness in one case, group structure in the other. This is fine.

DSW: In closing, I want to stress a point about mathematical models that I made earlier. A mathematical model is not like the sun that illuminates the entire earth. It is more like a flashlight that illuminates a tiny corner of the earth. There will always be many models, tailored to fit particular contexts, that must be cross-checked with each other and empirical reality.

Thanks very much for taking this deep dive with me. I hope that it will increase literacy among the general public and might even prove interesting to our expert colleagues as well.

JB: Thanks David! I hope you don’t mind if I close with a few links, in case anyone happens to want further detail. Such a person should probably read The Philosophy of Social Evolution (introduced in this blog post), or “The Inclusive Fitness Controversy: Finding a Way Forward”, or “Are Kin and Group Selection Rivals or Friends?”.

DSW: Certainly!

David Sloan Wilson

David Sloan Wilson is SUNY Distinguished Professor of Biology and Anthropology at Binghamton University. He applies evolutionary theory to all aspects of humanity in addition to the rest of life, both in his own research and as director of EvoS, a unique campus-wide evolutionary studies program that recently received NSF funding to expand into a nationwide consortium. His books include Darwin’s Cathedral: Evolution, Religion, and the Nature of Society, Evolution for Everyone: How Darwin’s Theory Can Change the Way We Think About Our Lives, and The Neighborhood Project: Using Evolution to Improve My City, One Block at a Time and Does Altruism Exist? Culture, Genes, and the Welfare of Others. .