|
|
ABSTRACT
Lactase persistence—the ability of adults to digest the lactose in milk—varies widely in frequency across human populations. This trait represents an adaptation to the domestication of dairying animals and the subsequent consumption of their milk. Five variants are currently known to underlie this phenotype, which is monogenic in Eurasia but mostly polygenic in Africa. Despite being a textbook example of regulatory convergent evolution and gene-culture coevolution, the story of lactase persistence is far from clear: Why are lactase persistence frequencies low in Central Asian herders but high in some African hunter-gatherers? Why was lactase persistence strongly selected for even though milk processing can reduce the amount of lactose? Are there other factors, outside of an advantage of caloric intake, that contributed to the selective pressure for lactase persistence? It is time to revisit what we know and still do not know about lactase persistence in humans.
락타아제 지속성—성인이 우유에 함유된 락토스를 소화할 수 있는 능력—은
인간 인구 집단 간에 빈도 차이가 매우 큽니다.
이 특성은
유제품 생산을 위한 가축의 domestication과
그 이후 우유 섭취에 대한 적응을 나타냅니다.
현재 이 형질을 결정하는 5개의 변이가 알려져 있으며,
이는 유라시아에서는 단일 유전자에 의해 결정되지만 아프리카에서는 주로 다유전자적입니다.
규제적 수렴 진화와 유전자-문화 공진화의 교과서적인 예시임에도 불구하고,
락타아제 지속성의 역사는 여전히 명확하지 않습니다:
중앙아시아의 유목민에서 락타아제 지속성 빈도가 낮은 반면,
일부 아프리카 사냥채집인에서 높은 이유는 무엇일까요?
유제품 가공이 락토스 양을 줄일 수 있음에도 불구하고
락타아제 지속성이 강하게 선택된 이유는 무엇일까요?
칼로리 섭취의 이점 외에도
락타아제 지속성에 대한 선택 압력을 유발한 다른 요인이 있을까요?
인간에서의 락타아제 지속성에 대해
우리가 알고 있는 것과 아직 모르는 것을 재검토할 때입니다.
Keywords
INTRODUCTION
Mammals are defined by, among other things, the presence of mammary glands, specialized organs that allow females to produce milk to feed their young. Along with this evolutionary novelty arose the need for mammalian newborns to break down lactose, the only carbohydrate found in milk, into smaller, digestible molecules. This is the role of lactase-phlorizin hydrolase, a β-galactosidase also known as the lactase enzyme, which is present in the small intestine and hydrolyzes lactose into galactose and glucose monomers, molecules that are small enough to be absorbed by intestinal cells (113). The expression level of lactase starts declining after weaning and is then very low in all adult mammals (36)—with the notable exception of humans.
Instead, in approximately a third of humans (61), the expression of lactase persists throughout life, a phenotype known as lactase persistence (LP). The frequency of LP varies greatly among populations, ranging from 5% to almost 100%, with the highest frequencies found in people of northern European descent and some populations from West Africa, East Africa, and the Middle East (58, 61, 86, 114, 123, 126) ( Figure 1 ). Why is there such a physiological difference among human populations? Given that the frequency of LP parallels the ancestral milk-drinking habits of populations, with high frequencies found in pastoral or agropastoral populations that traditionally incorporated large amounts of milk into their diets, the increase in LP frequency was suggested to be the result of regular milk consumption in premodern times in populations with domesticated dairying animals, in what is known as the cultural-historical hypothesis (86, 87, 123, 124). After this phenotype was shown to be genetically inherited as an autosomal dominant trait (88, 112), LP became a textbook example of gene-culture coevolution. The story turned out to be that changes in food production practices during the Neolithic revolution (approximately 10,000 years ago, when human populations began domesticating various plants and animals) led to an increase in the frequency of genetic variants that maintain the expression of lactase in adulthood. These variants allowed carrier individuals to broaden their dietary repertoire, which in turn allowed populations to incorporate more milk into their diets. LP also became a prime example of convergent evolution: Different genetic variants that cause the same phenotypic change arose in multiple populations that adopted milking practices in different geographical areas (60, 62, 134).
소개
포유류는
유선(젖샘)의 존재를 포함한 여러 특징으로 정의됩니다.
유선은
암컷이 새끼를 키우기 위해 우유를 생산하는 특수한 기관입니다.
이 진화적 혁신과 함께 포유류 신생아는
우유에 유일하게 존재하는 탄수화물인 락토스를
더 작은 소화 가능한 분자로 분해할 필요가 생겼습니다.
이 역할을 하는 것이
락타아제-플로리진 가수분해효소(β-갈락토시다아제)로 알려진
락타아제 효소입니다.
이 효소는
소장에서 유당을 갈락토스와 글루코스 단일 분자로 가수분해하며,
이 분자들은 장 세포에 흡수될 수 있을 만큼 충분히 작습니다(113).
락타아제의 발현 수준은
젖을 떼는 시점부터 감소하기 시작하며,
모든 성인 포유류에서 매우 낮은 수준으로 유지됩니다(36)—인간을 제외한 예외가 있습니다.
반면,
인간 중 약 3분의 1(61)에서는 락타아제의 발현이 평생 지속되며,
이 현상을 락타아제 지속성(LP)이라고 합니다.
LP의 빈도는 인구 집단 간에 크게 다르며,
5%에서 거의 100%까지 다양하며,
가장 높은 빈도는 북유럽계 인구와 서아프리카, 동아프리카, 중동 일부 인구에서 관찰됩니다(58, 61, 86, 114, 123, 126) (그림 1 ).
인간 인구 집단 간에 왜 이러한 생리적 차이가 존재할까요?
LP의 빈도가 인구 집단의 조상적 우유 섭취 습관과 일치하며,
전통적으로 식단에 많은 양의 우유를 포함해 온
목축 또는 목축-농업 인구 집단에서 높은 빈도가 관찰되자,
LP 빈도의 증가는
가축화된 유제품 동물을 사육한 인구 집단에서 전근대 시대의 정기적인 우유 섭취 결과라는
문화-역사적 가설(86, 87, 123, 124)이 제안되었습니다.
이 형질이
상염색체 우성 유전 형질로 유전된다는 것이 밝혀지자(88, 112),
LP는 유전자-문화 공진화의 교과서적 예시가 되었습니다.
이야기는
신석기 혁명(약 1만 년 전, 인간이 다양한 식물과 동물을 가축화하기 시작한 시기) 동안
식량 생산 방식의 변화가
성인기 유당 분해 효소 발현을 유지하는 유전적 변이의 빈도를 증가시켰다는 것으로 밝혀졌습니다.
이 변이들은 운반체 개인이 식이 범위를 넓힐 수 있게 했고,
이는 다시 인구 집단이 식단에
더 많은 우유를 포함시킬 수 있게 했습니다.
LP는 수렴 진화의 대표적인 사례가 되었습니다:
우유 생산 관행을 채택한 서로 다른 지리적 지역에서
동일한 형질 변화를 일으키는 서로 다른 유전적 변이들이 독립적으로 발생했습니다(60, 62, 134).
convergent evolution
락타아제 지속성은 수렴 진화의 대표적인 예시 중 하나.
서로 다른 인종에서 유당분해효소를 가지게 되는 현상..
수렴 진화는
서로 다른 계통의 생물이 비슷한 환경에 적응하면서 같은 형질을 진화시키는 현상.
락타아제 지속성은 인류 진화 과정에서 독립적으로 여러 번 나타났으며,
이는 유제품 섭취가 가능한 환경에서 생존에 유리하게 작용했기 때문으로 추정
Figure 1 Lactase persistence (LP) phenotypic frequencies in the Old World. The frequencies are from the Global Lactase Persistence Association Database (GLAD; http://www.ucl.ac.uk/mace-lab/resources/glad ), originally published by Itan et al. (63) and updated in 2013. We kept only populations that included at least ten individuals and removed one recent migrant population (North Africans living in France), resulting in a total of 14,908 individuals in 194 populations. When multiple populations were sampled at the same geographical location, we modified the coordinates so that each point is visible. The histogram represents the density of populations with different LP frequencies.
그림 1 구세계의 락타아제 지속성(LP) 형질 빈도. 빈도는 Global Lactase Persistence Association Database (GLAD; http://www.ucl.ac.uk/mace-lab/resources/glad )에서 수집되었으며, Itan 등(63)에 의해 처음 발표되고 2013년에 업데이트되었습니다. 우리는 최소 10명의 개인을 포함하는 인구만 유지하고 최근 이주한 인구(프랑스에 거주하는 북아프리카인)를 제거하여 총 194개 인구, 14,908명의 개인을 남겼습니다. 동일한 지리적 위치에서 여러 인구가 표본 추출된 경우, 각 지점이 표시되도록 좌표를 수정했습니다. 히스토그램은 다양한 LP 빈도를 가진 인구의 밀도를 나타냅니다.
Click to view
Moreover, the intensity of natural selection for LP has been estimated to be among the strongest in the human genome, with a selection coefficient of approximately 0.04–0.05 (10, 29, 134). This estimate raises the question of why being able to consume milk as an adult has had such a strong influence on human reproduction and/or survival: Is it only a matter of energy intake, or is there more to the story? Despite nearly half a century of debate around this question (28, 43, 50, 58, 61, 64, 126), the nature of the selective pressures responsible for the increase of LP frequency in humans remains unclear.
Indeed, a number of observations challenge our understanding of this adaptive story. For example, milk consumption does not always correlate with LP frequency: Some pastoral and agropastoral populations have a low frequency even though they consume milk [e.g., traditional herders from Mongolia and Central Asia (55, 147)], and some populations have a high frequency even though they do not consume milk of any kind [e.g., Hadza and Yaaku hunter-gatherers from East Africa (107, 134)]. Also, ancient DNA studies have recently revealed that LP was not common in Europe until the Middle Ages (2, 18, 42, 71, 74, 75, 85, 91, 102), adding some confusion about when LP was first selected for. Given all of these puzzles, it might be time to revisit our understanding of the evolution of LP.
In this review, we discuss only the LP phenotype (also called primary acquired lactase deficiency, adult hypolactasia, or lactose malabsorption), defined as the presence or absence of the lactase enzyme in adults; lactose intolerance, by contrast, refers to the symptoms that can occur when consuming lactose, regardless of whether the cause is lactase nonpersistence (LNP). We do not cover other phenotypes, such as congenital and infant hypolactasia (the underexpression of lactase in newborns or children), secondary hypolactasia (the loss of lactase expression following certain intestinal diseases, surgeries, or drug use), and milk allergies (immune reactions of the body to various components of human or cow milk).
또한,
LP에 대한 자연 선택의 강도는
인간 유전체에서 가장 강한 수준 중 하나로 추정되며,
선택 계수는 약 0.04–0.05입니다 (10, 29, 134).
이 추정치는
성인으로서 우유를 섭취할 수 있는 능력이
인간 번식 및/또는 생존에 왜 그렇게 강한 영향을 미쳤는지 의문을 제기합니다:
이는 단순히 에너지 섭취의 문제일 뿐인가,
아니면 더 깊은 이유가 있을까요?
이 질문에 대한 거의 반세기 간의 논쟁(28, 43, 50, 58, 61, 64, 126)에도 불구하고,
인간에서 LP 빈도 증가를 초래한 선택적 압력의 본질은 여전히 불분명합니다.
실제로, 여러 관찰 결과는
이 적응적 이야기의 이해를 도전하고 있습니다.
예를 들어,
우유 섭취는 LP 빈도와 항상 상관관계가 없습니다:
일부 목축 및 농목축 인구 집단은
우유를 섭취함에도 불구하고
LP 빈도가 낮습니다[예: 몽골과 중앙아시아의 전통적 목축민(55, 147)],
반면
일부 인구 집단은 어떤 종류의 우유도 섭취하지 않음에도 불구하고
LP 빈도가 높습니다[예: 동아프리카의 하자(Hadza)와 야쿠(Yaaku) 사냥채집인(107, 134)].
또한 최근 고대 DNA 연구는 LP가 중세까지 유럽에서 흔하지 않았다는 사실을 밝혀냈습니다(2, 18, 42, 71, 74, 75, 85, 91, 102), 이는 LP가 처음 선택된 시점에 대한 혼란을 더했습니다. 이러한 모든 미스터리를 고려할 때, LP의 진화에 대한 우리의 이해를 재검토할 때가 되었을지도 모릅니다.
이 리뷰에서는 LP 형질(일차적 획득성 락타아제 결핍, 성인 락타아제 결핍, 또는 락토스 흡수 장애로도 불림)만을 논의합니다. 이는 성인에서 락타아제 효소의 존재 여부를 의미합니다. 반면 락토스 불내증은 락타아제 비지속성(LNP) 여부와 관계없이 락토스를 섭취할 때 발생하는 증상을 가리킵니다. 우리는 선천성 및 영아 유당 불내증(신생아나 어린이에서 유당 분해 효소의 발현 부족), 특정 장 질환, 수술, 약물 사용 후 유당 분해 효소 발현 상실(이차성 유당 불내증), 우유 알레르기(인간 또는 소 우유의 다양한 성분에 대한 신체 면역 반응)와 같은 다른 표현형은 다루지 않습니다.
LACTASE PERSISTENCE: A TEXTBOOK EXAMPLE OF ADAPTATION
The Lactase Persistence Phenotype
The most straightforward way to test whether an individual is lactase persistent is to test for the activity of lactase in small-intestinal biopsies (137). However, this approach is highly invasive, so more convenient ways of assessing this phenotype have been developed. First, glucose can be measured in the blood before and after the uptake of a lactose load (typically 50 g of lactose when fasting, which is equivalent to approximately 1 L of cow milk) (93). A significant increase in glucose level means that the body is able to break down lactose into glucose in the small intestine. Alternatively, the hydrolysis of lactose can be measured based on the increase of galactose in urine excretion (48).
Lactase activity can also be evaluated by more indirect approaches, notably by looking at the amount of hydrogen in the breath (76) or by investigating the severity of intestinal symptoms (142) after a similar lactose load. Indeed, only LNP individuals will have large amounts of undigested lactose in their colon, which (a) will disrupt the osmotic balance and lead to an increase of water in the bowels, resulting in diarrhea (4), and (b) will be available to their colonic bacteria for fermentation ( Figure 2 ). During this transformation, hydrogen is produced, released into the bloodstream, and then excreted by the lungs. This production of gases during lactose fermentation (not only of hydrogen but also of carbon dioxide and methane) and the acidification of the milieu lead to several symptoms—namely flatulence, bloating, and intestinal cramps (4), although some variability among LNP individuals has been observed (20, 117, 120, 142). Furthermore, up to 8% of lactose can reach the colon in LP individuals, resulting in some additional difficulty in using these indirect tests to distinguish LP from LNP individuals (14).
유당 분해 효소 지속성: 적응의 교과서적 사례
락타아제 지속성 형질
개인이 락타아제 지속성인지 확인하는 가장 간단한 방법은
소장 생검을 통해 락타아제 활성을 측정하는 것입니다(137).
그러나 이 방법은 매우 침습적이기 때문에 이 형질을 평가하는 더 편리한 방법이 개발되었습니다.
먼저,
락토스 부하(일반적으로 공복 상태에서 50g의 락토스, 이는 약 1L의 소 우유에 해당함) 섭취 전후에
혈중 글루코스 농도를 측정할 수 있습니다(93).
글루코스 농도의 유의미한 증가가 관찰되면
소장에서 락토스가 글루코스로 분해된다는 의미입니다.
대안으로, 소변 배설물에서의 갈락토스 증가를 기반으로 락토스의 가수분해 정도를 측정할 수 있습니다(48).
락타아제 활성은 간접적인 방법으로 평가할 수 있으며,
특히 호흡 중 수소 농도(76)를 측정하거나 유사한 락토스 부하 후
장 증상의 심각도를 조사하는 방법이 있습니다(142).
실제로
LNP 환자는 대장에 소화되지 않은 락토스가 대량으로 존재하며, 이는
(a) 삼투압 균형을 깨뜨려 장 내 수분 증가를 유발해 설사를 일으키며 (4),
(b) 대장 세균에 의해 발효될 수 있는 상태가 됩니다 ( Figure 2 ).
이 과정에서 수소가 생성되어 혈류로 방출된 후 폐를 통해 배출됩니다.
유당 발효 과정에서 발생하는
가스(수소뿐만 아니라 이산화탄소와 메탄)와 환경의 산성화는
복부 팽만, 복부 팽창, 장 경련(4)과 같은 증상을 유발합니다.
다만 LNP 개인 간 변이가 관찰되었습니다(20, 117, 120, 142). 또한 LP 개인의 경우 최대 8%의 유당이 대장에 도달할 수 있어, 간접 검사를 통해 LP와 LNP 개인을 구분하는 데 일부 추가적인 어려움이 발생합니다(14).
Figure 2 The fate of milk and lactose in the human body. Lactose is first hydrolyzed by human or bacterial lactase enzymes and then fermented in the large intestine (colon) by lactic acid bacteria. Likely benefits of milk are shown in blue; potential harmful effects are shown in red. Abbreviations: LP, lactase persistence; LNP, lactase nonpersistence.
그림 2 인간 몸속에서 우유와 유당의 운명.
유당은 인간 또는 세균의 유당 분해 효소에 의해 먼저 가수분해된 후
대장(결장)에서 젖산균에 의해 발효됩니다.
우유의 잠재적 이점은 파란색으로,
잠재적 유해 효과는 빨간색으로 표시되었습니다.
약어: LP, 유당 분해 효소 지속; LNP, 유당 분해 효소 비지속.
Click to view
Despite their limitations, these phenotypic assays, performed individually or in combination, have been used since the 1960s to describe the prevalence of LP across the globe. Notably, worldwide frequencies have been compiled in several reviews (50, 58, 61, 63, 114, 126, 132) and suggest that approximately a third of humans are lactase persistent. The global picture is one of a large area of high frequencies in Europe and patchier distributions of high frequencies in West Africa, East Africa, the Middle East, and South Asia ( Figure 1 ).
More specifically, populations (from which at least ten individuals have been sampled) that have an LP frequency higher than 75% are found as follows:
In addition, in Pakistan, the results of the only two phenotypic studies are inconsistent with each other: Rab & Baseer (106) found high LP frequencies (80–100%) in five ethnic groups from the south (4–15 individuals per population), whereas Ahmad & Flatz (1) found lower frequencies (30–42%) in four similar ethic groups from the north (8–132 individuals per population). Based on genetic data, Gallego Romero et al. (41) inferred that the LP frequency in India was at most 74% among water buffalo herders from the north of the country.
The distribution of the LP frequency across populations ( Figure 1 ) shows that some populations have a frequency of approximately 80%, implying that LP was strongly beneficial in these populations, whereas many others have a frequency of approximately 20%, more likely reflective of admixture with neighboring high-LP populations.
이러한 형질적 검사는 한계가 있음에도 불구하고, 단독으로 또는 조합하여 수행되어 1960년대부터 전 세계적으로 LP의 유병률을 설명하는 데 사용되어 왔습니다. 특히, 전 세계적 빈도는 여러 리뷰(50, 58, 61, 63, 114, 126, 132)에서 정리되었으며, 약 3분의 1의 인간이 락타아제 지속성을 가지고 있음을 시사합니다. 전 세계적 상황은 유럽에서 높은 빈도의 넓은 지역과 서아프리카, 동아프리카, 중동, 남아시아에서 높은 빈도의 불규칙한 분포를 보여줍니다( Figure 1 ).
더 구체적으로, 최소 10명의 개인이 채취된 인구 집단 중 유전형 빈도가 75%를 초과하는 지역은 다음과 같습니다:
또한 파키스탄에서 수행된 유일한 두 개의 형질 연구 결과는 서로 일치하지 않습니다: Rab & Baseer (106)는 남부 지역 5개 민족 집단에서 높은 LP 빈도(80–100%)를 보고했으며(집단당 4–15명), 반면 Ahmad & Flatz (1)는 북부 지역 4개 유사 민족 집단에서 낮은 빈도(30–42%)를 보고했습니다(집단당 8–132명). 유전적 데이터에 따르면 Gallego Romero 등(41)은 인도 북부 수소 목축민에서 LP 빈도가 최대 74%였다고 추론했습니다.
집단별 LP 빈도 분포( Figure 1 )는 일부 집단에서 약 80%의 빈도를 보여 LP가 이 집단에서 강하게 유익했음을 시사하며, 반면 많은 다른 집단에서는 약 20%의 빈도를 보여 이웃한 고 LP 집단과의 혼혈을 반영할 가능성이 높습니다.
The Genetic Basis of Lactase Persistence
In parallel with the characterization of the LP phenotype, the question arose of whether this trait is genetically encoded or adaptively induced by the environment (e.g., by prolonged milk consumption). After a decade-long controversy, it was finally demonstrated, based on family studies, twin studies, studies of individuals who had divergent genetic ancestries but lived in the same environment, and studies of individuals with admixed genetic ancestry, that LP is an autosomal dominant trait (37, 46, 65, 79, 88, 112, 115). But not until two decades later was the first causal molecular change associated with lactase expression in adulthood finally discovered.
In 2002, a study of Finnish families by Enattah et al. (30) identified the first mutation associated with the LP phenotype: −13.910:C>T (rs4988235). It is surely because this regulatory mutation is located 14 kb upstream of the lactase (LCT) gene, in intron 13 of the minichromosome maintenance complex component 6 (MCM6) gene, and not within LCT or immediately upstream of it, that its discovery took so long. In vitro studies (using the human intestinal cell line Caco-2 with luciferase expression vectors) quickly confirmed that this variant is a cis-acting enhancer of the LCT promoter (97, 139), and this property was demonstrated in vivo more recently (33). A further study showed that the −13.910:T variant creates a new binding site for octamer-binding protein 1 (Oct-1), a transcription factor that interacts with human hepatocyte nuclear factor 1α (HNF1α) to bind to the LCT promoter (77). This allele therefore leads to an alternative path for LCT expression that is not downregulated, as the original path is. Interestingly, in vivo studies further revealed that, even though the LP phenotype is considered a binary trait encoded in a dominant manner, lactase activity is instead a codominant quantitative trait, with a clear trimodal distribution (56, 143). This difference in expression between homozygous and heterozygous likely has a minor effect on the ability to efficiently break down lactose but, as suggested by Swallow (132), it could become important under certain conditions, such as stress or disease.
Enattah et al. (30) also reported another mutation associated with the LP phenotype: −22.018:G>A (rs182549), located in intron 9 of MCM6. In Finnish families, −22.018 is in complete linkage disequilibrium with −13.910, and in a sample set of 236 individuals from four populations (Finnish, Italian, German, and Korean), −13.910:T is perfectly associated with LP, whereas all 7 recombinant individuals with genotypes −13.910:C/C and −22.018:G/A are lactase nonpersistent (30), suggesting that −22.018:A is not able to drive LP by itself. Actually, −13.910 and −22.018 might interact epistatically; in vitro studies have shown that the −22.018 region is a weak silencer of the enhancer activity driven by −13.910 (97, 138, 139), and −13.910:T is in very strong linkage disequilibrium with −22.018:A worldwide (31).
As it turns out, −13.910:T is not only a European mutation; it also underlies the LP phenotype all over Asia, including in various populations from Russia, Pakistan, and Iran (31); Central Asia (prevalence of 30% in herders) (55); and India (prevalence of up to 45% in herders from the north) (41). It is also the main LP-associated mutation in Mozabites (27%) (107) and other Berber populations (15–22%) (10, 90) from North Africa, as well as in Fula (37–48%) from both Central Africa (Cameroon and Mali) (60, 80, 107) and East Africa (Sudan) (31). One major haplotype has been found to carry the −13.910:T mutation from Europe to Asia to North Africa, showing that it rapidly spread through gene flow. However, other divergent haplotypes have also been found in a restricted geographical area around the Volga, which could be explained by an independent (convergent) appearance of the −13.910:T allele in this region (31) or by multiple recombination or gene conversion events between haplotypes.
In all of Eurasia, therefore, LP appears to be monogenic. The situation is different in East Africa, where four different mutations have been found to be associated with LP: −13.907:C>G (rs41525747), −13.915:T>G (rs41380347), −14.009:T>G (ss820486563), and −14.010:G>C (rs145946881), all of which cluster in intron 13 of MCM6, within a 100-base-pair interval of each other that includes −13.910 (60, 62, 134). All of these mutations result in an increase of lactase activity in vitro (62, 66, 134), and two of them, −13.907:G and −13.915:G, seem to affect binding of Oct-1 (29, 96). While −14.010:G is most prevalent in various Afro-Asian and Nilo-Saharan pastoralists (or agropastoralists) from East Africa (32–46%) (134) and in pastoralists from South Africa (13–20%) (16, 136), and both −13.907:G and −14.009:G are most prevalent among the Beja people of Sudan (21% and 24%, respectively) (62, 107), −13.915:G is the most common variant in camel herders from the Middle East (72–88%) (29, 59, 104).
Other, less common mutations have also been proposed to be associated with LP. Notably, −13.913:T>C was found in one Ethiopian (60), one Jordanian (29), and up to 7.5% of some South African populations (16) but was not confirmed as significantly associated with LP in a larger study of Ethiopian individuals (66). Some very low-frequency variants, such as −14.009:C>G and −14.011:C>T, were further shown to influence lactase activity in vitro (78).
Thus, at least five variants clearly underlie the LP phenotype: −13.910:T (combined with −22.018:A) in most of Eurasia, North Africa, and Central Africa; −13.915:G, mostly in the Middle East; and −13.907:G, −14.009:G, and −14.010:C, mostly in East Africa (see Table 1 ). Interestingly, in Ethiopian populations, all of these mutations coexist, resulting in a higher diversity at this locus in LP as compared with LNP individuals (62, 66). The LP phenotype is therefore a beautiful illustration of regulatory convergent evolution, in which nearby variants underlying the same phenotypic change arose in different populations. Such a parallel increase in the frequency of multiple alleles in different geographical areas clearly suggests the action of natural selection.
락타아제 지속성의 유전적 기반
LP 표현형의 특성화와 함께,
이 특성이 유전적으로 암호화되어 있는지,
아니면 환경에 의해 적응적으로 유도되는지(예: 장기간의 우유 섭취)에 대한 의문이 제기되었습니다.
10년에 걸친 논쟁 끝에, 가족 연구, 쌍둥이 연구, 유전적 조상이 다르지만 동일한 환경에서 생활한 개인 연구, 혼합 유전적 조상을 가진 개인 연구를 바탕으로 LP가 상염색체 우성 형질임을 최종적으로 입증되었습니다(37, 46, 65, 79, 88, 112, 115). 그러나 성인기 락타아제 발현과 관련된 첫 번째 원인 분자적 변화는 20년 후에야 발견되었습니다.
2002년 Enattah 등(30)의 핀란드 가족 연구에서 LP 표현형과 관련된 첫 번째 돌연변이가 식별되었습니다: −13.910:C>T (rs4988235). 이 조절 돌연변이가 락타아제(LCT) 유전자로부터 14kb 상류에 위치한 미니크로모소ーム 유지 복합체 구성 요소 6(MCM6) 유전자의 13번 인트론에 있으며, LCT 유전자 내 또는 그 바로 상류에 위치하지 않았기 때문에 그 발견이 오래 걸렸을 것입니다. 체외 실험(인간 장 세포 라인 Caco-2와 루시페라제 발현 벡터를 사용)은 이 변이가 LCT 프로모터의 cis-작용 증강자임을 신속히 확인했습니다(97, 139), 이 특성은 최근 체내에서도 입증되었습니다(33). 추가 연구에서 −13.910:T 변이가 인간 간세포 핵 인자 1α(HNF1α)와 상호작용하여 LCT 프로모터에 결합하는 옥타머 결합 단백질 1(Oct-1)의 새로운 결합 부위를 생성한다는 것이 밝혀졌습니다(77). 따라서 이 대립 유전자는 원래 경로와 달리 LCT 발현이 하향 조절되지 않는 대체 경로를 유도합니다. 흥미롭게도, 생체 내 연구에 따르면 LP 표현형은 우성 방식으로 암호화된 이분형 형질로 간주되지만, 락타아제 활성은 대신 명확한 삼분형 분포를 보이는 공동 우성 형질입니다 (56, 143). 동형접합체와 이형접합체 간의 발현 차이는 유당 분해 효율에 미미한 영향을 미칠 가능성이 있지만, Swallow(132)가 제안한 대로 스트레스나 질병과 같은 특정 조건 하에서는 중요해질 수 있습니다.
Enattah 등(30)은 LP 형질과 관련된 또 다른 변이체를 보고했습니다: −22.018:G>A (rs182549)는 MCM6 유전자의 9번 인트론에 위치합니다. 핀란드 가족에서 −22.018은 −13.910과 완전한 연관 불균형에 있으며, 핀란드, 이탈리아, 독일, 한국 4개 인구 집단에서 236명을 대상으로 한 표본 집합에서 −13.910:T는 LP와 완벽히 연관되어 있으며, −13.910:C/C 및 −22.018:G/A 유전자형을 가진 7명의 재조합 개체 모두 락타아제 비지속성(30)을 나타내며, 이는 −22.018:A가 단독으로 LP를 유발하지 못함을 시사합니다. 실제로 −13.910과 −22.018은 에피스타틱 상호작용을 일으킬 수 있습니다. 체외 연구에서 −22.018 지역은 −13.910에 의해 유도되는 증강자 활성을 약하게 억제하는 것으로 나타났으며 (97, 138, 139), −13.910:T는 전 세계적으로 −22.018:A와 매우 강한 연관 불균형에 있습니다 (31).
사실, −13.910:T는 유럽 변이체만이 아닙니다. 이 변이체는 러시아, 파키스탄, 이란(31)의 다양한 인구 집단; 중앙아시아(목축민에서 30%의 유병률)(55); 인도(북부 목축민에서 최대 45%의 유병률)(41)를 포함한 아시아 전역의 LP 표현형을 뒷받침합니다. 또한 모자비트(Mozabites)의 27%(107)와 북아프리카의 다른 베르베르 인구 집단(15–22%)(10, 90), 중앙 아프리카(카메룬과 말리)(60, 80, 107)와 동아프리카(수단)(31)의 풀라(Fula) 인구 집단에서 LP와 관련된 주요 변이입니다. 유럽에서 아시아를 거쳐 북아프리카까지 −13.910:T 변이를 운반하는 주요 한 가지 haplotype이 발견되었으며, 이는 유전자 유동으로 인해 빠르게 확산되었음을 보여줍니다. 그러나 볼가 강 주변의 제한된 지리적 지역에서 다른 분기된 haplotype이 발견되었으며, 이는 이 지역에서 −13.910:T 알레일의 독립적(수렴적) 출현(31) 또는 haplotype 간 다중 재조합 또는 유전자 전환 사건으로 설명될 수 있습니다.
따라서 유라시아 전역에서 LP는 단일 유전자에 의해 결정되는 것으로 보입니다. 동아프리카에서는 LP와 연관된 네 가지 다른 변이가 발견되었습니다: −13.907:C>G (rs41525747), −13.915:T>G (rs41380347), −14.009:T>G (ss820486563), 및 −14.010:G>C (rs145946881)로, 모두 MCM6 유전자의 13번 인트론 내 100베이스 페어 간격 내에 위치하며, 이 중 −13.910은 (60, 62, 134)에 포함됩니다. 이 모든 변이는 체외에서 락타아제 활성 증가를 유발합니다 (62, 66, 134), 그 중 −13.907:G와 −13.915:G는 Oct-1 결합에 영향을 미치는 것으로 보입니다 (29, 96). −14.010:G는 동아프리카의 다양한 아프리카-아시아 및 니로-사하라 목축민(또는 농업목축민)에서 가장 널리 분포하며(32–46%)(134) 및 남아프리카의 목축민에서(13–20%)(16, 136), −13.907:G와 −14.009:G는 수단의 베자족에서 각각 21%와 24%로 가장 널리 분포합니다(62, 107), −13.915:G는 중동 지역의 낙타 목축민에서 72–88%로 가장 흔한 변이체입니다(29, 59, 104).
기타 덜 흔한 변이도 LP와 연관될 수 있다고 제안되었습니다. 특히 −13.913:T>C는 에티오피아인 1명(60), 요르단인 1명(29), 일부 남아프리카 인구에서 최대 7.5%(16)에서 발견되었지만, 에티오피아인을 대상으로 한 대규모 연구에서는 LP와의 유의미한 연관성이 확인되지 않았습니다(66). 매우 낮은 빈도의 변이체, 예를 들어 −14.009:C>G와 −14.011:C>T는 체외 실험에서 락타아제 활성에 영향을 미치는 것으로 추가로 확인되었습니다 (78).
따라서 LP 표현형을 명확히 뒷받침하는 변이체는 최소 다섯 가지입니다: −13.910:T (−22.018:A와 결합)는 유라시아, 북아프리카, 중앙아프리카 대부분에서; −13.915:G는 주로 중동에서; −13.907:G, −14.009:G, 및 −14.010:C는 주로 동아프리카에서 발견됩니다(표 1 참조). 흥미롭게도 에티오피아 인구에서는 이러한 모든 변이가 공존하며, 이는 LP에서 LNP 개인에 비해 이 유전자좌에서 더 높은 다양성을 보여줍니다 (62, 66). 따라서 LP 형질은 동일한 형질 변화를 유발하는 가까운 변이가 서로 다른 인구에서 독립적으로 발생했다는 규제 수렴 진화의 아름다운 예시입니다. 서로 다른 지리적 지역에서 다중 알레르의 빈도가 동시에 증가하는 것은 자연 선택의 작용을 명확히 시사합니다.
Table 1
Known lactase persistence (LP)–associated mutations and their main geographical areas of repartition
Toggle display: Table 1 Open Table 1 fullscreen
LP-associated mutationMain geographical area of repartition
| −13.910:T (combined with −22.018:A) | Eurasia, North Africa, and Central Africa |
| −13.915:G | Middle East |
| −13.907:G | East Africa (Ethiopia and Sudan) |
| −14.009:G | East Africa (Ethiopia and Sudan) |
| −14.010:C | East Africa (Kenya and Tanzania) and South Africa |
Population Genetic Evidence for Natural Selection
Soon after the discovery of the first LP-associated mutation, Bersaglieri et al. (10) showed that haplotypes carrying the −13.910:T variant present typical characteristics of recent and local positive selection. Indeed, in Europe, −13.910:T is located on an unusually long stretch of homozygous markers given its frequency (i.e., a haplotype block of >1 Mb). This observation is unexpected under neutrality, because high-frequency alleles are usually old, and therefore, owing to recombination, they are typically surrounded by short haplotype blocks (94). In addition, the difference in frequency among populations, as measured by the fixation index, is significantly larger than expected under neutrality: The authors calculated it to be 0.53 in 53 worldwide populations, exceeding 99.9% of the values for genome-wide single-nucleotide polymorphisms (10). The authors also estimated the strength and timing of selection from a sample of European Americans and found that the 13.910:T allele arose 2,188–20,650 years before present (BP) and was favored with a selection coefficient of 0.014–0.15. Although these confidence intervals are large, they are consistent with a recent (Neolithic) selection of strong intensity, as confirmed by additional estimates ( Table 2 ).
자연 선택에 대한 인구 유전학적 증거
첫 번째 LP 관련 돌연변이가 발견된 직후, Bersaglieri 등(10)은 −13.910:T 변이체를 보유한 haplotype이 최근 및 지역적 긍정적 선택의 전형적인 특성을 보임을 보여주었습니다. 실제로 유럽에서 −13.910:T는 빈도 대비 이례적으로 긴 동형접합 마커 구간(즉, >1 Mb의 염색체 블록)에 위치해 있습니다. 이 관찰 결과는 중립성 하에서는 예상되지 않습니다. 왜냐하면 고빈도 알레르는 일반적으로 오래되었으며, 따라서 재조합으로 인해 짧은 염색체 블록으로 둘러싸여 있기 때문입니다(94). 또한, 인구 간 빈도 차이를 고정 지수로 측정했을 때 중립성 하에서 예상되는 값보다 유의미하게 큽니다: 저자들은 전 세계 53개 인구에서 이를 0.53으로 계산했으며, 이는 전장 단일핵산 다형성(SNP)의 99.9% 이상을 초과합니다(10). 저자들은 유럽계 미국인 표본을 통해 선택의 강도와 시기를 추정했으며, 13.910:T 알레일이 현재로부터 2,188–20,650년 전에 발생했으며, 선택 계수 0.014–0.15로 선호되었다고 밝혔습니다. 이 신뢰 구간은 넓지만, 추가 추정( 표 2 )에 의해 확인된 것처럼 최근(신석기 시대)에 강한 강도로 진행된 선택과 일치합니다.
Table 2
Estimated selection coefficients of various lactase persistence (LP)–associated mutations in different studies
Toggle display: Table 2 Open Table 2 fullscreen
Study (reference)PopulationMutatixxxxonSelection coefficient (95% CI)Timing of selection (95% CI) (BP)
| Bersaglieri et al. 2004 (10) | European American | −13.910:T | 0.014–0.15 | 2,188–20,650 |
| Finnish/Swedish | −13.910:T | 0.09–0.19 | 1,625–3,188 | |
| Tishkoff et al. 2007 (134) | Kenya-Nilo-Saharan (lowest) | −14.010:C | 0.035 (0.008–0.080) | 6,925 (2,232–18,496) |
| Tanzania-Niger (highest) | −14.010:C | 0.077 (0.026–0.142) | 2,778 (1,219–6,049) | |
| European American | −13.910:T | 0.039 (0.012–0.107) | 9,323 (2,231–19,228) | |
| Enattah et al. 2008 (29) | Saudi Arabian | −13.915:G | 0.051 (0.034–0.101) | 4,075 (2,050–6,100) |
| European American | −13.910:T | 0.048 (0.044–0.055) | 5,575 (4,950–6,200) | |
| Western Finnish | −13.910:T | 0.043 (0.039–0.049) | 5,200 (4,625–5,775) | |
| Itan et al. 2009 (64) | European | −13.910:T | 0.095 (0.052–0.159) | 7,441 (6,256–8,683) |
| Gerbault et al. 2009 (44) | European | −13.910:T | 0.012 (0.008–0.018) | Not estimated |
| Peter et al. 2012 (101) | Finnish | −13.910:T | 0.025 (0.004–0.20) | 11,200 (1,500–64,900) |
All coefficients assume a dominant model for genotype-phenotype association. Gerbault et al. (44) did not estimate the timing of selection, because the dates were taken from archeological data as a parameter of the model, and those vary between 7,000 and 8,000 BP. Abbreviations: BP, years before present; CI, confidence interval.
Similar signatures of positive selection have been found in East Africans, with average homozygous tracts of approximately 1.8 Mb, 1.4 Mb, and 1.1 Mb for −14.014:C, −13.907:G, and −13.915:G, respectively (134). The iHS scores of each variant, a statistic that reflects the haplotype structure (141), were shown to be highly unusual when compared with an empirical distribution of the rest of the genome. The authors further estimated that −14.010:C has been under selection since between 2,778 and 6,925 BP (95% confidence intervals: 1,219–6,049 and 2,232–18,496 BP, respectively), depending on the population ( Table 2 ). They in turn calculated the selection coefficients to be 0.077 and 0.035 (95% confidence intervals: 0.026–0.142 and 0.008–0.080, respectively). A comparison with the numbers obtained in Europeans with the same method suggests that the selective advantage of LP is similar in Europe and Africa (but see 119) and that the timing of selection might be a bit older in Europe, which is consistent with archeological records of pastoralism on both continents (see sidebar titled Dairying Animal Domestication in the Old World). Finally, Enattah et al. (29) also found that the haplotype structure in the Middle East deviates significantly from neutrality, with an estimated date of selection of 4,075 BP (95% confidence interval: 2,050–6,100 BP) and a selection coefficient of 0.051 (95% confidence interval: 0.034–0.101), similar to estimates obtained with the same method in Europeans ( Table 2 ). The genetic signatures of selection discussed above are expected under a hard-sweep model, in which only one allele underlies the selected trait and the allele was favored after its introduction into the population (130). However, it might be harder to infer the strength of selection in populations such as Ethiopians or Sudanese, where multiple alleles coexist, potentially resulting in a radically different signature of a soft sweep (62, 66, 107). In any case, it seems that the selection of LP occurred recently and concomitantly in different continents soon after the beginning of cattle and camel domestication (see sidebar titled Dairying Animal Domestication in the Old World).
동아프리카인에서 유사한 자연선택의 흔적이 발견되었으며, −14.014:C, −13.907:G, −13.915:G에 대한 평균 동형접합 구간은 각각 약 1.8 Mb, 1.4 Mb, 1.1 Mb로 보고되었습니다 (134). 각 변이의 iHS 점수(haplotype 구조를 반영하는 통계량, 141)는 게놈의 나머지 부분의 경험적 분포와 비교할 때 매우 이례적인 것으로 나타났습니다. 저자들은 −14.010:C가 2,778~6,925 BP(95% 신뢰 구간: 1,219–6,049 및 2,232–18,496 BP) 동안 선택 압력을 받았다고 추정했습니다( 표 2 ). 그들은 또한 선택 계수를 각각 0.077과 0.035(95% 신뢰 구간: 0.026–0.142 및 0.008–0.080)로 계산했습니다. 유럽인에서 동일한 방법으로 얻은 수치와의 비교는 LP의 선택적 우위가 유럽과 아프리카에서 유사하다는 것을 시사합니다(하지만 119 참조), 그리고 선택의 시기가 유럽에서 약간 더 오래되었을 수 있으며, 이는 두 대륙의 목축업에 대한 고고학적 기록과 일치합니다(부록 “구세계의 유제품 동물 domestication” 참조). 마지막으로, Enattah 등(29)은 중동 지역의 haplotype 구조가 중립성에서 유의미하게 벗어나며, 선택 추정 시점은 4,075 BP(95% 신뢰 구간: 2,050–6,100 BP)이고 선택 계수는 0.051(95% 신뢰 구간: 0.034–0.101)로, 동일한 방법으로 유럽인에서 얻은 추정치와 유사합니다( 표 2 ). 위에서 논의된 선택의 유전적 흔적은 단일 알레일이 선택된 특성을 결정하고 해당 알레일이 인구 내 도입 후 선호된 ‘하드 스윙 모델’ 하에서 예상됩니다(130). 그러나 에티오피아인이나 수단인 같은 인구 집단에서는 다중 알레일이 공존하기 때문에, 선택의 강도를 추론하는 것이 더 어려울 수 있으며, 이는 소프트 스윙의 완전히 다른 흔적을 초래할 수 있습니다(62, 66, 107). 어쨌든, LP의 선택은 소와 낙타의 domestication 시작 직후에 서로 다른 대륙에서 동시에 최근에 발생했을 것으로 보입니다(부록 ‘구세계의 유제품 동물 domestication’ 참조).
| DAIRYING ANIMAL DOMESTICATION IN THE OLD WORLD The first evidence of dairying animal domestication comes from Anatolia, where goats, sheep, and cattle were domesticated around 10,500 years before present (BP) (82, 140). These species then spread to Europe around 9,000 BP (140) and to Africa around 7,000 BP (8), following the migration of human farmers. In the Indus Valley, the zebu was domesticated around 8,000 BP, followed by the dairy buffalo around 4,500 BP (82, 140). The yak was probably domesticated in Tibet around 4,500 BP (105). Ungulates have also been domesticated for dairying, including the donkey (111) in Arabia or East Africa around 6,000 BP, the camel in Central Asia around 5,000 BP, and the dromedary in Arabia around 3,000 BP (98). Evidence for horse domestication has been more difficult to obtain owing to the high morphological similarity between wild and domesticated horses (148), but remains in Kazakhstan show that horses were harnessed around 5,500 BP (99). The domestication of the reindeer seems to date only from 2,500 BP at the earliest, and its domestication is ongoing (109). 구세계의 유제품 생산 동물 domestication 유제품 생산 동물 domestication의 첫 번째 증거는 아나톨리아에서 발견되었으며, 염소, 양, 소가 약 10,500년 전(BP)에 domesticated되었습니다(82, 140). 이 종들은 인간 농민들의 이동에 따라 약 9,000 BP에 유럽으로(140), 약 7,000 BP에 아프리카로(8) 확산되었습니다. 인더스 계곡에서는 약 8,000 BP에 제부우가 가축화되었으며, 약 4,500 BP에 유제품용 물소가 가축화되었습니다(82, 140). 야크는 티베트에서 약 4,500 BP에 domesticated되었습니다 (105). 유제류 동물도 유제품 생산을 위해 domesticated되었으며, 아라비아나 동아프리카에서 약 6,000 BP에 당나귀 (111), 중앙아시아에서 약 5,000 BP에 낙타, 아라비아에서 약 3,000 BP에 단봉낙타 (98)가 domesticated되었습니다. 말의 가축화 증거는 야생 말과 가축화된 말의 높은 형태학적 유사성(148)으로 인해 얻기 어려웠지만, 카자흐스탄의 증거는 약 5,500 BP에 말이 멍에를 메기 시작했음을 보여줍니다(99). 순록의 가축화는 가장 이른 시기로 약 2,500 BP로 추정되며, 현재도 진행 중입니다(109). |
How strong are these selection coefficients? For comparison, other studies of strongly favored loci in humans have estimated selection coefficients to be (a) between 0.04 and 0.09 for genes associated with resistance to malaria (53, 135), (b) approximately 0.03 for genes associated with skin pigmentation in Europeans (145), (c) between 0.002 and 0.029 for genes associated with hypoxia response to high altitude in Tibetan populations (6), (d) 0.036 for the alcohol dehydrogenase 1B (ADH1B) gene associated with alcohol metabolism in East Asians (101), and (e) 0.14 for a signal on the ectodysplasin A receptor (EDAR) gene, which is involved in the development of hair follicles and associated with an increase in eccrine sweat glands in East Asians (101). Therefore, the signal around LCT represents one of the strongest examples of positive selection on the human genome.
이러한 선택 계수는 얼마나 강할까요? 비교를 위해 인간에서 강하게 선호되는 유전자좌에 대한 다른 연구에서는 선택 계수를 (a) 말라리아 저항성과 관련된 유전자에서 0.04에서 0.09 사이(53, 135), (b) 유럽인의 피부 색소 침착과 관련된 유전자에서 약 0.03(145), (c) 티베트 인구에서 고고도 저산소 환경에 대한 유전자 반응과 관련된 유전자에서 0.002에서 0.029 사이(티베트 인구에서 고고도 저산소 반응과 관련된 유전자, 6), (d) 동아시아인의 알코올 대사 관련 알코올 탈수소효소 1B(ADH1B) 유전자에서 0.036(101), (e)14는 동아시아인의 모낭 발달과 관련이 있으며 eccrine 땀샘 증가와 연관된 ectodysplasin A 수용체 (EDAR) 유전자에 대한 신호입니다 (101). 따라서 LCT 주변의 신호는 인간 유전체에서 긍정적 선택의 가장 강력한 예시 중 하나를 대표합니다.
THE NATURE OF THE SELECTIVE PRESSURES
Given the evidence of positive selection on the LP phenotype, a natural question is, What drove selection to favor similar regulatory shifts in multiple populations worldwide? As was noticed early on, there is a clear correlation between the milk-drinking habits of populations and their LP frequency, leading to the cultural-historical hypothesis (86, 87, 123, 124). However, the directionality is unclear: Did human populations first start drinking milk in the absence of LP-associated mutations, or were milk-drinking practices favored in populations that already had, in low frequency, LP-associated mutations? If we consider the mutational target of LP-associated mutations to be the Oct-1 binding site, which is 13 base pairs long (60), and use a theta of 0.1% and a generation time of 30 years, the waiting time for a new mutation to arise (let alone reach a substantial frequency) is approximately 7,000 years. This seems too long given that the domestication of dairying animals is no more than 10,000 years old and that two mutations are observed at high frequencies in the Oct-1 binding site: −13.910:T and −13.915:G. One possibility is that the effective population size of humans was much larger in the recent past, decreasing the waiting time for a new mutation. Alternatively, the mutational target might be larger than 13 base pairs; although LP-associated mutations are highly clustered (in a 100-base-pair region), −13.907:G, which is outside the predicted binding site, affects Oct-1 binding (60). Finally, some LP-associated mutations might have already been present at low frequency before animal domestication.
More generally, the validity of the cultural-historical hypothesis has often been questioned (28, 37, 43, 50), and in particular it is unclear whether and (if so) why the ability to consume milk as an adult created such a differential fitness between LP and LNP individuals. The first possibility is that all individuals had an advantage of consuming milk because it is a rich source of proteins, fat, minerals, and vitamins, but only LP individuals could benefit from these without symptoms. There is indeed a cost of drinking milk for LNP individuals, with diarrhea being the most likely cause of a selective disadvantage. However, lactose intolerance symptoms have been investigated by giving 50 g of lactose (equivalent to 1 L of milk) to fasting individuals, and most LNP individuals seem to tolerate moderate amounts of lactose (such as one glass of milk, or up to 15 g of lactose) without any symptoms, especially if they slow down the transit time by consuming other foods (120, 122, 131). Furthermore, the occurrence of diarrhea and the severity of intestinal symptoms depend on the fermentation ability of each individual, which varies widely (51, 52, 149) and depends mostly on the composition of their colonic microbiota (the sum of all microorganisms present in the colon) (4, 81). For example, if colonic bacteria are efficient at fermenting lactose, then the osmotic shock (which leads to diarrhea) will be reduced, but the quantity of gases might increase. In parallel, if methanogenic archaebacteria are prevalent, then carbon dioxide will be transformed mostly into methane, leading to constipation rather than diarrhea, as observed in 30% of individuals (19, 81). In addition, consumption of dairy products by LNP individuals can influence the colonic microbiota composition and lead to a reduction of intestinal symptoms (54, 133). Therefore, it seems that LNP individuals could be able to consume milk in small amounts spread throughout the day, especially if it is taken together with other foods.
LNP individuals can also benefit from proteins and fat if they eat derived dairy products that are low in lactose (35, 81). Notably, milk can be fermented to produce yogurt or various fermented beverages, in which lactose is partially transformed by bacteria and/or yeasts, or it can be processed to obtain cheese, cream, and butter, in which lactose is almost entirely physically eliminated after protein coagulation. Finally, LNP individuals seem to be able to benefit from some amount of glucose when ingesting fermented products such as yogurts, as the lactic acid bacteria present in these products release bacterial lactase, and therefore a certain amount of lactose can be hydrolyzed during intestinal digestion (70), though it is not clear exactly how much this represents.
What is clear, however, is that LNP individuals cannot derive large amounts of glucose from any dairy products, as lactose, representing about 30% of the calories in human milk (17), is the sole sugar in milk. Flatz & Rotthauwe (37), however, argued that carbohydrates were not scarce in human societies, and they therefore believed that the selective pressure could not be caused by an “unspecific” caloric intake. Although some have proposed that this energetic advantage of milk was most strongly selected for during famines and drought (28, 43), this is not trivially the case, as milk production itself is affected by a scarcity of food for livestock and is therefore minimal in times of paucity (9, 13, 34). Fresh milk therefore might not constitute a realistic alternative in times of food shortages.
Because of these arguments, and because of the imperfect correlation between the degree of pastoralism and LP frequency (with LP being disproportionally present in Europeans), some researchers have proposed alternative explanations for the selective advantage of LP. Notably, Flatz & Rotthauwe (38) proposed the calcium assimilation hypothesis, in which milk would be an important source of calcium, vitamin D, and lactose, the latter two of which facilitate calcium adsorption in humans. This property would be especially advantageous in farmers from high latitudes, such as Europeans, who have a low dietary supply of vitamin D (owing to their shift to cereals) and experience low levels of UV irradiation (which stimulates the production of vitamin D). However, recent studies have challenged the view that lactose facilitates calcium adsorption in humans, proposing instead that LNP individuals could benefit from more calcium when eating derived dairy products than LP individuals do from consuming fresh milk (73).
Other authors proposed that milk might have represented a precious source of uncontaminated water and electrolytes, especially in populations inhabiting dry arid environments and facing, for example, cholera, such as some African and Middle Eastern pastoralists (21, 22). This hypothesis and the calcium assimilation one explain LP only in specific parts of the world, so a combination of them is required to explain the overall pattern. To disentangle these hypotheses and gain some understanding of the selective pressures responsible for the physiological differences among human populations, Holden & Mace (58) performed a joint analysis of LP phenotypic data in 62 populations together with data on dependence on pastoralism, levels of UV irradiation, and intensity of drought, controlling for the phylogenetic relationship between populations. Their conclusion was that the data are concordant only with the cultural-historical hypothesis, because LP frequencies significantly correlate with pastoralism but not with the two other factors.
A more recent study by Itan et al. (64) simulated selection in combination with underlying demographic processes in Europe and reached the same conclusion, finding no evidence that selection is stronger in high latitudes than in lower ones. The authors further estimated the origin of selection to be between central Europe and the northern Balkans 7,500 years ago. By contrast, Gerbault et al. (44) concluded that the most likely scenario in Europe is one of demic diffusion of farming together with an advantage of calcium assimilation in high latitudes, while favoring the cultural-historical hypothesis in Africa. In summary, it appears that LP frequencies in Africa are generally consistent with the cultural-historical hypothesis, whereas in Europe, results on the respective influences of pastoralism, latitude, and demography remain conflicting. Beyond studies of human genetic data, the geographical concordance between human LP frequencies and the genetic diversity of six milk genes in European cattle breeds (7) strongly supports the idea that pastoralism played a primary role in the increase in LP frequency in Europe.
And what about Asia, which is home to the largest populations of herders, the steppe populations? To address this question, we reanalyzed the data from Holden & Mace (58), analyzing separately populations from Africa, Europe, and Asia (a subset of 54 out of 62 total populations) and adding 14 populations for which LP frequencies were available in the Global Lactase Persistence Association Database (GLAD; http://www.ucl.ac.uk/mace-lab/resources/glad ) (63) and for which we were able to find information on the proportions of pastoralism in Murdock's “Ethnographic Atlas: A Summary” (89). As expected, we found a highly significant correlation between LP and pastoralism in the worldwide data set (68 populations, Pearson coefficient = 0.47, p = 5×10−5) and within Africa (40 populations, Pearson coefficient = 0.57, p = 1×10−4). However, the correlation was not significant within Europe, Asia, or Eurasia as a whole (10, 18, and 28 populations, respectively; Pearson coefficient = 0.18, 0.01, and 0.10, respectively; p > 0.61) (see Figure 3 ). Even though we did not take into account the phylogenetic relationship between populations, as was done in the original paper (58), our findings suggest that the observed worldwide correlation could be driven mostly by African populations. Furthermore, we can see that Europeans have a high LP frequency despite moderate ancestral levels of pastoralism (because they are traditionally agropastoral populations that derive a considerable amount of energy from domesticated plants), whereas Asians all have low LP frequencies even though some populations have relied heavily on pastoralism for millennia. Why, then, are these Asian herders lactase nonpersistent?
선택적 압력의 본질
LP 형질에 대한 긍정적 선택의 증거를 고려할 때, 자연스럽게 제기되는 질문은 다음과 같습니다. 전 세계 여러 인구 집단에서 유사한 조절 변화가 선택적으로 선호되도록 한 요인은 무엇인가? 초기부터 지적된 바와 같이, 인구집단의 우유 섭취 습관과 LP 빈도 사이에 명확한 상관관계가 존재하며, 이는 문화-역사적 가설(86, 87, 123, 124)을 제시합니다. 그러나 방향성은 명확하지 않습니다: 인간 인구집단이 LP 관련 돌연변이가 없는 상태에서 먼저 우유 섭취를 시작했는지, 아니면 이미 낮은 빈도로 LP 관련 돌연변이를 가진 인구집단에서 우유 섭취 습관이 선호되었는지? LP 관련 돌연변이의 표적 부위를 Oct-1 결합 부위(13 염기쌍 길이, 60)로 가정하고, theta를 0.1%, 세대 시간을 30년으로 설정하면, 새로운 돌연변이가 발생하기까지(더욱이 상당한 빈도에 도달하기까지) 약 7,000년이 소요됩니다. 이는 유제품 가축의 domestication이 10,000년 미만이며, Oct-1 결합 부위에서 −13.910:T와 −13.915:G라는 두 개의 변이가 높은 빈도로 관찰된 점을 고려할 때 너무 긴 시간입니다. 한 가지 가능성은 최근 과거에 인간의 유효 인구 규모가 훨씬 컸기 때문에 새로운 변이의 대기 시간이 줄어들었을 수 있다는 것입니다. 대안으로, 돌연변이 표적 영역이 13 염기쌍보다 더 넓을 수 있습니다. LP와 관련된 돌연변이는 100 염기쌍 영역에 고도로 집적되어 있지만, 예측된 결합 부위 외부에 위치한 −13.907:G는 Oct-1 결합에 영향을 미칩니다 (60). 마지막으로, 일부 LP와 관련된 돌연변이는 동물 가축화 이전에 낮은 빈도로 이미 존재했을 수 있습니다.
일반적으로 문화-역사적 가설의 타당성은 자주质疑되어 왔습니다(28, 37, 43, 50), 특히 성인으로서 우유를 섭취하는 능력이 LP와 LNP 개체 간에 왜 그렇게 큰 적응도 차이를 초래했는지 명확하지 않습니다. 첫 번째 가능성은 모든 개체가 우유가 단백질, 지방, 미네랄, 비타민의 풍부한 원천이기 때문에 우유를 섭취하는 데 이점을 가졌지만, LP 개체만 증상 없이 이점을 누릴 수 있었다는 것입니다. LNP 개체에게는 우유 섭취에 비용이 발생하며, 설사가 선택적 불리함의 가장 가능성이 높은 원인입니다. 그러나 유당 불내증 증상은 금식 상태의 개인에게 유당 50g(우유 1L에 해당)을 투여하여 조사되었으며, 대부분의 LNP 개인은 다른 음식을 섭취하여 소화 시간을 늦추면(예: 우유 한 잔 또는 유당 15g까지) 증상 없이 적당한 양의 유당을 견딜 수 있는 것으로 나타났습니다(120, 122, 131). 또한 설사의 발생과 장 증상의 심각성은 개인의 유당 발효 능력에 따라 크게 다르며(51, 52, 149), 이는 주로 대장 미생물군집(대장에 존재하는 모든 미생물의 총합)의 구성에 달려 있습니다(4, 81). 예를 들어, 대장 세균이 유당을 효율적으로 발효하면 삼투압 충격(설사를 유발함)이 감소하지만 가스 양이 증가할 수 있습니다. 동시에 메탄 생성 아키아박테리아가 우세하면 이산화탄소가 주로 메탄으로 전환되어 설사 대신 변비로 이어질 수 있으며, 이는 개인의 30%에서 관찰되었습니다(19, 81). 또한 LNP 환자가 유제품을 섭취하면 대장 미생물군 구성에 영향을 미쳐 장 증상을 감소시킬 수 있습니다(54, 133). 따라서 LNP 환자는 하루에 소량씩 분산하여 섭취하거나 다른 음식과 함께 섭취할 경우 우유를 섭취할 수 있을 것으로 보입니다.
LNP 환자는 유당 함량이 낮은 유제품을 섭취할 경우 단백질과 지방의 혜택을 받을 수 있습니다(35, 81). 특히 우유는 발효를 통해 요거트나 다양한 발효 음료를 생산할 수 있으며, 이 과정에서 유당은 세균과/또는 효모에 의해 부분적으로 전환됩니다. 또는 우유는 단백질 응고 후 유당이 거의 완전히 물리적으로 제거되는 치즈, 크림, 버터 등으로 가공될 수 있습니다. 마지막으로, LNP 환자는 요거트와 같은 발효 제품을 섭취할 때 일부 글루코스를 흡수할 수 있습니다. 이는 이러한 제품에 존재하는 젖산균이 세균성 락타제를 방출하기 때문에 장 내 소화 과정에서 일부 락토스가 가수분해되기 때문입니다(70). 그러나 이 양이 정확히 얼마나 되는지는 명확하지 않습니다.
그러나 분명한 것은 LNP 개인이 유제품에서 큰 양의 포도당을 얻을 수 없다는 점입니다. 유당(lactose)은 인간 모유 칼로리의 약 30%를 차지하며(17), 우유에 존재하는 유일한 당분입니다. Flatz & Rotthauwe(37)는 인간 사회에서 탄수화물이 부족하지 않았다고 주장하며, 따라서 선택적 압력이 “비특이적” 칼로리 섭취에 의해 유발되었을 수 없다고 믿었습니다. 일부 연구자들은 우유의 에너지적 우위가 기근과 가뭄 기간에 가장 강하게 선택되었을 것이라고 제안했습니다(28, 43). 그러나 이는 단순한 문제가 아닙니다. 우유 생산 자체는 가축의 사료 부족에 영향을 받기 때문에 식량 부족 시기에는 최소화되기 때문입니다(9, 13, 34). 따라서 신선한 우유는 식량 부족 시 현실적인 대안이 되지 않을 수 있습니다.
이러한 논쟁과 목축의 정도와 LP 빈도 사이의 불완전한 상관관계(LP가 유럽인에서 비례적으로 더 많이 나타남) 때문에 일부 연구자들은 LP의 선택적 우위에 대한 대체 설명을 제안했습니다. 특히 Flatz & Rotthauwe (38)는 칼슘 흡수 가설을 제안했습니다. 이 가설에 따르면 우유는 칼슘, 비타민 D, 락토스의 중요한 공급원이며, 후자 두 가지는 인간에서 칼슘 흡수를 촉진합니다. 이 특성은 고위도 지역 농민, 특히 유럽인들에게 특히 유리했을 것입니다. 이들은 곡물 중심의 식이로 인해 비타민 D 섭취량이 낮고, 비타민 D 생성을 자극하는 자외선 노출이 적기 때문입니다. 그러나 최근 연구들은 유당이 인간에서 칼슘 흡수를 촉진한다는 관점을 도전하며, 대신 LNP 개인이 유제품을 섭취할 때 LP 개인이 신선한 우유를 섭취할 때보다 더 많은 칼슘을 얻을 수 있다고 제안했습니다(73).
다른 연구자들은 우유가 특히 건조한 사막 환경에 거주하며 콜레라와 같은 질병에 직면한 아프리카와 중동 지역의 유목민과 같은 인구 집단에게 오염되지 않은 물과 전해질의 귀중한 원천을 제공했을 수 있다고 제안했습니다(21, 22). 이 가설과 칼슘 흡수 가설은 LP를 세계의 특정 지역에서만 설명할 수 있으므로, 전체 패턴을 설명하려면 두 가설의 결합이 필요합니다. 이러한 가설을 구분하고 인간 인구 간의 생리적 차이를 유발한 선택적 압력을 이해하기 위해 Holden & Mace (58)는 62개 인구집단의 LP 형질 데이터와 목축 의존도, 자외선 조사량, 가뭄 강도 데이터를 결합해 분석했습니다. 이 분석은 인구집단 간의 계통 관계를 통제했습니다. 그들의 결론은 데이터가 문화-역사적 가설과만 일치한다는 것입니다. 왜냐하면 LP 빈도는 목축업과 유의미하게 상관관계를 보이지만 다른 두 요인과의 상관관계는 없기 때문입니다.
이탄 등(64)의 최근 연구는 유럽에서 선택과 기본 인구학적 과정을 결합해 시뮬레이션했으며 동일한 결론을 도출했습니다. 즉, 고위도 지역에서 선택이 저위도 지역보다 강하다는 증거는 발견되지 않았습니다. 저자들은 선택의 기원을 7,500년 전 중앙 유럽과 북부 발칸반도 사이로 추정했습니다. 반면 Gerbault 등(44)은 유럽에서 가장 가능성이 높은 시나리오로 고위도에서의 칼슘 흡수 우위와 함께 농업의 인구학적 확산을 제시했으며, 아프리카에서는 문화-역사적 가설을 지지했습니다. 요약하자면, 아프리카의 LP 빈도는 문화-역사적 가설과 일반적으로 일치하지만, 유럽에서는 목축, 위도, 인구학적 요인의 영향에 대한 결과는 여전히 모순됩니다. 인간 유전적 데이터 연구를 넘어, 유럽 소 품종의 유전적 다양성과 인간 LP 빈도 간의 지리적 일치(7)는 목축이 유럽에서 LP 빈도 증가에 주요 역할을 했다는 아이디어를 강력히 지지합니다.
그렇다면 목축민 인구가 가장 많은 아시아, 특히 초원 지역은 어떨까요? 이 질문에 답하기 위해 우리는 Holden & Mace (58)의 데이터를 재분석했습니다. 아프리카, 유럽, 아시아의 인구(총 62개 인구 중 54개 하위 집합)를 분리하여 분석하고, Global Lactase Persistence Association Database (GLAD; http://www.ucl.ac.uk/mace-lab/resources/glad ) (63)에서 데이터를 재분석했으며, Murdock의 “Ethnographic Atlas: A Summary” (89)에서 목축업의 비율에 대한 정보를 찾을 수 있었습니다. 예상대로 전 세계 데이터 세트(68개 인구 집단)에서 LP와 목축업 간에 매우 유의미한 상관관계가 관찰되었습니다(피어슨 계수 = 0.47, p = 5×10−5)이며, 아프리카 내에서도 유의미했습니다(40개 인구 집단, 피어슨 계수 = 0.57, p = 1×10−4). 그러나 유럽, 아시아, 유라시아 전체에서는 유의미한 상관관계가 관찰되지 않았습니다(각각 10, 18, 28개 인구; 피어슨 계수 = 0.18, 0.01, 0.10; p > 0.61) (그림 3 참조). 원본 논문(58)에서 수행된 것처럼 인구 간의 계통학적 관계를 고려하지 않았음에도 불구하고, 우리의 결과는 관찰된 전 세계적 상관관계가 주로 아프리카 인구들에 의해 주도되었을 가능성이 있음을 시사합니다. 또한, 유럽인은 전통적으로 농목축 인구로 domesticated 식물로부터 상당량의 에너지를 얻기 때문에 목축의 조상 수준이 중간임에도 불구하고 높은 LP 빈도를 보입니다. 반면 아시아인은 일부 인구군이 수천 년 동안 목축에 크게 의존해 왔음에도 불구하고 모두 낮은 LP 빈도를 보입니다. 그렇다면 왜 이 아시아 목축민들은 락타아제 비지속성을 가질까요?
그림 3 구세계(상단), 아프리카(하단 좌측), 유라시아(하단 우측) 인구에서 유당분해효소 지속성(LP) 표현형 빈도와 목축업 비율 간의 상관관계. 대부분의 데이터는 Holden & Mace (58)에서 가져왔으며, 우리는 Global Lactase Persistence Association Database (GLAD; http://www.ucl.ac.uk/mace-lab/resources/glad )에서 LP 빈도가 확인된 14개 집단을 추가했습니다. 이 데이터는 Itan et al. (63)에서 2013년에 업데이트된 데이터이며, Murdock의 “Ethnographic Atlas: A Summary” (89)에서 목축업 비율 정보를 찾을 수 있는 인구 집단(소말리아, !Kung, Wolof, Diola, Serere, Bisharin, Sandawe, Iraqw, Maasai, Kazakh, Burmese, Kashmiri 및 핀란드 인구 집단, 이스라엘의 아랍인)에 해당됩니다. 우리는 아메리카와 오세아니아(8개 인구 집단)의 Holden & Mace 데이터를 제외하여 아프리카, 유럽, 아시아에 초점을 맞추기 위해 데이터를 조정했습니다.
Figure 3 Correlation between lactase persistence (LP) phenotypic frequency and the proportion of pastoralism in populations from the Old World (top), Africa (bottom left), and Eurasia (bottom right). Most of the data are from Holden & Mace (58); to their data, we added 14 populations for which LP frequencies were available in the Global Lactase Persistence Association Database (GLAD; http://www.ucl.ac.uk/mace-lab/resources/glad ), originally published by Itan et al. (63) and updated in 2013, and for which we could find information on the proportions of pastoralism in Murdock's “Ethnographic Atlas: A Summary” (89) (Somali, !Kung, Wolof, Diola, Serere, Bisharin, Sandawe, Iraqw, Maasai, Kazakh, Burmese, Kashmiri, and Finnish populations as well as Arabs from Israel). We removed Holden & Mace's data from America and Oceania (8 populations) to focus on Africa, Europe, and Asia.
Click to view
PUZZLING OBSERVATIONS
Non-Lactase-Persistent Herders
Under the cultural-historical hypothesis, populations with higher levels of milk dependence (notably nomadic herders) should have higher LP frequencies, provided that they did not shift to pastoralism recently and did not substantially admix with nonpastoralist populations. Yet in long-term herders from Asia, such as Mongols and Kazakhs, the observed frequency is quite low. Indeed, LP frequencies are estimated to be 12% in Mongols (147) and 24–30% in Kazakhs (55, 147). In other herders from Central Asia, where the LP phenotype is well correlated with the presence of −13.910:T (55), LP frequency can be estimated from genetic data. We collected such data for ten additional Central Asian herder populations (two Karakalpak, five Kyrgyz, one Kazakh, one Turkmen, and one Uzbek, totaling 301 individuals) and found LP frequencies to lie between 3% and 26%, with an average of 14% (L. Ségurel & E. Heyer, unpublished data). Similarly, in Tibetan populations, all five previously LP-associated mutations are absent, suggesting either a very low frequency of LP or an independent genetic basis for it (100). Therefore, in Asia (outside of northern India and Pakistan), the emerging picture is one of very low LP frequencies, whether populations engaged in a pastoralist lifestyle or not.
Central Asian herders are not the only exception to the expected pattern. Indeed, the Sami reindeer herders in Scandinavia have a lower LP frequency than the rest of the Swedish population (40–75% versus 91%) (110) despite a higher dependence on pastoralism (60% versus 30%). Similarly, some African pastoral ethnic groups who consume milk (50–90% pastoralism) have low LP frequencies, as in the case of the Dinka (LP frequency of 22%) and Nuer (25%) in Sudan (5), the Somali in Ethiopia (24%) (62), and the Herero in South Africa (3%) (58). Finally, in the area where animals were first domesticated (notably in Turkey) and more generally around the Mediterranean, populations that have used milk for millennia (see sidebar titled First Evidence of Milk Processing) have moderate LP frequencies.
| FIRST EVIDENCE OF MILK PROCESSING The slaughtering-age profiles of early Neolithic cattle (140) and the presence of residual milk proteins and lipids on ceramics as old as 9,000 years before present (BP) (32) revealed that dairying was practiced early on by the first farmers in Anatolia. This suggests that milk was used for human benefit soon after domestication. In Europe, a sieve from around 8,000 BP carries evidence of cheese processing (116), and β-lactoglobulin has been found in calculus, showing that milk or lactose-rich whey was drunk around 5,000 BP (144). In Asia, use of mare milk has been shown in Kazakhstan around 5,500 BP (99). In Africa, analyses of fatty acids in potsherds in Libya revealed the use of milk as early as around 7,000 BP (27). Therefore, populations knew how to make derived dairy products from milk soon after animal domestication. |
How can we explain these discrepancies? First, there could be important differences between pastoral groups in the time since domestication; the first dairying domesticated animals appeared 10,500 BP in the Middle East, whereas the latest ones seem to be the reindeer, 2,500 BP (see sidebar titled Dairying Animal Domestication in the Old World). Milk from different species also contains different amounts of lactose: 100 g of milk from mares, donkeys, and humans contains 6.4–6.9 g of lactose (the highest concentration); from cows, buffalo, yaks, goats, sheep, and camels contains 4.2–5.1 g of lactose; and from reindeer and moose contains 2.6–2.9 g of lactose (the lowest concentration) (35). The low lactose content in reindeer milk, combined with their more recent domestication, could explain the moderate LP frequency in the Sami people. However, this explanation is not likely for Central Asian and Mongolian herders, who have been consuming high-lactose mare milk for at least 5,500 years (99).
Ingram et al. (62) observed that part of a Somali population from East Africa did not present a baseline release of hydrogen and proposed that these individuals might have a colonic adaptation that reduces the symptoms associated with milk consumption. However, it is unclear whether these individuals are lactase persistent or nonpersistent, as neither the increase of glucose nor the presence of symptoms was investigated. In any case, although these individuals might have reduced symptoms, they are still not able to derive glucose from milk.
Importantly, there could have been major differences in dietary availability among populations during the Neolithic, with European agropastoral populations facing harsh times of food shortage during their transition to agriculture, whereas steppe populations would have undergone a smoother transition from hunter-gathering to herding. It has indeed been proposed that early farmers had much less balanced diets than hunter-gatherers, with many food deficiencies and a shortage of proteins (103, 108). Alternatively, it could be that some pastoral populations are admixing too often or too strongly with neighboring nonpastoral populations, and therefore are unable to maintain a strong signal of local adaptation. This could be the case for Central Asia, which is a migratory crossroads and lies in the middle of nonpastoral groups.
Another important factor is the transformation of milk into derived dairy products. Durham (28) proposed that, although domestication of dairying animals is a prerequisite for LP to evolve, it is expected to be high not in all pastoral populations, but rather only in those where milk is not entirely processed into low-lactose products and/or those under high dietary stress. He further showed that LP indeed correlates better with the amount of fresh milk consumed than with the proportion of pastoralism (28), even though such data are not reliably available for all populations. Notably, this explanation might account for the intermediate LP frequency in populations around the Mediterranean and north of the Middle East (39%), where dairying animals were first domesticated, given that these populations consume moderate amounts of fresh milk (102 L per person per year) and transform a large proportion of milk into cheese (38% on average) (28). Northern and central Europeans, by contrast, consume much more fresh milk (489 L per person per year), transform a lower proportion into cheese (18% on average), and have a very high LP frequency (91%). If so, given that all populations had access early on to milk-processing techniques (see sidebar titled First Evidence of Milk Processing), the question becomes, Why have they not avoided the selective pressure for LP by a more rapid cultural adaptation, i.e., milk transformation into secondary products? Indeed, why would a population continue to consume fresh milk despite having intestinal symptoms for thousands of years (the time it would take for the LP-associated mutations to increase substantially enough that an appreciable fraction of the population could enjoy the nutritional benefit of lactose) when the cultural adaptation of milk processing would allow them to benefit from proteins, lipids, vitamins, and minerals at a lower cost?
Cultural differences in terms of preferences or beliefs that influence the practices around milk could have been important, as was nicely described for South Asian populations by Simoons (125). Climatic differences could also influence the need or ability to preserve dairy products outside of fresh milk. (The primary function of fermentation, for example, is to increase the conservation time of dairy products.) A strong seasonality of milk availability would also increase the need to process milk in order to store dairy products for the winter. Furthermore, differences in mobility could influence consumption practices, given that a highly nomadic lifestyle likely favors the transportation of processed dairy products over a large quantity of fresh milk. Finally, differences in carbohydrate availability in the rest of the diet might matter, with some populations experiencing a stronger pressure to increase their protein and fat intake, whereas others place more importance on carbohydrates. Therefore, differences in cultural, ecological, nutritional, and environmental factors likely contribute to the quantities and relative amounts of fresh and processed milk consumed in different human populations (12).
Ancient DNA
Another puzzling observation that added some confusion about when and where LP started to increase in frequency is the late appearance of the −13.910:T LP-associated allele in ancient DNA from Europe (2, 18, 42, 71, 74, 75, 85, 91, 102). Ancient DNA studies are especially helpful because they can provide a direct temporal snapshot of the evolution of the LP phenotype. Furthermore, they provide information about where the selective constraints started, before further migration and admixture events occurred. Because of degradation, access to ancient DNA from before 10,000 BP and DNA conserved in warm areas is challenging, but fortunately, data from Europe from the Neolithic and more recent times are expected to yield substantial information, and indeed such data have started to accumulate about the −13.910:T mutation.
As expected, analysis of hunter-gatherers from the Paleolithic and Mesolithic revealed that no individual carried the LP allele before 5,000 BP (39, 40, 42, 67, 121) ( Figure 4 ), except one out of eight sequencing reads covering the −13.910 polymorphism in a Spanish individual from the Mesolithic (95), which has been interpreted as caused by a cytosine deamination, a typical damage in ancient DNA. The LP allele was therefore not present as a standing variation (at least at appreciable frequency) in pre-Neolithic European hunter-gatherer populations.
Figure 4 Evolution of lactase persistence (LP) in Europe over the last 10,000 years. The figure shows the theoretical expectations of the trajectory of an allele under selection for various selection coefficients (S) with a final allele frequency of 50% (light purple lines) or 60% (dark purple lines) superposed on −13.910:T allele frequencies observed in ancient DNA data sets (colored squares). The allele frequencies of 50% and 60% correspond to LP frequencies of 75% and 80%, respectively, as observed in modern populations from central Europe. The sizes of the colored squares are proportional to the number of samples (from 1 to 35), and the colors indicate the area of origin of the human remains. We used a generation time of 30 years to obtain dates in years. The frequencies are taken from References 2, 18, 39, 40, 42, 57, 67, 71, 74, 75, 83, 84, 91, 102, and 146. The value for Reference 85 was not numerically available and is therefore not included here.
Click to view
More surprisingly, no LP allele was found in 69 ancient Europeans dating back to the early or middle Neolithic (approximately 8,500–5,000 BP), whether from the Epicardial culture in southern Europe (75) or from the Linearbandkeramic (LBK)–associated cultures in central Europe (42, 146) ( Figure 4 ). Both of these cultures are thought to have emerged from demographic migrations of the agropastoralist populations from Anatolia (42, 127, 128), who were the first to consume milk (see sidebar titled First Evidence of Milk Processing). Given that the LBKs emerged in the same area (the Danube basin) and the same time (approximately 6,000 BP) (8) that were inferred for selection of LP ( Table 1 ), and that these populations practiced dairying, the appearance of LP in Europe is often associated with the LBK complex (64).
LP was actually found for the first time in Europe in the late Neolithic (approximately 5,000–4,300 BP) ( Figure 4 ). It was found (a) surprisingly, in one out of ten hunter-gatherers (−13.910:T frequency of 5%) from the Pitted Ware culture in Sweden (83); (b) in two burials sites from the Corded Ware culture in northern Spain (−13.910:T frequency of 14% and 26%) (102); and (c) in central Europeans (−13.910:T frequency of approximately 5%), with the oldest occurrence of LP being in a 4,350-year-old individual from the Bell-Beaker culture, which is strongly associated with the Corded Ware culture (85). Allentoft et al. (2) further estimated that the LP allele frequency was 7% during the Bronze Age in central Europe (i.e., in descendants from the Corded Ware culture). The first observation of LP in Europe is therefore concomitant with an important demographic and cultural event: the massive migration into Europe of eastern steppe populations related to the Yamnaya culture (2, 49), a nomad steppe culture heavily reliant on cattle herding and sporadic agriculture (69), which subsequently admixed with European populations, resulting in the development of the Corded Ware culture in central Europe. Interestingly, this association between LP and steppe ancestry was also supported by a data set of 13 ancient Hungarians from the Iron Age (42): The individual with the LP allele was the only one with a high proportion of genetic ancestry from steppe populations. The absence of LP before the late Neolithic and the correlation between its appearance and migration from the steppes have therefore led to the alternative hypothesis of LP first arising in a pastoralist steppe population and then being brought to western Europe at the beginning of the Corded Ware culture (2).
However, these steppe populations have been estimated to display a very small amount of LP (−13.910:T frequency of 0% and approximately 6% in two studies from the Bronze Age) (2, 85), challenging the idea that they are the source populations for LP. When imputing the −13.910:T frequency based on surrounding markers, Allentoft et al. (2) reported a high LP allele frequency (approximately 20%) in ancient steppe populations. However, the major haplotype currently carrying the −13.910:T allele is also found carrying the −13.910:C (LNP) allele both in high frequency in modern Eurasian populations (31) and in one early Neolithic individual (the “Stuttgart” individual) (68). Therefore, the −13.910 genotype cannot be reliably imputed from surrounding sequences, whether from modern or ancient data.
In summary, the LP allele has not been found anywhere before 5,000 BP; its frequency then reached 5–26% (depending on the study) in late-Neolithic Europeans associated with the Corded Ware culture, 7% during the Bronze Age, and 19% during the Iron Age (in the Hallstatt culture in Poland) (146) ( Figure 4 ). The allele frequency then clearly increased, reaching 36% and 53% at two Roman sites in Poland (11 and 20 individuals, respectively) (146).
In the Middle Ages, 80 individuals from four contemporaneous archaeological sites from the same area in Poland had heterogeneous LP allele frequencies (20–64%) (146). This heterogeneity was also found in Germany and Hungary, where Middle Ages populations had an LP allele frequency of 50% (71) and 11% (91), respectively. In the latter case, two different cultural backgrounds were differentiated: the “commoners,” a third of whom displayed the LP phenotype, and the “conquerors,” who belonged to pastoralist nomad tribes who invaded Hungary in 895 ad, in whom no LP allele was found. The high variability of LP allele frequency may be due either to a strong stratification of protohistoric and historic populations or to noise, given the small number of individuals who have been analyzed from each population.
To compare the observed ancient DNA data with theoretical expectations, we calculated the expected trajectory of a selected allele under a dominant model with a selection coefficient of 0.03, 0.04, and 0.05, reaching a modern allele frequency of 50% or 60% (corresponding to an LP frequency of 75% and 80%, respectively, as observed in central Europe), and assuming a constant population size and strength of selective pressure. As shown in Figure 4 , except for the case of the 0.03 selection coefficient (which is lower than most estimated values based on modern data; Table 2 ), we do not expect to observe a substantial LP allele frequency before 3,000 BP. The theoretical allele trajectory is therefore broadly compatible with the observed ancient DNA data, as was also concluded by a recent study (92). The main puzzle is actually in the reverse direction than that previously highlighted: The estimated LP frequency for late-Neolithic European populations (14% and 26% in 7 and 19 individuals from northern Spain, respectively) is higher than expected, especially for a south European population. This discrepancy could result from the sampling error, because of particular demographic events in these populations, or it could indicate that the selection coefficient has not been constant in time, with a higher selective pressure in the distant past than at present.
In conclusion, although ancient DNA data are valuable in this context and have allowed investigators to confirm a progressive increase of LP in Europe since the late Neolithic, the questions around the timing of selection, the strength of the selective pressure, and the geographical origin of the LP allele (LBK or steppe population) are still open. The accumulation of more population data from the late Neolithic, including samples from steppe populations that predate the Yamnaya, should allow a better evaluation of the frequency of the LP allele and the strength of selection in Europe.
CONCLUSION AND PERSPECTIVES
Although LP has been investigated in an impressive number of individuals since the 1960s, some geographical areas remain phenotypically understudied, notably in North Africa and West Africa and around the Caucasus. The situation should also be clarified in western Asia, notably in Pakistan, where contrasting results have been obtained, and in Tibetans, for whom we lack phenotypic data entirely. Despite the identification of five LP-associated alleles, some additional molecular basis of the trait remains to be uncovered. Indeed, two populations with substantial LP completely lack known LP-associated mutations: the Hadza from Tanzania, who have an LP frequency of 47% (estimated in 19 individuals with the hydrogen test) (107, 134), and the Wolof from Senegal, who have an LP frequency of 51% (although the phenotypic and genetic data for the latter did not come from the same study) (3, 62). The case of the Hadza is particularly puzzling, because no associated mutation has been found despite a large resequencing effort, including intron 9 of MCM6 (1.3 kb), intron 13 of MCM6 (3.2 kb), and the LCT promoter (2 kb) (107, 134). More generally, the current known alleles have been calculated to account for at most 45% of the phenotypic variation in African populations (107). Future studies assessing genotype-phenotype associations would benefit from testing LP with both the glucose and hydrogen tests, as indirect approaches are influenced by several confounding factors and likely have a high error rate. An interesting research direction would be to study the gut microbiome composition in LNP individuals, particularly to understand which factors are associated with intestinal symptoms. Additional molecular work (both in vitro and in vivo), notably testing the effect of multiple mutations on lactase expression thanks to mutagenesis, would also help to understand the size of the mutational target and the pathways involved in the regulation of LCT expression.
Concerning the nature of the selective pressures responsible for the increase in LP frequency in multiple populations, it seems clear that broadening the dietary repertoire and being able to derive glucose from milk were strongly favored in some populations. Why this was not the case in all pastoralists and was especially the case in Europeans is not entirely clear but could have resulted from a combination of cultural, nutritional, and environmental factors, such as the preference or need to ferment and transform milk into low-lactose dairy products, the stability and availability of other food, the constraints resulting from seasonality and mobility, and the type of livestock. There are a spectrum of dietary practices, ranging from nonmilking pastoralists (although these are rare) to mostly fermenting pastoralists to heavily milk-drinking pastoralists, and this variety seems to be better correlated with LP frequency than the levels of pastoralism are (28). In any case, more nutritional anthropological studies investigating the amount, type, and seasonality of milk and dairy products consumed and the perception of these foods in traditional populations would be helpful in understanding why some populations took up drinking fresh milk, whereas others mostly transformed it. Additionally, patterns of admixture between pastoral and nonpastoral populations might have played an important role in limiting the efficacy of natural selection.
Interestingly, although most of the focus has been on the cost of drinking milk for LNP individuals, arguments can also be made that consuming milk or dairy also has advantages for these individuals. Indeed, recent studies have shown that the gut microbiota of LNP individuals differ from those of LP individuals from the same population in that they have a higher prevalence of Bifidobacterium (11, 15, 47), which is explained by the larger amounts of lactose available for bacterial fermentation. In fact, this represents one of the strongest associations to date between genetic variants and variation in gut microbiome composition (11, 15, 47). A consequence is that LNP individuals who consume fermented dairy products have higher levels of short-chain fatty acids, the end products of fermentation ( Figure 2 ), which represent an important source of energy for hosts (between 10% and 30% of their basal metabolic needs) (23). However, most of these fatty acids are usually derived from carbohydrates other than lactose, notably from starch and nonstarchy polysaccharides (4, 81). It has been estimated that the ingestion of lactose provides 4 kcal/g when digested in the small intestine (as in LP), whereas it yields approximately 2 kcal/g if fermented in the colon (as in LNP) (118). The difference in energetic uptake between LP and LNP is therefore not that high (especially if adding to that the benefits from proteins and fat). Outside of the energetic value of lactose, some have proposed that lactose should be considered a prebiotic for LNP individuals (133) because it stimulates the growth of lactic acid bacteria that are thought to be beneficial for human health, notably owing to their production of antibacterial peptides and their stimulation of the host immune system (26, 72). This feature could therefore represent an alternative selective advantage of fermenting milk in populations with low amounts of vegetal carbohydrates or high pathogenic loads.
Finally, the lactase enzyme—more accurately known as the lactase-phlorizin hydrolase—has the ability to hydrolyze not only lactose but also other β-galactosides and β-glucosides, such as phlorizin and flavonoid glucosides in plants (24, 129). Because LP frequencies are high in some nonpastoral populations with no milk in their diets [namely the Khoisan-speaking Hadza hunter-gatherers (47%) and the Afro-Asiatic Yaaku hunter-gatherers (78%)], some have proposed that LP could also be selected for its broader role in the hydrolysis of vegetal molecules (107, 134). However, as pointed out by Gerbault et al. (45), this hypothesis may not be realistic, given that another intestinal enzyme, cytosolic β-glucosidase, is also able to hydrolyze the same compounds (25). In the end, more research will be needed to explain why these hunter-gatherers are lactase persistent.
disclosure statement
The authors are not aware of any affiliations, memberships, funding, or financial holdings that might be perceived as affecting the objectivity of this review.
ACKNOWLEDGMENTS
acknowledgments
We wish to thank Molly Przeworski for her helpful comments on an earlier version of this review.
literature cited
|
|