A+ A A-

Molecular phylogenetics reveals past gene flow between indigenous North African, Iberian and Balkan cattle

Download this file (JNS_AB_64_6.pdf)Volume 64, Article 06[Volume 64, Article 06]699 kB





1Unité Laboratoire des Productions Animales et Fourragères, Institut National de la Recherche Agronomique de Tunisie, Université de Carthage, Rue Hédi Karray, Ariana, 2049, Tunisia.

GABI, INRA, AgroParisTech, Université Paris Saclay, 78350 Jouy-en-Josas, France.

Abstract – North Africa has been historically a crossroad of many civilizations, human migrations and trade which are expected to have left distinctive footprints within the genome of indigenous cattle inhabiting this region resulting from co-migration with humans. The aim of this study was to investigate the phylogenetic relationship and admixture patterns between indigenous Tunisian, Iberian and Balkan cattle in relation to historical human trade and migration between these three regions. For this purpose we used data from 39 Tunisian individuals genotyped with the Illumina BovineSNP50 BeadChipv2 followed by comparisons with six Spanish breeds, selected as being representative of Iberian cattle, and four Balkan populations. In addition to African taurine introgression into Iberian cattle (ranging from 10.3% to 16.2%) reported by previous studies, we found genetic evidence for a past gene flow between Iberian and North African cattle. Likewise, our analysis based on f3 statistics and the estimation of the amount of genetic differentiation between populations, clearly suggest a past gene flow between North African and Balkan cattle. Further, in addition to a common ancestry shared between the three cattle types, the model-based clustering showed that Tunisian cattle share with the Balkan populations, a second common ancestry that is not present in Iberian breeds. This finding suggests the presence of at least two-wave admixture between North African and Balkan cattle. The first one carried North African alleles into Balkan populations through Iberian cattle while the second one occurred directly between North African and Balkan cattle (probably during the Ottoman empire control over North Africa between the 16th and the 19th century). Our phylogenetic analyses of cattle from three key-regions of the Mediterranean Basin show that the admixture patterns between cattle populations is most likely more complex than previously thought because of multiple-wave population admixture occurring at different periods of history.

Keywords: Tunisian Cattle, Iberian cattle, Balkan cattle, SNP, Admixture, Population structure.


  1. Introduction

It is commonly accepted that the domestication of cattle took place from wild aurochs (Bos primigenius) between 8000 and 10,000 years ago. Most studies are in favour of at least two independent domestication sites where the first one is located in the Fertile Crescent and the second one is located in the Indus valley (Loftus et al. 1994). Cattle spread to many parts of the world in conjunction with human migrations. For example, they accompanied Neolithic farmers migration from the Near East into Greece and the Balkan region during the early part of the seventh millennium BC (Pinhasi and von Cramon-Taubadel, 2009). Later on, cattle became widespread all over Europe following at least two main routes. The first one is a land migration called the Danubian route where cattle moved along the Balkans into the plains of central and Northern Europe (Cymbron et al. 2005). The second migration route is sea-based and occurred along the Mediterranean coast into the Iberian peninsula (Spain and Portugal) through Corsica and Southern France (Price 2000). In addition to their Near Eastern origin, it has been shown that Iberian cattle were subjected to a substantial influence from African taurines (Cymbron et al. 1999 ; Decker et al. 2014). Archaeological studies reported that ancient domesticated African cattle were introduced from the first domestication center (i.e the Fertile Crescent) through Egypt and the Horn of Africa in about 5000 BC (Payne and Hodges 1997). These cattle spread across northeast and northern Africa where they interbred with wild aurochs, which were native to the region causing their divergence from European and asian taurines (Decker et al. 2014). Although many studies brought genetic evidence for an African taurine influence on the Iberian cattle, there is not yet any direct genetic evidence of past admixture between Iberian and North African cattle because of the lack of North African DNA samples. Recently, using medium density SNP arrays, it has been shown that these latter possess an admixed South European X African origin (Ben Jemaa et al. 2015; Ben Jemaa et al. 2018). For their part, indigenous Balkan and Anatolian cattle breeds were shown to form one of the five major groups that can be distinguished within European breeds using genetic markers (Felius et al. 2014). Analysis of mtDNA showed that the T3 haplogroup was dominant in most of Balkan cattle breeds (Ivanković et al. 2014). Despite the ottoman presence in North Africa (except Morocco) during the course of three centuries from 1574 to 1881 (Hathaway and Barbir, 2014), there is no evidence of cattle migration between North Africa and the Balkans during this period. In the present study, we sought to investigate the phylogenetic relationship and admixture patterns of cattle populations sampled from Tunisia (as representative of North African cattle), Spain (as representative of the Iberian cattle) and the Balkans using medium-density SNP chips.

  1. Material and methods

    1. Genotyping

For the purpose of this study, we selected genotypes of 39 Tunisian samples available from a previous study using the BovineSNP50 BeadChip Ver. 2 (Ben Jemaa et al. 2018). We combined these genotypes with those available from individuals belonging to Iberian taurines (6 breeds), Balkan taurines (4 breeds) and one African taurine population (N’Dama). In addition, we included genotyping data from 20 Bali cattle (Bos javanicus domesticus) which was used as outgroup in our phylogenetic analysis. The genotyping data were available from Decker et al. 2014. Genotyping data for 49 233 SNPs were available on these breeds. The number of animals per breed ranged from 5 to 39 (Table 1).

    1. SNP quality control and marker selection

We used PLINK ver.1.9 (Purcell et al. 2007) for genotyping data quality control. Samples genotyped for less than 85 % of markers, SNPs genotyped for less than 75 % of the animals and those with MAF less than 0.05 were discarded. Using these criteria, 10423 SNPs were removed due to missing genotype data and 4223 SNPs were removed due to MAF threshold which led to 34 587 SNPs spread over all autosomal chromosomes kept for further analysis.

    1. Phylogenetic analysis

We used various methods to investigate relationships between the populations of the study. First, the pairwise fixation index (FST) between populations was estimated using Genepop 4.6 software (Rousset 2008). Second, we inferred patterns of splits and mixtures of the 13 populations using TreeMix (Pickrell and Pritchard 2012) and by setting Bali cattle as a rooting outgroup. We built a phylogenetic tree of these populations and began to add migration events (modeled as edges) to the phylogenetic model. Migration edges were added until 0.9989 % of the variance in ancestry between populations was explained by the model. The residuals from the fit of the model to the data were visualized using the R script implemented in TreeMix.







Table 1. Sample description: The population name, the population abbreviation, the geographic origin, the number of individuals (N) and the data origin of each population.

Population name


Geographic origin


Data origin

Anatolian Black




Decker et al. 2014

Illyrian Mountain Buša




Decker et al. 2014





Decker et al. 2014





Decker et al. 2014

Berrenda en Colorado




Decker et al. 2014

Tunisian local




Ben Jemaa et al. 2018





Decker et al. 2014

Cardena Andaluza




Decker et al. 2014





Decker et al. 2014

Negra Andaluza




Decker et al. 2014





Decker et al. 2014





Decker et al. 2014





Decker et al. 2014

    1. Population structure and admixture

Principal component analysis (PCA) was performed with the adegenet R package (Jombart 2008) and results were visualized using the same package. To further quantify the different ancestry proportions of the populations of the study, we carried out an unsupervised hierarchical clustering using Admixture 1.23 software (Alexander et al. 2009). Distruct software (Rosenberg, 2004) was then used to graphically display ancestry within each individual.

In order to provide further support for a past admixture between populations, we ran the THREEPOP program implemented in TreeMix. This program calculated f3 statistics for all possible triplets from the populations. If a population A is a mixture of two other populations B and C, the Z-score computed for each tested triplet would have a significant negative value.

  1. Results

    1. Phylogenetic analysis and admixture

Pairwise FST analysis revealed that the Anatolian Black (ABB) and the Illyrian Mountain Buša (IMB) had the lowest values with the Tunisian population (0.0482 and 0.0535, respectively). For their part, Iberian breeds had higher FST values with the Tunisian population (ranging between 0.0609 for TUN/PIR to 0.0823 for TUN/SA). Among the Balkan populations, IMB and Busha (BU) had the lowest FST values with the Iberian breeds (ranging from 0.0283 for IMB/PIR to 0.061 for BU/SA). It is also worth noting that Balkan breeds had low to moderate pairwise FST values (<0.07) while Iberian breeds showed low pairwise FST values (0.019 <FST<0.058). Treemix results are shown in Figure 1. The proportion of the variance in ancestry between populations explained by the model began to asymptote at 0.9989 when 7 migration edges were fit. We have found a substantial level of gene flow between the Tunisian cattle and the common ancestor of the Iberian breeds (~34%) as well as between Cika (SIC) and the Iberian Pirenaica (PIR) (14%). Additionally, TreeMix placed a migration edge from SIC to Busha (BU) (46%). We also found significant amounts of gene flow between an ancestral population in Asia (which lived after divergence from Bali cattle) and the Anatolian Black cattle (~24%). Likewise, low levels of African taurine introgression were detected into the Iberian breeds which occurred before their divergence.

The major part of highly significant f3 statistics suggests African and Balkan cattle admixture as well as African and Iberian cattle admixture into the Tunisian local cattle (Table 2). Additionally, Nine of the most significant f3 statistics showed that Anatolian Black cattle is admixed with a non-taurine population that is not present in our sample and either with a Balkan or an Iberian breed.



Figure 1. Maximum likelihood tree constructed with TreeMix, inferred from the 13 cattle populations of the study when 7 migration events (modeled as arrows) were allowed. Migration arrows are colored according to their weight.


Table 2. The most significant twenty f3 statistics for the populations of the study


f3 statistic

standard error






















































































    1. Population structure

Principal component analysis was carried out to establish the relationship among the populations of the study (Figure 2). The first and the second components (PC1 and PC2) explained 15.7 % and 4.92% of the variation, respectively. PCA grouped individuals in clusters according to their populations of origin. IMB and BU were placed near Iberian cattle while these were grouped close to each other in a reduced area. For their part, Tunisian individuals were placed at an intermediate position both between Iberian and Anatolian Black (according to PC1) and between African and European cattle (according to PC2). Bali individuals showed the highest dispersion around their center of gravity while among Balkan breeds, ABB individuals showed the highest one.

Figure 2. PCA results of allele frequencies obtained from 34,587 SNPs genotyped in 183 cattle individuals from 13 populations (PC1: 15.7 % and PC2: 4.92 %). ABB=Anatolian Black; IMB=Illyrian Mountain Buša; SIC=Cika; BALI=Bali; BC=Berrenda en Colorado; TUN=Tunisian local; BU=Busha; CAR=Cardena Andaluza; MOST=Mostrenca; NGA=Negra Andaluza; ND1=N’dama; PIR=Pirenaica; SA=Sayaguesa.

To infer patterns of admixture and the proportions of ancestral populations, we carried out a model-based unsupervised hierarchical considering different K numbers of predefined clusters (Figure 3). In the K=2 model, all taurines were assigned to a single cluster. N’Dama (ND1), Tunisian and Balkan populations showed however few traces of non-taurine introgression in their genome. When K was set to 3, the African taurine ND1 was separated from the European cattle. Tunisian individuals were mainly composed of African and Euopean ancestries (58% and 41%, respectively) while all Balkan cattle except SIC showed also a significant amount of African taurine introgression in their genome (ranging, on average, from 11% for IMB to 23.6% for ABB). Similarly, African taurine introgression was detected in all Iberian breeds ranging, on average, between 13% for to Mostrenca (MOST) to 14.16% for Cardena Andaluza (CAR). At K=5, Tunisian cattle, BU, IMB and most of ABB individuals shared a substantial common ancestry with the Iberian cattle (ranging from 21% to 49%). Furthermore, Tunisian individuals showed a second common ancestry (different from the first one) with ABB, IMB and BU that is not present within Iberian breeds (blue color).



Figure 3. Unsupervised hierarchical clustering of the 183 individuals from the 13 populations. Results for k (number of clusters) = 2, k=3, k=5 are shown. Individuals are grouped by population (separated by black lines). Each individual is represented by a vertical bar. The proportion of the bar in each of k colors corresponds to the average posterior likelihood that the individual is assigned to the cluster indicated by that color

  1. Discussion

In this study, we aimed at investigating the genetic relationship between Tunisian, Iberian and Balkan cattle using medium density SNP chips. Previous studies showed that Tunisian cattle have low degree of divergence from the ancient cattle population that lived in the area hundreds of years ago (Ben Jemaa et al. 2015). Furthermore, this population was shown to have the same genetic structure as Algerian indigenous cattle (Ben Jemaa et al. 2018). Therefore, we can consider that the Tunisian local cattle are representative of the indigenous cattle population that lived in Northern Africa.

Among the populations of the study, Anatolian Black and Tunisian individuals showed the highest genetic heterogeneity. This is reflected in the higher dispersion of these individuals around their center of gravity in PCA (Figure 2) and in Admixture results (K=5 ; Figure 3). In contrast, Iberian breeds have a homogeneous genetic structure (though low levels of admixture with other foreign breeds are still observable) that results from the use of rigorous breeding schemes.

Our results support previous studies which reported a significant direct influence of African cattle (ND1) on the Iberian breeds (e.g Cymbron et al. 1999; Anderung et al. 2005; Decker et al. 2014). Nonetheless, because of the absence of samples from North Africa, none of these studies reported direct genetic evidence for North African influence on Iberian cattle. Carvajal-Carmona et al. (2003) inferred such influence by studying the diversity of mtDNA in Colombian criollo Cattle. Likewise, we previously reported an indirect genetic evidence for a past gene flow between Tunisian and Iberian cattle when we found low levels of genetic differentiation between the Tunisian and the Creole breed from Guadeloupe (CGU) as the latter was introduced in the Caribbean islands by Spanish and Portuguese conquerors after the second trip of Columbus in 1493 (Ben Jemaa et al. 2015). The present study brings a direct genetic evidence for a past admixture between North African and Iberian cattle which occurred before the divergence of Iberian breeds.

Furthermore, most of studies reported exportation patterns from Africa to Iberia, either during Muslim control of the Iberian peninsula (711-1492) (Payne 1978; Cymbron et al. 1999) or earlier, during the Bronze age (3300 to 1200 B.C) (Anderung et al. 2005)). Studies reporting cattle exportation in the opposite direction are much less frequent (e.g Decker et al. 2014).

According to Genepop results, among the Iberian breeds, Pirenaica (PIR) had the lowest FST value with Tunisian cattle (0.0609) while Sayaguesa (SA) and Mostrenca (MOST) had the highest ones (0.0823 and 0.0816, respectively). This difference is more likely due to the higher amount of genetic drift that occurred for SA and MOST breeds owing to a higher reproductive isolation (This is shown through a longer branch length on the phylogenetic network and higher FST values with the closest Iberian breed).

In this study, we also shed light on the phylogenetic relationships and admixture between North African and Balkan cattle. To our knowledge, no previous genetic studies reported past cattle admixture between these two regions. Our f3 statistics results, the low genetic differentiation and the high admixture levels between the Tunisian population on one hand and Anatolian Black, Illyrian Mountain Buša and Busha on the other hand, all suggest a past gene flow between North African and Balkan cattle. Interestingly, in addition to the common ancestry shared with Balkan and Iberian populations, Tunisian cattle share a second common ancestry with three Balkan populations, that is not present in Iberian cattle (Figure 3, K=5). Furthermore, we found lower level of differentiation between Tunisian and Balkan populations compared to that between Tunisian and Iberian breeds. Besides, f3 statistics, pairwise FST and Treemix bring strong evidence of gene flow between Iberian and Balkan cattle. Taken together, these three observations indicate the presence of at least two-wave admixture between North African and Balkan cattle. The first one carried North African alleles into Balkan populations through Iberian cattle while the second one occurred directly between North African and Balkan cattle. We believe that this gene flow took place during the ottoman rule in North Africa lasting from the mid-16th century until the French Conquest of Tunisia in 1881.

Decker et al. (2014) reported that Anatolian breeds have introgression from African taurine. According to our results, African taurine introgression into ABB breed may come from North Africa. Besides, Treemix placed two migration edges originating from a remote position from BALI into ABB and IMB indicating introgression from an unsampled population into Balkan cattle. Previous studies suggest that this unsampled population belongs to indicine cattle (Decker et al. 2014; Upadhyay et al. 2017).

We also observed signals of past hybridization between Iberian and Balkan cattle (Figure 1, Table 2). Pairwise FST values suggest that the relationship between these two populations is closer than that between Tunisian and Iberian breeds. The same argument indicates a closer relationship between Iberian and Balkan cattle than between Tunisian and all of the Balkan breeds except the Anatolian Black who showed a surprisingly lower FST value with Tunisian cattle, thus indicating that these two populations are historically more connected than they do with the Iberian breeds. This finding provides further support for the hypothesis put forward by Decker et al., (2014) who considered that modern Anatolian breeds do not represent the taurine populations originally domesticated in this region.

  1. Conclusions

The results of the present study have several important implications for understanding the complex history of cattle breed formation and admixture in several key geographic regions such as North Africa, Iberian and Balkan regions. Most importantly, the present study shed light, for the first time, on the close genetic relationship between North African and Balkan cattle owing to multiple-waves admixture occurring at different periods of history. Finally, our study brings for the first time a direct genetic evidence of past gene flow between North African and Iberian cattle which occurred in both directions.

  1. Acknowledgements

This work was supported by the International Foundation for Science (IFS) (Grant Number B4578-1)


  1. References

Alexander, D.H., Novembre, J., Lange, K., 2009. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664. https://doi.org/10.1101/gr.094052.109

Anderung, C., Bouwman, A., Persson, P., Carretero, J.M., Ortega, A.I., Elburg, R., Smith, C., Arsuaga, J.L., Ellegren, H., Götherström, A., 2005. Prehistoric contacts over the Straits of Gibraltar indicated by genetic analysis of Iberian Bronze Age cattle. Proc Natl Acad Sci U S A 102, 8431–8435. https://doi.org/10.1073/pnas.0503396102

Ben Jemaa, S., Boussaha, M., Ben Mehdi, M., Lee, J.H., Lee, S.-H., 2015. Genome-wide insights into population structure and genetic history of tunisian local cattle using the illumina bovinesnp50 beadchip. BMC Genomics 16. https://doi.org/10.1186/s12864-015-1638-6

Ben Jemaa, S., Rahal, O., Gaouar, S.B.S., Mastrangelo, S., Boussaha, M., Ciani, E., 2018. Genomic characterization of Algerian Guelmoise cattle and their genetic relationship with other North African populations inferred from SNP genotyping arrays. Livestock Science 217, 19–25. https://doi.org/10.1016/j.livsci.2018.09.009

Carvajal-Carmona, L.G., Bermudez, N., Olivera-Angel, M., Estrada, L., Ossa, J., Bedoya, G., Ruiz-Linares, A., 2003. Abundant mtDNA diversity and ancestral admixture in Colombian criollo cattle (Bos taurus). Genetics 165, 1457–1463.

Cymbron, T., Freeman, A.R., Isabel Malheiro, M., Vigne, J.-D., Bradley, D.G., 2005. Microsatellite diversity suggests different histories for Mediterranean and Northern European cattle populations. Proc Biol Sci 272, 1837–1843. https://doi.org/10.1098/rspb.2005.3138

Cymbron, T., Loftus, R.T., Malheiro, M.I., Bradley, D.G., 1999. Mitochondrial sequence variation suggests an African influence in Portuguese cattle. Proc Biol Sci 266, 597–603.

Decker, J.E., McKay, S.D., Rolf, M.M., Kim, J., Molina Alcalá, A., Sonstegard, T.S., Hanotte, O., Götherström, A., Seabury, C.M., Praharani, L., Babar, M.E., Correia de Almeida Regitano, L., Yildiz, M.A., Heaton, M.P., Liu, W.-S., Lei, C.-Z., Reecy, J.M., Saif-Ur-Rehman, M., Schnabel, R.D., Taylor, J.F., 2014. Worldwide Patterns of Ancestry, Divergence, and Admixture in Domesticated Cattle. PLoS Genet 10. https://doi.org/10.1371/journal.pgen.1004254

Felius, M., Beerling, M.-L., Buchanan, D.S., Theunissen, B., Koolmees, P.A., Lenstra, J.A., 2014. On the History of Cattle Genetic Resources. Diversity 6, 705–750. https://doi.org/10.3390/d6040705

Green, R.E., Krause, J., Briggs, A.W., Maricic, T., Stenzel, U., Kircher, M., Patterson, N., Li, H., Zhai, W., Fritz, M.H.-Y., Hansen, N.F., Durand, E.Y., Malaspinas, A.-S., Jensen, J.D., Marques-Bonet, T., Alkan, C., Prüfer, K., Meyer, M., Burbano, H.A., Good, J.M., Schultz, R., Aximu-Petri, A., Butthof, A., Höber, B., Höffner, B., Siegemund, M., Weihmann, A., Nusbaum, C., Lander, E.S., Russ, C., Novod, N., Affourtit, J., Egholm, M., Verna, C., Rudan, P., Brajkovic, D., Kucan, Ž., Gušic, I., Doronichev, V.B., Golovanova, L.V., Lalueza-Fox, C., de la Rasilla, M., Fortea, J., Rosas, A., Schmitz, R.W., Johnson, P.L.F., Eichler, E.E., Falush, D., Birney, E., Mullikin, J.C., Slatkin, M., Nielsen, R., Kelso, J., Lachmann, M., Reich, D., Pääbo, S., 2010. A Draft Sequence of the Neandertal Genome. Science 328, 710–722.https://doi.org/10.1126/science.1188021

Hathaway, J., Barbir, K., 2014. The Arab Lands under Ottoman Rule: 1516-1800. Routledge.

Ivanković, A., Paprika, S., Ramljak, J., Dovč, P., Konjačić, M., 2014. Mitochondrial DNA-based genetic evaluation of autochthonous cattle breeds in Croatia. Czech journal of animal science 59, 519–528.

Jombart, T., 2008. adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24, 1403–1405. https://doi.org/10.1093/bioinformatics/btn129

Loftus, R.T., MacHugh, D.E., Bradley, D.G., Sharp, P.M., Cunningham, P., 1994. Evidence for two independent domestications of cattle. PNAS 91, 2757–2761. https://doi.org/10.1073/pnas.91.7.2757

Patterson, N., Moorjani, P., Luo, Y., Mallick, S., Rohland, N., Zhan, Y., Genschoreck, T., Webster, T., Reich, D., 2012. Ancient Admixture in Human History. Genetics 192, 1065–1093. https://doi.org/10.1534/genetics.112.145037

Payne, W.J.A., 1970. Cattle production in the tropics. [London] : Longman.

Payne, W.J.A., Hodges, J., 1997. Tropical cattle: origins, breeds and breeding policies. Tropical cattle: origins, breeds and breeding policies.

Pickrell, J.K., Pritchard, J.K., 2012. Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data. PLoS Genet 8. https://doi.org/10.1371/journal.pgen.1002967

Pinhasi, R., Cramon-Taubadel, N. von, 2009. Craniometric Data Supports Demic Diffusion Model for the Spread of Agriculture into Europe. PLOS ONE 4, e6747. https://doi.org/10.1371/journal.pone.0006747

Price, T.D., 2000. Europe’s First Farmers. Cambridge University Press.

Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M.A.R., Bender, D., Maller, J., Sklar, P., de Bakker, P.I.W., Daly, M.J., Sham, P.C., 2007. PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. Am J Hum Genet 81, 559–575.

Rousset, F., 2008. genepop’007: a complete reimplementation of the genepop software for Windows and Linux. Molecular Ecology Resources 8, 103–106. https://doi.org/10.1111/j.1471-8286.2007.01931.x

Upadhyay, M.R., Chen, W., Lenstra, J.A., Goderie, C.R.J., MacHugh, D.E., Park, S.D.E., Magee, D.A., Matassino, D., Ciani, F., Megens, H.-J., van Arendonk, J.A.M., Groenen, M.A.M., 2017. Genetic origin, admixture and population history of aurochs (Bos primigenius) and primitive European cattle. Heredity (Edinb) 118, 169–176. https://doi.org/10.1038/hdy.2016.79



This article is published under license to Journal of New Sciences. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

CC BY 4.0