Inference of Population Structure in Whale Sharks

Chris Keane, Hanna Jackson, Ross Pearsall, Brady Surbey

Department of Biological Sciences, Simon Fraser University

*This report is a class project NOT peer-reviewed science


We reanalysed a significant proportion of the currently available whale shark (Rhincodon typus) microsatellite population genetic data to make inferences about their population structure.

We tested the hypothesis that whale sharks form two populations, as opposed to one global population. The current literature and our analysis points to admixture having been insufficient to counteract the genetic drift, resulting in two sub populations. Few concrete conclusions can be made because the species is difficult to observe and study, along with the additional complication of attempting to assign population structure to animals within seemingly continuous aquatic environments.

We used the program STRUCTURE to make our inferences and test for the presence or absence of genetic structure. Two of the nine populations are shown to have less genetic similarity to the others with one, the Gulf of Mexico population, displaying by far the greatest discernible genetic structure. Although genetic data show this population being more isolated than the other locations there is still genetic and tracking evidence for migration of this population.

From our analysis, we learned that for studies on population structure, especially on aquatic animal species such as the whale shark, genetic data can be effective at discerning not only genetic attributes of the population, but can also provide insight into physical and spatial characteristics. Further, we note that population structure is a complex and difficult to clearly define concept that nevertheless has a significant effect on our understanding of other aspects of an organism’s biology.


Our goal was to attain whale shark population genetic data and to analyze it to gain understanding of their population structure. This project aims to advance current understanding of the potential biological consequences of population structure in whale sharks, such as decreased effective population size, increased genetic drift, decreased efficacy of selection and increased homozygosity. In this way, knowing the extent of their population structure is key to a better understanding the biology of this charismatic animal.

We chose to re-analyze a 2014 paper by Vignaud and colleagues on the population structure of whale sharks wherein they collected genetic data from individuals all over the world. Their data set makes up a large proportion of the currently available population genetic data for whale sharks. They found that whale sharks are not part of a global meta-population, but rather make up at least two sub-populations.

Along with our reanalysis of this paper’s specific data, we also reviewed relevant surrounding literature to discover the extent to which other research agreed with Vignaud’s result. Overall there is insufficient data to make concrete claims, but some hesitant conclusions can be made.


2.1 | Distribution

Whale sharks (Rhincodon typus) have a wide distribution globally in warm temperate and tropical seas except the Mediterranean (Colman 1997, Compagno 2001, Stewart and Wilson 2005). Their distribution is related to oceanographic features such as areas of enhanced upwelling and plankton productivity (Eckert and Stewart 2001), temperature (Sequeira et al. 2012), ocean currents and biological productivity (Stewart and Wilson 2005). One study found that 90% of tracked animals stayed between 27.5 and 29 degrees Celsius (Sequeira et al. 2012) however, they are also known to swim to depths of 1000-m or deeper, to temperatures of 5 degrees Celsius or less (Stewart and Wilson 2005). Their movement patterns consist of sporadic occurrences and seasonal aggregations (Colman 1997, Stewart and Wilson 2005), making them rather difficult to track effectively. Their population numbers have shown recent decline as measured by a decrease in catch per unit effort as well as a decrease in sightings (Stewart and Wilson 2005).

2.2 | Biology

Whale sharks are large-bodied fish with slow growth, late maturation and extended longevity (Colman 1997). This limits recruitment and makes the species susceptible to exploitation (Colman 1997). They filter-feed on small crustaceans, fish, plankton and macroalgae (Colman 1997). Their effective population size has been estimated in multiple methods, coming to varying conclusions. One genetic study estimated the population size was from 238,000 to 476,000 animals(Castro et al. 2007), although these estimates are subject to high variability due to low sample sizes.

2.3 | Population Structure

Different population structures and migration patterns have been observed in large fish and cetaceans. Some have distinct subpopulations, some have continuous populations and others have high migration but yet distinct subpopulations (Bremer et al. 1996, Fontaine et al. 2007, Boissin et al. 2019, Pirog et al. 2019)

Whale shark population structure has been the subject of much inquiry, via photo identification (McKinney et al. 2017) telemetry tracking (Eckert and Stewart 2001), DNA (Castro et al. 2007, Vignaud et al. 2014), eDNA (Sigsgaard et al. 2017)and iDNA analysis (Meekan et al. 2017). These studies vary in statistical power to infer the population structure of whale sharks, and although the particular details of this structure may have yet to be worked out, a general pattern has emerged.

One of the major studies of this topic took genetic data from whale sharks all across the world. They found that mtDNA diversity, haplotype diversity and nucleotide diversity were similar across all sampling sites aside from one, the Gulf of Mexico sampling site (Vignaud et al. 2014). Based on this as well as FST values, they concluded that the Indo-Pacific population has little genetic structure, however when the Atlantic Gulf of Mexico population was considered, there was significant genetic structure (Vignaud et al. 2014). This supports the conclusion that there are at least two whale shark populations that rarely mix, one Atlantic and one Indo-Pacific (Vignaud et al. 2014).

Another impactful genetic study similarly analyzed genetic data and found less conclusive evidence (Castro et al. 2007). They found that the most common haplotype was found globally however there was low gene flow and some genetic structure (Castro et al. 2007). Similar to Vignaud, they found no population structure between the Indian and Pacific basins, however they did not find that the Atlantic population was differentiated, indicating low overall population structure for this species (Castro et al. 2007).

A different research group analysed the whale shark eDNA, taken from the ocean environment and scanned for whale shark genetic material (Sigsgaard et al. 2017). Consistent with earlier findings, this group found that the Gulf of Mexico population was significantly differentiated from the Indo-Pacific population (FST=0.3) and the Indo-Pacific populations were not significantly differentiated (FST=0-0.3). iDNA techniques have also been deployed, with similar results indication that the Atlantic Gulf of Mexico is significantly differentiated from the Indo-Pacific population (Meekan et al. 2017).

These analyses were performed with many different forms of population genetic data and programs capable of analysing that data. Our objective was to reanalyse a significant portion of the currently available population genetic data


Our analyses were done on the Vignaud team’s microsatellite dataset. We used STRUCTURE (Pritchard, Stephens, & Donnelly, 2000; Falush, Stephens, & Pritchard, 2003; Falush, Stephens, & Pritchard, 2007; Hubisz, Falush, Stephens, & Pritchard, 2009) to estimate how many subpopulations exist within the global whale shark population. We tested K values from 2 to 8 with a burn-in of 50,000, and 50,000 repetitions of the Markov Chain Monte Carlo algorithm. Each K value simulation was replicated 11 times.

They found that by many metrics, the Gulf of Mexico population was significantly differentiated from separate the Indo-Pacific population. Our results further support their conclusions. At K = 2, there are few distinctions present (Figure 1). However, as K increases, the Guld of Mexico (group 7) subpopulation looks to be more unique than all other demes (Figure 1). These simulations are consistent with the Vignaud team’s assessment, that there are two distinct global subpopulations of whale sharks present (Vignaud et al., 2014). Another pattern that emerged in our STRUCTURE analyses is seen in the Djibouti (group 2) subpopulation (Figure 1). The whale shark population appears to be at least two subpopulations at all values of K. Although, Vignaud’s paper suggests that any significant differentiation from this group could be a result of some of the microsatellite DNA being from preserved samples, and therefore being of less quality than other samples (Vignaud et al., 2014).

Figure 1. STRUCTURE plots of 14 whale shark microsatellite DNA loci of 406 individuals. K values from 2 (top) to 8 (bottom) were used. Vertical bars represent unique individuals. Colours correspond to the value of K. Individuals with multiple colours represents assignment to multiple subpopulations, indicating possible gene flow between the


We found no clear evidence that the whale shark population exists as anything other than two distinct subpopulations. Our analyses agree with our hypothesis and previous studies that there are likely two distinct global subpopulations of whale sharks, one in the Gulf of Mexico, and the other spanning the rest of the world (Vignaud et al., 2014; Meekan et al., 2017). Though, from our assessment it is possible that another group exists within the Djibouti area.

Whilst our simulations from this fascinating dataset are largely consistent with other papers, what seems clear is that this is a complicated matter, and there is still no definitive answer. Even if the Gulf of Mexico subpopulation is distinct, the highly mixed colouring on the STRUCTURE simulations demonstrate that there could still be interbreeding and migration occurring across the globe.

This analysis shows that the global population genetic structure of whale sharks is more complex than merely one panmictic population. This structure adds additional complications to the biology of the species. With population structure, the effective population size of any given intermixing group is decreased compared to what it would be if it was one globally panmictic population. With this decreased effective population size, genetic drift will have more influence over the allele frequencies and selection will be less effective at purging deleterious alleles (Kimura, 1968). This could lead to decreased genetic health of the populations, ultimately causing potential declines in whale shark populations.

Future efforts in whale shark population genetic analysis could attempt to obtain entire genetic sequences from individuals of all regions to allow for a more thorough analysis. This could be made easier by a concerted effort to understand their reproduction habits, which is still not understood (Sequiera et al., 2013). If birthing areas were discovered, genetic samples could be obtained from both the mother and offspring, providing a much larger pool of potential data.


We would like to thank Dr. Michael Hart for his time, effort, resources and enthusiasm in aiding progress on our project.


Vignaud, T. M., J. A. Maynard, R. Leblois, M. G. Meekan, R. Vázquez-Juárez, D.

Ramírez-Macías, S. J. Pierce, D. Rowat, M. L. Berumen, C. Beeravolu, S. Baksay, and S. Planes. 2014. Genetic structure of populations of whale sharks among ocean basins and evidence for their historic rise and recent decline. Molecular Ecology 23:2590–2601.

Colman, J. G. 1997. A review of the biology and ecology of the whale shark. Journal of Fish Biology 51:1219–1234.

Compagno, L. J. V. 2001. harks of the World An annotated and illustrated catalogue of Shark species known to date. Shark Research Center 2:186–209.

Stewart, B. S., and S. G. Wilson. 2005. Threatened fishes of the world: Rhincodon typus. Environmental Biology of Fishes 74:184–185.

Eckert, S. A., and B. S. Stewart. 2001. Telemetry and satellite tracking of whale sharks, Rhincodon typus, in the Sea of Cortez, Mexico, and the north Pacific Ocean. Environmental Biology of Fishes 60:299–308.

Sequeira, A., C. Mellin, D. Rowat, M. G. Meekan, and C. J. A. Bradshaw. 2012. Ocean-scale prediction of whale shark distribution. Diversity and Distributions 18:504–518.

Castro, A. L. F., B. S. Stewart, S. G. Wilson, R. E. Hueter, M. G. Meekan, P. J. Motta, B. W. Bowen, and S. A. Karl. 2007. Population genetic structure of Earth’s largest fish, the whale shark (Rhincodon typus). Molecular Ecology 16:5183–5192.

Bremer, J. R. A., J. Mejuto, T. W. Greig, and B. Ely. 1996. Global population structure of the swordfish (Xiphias gladius L.) as revealed by analysis of the mitochondrial DNA control region. Journal of Experimental Marine Biology and Ecology 197:295–310.

Fontaine, M. C., S. J. E. Baird, S. Piry, N. Ray, K. A. Tolley, S. Duke, A. A. Birkun, M. Ferreira, T. Jauniaux, Á. Llavona, B. Öztürk, A. A. Öztürk, V. Ridoux, E. Rogan, M. Sequeira, U. Siebert, G. A. Vikingsson, J. M. Bouquegneau, and J. R. Michaux. 2007. Rise of oceanographic barriers in continuous populations of a cetacean: The genetic structure of harbour porpoises in Old World waters. BMC Biology 5:1–16.

Boissin, E., S. R. Thorrold, C. D. Braun, Y. Zhou, E. E. Clua, and S. Planes. 2019. Contrasting global, regional and local patterns of genetic structure in gray reef shark populations from the Indo-Pacific region. Scientific Reports 9:1–9.

Pirog, A., V. Ravigné, M. C. Fontaine, A. Rieux, A. Gilabert, G. Cliff, E. Clua, R. Daly, M. R. Heithaus, J. J. Kiszka, P. Matich, J. E. G. Nevill, A. F. Smoothey, A. J. Temple, P. Berggren, S. Jaquemet, and H. Magalon. 2019. Population structure, connectivity, and demographic history of an apex marine predator, the bull shark Carcharhinus leucas. Ecology and Evolution:1–21.

McKinney, J. A., E. R. Hoffmayer, J. Holmberg, R. T. Graham, W. B. Driggers, R. de la Parra-Venegas, B. E. Galván-Pastoriza, S. Fox, S. J. Pierce, and A. D. M. Dove. 2017. Long-term assessment of whale shark population demography and connectivity using photo-identification in the Western Atlantic Ocean. PLoS ONE 12:1–18.

Sigsgaard, E. E., I. B. Nielsen, S. S. Bach, E. D. Lorenzen, D. P. Robinson, S. W. Knudsen, M. W. Pedersen, M. Al Jaidah, L. Orlando, E. Willerslev, P. R. Møller, and P. F.

Thomsen. 2017. Population characteristics of a large whale shark aggregation inferred from seawater environmental DNA. Nature Ecology & Evolution 1:7–11.

Meekan, M., C. M. Austin, M. H. Tan, N. W. V. Wei, A. Miller, S. J. Pierce, D. Rowat, G. Stevens, T. K. Davies, A. Ponzo, and H. M. Gan. 2017. iDNA at sea: Recovery of whale shark (Rhincodon typus) mitochondrial DNA sequences from the whale shark copepod (Pandarus rhincodonicus) confirms global population structure. Frontiers in Marine Science 4:1–8.

Pritchard, J. K., Stephens, M., & Donnelly, P. (2000). Inference of population structure using multilocus genotype data. Genetics, 155(2), 945–959.

Falush, D., Stephens, M., & Pritchard, J. K. (2003). Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics, 164(4), 1567–1587.

Falush, D., Stephens, M., & Pritchard, J. K. (2007). Inference of population structure using multilocus genotype data: Dominant markers and null alleles. Molecular Ecology Notes, 7(4), 574–578.

Hubisz, M. J., Falush, D., Stephens, M., & Pritchard, J. K. (2009). Inferring weak population structure with the assistance of sample group information. Molecular Ecology Resources, 9(5), 1322– 1332.

Kimura, M. (1968). Evolutionary Rate at Molecular Level. Nature, 217(5129), 624-.