Sampling: geological and human error

The continental fossil record of vertebrates is notoriously patchy and there is a risk that studies such as this reflect little more than poor sampling (Smith 2007). It could be argued, for example, that the range chart and the plots of extinction and origination metrics merely document fluctuations in the quality of preservation of the fossils, variations in the environments represented, or in the number of localities and specimens recorded for each time division.Such problems with sampling could reflect human efforts – perhaps geologists and palaeontologists have worked with different degrees of vigour on different rock units, and so their collecting efforts might then bias the apparent patterns of diversity through time. On the other hand, sampling might more probably reflect the nature of the rock record. Terrestrial sediments in such red-bed successions reflect sporadic deposition in rivers, lakes and dune fields. A great deal of deposited rock could well be eroded by subsequent sediment movements under water or air. So, how can palaeontologists attempt to rule out such sampling problems?

In our paper (Benton et al. 2004), we presented three tests for sampling that sought to determine whether we were looking at a geological or a biological signal.

  1. First, we plotted the numbers of genera and families against numbers of localities and specimens. If sampling intensity drove apparent diversity, then stratigraphic intervals that are well sampled (lots of specimens, lots of localities) might very well show higher diversity than more poorly sampled stratigraphic intervals (few specimens, few localities). Our plot (see Figure right, a and b) shows no correlation: if anything, time bins with large numbers of localities and specimens are associated with low-diversity faunas and vice versa. Further, when the distributions of generic and familial diversity through time are compared with the distributions of numbers of sites and numbers of specimens per time bin (see Figure right, c), there is no apparent tracking. Peaks and troughs in the diversity data do not match peaks and troughs in richness of the fossil record. And, crucially, the time of diversity decline across the PTB corresponds to a rising trend in numbers of sites and specimens.
  2. Secondly, we looked at sample sizes. Five of the 13 stratigraphic units are represented by small (number of specimens <50) sample sizes, the Osinovskaya, Belebey, Bolshekinelskaya, Gostevskaya and Bukobay svitas, of which only the Gostevskaya falls near the PTB. Because the sample sizes are much smaller than those of the remaining eight time zones, we re-examined the data with those samples either omitted, or combined with neighbouring time bins. Both adjustments have the effect of increasing mean sample size; both had no effect on the patterns of diversity, extinction or origination.
  3. Thirdly, we applied a statistical technique called rarefaction analysis, which is designed to adjust sample sizes to the lowest common level. The question is asked: what result would we find if we drew a subsample of a particular size from the overall samples? The idea is to pick a subsample size that matches the smallest actual sample, and to use the rarefaction analysis to determine how much of the pattern might be generated by variation in sample sizes through the 13 stratigraphic units. Our rarefaction analysis showed that the better-sampled time units – the Kopanskaya, Kzylsaiskaya and Staritskaya svitas – may overestimate diversity by one, or at most, two families, in comparison to the other time bins. Normalizing all time bin sizes to the range of 49Ð63 specimens, cuts diversity of the first three Triassic gorizonts by one or two families and, hence, makes the PTB extinction seem larger (91% instead of 82% extinction rate) and depresses earliest Triassic diversity even more than has been indicated from the raw figures.

Our conclusion is then that the patterns we see in the Russian sections are more biological than geological. Sampling effects are not ruled out completely, of course, but the pattern of data cannot be passed off simply as a geologically driven signal. It seems reasonable for the present to read the patterns as evolutionary and then to compare them with other areas.


  • Benton, M.J., Tverdokhlebov, V.P. and Surkov, M.V. 2004. Ecosystem remodelling among vertebrates at the Permian-Triassic boundary in Russia. Nature 432, 97-100 (doi:10.1038/nature02950). pdf Download the original of Figure 1 in colour as a pdf. Download the Excel data file here.
  • Smith, A.B. 2007. Marine diversity through the Phanerozoic: problems and prospects. Journal of the Geological Society, 164, 731-745.