Sampled to Death? The Rise and Fall of Probability Sampling in Archaeology
Edward B. Banning
After a heyday in the 1970s and 1980s, probability sampling became much less visible in archaeological literature as it came
under assault from the post-processual critique and the widespread adoption of full-coverage survey. After 1990, published
discussion of probability sampling rarely strayed from sample-size issues in analyses of artifacts along with plant and animal
remains, and most textbooks and archaeological training limited sampling to regional survey and did little to equip new gen-
erations of archaeologists with this critical aspect of research design. A review of the last 20 years of archaeological literature
indicates a need for deeper and broader archaeological training in sampling; more precise usage of terms such as sample;
use of randomization as a control in experimental design; and more attention to cluster sampling, stratied sampling, and non-
spatial sampling in both training and research.
Keywords: probability sampling, statistics, survey, history of archaeology, archaeological pedagogy
Después de un apogeo en los años setenta y ochenta, el muestreo probabilístico se hizo mucho menos visible en la literatura
arqueológica, ya que se vio amenazado por la crítica posprocesal y la adopción común de la encuesta de cobertura com-
pleta. Después de 1990, la discusión publicada sobre el muestreo probabilístico rara vez se desvió de los problemas del
tamaño de la muestra en los análisis de artefactos, restos de plantas y animales, mientras que la mayoría de los libros de
texto y el entrenamiento arqueológico limitaron el muestreo al estudio regional e hicieron poco para equipar a las nuevas
generaciones de arqueólogos con este aspecto crítico de diseño de la investigación. Un resumen de los últimos 20 años de
literatura arqueológica indica la necesidad de una formación arqueológica más profunda y amplia en el muestreo, el uso
más preciso de términos como muestra, el uso de la aleatorización como control en el diseño experimental y una mayor
atención al muestreo conglomerado, muestro estraticado, y muestreo no espacial tanto en capacitación como en
investigación.
Palabras clave: muestreo de probabilidad, estadísticas, prospección, historia de la arqueología, pedagogía arqueológica
R
ecently, I was asked to write a contribu-
tion on spatial sampling in archaeology
(Banning 2020), with case studies to
illustrate best practices. To my surprise, I had
difculty nding any examples, let alone best
practices, of probability samplingspatial or
otherwisein archaeological literature of the
last 20 years, aside from Orton s(2000) excellent
book. Given that sampling theory is a critical
aspect of research design and control for bias,
this puzzled and concerned me.
There can be legitimate reasons not to employ
probability sampling. What rattled me when I
tried to nd those case studies is the possibility
that many archaeologists are neglecting probabil-
ity sampling for the wrong reasons.
Here, I explore some possible reasons for this
neglect before offering some suggestions for
restoring formal sampling to a substantive role
in our practice. First, however, let us review
briey the purpose and nature of probability sam-
pling, and a brief history of its use in archaeology.
Edward B. Banning (ted.banning@utoronto.ca, corresponding author) Department of Anthropology, University of Toronto,
19 Russell St., Toronto, Ontario, M5S 2S2, Canada
American Antiquity 86(1), 2021, pp. 4360
Copyright © 2020 by the Society for American Archaeology. This is an Open Access article, distributed under the terms of the
Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use,
distribution, and reproduction in any medium, provided the original work is properly cited.
doi:10.1017/aaq.2020.39
43
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
What Is Sampling?
Sampling entails two important concepts: (1) the
population, or set of phenomenasuch as sites,
features, spaces, artifacts, bone, and charcoal
fragmentswhose characteristics are of interest,
and (2) the sample, a subset of the population
that we actually examine. Our interest in the sam-
ple is that it might tell us something about the
population. In archaeology, we often have popu-
lations that consist of spaces, such as excavation
squares, because we cannot enumerate popula-
tions of sites, artifacts, or ecofacts that we
have not yet surveyed or excavated. A sampling
frame is a list of the populations elements or
members, or a grid or map for identifying the set
of spatial elements in a spatial population. Sam-
ple size is just the number of elements in the sam-
ple, whereas sampling fraction is the sample size
divided by the number of elements in the whole
population, whether known or not. Even archae-
ologists who do not formall y employ sampling
theory accept that a large sample is a better
basis for inferences than a very small sample.
Archaeologists sample all the time, if only
because cost or other factors make examination
of whole populations impractical or unethical.
We also recognize that taphonomic factors can
distance our sample of a fossil assemblage
still farther from a deposited assemblage that
may be our real population of interest (Holtzman
1979; Meadow 1980). The question is, How con-
dent should we be about inferences based on a
small subset of a population?
For some kinds of samples, not very. Con-
venience or oppo rtunistic samples are just
the sites, artifacts, or plant or animal remains
that come to hand, often because they are already
sitting in some lab. Potentially better are pur-
posive samples that result from conscious selec-
tion of certain members of the population
because of the perception that they provide
superior information for some purpose, such as
excavation areas selected for their probability
of yielding a long, stratied sequence. Samples
such as these are not awed, for certain purposes
at least, but they entail the risk that they may not
be representative of the population of interest.
In other words, the samples characteristics
might not be very similar to the characteristics
of the whole population. A nonrandom differ-
ence between the value of some population char-
acteristic (statisticians call this a parameter)
and the value of that characteristic (or statistic)
in a sample is bias.
Probability sampling is a set of methods with
the goal of controlling this risk of bias by ensur-
ing that the sample is representative of the
population so that we can estimate a parameter
on the basis of a statistic. This always involves
some randomness. The classic probability sam-
pling strategies include simple random sampling
with replacement, in which every element of
the populationwhether an artifact, bone frag-
ment, space, or volumehas an equal probabil-
ity of selection at each and every draw from the
population. This is like picking numbers from a
hat but then replacing them so that some ele-
ments can be selected more than once. Alterna-
tively, we may remove elements once they are
randomly selected (random sampling without
replacement) so that the probability of selection
changes as sampling progresses and no element
is selected more than once. Another is systematic
sampling, in which we randomly select the rst
element and then all the others are strictly deter-
mined by a spacing rule. For example, we
might organize artifacts in rows, randomly select
one of the rst four artifacts by rolling a die
(ignoring 5 and 6), and then take every fourth
artifact in sequence to yield a 25% sampling frac-
tion. Stratied sampling involves dividing the
population into subpopulations (strata) that dif-
fer in relevant characteristics before sampling
within them randomly or systematically. System-
atic unaligned sampling is a specically spatial
design meant to ensure reasonably even coverage
of a site or region without as much rigidity as a
systematic sample (Figure 1). Most probability
sampling designs are variations or combinations
of these basic ones.
Sample elements need not be spatial
(Figure 2), but the fact that archaeologists can
rarely specify populations of artifacts or eco-
facts in advance often forces them to employ
cluster sampling. Cluster samples occur when-
ever the population of interest consists of items
such as artifacts, charcoal, or bone fragments,
but the population actually sampled is a spatial
one, such as a population of 2 × 2 m squares
44 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
(Mueller 1975a). Cluster samples require statis-
tical treatment that differs from that for simple
random samples (Drennan 2010:244246;
Orton 2000:212213) because of the phenom-
enon called autocorrelation, which is that
observations that are close together are likely to
be more similar to one another than ones that
are far apart. In the case of lithics, it is likely
that multiple akes found near each other came
from the same core, for example.
Multistage sampling is a variety of cluster
sampling in which there is a hierarchy of clusters.
For example, we might rst make a stratied ran-
dom selection of sites that have been excavated,
then randomly select contexts or features from
the selected excavations (themselves generally
already samples of some kind), then analyze
the entire contents of the sampled contexts.
Another important type is Probability Propor-
tional to Size, or PPS sampling (Orton 2000:34).
This involves randomly or systematically placed
dots or lines over a region, site, thin section,
pollen slide, or other area. Only sites, artifacts,
or mineral or pollen grains that the dots or lines
intersect are included in the sample. Because
the dots or lines are more likely to intersect
large items than small ones, it is necessary to cor-
rect for this effect to avoid bias.
In general, probability sampling is preferable
to convenience or purposive sampling whenever
we should be concerned whether or not the
sample is representative of a populationand,
consequently, suitable for making valid infer-
ences about it. Convenience sampling is accept-
able for some clearly dened purposes when
probability sampling is impossible or impracti-
cal, and purposive sampling can be preferable
when we have very speci c hypotheses whose
efcient evaluation requires targeted, rather
than randomized, observations.
Figure 1. Hypothetical examples of some spatial sampling designs (after Haggett 1965:Figure 7.4) that were repeated in
dozens of later archaeological publications: (a) simple random, (b) stratied random, (c) systematic, and (d) systematic
unaligned.
Banning] 45SAMPLED TO DEATH?
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
The Rise of Archaeological Probability
Sampling
What was it that once made sampling theory
appeal to archaeologists? Its perception as
scientic, no doubt, was a contributing factor.
As the previous section suggests, a better incen-
tive was that, by controlling sources of bias,
it permits valid conclusions about populations
when observing entire populations is impossible,
undesirable, wasteful, or unethical. Sampling
allows us to evaluate the strength of claims
about populations with less worry that results
are due to chance or, worse, our own preconcep-
tions (Drennan 2010:8082; Orton 2000:69).
Some of the earliest attention to sampling in
archaeology concerned sample size. Phillips
and colleagues (1951) made frequent reference
to samples of sites and pottery, and especially
the adequacy of sample sizes for seriation. In
one instance, they even drew a random sample
of sherds (1951:77). They did not, however,
employ formal methods to decide what consti-
tuted an adequate sample size, and sampling
did not attract much explicit attention from
archaeologists until the 1960s (Rootenberg
1964; Vescelius 1960).
Binford (1964) was particularly inuential
in archaeologists adoption of probability
sampling, presenting it as a key element of
research design. He summarized the main sam-
pling strategies reviewed in the last section and
identied different kinds of populations and the
role of depos itional history in their denition.
He also recognized that confounding factors
such as vegetation, construction, land use, and
Figure 2. Some hypothetical examples of nonspatial samples, with selected elements in gray: (a) simple random sample
of pottery sherds (the twelfth sherd selected twice), (b) 25% systematic sample of projectile points arranged in arbitrary
order, and (c) stratied random sample of sediment volumes for otation.
46 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
accessibilitycould complicate sampling
designs and inferences from spatial samples.
He also inadvertently fostered some miscon-
ceptions. Despite advocating nuanced decisions
on sample size earlier in the article, Binford dis-
missed attention to sample size as quite compli-
cated, and continued with, for purposes of
argument, . . . we will assume that a 20% areal
coverage within each sampling stratum has been
judged sufcient (1964:434). Although this
was just a simplifying example, later archaeolo-
gists often took 20% as a recommended sampling
fraction. Similarly, some archaeologists seem to
have taken his mention of soil type as grounds
for stratication as received wisdom, and they
used soil maps to stratify spatial samples whether
or not this made sense. Despite his assertion that
probability sampling should occur on all levels
of data collection (1964:440), both this article
and much of the literature it inspired strongly
privilege sampling in regional surveys, with less
attention to sampling sites, assemblages, or arti-
facts (but see Orton 2000).
Soon, sampling appeared in texts used to edu-
cate the next generation of archaeologists (Ragir
1967; Watson et al. 1971). Most focused on the
basic spatial sampling designs. Generally lacking
was discussion of when probability sampling
was appropriate and how to dene populations
or plan effective stratied or cluster samples.
An outpouring of literature on sampling
in regional survey (e.g., Cowgill 1970, 1975;
Judge et al. 1975; Lovis 1976; Williams et al.
1973), surface collection (Redman and Watson
1970), excavation (Hill 1970), zooarchaeology
(Ambrose 1967), and artifact analysis (Cowgill
1964) also appeared. There were more general
reviews (Mueller 1975b;OBrien and Lewarch
1979; Redman 1974) and desktop simulations
of sampling designs (Mueller 1974; Plog
1976). Sampling soon saw application outside
North America (e.g., Cherry et al. 1978;Mac-
Donald et al. 1979; Redman and Watson 1970),
and the number of articles in American Antiquity
that discussed or used probability sampling grew
rapidly until 1980 (Figure 3).
Then, the literature shifted to more focused
topics, such as determining sample sizes
(Dunnell 1984; Leonard 1987; McManamon
1981; Nance 1981), ensuring that absences of
certain classes are not due only to sampling
error (Nance 1981), or sampling shell middens
(Campbell 1981).
During the 1970s, much research still used
purposive sampling or ignored this sampling
wave. Even authors who did not embrace sam-
pling, however, tended to be somewhat apolo-
getic, offering caveats about their samples
usefulness or describing their research as
preliminary.
The Fall of Probability Sampling in
Archaeology
Articles published in American Antiquity since
1960 and Journal of Field Archaeology, two
long-lived journals that regularly publish articles
on archaeological methods (Figure 3 and Supple-
mental Text 1), show that substantive discussion
or mentions of sampling in the statistical sense
peaked about 1980 in the former and 1990 in
the latter, then declined, albeit with some recov-
ery in the late 1990s and again in the last few
years, never returning to pre-1990 levels. What
these graphs do not reveal is that articles prior
to 1985 tend to be about sampling, while most
after 1990 just mention having used some kind
of random or systematic sample, usually without
presenting any details. Those few about sam-
pling in the later period usually pertain to
sample-size issues in zooarchaeology and paleo-
ethnobotany rather than to research design more
generally. It is hard to imagine that there was
nothing further to say about probability sampling
or that it had become too routine to warrant com-
ment, especially as research in statistics devel-
oped considerably after 1970 (e.g., Orton
2000:11; Thompson and Seber 1996).
Remarkably, a common claim of the 1990s
is that some pattern is highly unlikely to result
from sampling error or random chance (e.g.,
Falconer 1995:405), despite relying on small
or non-probabilistic samples. Other authors
acknowledge bias in their data but go on to
analyze them as though they are unbiased, or
describe sampling designs expected to pro-
vide representative samples by standardizing
sampling elements without reference to probabil-
ity sampling (e.g., Bayman 1996:407; Walsh
1998:582).
Banning] 47SAMPLED TO DEATH?
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
A blistering critique (Hole 1980) that exposed
aws in then-recent archaeological sampling
including arbitrary sampling fractions, the sup-
pression of standard error through impractically
small sample elements, and the ignoring of
prior informationforeshadowed this decline.
Hole (1980:232), however, was not criticizing
sampling per se, just its misapplications. Even
though some authors judged her critique as
extreme (Nance and Ball 1986; Scheps 1982),
the fact that none of the 17 articles that cited it
from 1981 to 1999, according to Google Scholar,
Figure 3. The frequency of articles with substantive discussion of sampling or based at least partly on explicit probability
samples in American Antiquity (19602019) and Journal of Field Archaeology (19742019). Note that there was an
interruption in Journal of Field Archaeology from 2002 until early 2004 and that 20182019 have ve articles
(10 per four years).
48 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
advocated abandoning probability sampling sug-
gests that it had little or no role in decreasing
interest or expertise in sampling. The following
sections consider more likely candidates.
The Post-Processual Critique
During the 1980s, attacks on sometimes pseudo-
scientic or dehumanizing examples of New
Archaeology engendered an anti-science rhetoric
that may have made probability sampling a vic-
tim. Shanks and Tilley led this attack by arguing
that mathematical approaches entail assumptions
that theory is value free, and that categories of
analysis are necessarily designed to enable
certain calculations to be made (1987:57).
McAnany and Rowe (2015) explicitly connect
rejection of probability sampling with the post-
processual paradigm. More recently, Sørensen
(2017) argues against a new scientic turn
that devalues the humanities and fetishizes sci-
entic facts. What he criticizes explicitly, how-
ever, is not really the use of samples but the use
of inadequate ones (Sørensen 2017:106). This is
not a problem with science; it just underscores
the need for better sampling.
Furthermore, adherents of the interpretive
paradigm still base inferences on samples and
analyze data as though they represent something
more than the sample itself. Even Shanks and
Tilley (1987:173174) explicitly used stratied
sampling in their analysis of beer cans and
based bar graphs and principal components anal-
ysis on this sample (Shanks and Tilley
1987:173189; see also Cowgill 1993; VanPool
and VanPool 1999; Watson 1990).
Similarly, Shanks (1999) relies on the quanti-
tative distribution of motifs in a sample of 2,000
Korinthian pots (1999:40). This is an opportun-
istic sample of all complete pots known to
Shanks, but he expects them to represent popula-
tions of artifacts and the people who made them,
such as the pottery of archaic Korinth (Shanks
1999:2, 9, 10, 151). He further generalizes about
the early city state, an even more abstract popu-
lation (Shanks 1999:210
213), and wonders if
his sample is somewhat biased for some pur-
poses (1999:41).
Apparently, formal sampling and generaliza-
tion from sample to population are not
incompatible with interpretive archaeology. Just
as atheoretical archaeologists inescapably use
theory (Johnson 1999:xiv, 68), avowedly anti-
science archaeologists still use statistical reason-
ing and sampling. Their anti-science rhetoric,
however, may still have had a chilling inuence
on explicit archaeological sampling.
The Full-Coverage Program
When Fish and Kowalewski (1990) published
The Archaeology of Regions, full-coverage
survey had already begun to trend. Its premise
that small samples are an inadequate basis for
some kinds of research is undeniable (Banning
2002:155): a small sample can never capture all
nuances of a settlement system and suffers
from the Teotihuacan effect”—the risk of omit-
ting key sites in a settlement system (Flannery
1976:159). The solution, according to most
authors in this volume, is to survey an entire land-
scape at somewhat consistent intensity.
Its classic example is the Valley of Mexico
survey. Rather than only examining a subset, sur-
veyors examined every accessible space within
a universe of spatial units with pedestrian tran-
sect intervals ranging from 15 to more than 75 m,
but typically 4050 m (Parsons 1990:11; San-
ders et al. 1979:24).
Kowalewski (1990) identies the main
advantages of full coverage, claiming that it cap-
tures greater variability and larger datasets, facil-
itates analysis of spatial structure, and is better at
representing rare observations. He also high-
lights its exibility of scale in that it does not
force researchers to lock in to an analytical
unit size (cf. Ebert 1992), and he correctly
notes that much archaeological research is not
about parameter estimation.
The discussants who close out The Archae-
ology of Regions, however, were not as con-
vinced that full coverage was better than
sampling. With surveyor intervals as large as
100 m, most or all full-coverage surveys were
actually still sampling, potentially having missed
even easily detectable sites with horizontal
dimensions less than the transect interval.
These are really systematic transect samples,
and PPS samples of sites, whose main virtue is
even and somewhat consistent coverage (Cowgill
Banning] 49SAMPLED TO DEATH?
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
1990:254; Kintigh 1990:238). Full coverage
does not mean anything close to 100% coverage
unless transect intervals are extremely small and
visibility and preservation are excellent (Given
et al. 1999 :22; Sundstrom 1993).
Fred Plog (1990) specically rebuts many of
Kowalewskis claims, noting that well-designed
stratied samples capture variability well,
whereas volume of data is correlated with survey
effort, irrespective of method. The claim that
full-coverage surveys perform better at capturing
rare sites is true only for large, obtrusive ones,
not ones that are small or unobtrusive. As most
full-coverage surveys really use transects as spa-
tial units, they are also locked in to their tran-
sect spacings.
Due to the fact that most full-coverage surveys
are systematic PPS samples, they yield biased
estimates of some parameterssuch as mean
site size, the proportion of small sites, and the
rank-size statisticbecause they underrepresent
small, unobtrusive sites and artifact densities
unless their practitioners correct for this (Cowgill
1990:252258). None of the surveys in The
Archaeology of Regions did so, however.
In the aftermath of this book and a session
decrying sampling at the 1993 Theoretical
Archaeology Group conference, almost every-
body was against sampling (Kamermans
1995:123). The brief irtation with survey sam-
pling led to consensus . . . that best practice
involves so-called full-coverage survey
(Opitz et al. 2015:524). Despite its focus on spa-
tial sampling, this probably inuenced attitudes
to sampling more generally.
Misunderstanding Sampling
Certain misconceptions also discouraged interest
in sampling. Binford (1964) had proclaimed that
sampling requires populations of equal-sized
spatial units. Many archaeologists found that
arbitrary grids, especially of squares, were rarely
practical or useful because their boundaries did
not correspond with meaningful variation on
the ground. Sampling universes, however, do
not have to consist of any particular kind of spa-
tial unit (Banning 2002:8688; Wobst 1983
).
Even as probability sampling was in its early
decline, some projects successfully used sample
elements that conformed to geomorphological
landforms, eld boundaries, or urban architec-
tural blocks (e.g., Banning 1996; Kuna 1998;
Wallace-Hadrill 1990).
Some archaeologists worried that xed
samples fail to include rare items or represent
diversity accurately. Others, however, found
solutions. One is to supplement a probability
sample with a purposive one (Leonard 1987;
Peacock 1978); it is not appropriate to combine
the two kinds of samples to calculate statistics,
but researchers can use the probabilistic data to
make parameter estimates for common things
and the purposive sample to characterize rare
phenomena or establish detection limits on
their abundance. Another is to use sequential
sampling instead of a xed sample size (Dunnell
1984; Leonard 1987; Nance 1981; Ullah et al.
2015). This involves increasing sample size
until some criterion is met, such as a leveling
off in diversity or relative error.
Finally, some archaeologists have the mis-
taken idea that sampling is a way to nd sites.
Spatial sampling is actually rather poor at site
discovery, but this does not discount its suitabil-
ity for making inferences about popu lations
(Shott 1985; Welch 2013). That sampling does
not ensure site discovery is not a good reason
to abandon it.
Opportunity and Exchangeability
The ubiquity of opportunistic populations and
samples in archaeology may also discourage
interest in formal sampling. In heritage manage-
ment, for example, the population often corre-
sponds to a project area that depends on
development plans rather than archaeological
criteria. In a corridor survey for a pipeline, a pro-
ject area could intersect a large site, yielding a
sample of cultural remains that may or may not
be representative, with little opportunity even to
determine the sites size or boundaries, except
in jurisdictions that offer some exibility (e.g.,
Florida Division of Historical Resources
2016:14; and see below).
But this is not unique to cultural resource
management (CRM). Archaeologists frequently
treat an existing collection as a population, and
they either sample it or study it in its entirety.
50 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
Biases could result from the nature of these
opportunistic samples, but that does not mean
we cannot evaluate their effects (Drennan
2010:92). Accompanying documentation might
even facilitate an effective stratied sample.
Bayesian theory potentially offers some re-
spite through its concept of exchangeability
(Buck et al. 1996:7278). An opportunistic sam-
ple may be adequate for certain purposes, as long
as relevant observations on the sample are not
biased with respect to those purposes, even if
we can expect bias with regard to other kinds
of observations. A collection of Puebloan pottery
formed in the 1930s, or acquired by collectors,
might include more large, decorated sherds or
higher decorative diversity than would the popu-
lation of sherds or pots in the site of origin
because of the coll ectors predispositions. Such
a sample would provide biased estimates of the
proportion of decorated pottery, but it might be
acceptable for estimating the proportions of tem-
per recipes in pottery fabrics, for example.
However, there is no reason to think that
Bayesian exchangeability has any role in archae-
ologists attitudes to probability sampling. Few
texts on archaeological analysis even mention
exchangeability (Banning 2000:88; Buck et al.
1996:7274; Orton 2000:21). One other does,
but without naming it (Drennan 2010:8892).
In archaeological research literature, I was only
able to nd a single example (Collins-Elliott
2017). Clearly, those who have eschewed prob-
ability sampling have not been aware of this
concept.
Undergraduate Statistical Training
Another candidate cause for the decline is ar-
chaeological training (cf. Cowgill 2015; Thomas
1978:235, 242). A publication on teaching
archaeology in the twenty-rst century (Bender
and Smith 2000) ignores sampling design, or
even statistics more generally, except for a single
mention of sampling as a useful skill (Schulden-
rein and Altschul 2000:63). A proposed area
for reform of archaeological curriculum, Funda-
mental Archaeological Skills, is silent on
research design, sampling, and statistics except
to list statistics as a basic skill in graduate pro-
grams (Bender 2000:33, 42). At least one article
on archaeological pedagogy mentions, but does
not develop, the role of sampling in eld training
(Berggren and Hodder 2003).
A review of underg raduate textbooks leads
to some interesting observations. The selection
(Supplemental Text 2) includes all of the
English-language textbooks I could nd that
cover archaeological methods, but it limits
multiple-edition books to the earliest and latest
editions I could access.
Renfrew and Bahns(2008:8081) explica-
tion of major spatial sampling strategies is typ-
ical. After briey mentioning non-probabilistic
sampling, they describe random, stratied
random, systematic, and systematic unaligned
sampling designs, all in spatial application.
They illustrate these with the same maps (from
Haggett 1965:Figure 7.4) as has virtually every
archaeology text that describes sampling since
Stephen Plog (1976 :137) used them in The
Early Mesoamerican Village (Figure 1 ). For
stratied sampling, they do not mention the
rationale for strata or the need to verify that strata
are statistically different. As usual, samplings
justication is that archaeologists cannot usu-
ally afford the time and money necessary to
investigate fully the whole of a large site or all
sites in a given region (Renfrew and Bahn
2008:80), while they say that probability sam-
pling allows generalizations about a site or
region.
Most texts that do not specialize in quantita-
tive methods give, at best, perfunctory attention
to sampling. Not one of 54 introductory texts
in my list mentions sequential sampling (Dunnell
1984) or the newer development, adaptive sam-
pling (Thompson and Seber 1996), and only
ve make any mention of sample size, sampling
error, or nonspatial sampling. All but two only
present sampling in regional survey, and only
four mention alternatives to geometrical sample
elements. A few misrepresent sampling as
a means to nd things, especially sites, rather
than to estimate parameters or test hypotheses
(e.g., Muckle 2014:99; Thomas 1999:127).
Seventeen more specialized texts provide a fuller
discussion of sampling, but they probably reach
smaller, more advanced audiences.
Turning to curriculum, explicit statistical
requirements are far from universal. Although
Banning] 51SAMPLED TO DEATH?
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
variations in how to describe undergraduate
programs make comparison difcult, I was
able to nd information online for 24 of the 25
most highly ranked undergraduate programs
internationally (Supp lemental Text 3; Quacquar-
elli Symonds 2019). Of these, at least ve (21%)
require study in statistics, and eight (33%) have
in-house statistical or data-analysis courses.
Some may include statistics in other courses,
such as science or laboratory courses. At least
ve have courses on research design (four have
courses that may include some research design),
but it is unclear whether these cover sampling.
Many programs have an honors thesis or
capstone course that could include sampling.
Although some archaeology programs
emphasize quantitative methods, one of which
even says that quantitative skills and computing
ability are indispensable (Stanford University
2019), the overall impression is that knowledge
of statistics or sampling is optional. The only arti-
cle I found th at explicitly addresses sampling
education in archaeology (Richardson and
Gajewski 2002) is not by archaeologists but by
statisticians in a journal on statistics pedagogy.
Publication and Peer Review
One might also ask why the peer-review process
does not lead to better explication of sampling
designs. As one anonymous reviewer of this arti-
cle pointed out, this is likely due to a lack of sam-
pling and statistical expertise among a s igni cant
proportion of journal editors and manuscript
reviewers who, after all, probably received train-
ing in programs much like the ones reviewed in
the last section.
Archaeological Sampling in the Twenty-First
Century
As the histograms in Figure 3 indicate, some
twenty-rst-century articles in American
Antiquity and Journal of Field Archaeology do
mention sampling or samples, and there has
even been an encouraging uptick in the last
few years, but rarely do they describe these expli-
citly as probability samples. In a disproportionate
stratied random cluster sample of all research
articles in American Antiquity, Journal of
Archaeological Science (JAS), Journal of Field
Archaeology (JFA), and Journal of Archaeo-
logical Method and Theory (JAMT) from 2000
to 2019 (Supplemental Text 4), 24 ± 1.1% of arti-
cles mention some kind of sample without speci-
fying what kind of sample it is. Furthermore,
some that explicitly use probability samples
after 2000 involve samples collected in the
1980s (e.g., Varien et al. 2007) rather than pre-
senting any new sample. Few acknowledge use
of convenience samples (0.9 ± 0.2%), but it
seems likely that most of the unspecied samples
were also of this type.
A few sampling-related articles in these
and other journals show originality or new
approaches (e.g., Arakawa et al. 2013; Burger
et al. 2004; Perreault 2011; Prasciunas 2011).
We also nd random sampling in simulations
(e.g., Deller et al. 2009). Despite a few bright
spots, however, most of the articles in this period
make no use of sampling theory, do not explicitly
identify the population they sampled, and do not
account for cluster sampling in their statisticsif
they provide sampling errors at all. Some in my
sample use sampling as a synonym for col-
lecting (1.3 ± 0.25% overall but almost 3% in
American Antiquity and JFA) and systematic
sampling in a nonstatistical sense (0.4 ± 0.1%
but almost 2% in JAMT), or they use sample
as a synonym for specimen (e.g., individual
lithics 3.0 ± 0.5%, bones or bone fragments
10.7 ± 0.8%, most of the latter in JAS).
Many authors who mention samples actu-
ally base analyses on all available evidence,
such as all the pottery excavated at a site,
or all known Clovis points from a region
(8.6 ± 0.7%). These are only samples in the
sense of convenience samples, and they are argu-
ably populations in the present.
One of the most common practices is to
use sample only in the sense of a small
amount of material, such as some carbon for dat-
ing or a few milligrams of an artifact removed
for archaeometric analysis (carbon samples
15.3 ± 0.7%, pottery 5.3 ± 0.74%, other 20.5 ±
1.1%), selected without sampling theory. Ameri-
can Antiquity was most likely to refer to carbon
specimens as samples. Many studies use ota-
tion sample to refer to individual otation
volumes (4.5 ± 0.4%, mainly in American
52 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
Antiquity and JFA) rather than the entire sample
from a site or context, and we see similar usage of
soil sample even more often (16 ± 0.8%, most
often in JFA).
Articles on regional survey after 2000
often mention sampling with no reference to
sampling theory (14.5 ± 8.6%). Some claim
full-coverage but employ systematic transect
samples (0.3 ± 0.16%). Some articles not in this
sample claim to use stratied sampling but actu-
ally selected tracts within strata purposively or
by convenience (e.g., Tankosic
́
and Chidiroglou
2010:13; Tartaron 2003:30). Many of these may
have been effective in achieving their goals, but it
is unclear whether stratication was effective
or if they controlled biases in estimates. Among
the encouraging exceptions, Parcero-Oubiña
and colleagues (2017) use a stratied sample of
agricultural plots in Chile, having estimated the
sample size they would need in each stratum to
achieve desired condence intervals, and PPS
random point sampling in each stratum to select
plots.
In site excavation, purposive sampling is typ-
ical, while selection of excavated contexts for
detailed analysis often occurs with little or no
explanation (e.g., Douglass et al. 2008). Excava-
tion directors understandably use expertise and
experience or, at times, deposit models (some-
times based on purposive auger samples) to
decide which parts of sites might best provide
evidence relevant to th eir research questions
(Carey et al. 2019). However, at least one study
outside my sample used spatial sampling to esti-
mate the number of features (Welch 2013).
Justiably, purposive sampling dominates
best practice in radiocarbon dating (Calabrisotto
et al. 2017), whereas sampling for micromor-
phology, pollen, and plant macroremains tends
to be systematic within vertical columns, and
selection of column locations is purposive, if
described at all (e.g., Pop et al. 2015). Alterna-
tively, sampling protocols for plant remains
may involve a type of cluster sample with a sin-
gle, standardized sediment volume (e.g., 10 L)
from every stratigraphic context or feature in an
excavation (e.g., Gremillion et al. 2008). At
least one case involves otation of all contexts
in their entirety (Mrozowski et al. 2008), not a
sample at all. One article explicitly calculates
sample sizes needed at desired levels of error
and condence for comparing Korean assem-
blages of plant remains (Lee 2012), while His-
cock (2001) explicitly addresses sample size in
artifact analyses.
In heritage management, our evidence comes
more from regulatory frameworks and guidelines
(Supplemental Te xt 5) than from the articles
reviewed for Supplemental Texts 1 and
4. Although much of this work looks like sam-
pling, the main purpose of regional CRM inven-
tory surveys in most cases is to detect and
document archaeological resources, not just sam-
ple them (e.g., MTCS 2011:74). Many North
American guidelines specify systematic survey
by pedestrian transects or shovel tests, but their
purpose is primarily site discovery, not parameter
estimation (see Shott 1985, 1987, 1989), and
selection of in-site areas for excavation tends to
be purposive (Neumann and Sanford 2010:174).
Some jurisdictions do offer exibility, however.
The Bureau of Land Management (BLM) does
not discuss how to design class II surveys because
the methods and theories of sampling are continu-
ally being rened (BLM 2004:21B4). Mean-
while, Wisconsins standards explicitly address
spatial probability sampling in research design
(Kolb and Stevenson 1997:34). A memorandum
of agreement among stakeholders in the Permian
Basin has oil and gas developers pay into a pool
that funds archaeological research and manage-
ment in this part of New Mexico without tying
it to project areas (Larralde et al. 2016; Schlanger
et al. 2013). This approach allows more exible
research designs (Shott 1992), including, where
warranted, probability sampling.
It may not seem obvious that sampling is rele-
vant to experimental archaeology but, for almost
a century, probability has had a role in ensuring
that confounding factors, such as differential
skill among intknappers or variations in bone
geometry, do not compromise the results of
experiments (Fisher 1935). In a way, experimen-
ters sample from all possible combinations of
treatments. Yet, sampling theory has had little
impact on experimental archaeology. In introduc-
ing a volume on this topic, Outram (2008)makes
no mention of statistical s ampling or randomiza-
tion, nor do any of the articles in that volume.
Some articles in Ferguson (2010) and Khreisheh
Banning] 53SAMPLED TO DEATH?
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
and colleagues (2013) discuss experimental con-
trols or confounding factors, but none highlights
randomization, arguably the most important
protocol. Only Harry (2010:35) employs ran-
domization but draws no attention to its import-
ance. Of the articles in Supplemental Text 4,
8.6 ± 0.7% described experiments that made no
use of randomization. Most of these were in
JAS. Some encouraging exceptions highlight
validity, confounding variables, and use of
randomness (Lin et al. 2018 and articles cited
there). Excluding randomness from experimental
protocols risks confusing variables of interest
with such variables as experimenter fatigue or
the order in which a intknapper selects cores
(cf. Daniels 1978).
More generally, claims for random samples
often have no supporting description (e.g.,
Benedict 2009:157). It is difcult to assess
whether these were true probability samples or
just haphazard (grab-bag) convenience sam-
ples; in Supplemental Text 1, I give them the
benet of the doubt. As noted, 24% of articles
in Supplemental Text 4 use samples without stat-
ing their sampling methods (31 ± 5% of articles
in American Antiquity). Some researchers mix a
random sample with a judgment sample without
providing data that would allow us to disentangle
them (e.g., Vaughn and Neff 2000).
Sometimes we nd such puzzling claims as
although they were collected from a single
unit . . . , these bones are a fairly representative
sample of the faunal assemblage (Flad 2005:
239) or although -inch screens can cause
signicant biases . . . exceptional preservation
. . . along with the dearth of very small
sh . . . suggest that our samples are relatively
representative (Rick et al. 2001:599). One art-
icle admits that three houses constitute a small
sample but claims it is a reasonable representa-
tive sample of some unidentied population
(Hunter et al. 2014:716). Baseless assertions
that samples are representative occur in 2.8 ±
0.4% of articles in Supplemental Text 4, but
they are especially prevalent in JFA (6.8 ±
2.3%). Other authors assume that a large sample
size is enough to make their samples representa-
tive. It is possible that some of these projects did
employ probability sampling, but if so, they did
not describe it (e.g., Spencer et al. 2008).
At least one study, apparently based on a con-
venience sample, claims that relatively consis-
tent artifact densities across a site indicate
that patterns identied did not result from
sampling bias (Loendorf et al. 2013:272).
Another suggests that a pattern it identies
does not stem from vagaries in sampling
(Yasur-Landau et al. 2015 :612) without describ-
ing any sampling design that would have assured
this. Yet another asserts that ve . . . fragments
selected nonrandomly and another ve . . . indis-
criminately . . . comprise a random sample
(Schneider 2015:519).
Other authors proudly ignore statisticians
warning that errors introduced by using simple-
random-sampling formulas for . . . cluster samples
can be extremely serious (Blalock 1979:571).
Campbell acknowledges use of a cluster sample
but implies that the statistically-informed are
being pedantic when he claims conventional
statistics . . . have been shown to work well
(Campbell 2017:15). Poteate and Fitzpatrick
(2013) similarly use the statistics for simple ran-
dom element sampling on simulated cluster sam-
ples, yielding incorrect condence intervals on
such measures as NISP. They also ignore that
there is no reason to expect a small sample to
yield the same MNI or taxonomic richness as
a whole population, since these are very different
levels of aggregation, and call to mind Hole s
(1980:226) disparagement of many such
simulations.
These examples suggest a eld that has
mostly given up on sampling theory, notwith-
standing Figure 3s slight uptick in the last few
years and the presence of some very good excep-
tions to the trend. Not all archaeological research
should employ probability sampling, and we all
make mistakes, but some statements like those
just mentioned pose serious concerns. Too
many articles mention samples with no indica-
tion of whether they were probabilistic or not
but treat them as representative. Many use sam-
ple simply to refer to specimens, selections, or
fragments, and number of samples
to mean
sample size. Finally, authors of some studies
that did not employ probability sampling are
not shy about blaming the failure of results to
meet expectations on probable sampling error
rather than on incorrect hypotheses or methods.
54 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
This poses grave challenges to the validity of the
articles conclusions.
Can We Revive Probability Sampling?
Not all of archaeology benets from probability
sampling. We are not always interested in the
typical or average, but rather in targeting sig-
nicant anomalies, optimal preservation, or evi-
dence relevant to specic hypotheses. In some
contexts, however, inattention to sample quality
has real consequences.
Sampling theory remains important whenever
we want to generalize about or compare popula-
tions without observing them in their entirety,
such as estimating total roofed area in an
Iroquoian village without excavating an entire
site (cf. Shott 1987). Inattention to sampling
could lead to the erroneous inference of signi-
cant difference between sites, or change over
time, when there is notor, conversely, failure
to identify signicant differences or changes
that actually did occur.
It also has a role in experimental archaeology.
Experimenters must demonstrate that they are
measuring what they purport to be measuring
by controlling for confounding variables, such
as skill differences among participants or quality
differences in materials. One tool for this is ran-
domization, much like using a probability sample
from a population of potential experimental
congurations.
So, how might we encourage more serious
attention to and more widespread use of thought-
ful sampling in archaeology?
In place of cookbook descriptions of sam-
pling, textbooks could contextualize sampling
within problem-oriented research design. They
could encourage students to think about certain
situations in which sampling would be helpful,
and other situations in which more targeted
research would be more useful. The key is to
ensure validity of observations and conclusions
(Daniels 1978).
Course curricula could include courses that
prepare students to understand sampling as a
practical aspect of research design, not just
regional or site survey. Rather than teach text-
book sample designs, we could encourage stu-
dents to think critically about preventing their
own preconceptions or vagaries of research
from yielding biased characterizations of sites,
artifacts, or assemblages, or inferences of dubi-
ous validity.
We could also be more precise with termin-
ology. Should we restrict the word sample to
subsets of observations from a larger population?
And should we replace otation sample,
NAA sample, and the like with otation vol-
ume,”“NAA specimen, and so on? Even car-
bon sample deserves a better term to indicate
that its selection has nothing to do with sampling
theory. Oxymorons such as sample population
and sample parameter have no place in our lit-
erature. We need to clarify whether a stratied
sample is stratied in the statistical sense or just a
specimen from a stratied deposit, and system-
atic, in the context of sampling, should not just
be a synonym for methodical.
Conclusions
I suggest the following sampling takeaways:
(1) Probability sampling is not always appro-
priate but, when generalizing about or
comparing populations on the basis of
limited observations, failure to employ
proba bility sam pling may threaten the val -
idi ty of results.
(2) Sampling is not only for spatial situations,
but also assemblages of artifacts, faunal
and plant remains, temper or chemical evi-
dence in pottery or lithics, and many kinds
of experiments.
(3) Cluster sampling, ubiquitous in archae-
ology, requires appropriate statistics for esti-
mating variance. Ignoring this affects the
outcomes of statistical tests.
(4) Some archaeological samples are PPS sam-
ples, also requiring appropriate statistics to
avoid bias.
(5) Stratied sampling requires relevant prior
information and follow-up evaluation to
ensure that criteria for stratication were
effective.
(6) Straightforward methods are available to
ensure that sample sizes are adequate; arbi-
trary sampling fractions are worthless.
(7) When in doubt, talk to a statistician.
Banning] 55SAMPLED TO DEATH?
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
Sampling theory has had a rocky ride in archae-
ology. Negative perceptions of scientism, pro-
motion of full-coverage survey, and aws in
past sampling-based research probably discour-
aged archaeologists interest in formal sampling
methods.
Yet the need for valid inferences persists, per-
haps all the more as we increasingly mine Big
Data from legacy projects. Probability sampling
has the potential, in conjunction with well-
conceived purposive selection, to contribute to
archaeological research designs that are thought-
ful, efcient, and able to yield valid inferences.
We should not let misconceptions of the 1970s
or 1990s deter us from taking full advantage of
its well-established methods.
Acknowledgments. I am grateful to R. Lee Lyman and sev-
eral anonymous reviewers for their insightful, constructive,
and extremely useful comments on previous versions of this
article. I would also like to thank Gary Lock, Piraye Hacıgü-
zeller, and Mark Gillings for the invitation that unexpectedly
led me to write it, and Sophia Arts for editorial work and
assistance in compiling statistics on published articles.
Data Availability Statement. No original data were used in
this article.
Supplemental Materials. For supplemental material accom-
panying this article, visit https://doi.org/10.1017/aaq.2020.
39.
Supplemental Text 1. List of publications in American
Antiquity (19602019) and Journal of Field Archaeology
(19742019) used for the histograms in Figure 3. Note that
there was an interruption in Journal of Field Archaeology
from 2002 until early 2004.
Supplemental Text 2. Distribution of sampling topics
covered in a selection of introductory undergraduate texts,
as well as some more specialized ones, over several decades.
The list excludes texts that only cover culture history and, for
texts with many editions, only includes one early and one
recent edition.
Supplemental Text 3. The 25 highest-ranked international
undergraduate programs in archaeology (Quacquarelli
Symonds Limited 2019), listed alphabetically, and their
20182019 or 20192020 requirements relevant to sampling,
according to program websites. Where universities had mul-
tiple archaeology programs, the table reects the one related
to anthropology or prehistory. Information that was not avail-
able publicly online is marked by (?). As no information on
archaeological programs at the Sorbonne was available
online, the sample size is n = 24.
Supplemental Text 4. Disproportionate stratied random
cluster sample of the population of research articles and
reports in American Antiquity, Journal of Archaeological Sci-
ence, Journal of Field Archaeology, and Journal of
Archaeological Method and Theory from January 2000 to
December 2019, summarizing the proportions of articles
that use sample or sampling in various ways, along with
evaluation of the stratications effectiveness.
Supplemental Text 5. Examples of standards and guide-
lines for archaeological eldwork in the heritage (CRM)
industry.
References Cited
Ambrose, Wal R.
1967 Archaeology and Shell Middens. Archaeology and
Physical Anthropology in Oceania 2:169187.
Arakawa, Fumiyasu, Christopher Nicholson, and Jeff Rasic
2013 The Consequences of Social Processes: Aggregate
Populations, Projectile Point Accumulation, and Sub-
sistence Patterns in the American Southwest. American
Antiquity 78:147165.
Banning, Edward B.
1996 Highlands and Lowlands: Problems and Survey
Frameworks for Rural Archaeology in the Near East.
Bulletin of the American Schools of Oriental Research
301:2545.
2000 The Archaeologists Laboratory: The Analysis of
Archaeological Data. Kluwer Academic/Plenum Pub-
lishing, New York.
2002 Archaeological Survey. Kluwer Academic/Plenum,
New York.
2020 Spatial Sampling. In Archaeological Spatial Anal-
ysis: A Methodological Guide, edited by Mark Gillings,
Piraye Hacıgüzeller, and Gary Lock, pp. 4159. Rout-
ledge, Abingdon, UK.
Bayman, James M.
1996 Shell Ornament Consumption in a Classic Hohokam
Platform Mound Community Center. Journal of Field
Archaeology 23:403420.
Bender, Susan J.
2000 A Proposal to Guide Curricular Reform for the
Twenty-First Century. In Teaching Archaeology in the
Twenty-rst Century, edited by Susan J. Bender and
George S. Smith, pp. 3148. SAA Press, Washington,
DC.
Bender, Susan J., and George S. Smith (editors)
2000 Teaching Archaeology in the Twenty-rst Century.
SAA Press, Washington, DC.
Benedict, James B.
2009 A Review of Lichenometric Dating and Its Applica-
tions to Archaeology. American Antiquity 74:143172.
Berggren, Åsa, and Ian Hodder
2003 Social Practice, Method, and Some Problems of
Field Archaeology. American Antiquity 68:421434.
Binford, Lewis R.
1964 A Consideration of Archaeological Research
Design. American Antiquity 29:425441.
Blalock, Hubert M., Jr.
1979 Social Statistics. 2nd ed. McGraw-Hill, New York.
BLM (Bureau of Land Management)
2004 8110 Identifying and Evaluating Cultural
Resources. United States Department of the Interior.
Electronic document, https://www.blm.gov/sites/blm.
gov/les/uploads/mediacenter_blmpolicymanual8110_0.
pdf, accessed January 23, 2020.
Buck, Caitlin E., William G. Cavanagh, and C. Litton
1996 Bayesian Approach to Interpreting Archaeological
Data. John Wiley & Sons, London.
56 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
Burger, Oskar, Lawrence C. Todd, Paul Burnett, Thomas
J. Stohlgren, and Doug Stephens
2004 Multi-Scale and Nested-Intensity Sampling Techni-
ques for Archaeological Survey. Journal of Field
Archaeology 29:409423.
Calabrisotto, C. Scirè, Marialucia Amadio, Mariaelena Fedi,
Lucia Liccioli, and Luca Bombardieri
2017 Strategies for Sampling Difcult Archaeological
Contexts and Improving the Quality of Radiocarbon
Data: The Case of Erimi Laonin tou Porakou, Cyprus.
Radiocarbon 59:19191930.
Campbell, Sarah K.
1981 The Duwamish No. 1 Site, a Lower Puget Sound
Shell Midden.Ofce of Public Archaeology Research
Report 1. University of Washington, Seattle.
Campbell, Greg
2017 What Do I Do with All These Shells? Basic Guid-
ance for the Recovery, Processing, and Retention of
Archaeological Marine Shells. Quaternary Inter-
national 42:713720.
Carey, Chris, Andy J. Howard, Jane Corcoran, David Knight,
and Jen Heathcote
2019 Deposit Modeling for Archaeological Projects:
Methods, Pr a ctice, and Future Developments. Geoar-
chaeology 34:495505.
Cherry, John F., Clive Gamble, and Stephen Shennan
(editors)
1978 Sampling in Contemporary British Archaeology.
BAR British Series 50. British Archaeological Reports,
Oxford.
Collins-Elliott, Stephen A.
2017 Bayesian Inference with Monte Carlo Approxima-
tion: Measuring Regional Differentiation in Ceramic
and Glass Vessel Assemblages in Republican Italy, ca.
200 BCE20 CE. Journal of Archaeological Science
80:3749.
Cowgill, George L.
1964 The Selection of Samples from Large Sherd Collec-
tions. American Antiquity 29:467473.
1970 Some Sampling and Reliability Problems in Archae-
ology. In Archéologie et Calculateurs, edited by
Jean-Claude Gardin, pp. 161175. CNRS, Paris.
1975 A Selection of Samplers: Comments on Archaeo-
Statistics. In Sampling in Archaeology, edited by
James Mueller, pp. 258274. University of Arizona
Press, Tucson.
1990 Toward Rening Concepts of Full-Coverage Survey.
In The Archaeology of Regions: The Case for Full-
Coverage Survey, edited by Suzanne K. Fish and
Stephen A. Kowalewski, pp. 249259. Smithsonian
Institution, Washington, DC.
1993 Distinguished Lecture in Archaeology: Beyond
Criticizing New Archaeology. American Anthropologist
95:551573.
2015 Some Things I Hope You Will Find Useful Even
If Statistics Isnt Your Thing. Annual Review of Anthro-
pology 44:114.
Daniels, Stephen G. H.
1978 Implications of Error: Research Design and the
Structure of Archaeology. World Archaeology 19:
2935.
Deller, D. Brian, Christopher J. Ellis, and James R. Keron
2009 Understanding Cache Variability: A Deliberately
Burned Early Paleoindian Tool Assemblage from the
Croweld Site, Southwestern Ontario, Canada. Ameri-
can Antiquity 74:371397.
Douglass, Matthew J., Simon J. Holdaway, Patricia
C. Fanning, and Justin I. Shiner
2008 An Assessment and Archaeological Application of
Cortex Measurement in Lithic Assemblages. American
Antiquity 73:513526.
Drennan, Robert D.
2010 Statistics for Archaeologists: A Common Sense
Approach. Springer, New York.
Dunnell, Robert C.
1984 The Ethics of Archaeological Signicance Deci-
sions. In Ethics and Values in Archaeology, edited by
Ernestene L. Green, pp. 6274. Free Press, New York.
Ebert, James I.
1992 Distributional Archaeology. University of New
Mexico Press, Albuquerque.
Falconer, Steven E.
1995 Rural Responses to Early Urbanism: Bronze Age
Household and Village Economy at Tell el-Hayyat, Jor-
dan. Journal of Field Archaeology 22:399419.
Ferguson, Jeffry R. (editor)
2010 Designing Experimental Research in Archaeology:
Examining Technology through Production and Use.
University Press of Colorado, Boulder.
Fish, Suzanne K., and Stephen A. Kowalewski (editors)
1990 The Archaeology of Regions: The Case for Full-
Coverage Survey. Smithsonian Institution, Washington,
DC.
Fisher, Sir Ronald A.
1935 The Design of Experiments. Oliver and Boyd,
Edinburgh.
Flad, Rowan K.
2005 Evaluating Fish and Meat Salting at Prehistoric
Zhongba, China. Journal of Field Archaeology
30:231253.
Flannery, Kent V.
1976 The Trouble with Regional Sampling. In The Early
Mesoamerican Village, edited by Kent V. Flannery,
pp. 159160. Academic Press, New York.
Florida Division of Historical Resources
2016
Module Three, Guidelines for Use by Historic
Preservation Professionals. Division of Historical
Resources, Florida Department of State, Tallahassee.
Electronic document, https://dos.myorida.com/media/
31394/module3.pdf, accessed January 23, 2020.
Given, Michael, A. Bernard Knapp, Nathan Meyer, Timothy
E. Gregory, Vasiliki Kassianidou, Jay Stratton Noller,
Lisa Wells, Neil Urwin, and Haddon Wright
1999 The Sydney Cyprus Survey Project: An Interdiscip-
linary Investigation of Long-Term Change in the North
Central Troodos, Cyprus. Journal of Field Archaeology
26:1939.
Gremillion, Kristen, Jason Windingstad, and Sarah
C. Sherwood
2008 Forest Opening, Habitat Use, and Food Production
on the Cumberland Plateau, Kentucky: Adaptive
Flexibility in Marginal Settings. American Antiquity
73:387411.
Haggett, Peter
1965 Locational Analysis in Human Geography. Edward
Arnold, London.
Harry, Karen G.
2010 Understanding Ceramic Manufacturing Technol-
ogy: The Role of Experimental Archaeology. In Design-
ing Experimental Research in Archaeology, edited by J.
R. Ferguson, pp. 1345. University Press of Colorado,
Boulder.
Banning] 57SAMPLED TO DEATH?
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
Hill, James N.
1970 Broken K Pueblo: Prehistoric Social Organization
in the American Southwest. University of Arizona
Press, Tucson.
Hiscock, Peter
2001 Sizing Up Prehistory: Sample Size and Composition
of Artefact Assemblages. Australian Aboriginal Studies
l:4862.
Hole, Bonnie L.
1980 Sampling in Archaeology: A Critique. Annual
Review of Anthropology 9:217234.
Holtzman, Richard C.
1979 Maximum Likelihood Estimation of Fossil Assem-
blage Composition. Paleobiology 5:7789.
Hunter, Ryan, Stephen W. Silliman, and David B. Landon
2014 Shellsh Collection and Community Connections in
Eighteenth-Century Native New England. American
Antiquity 79:712729.
Johnson, Matthew
1999 Archaeological Theory: An Introduction. Blackwell,
Oxford.
Judge, W. James, James I. Ebert, and Robert K. Hitchcock
1975 Sampling in Regional Archaeological Survey. In
Sampling in Archaeology, edited by James W. Mueller,
pp. 82123. University of Arizona Press, Tucson.
Kamermans, Hans
1995 Survey Sampling, Right or Wrong. In Computer
Applications and Quantitative Methods in Archaeology
1994, edited by Jeremy Huggett and Nick Ryan, pp.
123126. BAR International Series 600. British Ar-
chaeological Reports, Oxford.
Khreisheh, Nada N., Danielle Davies, and Bruce A. Bradley
2013 Extending Experimental Control: The Use of Porcel-
ain in Flaked Stone Experimentation. Advances in Ar-
chaeological Practice 1:3846.
Kintigh, Keith W.
1990 Comments on the Case for Full-Coverage Survey. In
The Archaeology of Regions: The Case for Full-
Coverage Survey, edited by Suzanne K. Fish and
Stephen A. Kowalewski, pp. 237242. Smithsonian
Institution, Washington, DC.
Kolb, Jennifer L., and Katherine Stevenson (editors)
1997 Guidelines for Public Archaeology in Wisconsin.
Wisconsin Archaeological Survey. Electronic
document, http://www4.uwm.edu/Org/was/WASurvey/
WAS_Guidlines_les/WAS_Guidelines_DOC.pdf,
accessed January 23, 2020.
Kowalewski, Stephen A.
1990 Merits of Full-Coverage Survey: Examples from the
Valle y of Oaxaca, Mexico. In The Archaeolo gy of Regions:
A Case for Full-C o v erage Surv e y ,editedbySuzanneK.
Fish and Stephen A. Kowalewski, pp. 3385. Smithsonian
Institution, Washington, DC.
Kuna, Martin
1998 Method of Surface Artefact Survey. In Space in Pre-
historic Bohemia, edited by Evz
̌
en Neustupný pp. 77
83. Institute of Archaeology, Czech Academy of
Sciences, Prague.
Larralde, Signa, Martin Stein, and Sarah H. Schlanger
2016 The Permian Basin Programmatic Agreement after
Seven Years of Implementation. Advances in Archaeo-
logical Practice 4:149160.
Lee, Gyoung-ah
2012 Taphonomy and Sample Size Estimation in Paleo-
ethnobotany. Journal of Archaeological Science
39:648655. DOI:10.1016/j.jas.2011.10.025.
Leonard, Robert D.
1987 Incremental Sampling in Artifact Analysis. Journal
of Field Archaeology 14:498500.
Lin, Sam C., Zeljko Rezek, and Harold L. Dibble
2018 Experimental Design and Experimental Inference in
Stone Artifact Archaeology. Journal of Archaeological
Method and Theory 25:663688.
Loendorf, Chris R., Craig M. Fertelmes, and Barnaby
V. Lewis
2013 Hohokam to Akimel Oodham: Obsidian Acquisi-
tion at the Historic Period Sacate Site (GR-909), Gila
River Indian Community, Arizona. American Antiquity
78:266284.
Lovis, William A.
1976 Quarter Sections and Forests: An Example of Prob-
ability Sampling in the Northeastern Woodlands. Ameri-
can Antiquity 41:364372.
MacDonald, Burton, Lawrence A. Pavlish, and Edward
B. Banning
1979 The Wâdî al-Hasâ Survey 1979: A Preliminary
Report. Annual of the Department of Antiquities of Jor-
dan 24:169183.
McAnany, Patricia A., and Sarah M. Rowe
2015 Re-Visiting the Field: Collaborative Archaeology as
Paradigm Shift. Journal of Field Archaeology 40:499
507.
McManamon, Francis P.
1981 Probability Sampling and Archaeological Survey in
the Northeast: An Estimation Approach. In Foundations
of Northeast Archaeology, edited by Dean R. Snow, pp.
195227. Academic Press, New York.
Meadow, Richard H.
1980 Animal Bones: Problems for the Archaeologist
Together with Some Possible Solutions. Paléorient
6:6577.
Mrozowski, Stephen A., Maria Franklin, and Leslie Hunt
2008 Archaeobotanical Analysis and Interpretation
of Enslaved Virginian Plant Use at Rich Neck
Plantation (44WB52). American Antiquity 73:699
728.
MTCS (Ministry of Tourism, Culture and Sport, Ontario)
2011 Standards and Guidelines for Consultant Archaeolo-
gists. Queens Printer for Ontario, Toronto. Electronic
document, http://www.mtc.gov.on.ca/en/publications/
SG_2010.pdf
, accessed January 23, 2020.
Muckle, Robert J.
2014 Introducing Archaeology. 2nd ed. University of
Toronto Press, Toronto.
Mueller, James W.
1974 The Use of Sampling in Archaeological Survey.
Memoirs of the Society for American Archaeology 28.
American Antiquity 39(2, Pt. 2):191.
Mueller, James W.
1975a Archaeological Research as Cluster Sampling. In
Sampling in Archaeology, edited by James W. Mueller,
pp. 3341. University of Arizona Press, Tucson.
Mueller, James W. (editor)
1975b Sampling in Archaeology. University of Arizona
Press, Tucson.
Nance, Jack D.
1981 Statistical Fact and Archaeological Faith: Two Mod-
els in Small-Sites Sampling. Journal of Field Archae-
ology 8:151165.
Nance, Jack D., and Bruce F. Ball
1986 No Surprises? The Reliability and Validity of Test
Pit Sampling. American Antiquity 51:457483.
58 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
Neumann, Thomas W., and Robert M. Sanford
2010 Practicing Archaeology: An Introduction to Cul-
tural Resources Archaeology, 2nd ed. AltaMira, Lan-
ham, Maryland.
OBrien, Michael J., and Dennis E. Lewarch (editors)
1979 Recent Approaches to Surface Data and Sampling.
Special issue, Western Canadian Journal of Anthropol-
ogy 8(3).
Opitz, Rachel S., Krysta Ryzewski, John F. Cherry, and
Brenna Moloney
2015 Using Airborne LIDAR Survey to Explore
Historic-Era Archaeological Landscapes of Montserrat
in the Eastern Caribbean. Journal of Field Archaeology
40:523541.
Orton, Clive
2000 Sampling in Archaeology. Cambridge University
Press, Cambridge.
Outram, Alan K.
2008 Introduction to Experimental Archaeology. World
Archaeology 40:16.
Parcero-Oubiña, César, Pastor Fábrega-Álvarez, Diego
Salazar, Andrés Troncoso, Frances Hayashida,
Mariela Pino, César Borie, and Ester Echenique
2017 Ground to Air and Back Again: Archaeological
Prospection to Characterize Prehispanic Agricultural
Practices in the High-Altitude Atacama (Chile). Quater-
nary International 435 B:98113.
Parsons, Jeffrey R.
1990 Critical Reections on a Decade of Full-Coverage
Regional Survey in the Valley of Mexico. In The
Archaeology of Regions: The Case for Full-Coverage
Survey, edited by Suzanne K. Fish and Stephen A.
Kowalewski, pp. 731. Smithsonian Institution, Wash-
ington, DC.
Peacock, William R. B.
1978 Probabilistic Sampling in Shell Middens: A Case
Study from Oronsay, Inner Hebrides. In Sampling
in Contemporary British Archaeology, edited by
John Cherry, Clive Gamble, and Stephen Shennan, pp.
177190. British Archaeological Reports, Oxford.
Perreault, Charles
2011 The Impact of Site Sample Size on the Reconstruc-
tion of Culture Histories. American Antiquity 76:
547572.
Phillips, Philip, James A. Ford, and James B. Grifn
1951 Archaeological Survey in the Lower Mississippi
Alluvial Valley, 19401947. Papers of the Peabody
Museum of Archaeology and Ethnology 25. Harvard
University, Cambridge, Massachusetts.
Plog, Stephen
1976 Relative Efciencies of Sampling Techniques for
Archaeological Surveys. In The Early Mesoamerican
Village, edited by Kent V. Flannery, pp. 136158.
Academic Press, New York.
Plog, Fred
1990 Some Thoughts on Full-Coverage Surveys. In
The
Archaeology of Regions: The Case for Full-Coverage
Survey, edited by Suzanne K. Fish and Stephen A.
Kowalewski, pp. 243248. Smithsonian Institution,
Washington, DC.
Pop, Eduard, Corrie Bakels, Wim Kuijper, Herman Mücher,
and Madeleine van Dijk
2015 The Dynamics of Small Postglacial Lake Basins and
the Nature of Their Archaeological Record: A Case
Study of the Middle Palaeolithic Site Neumark-Nord
2, Germany. Geoarchaeology 30:393413.
Poteate, Aaron S., and Scott M. Fitzpatrick
2013 Testing the Efcacy and Reliability of Common
Zooarchaeological Sampling Strategies: A Case Study
from the Caribbean. Journal of Archaeological Science
40:36933705.
Prasciunas, Mary M.
2011 Mapping Clovis: Projectile Points, Behavior, and
Bias. American Antiquity 76:107126.
Quacquarelli Symonds Limited
2019 QS World University Rankings, Archaeology.
Electronic document, https://www.topuniversities.com/
university-rankings/university-subject-rankings/2019/
archaeology, accessed May 25, 2019.
Ragir, Sonia
1967 A Review of Techniques for Archaeological Sam-
pling. In A Guide to Field Methods in Archaeology:
Approaches to the Anthropology of the Dead, edited
by Robert F. Heizer, and John A. Graham, pp. 181
197. National Press, Palo Alto, California.
Redman, Charles L.
1974 Archaeological Sampling Strategies. Addison-
Wesley Module in Anthropology 55. Addison-Wesley,
Reading, Massachusetts.
Redman, Charles L., and Patty Jo Watson
1970 Systematic Intensive Surface Collection. American
Antiquity 35:279291.
Renfrew, Colin, and Paul Bahn
2008 Archaeology: Theories, Methods, and Practice. 5th
ed. Thames & Hudson, London.
Richardson, Mary, and Byron Gajewski
2002 Archaeological Sampling Strategies. Journal of
Statistics Education 10. DOI:10.1080/10691898.2003.
11910693.
Rick, Torben C., Jon M. Erlandson, and René L. Vellanoweth
2001 Paleocoastal Marine Fishing on the Pacic Coast of
the Americas: Perspectives from Daisy Cave, California.
American Antiquity 66:595613.
Rootenberg, Sheldon
1964 Archaeological Field Sampling. American Antiquity
30:181188.
Sanders, William T., Jeffrey R. Parsons, and Robert
S. Santley
1979 The Basin of Mexico: Ecological Processes in the
Evolution of a Civilization. Academic Press, New York.
Scheps, Sheldon
1982 Statistical Blight. American Antiquity 47:836
851.
Schlanger, Sarah, George MacDonell, Signa Larralde, and
Martin Sheen
2013 Going Big: The Permian Basin Memorandum
of Agreement as a Fundamental Shift in Section 106
Compliance. Advances in Archaeological Practice
1:1323.
Schneider, Tsim D.
2015 Envisioning Colonial Landscapes Using Mission
Registers, Radiocarbon, and Stable Isotopes: An Experi-
mental Approach from San Francisco Bay. American
Antiquity 80:511529.
Schuldenrein, Joseph, and Jeffrey H. Altschul
2000 Archaeological Education and Private-Sector
Employment. In Teaching Archaeology in the
Twenty-rst Century, edite d by Susa n J. Bende r and
George S. Smith, pp. 5964. SAA Press, Washington,
DC.
Shanks, Michael
1999 Art and the Early Greek State: An Interpretive
Archaeology. Cambridge University Press, Cambridge.
Banning] 59SAMPLED TO DEATH?
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press
Shanks, Michael, and Christopher Tilley
1987 Re-Constructing Archaeology: Theory and Practice.
Cambridge University Press, Cambridge.
Shott, Michael L.
1985 Shovel-Test Sampling as a Site Discovery Tech-
nique: A Case Study from Michigan. Journal of Field
Archaeology 12:457468.
1987 Feature Discovery and the Sampling Requirements
of Archaeological Evaluations. Journal of Field Archae-
ology 14:359371.
1989 Shovel-Test Sampling in Archaeological Survey:
Comments on Nance and Ball, and Lightfoot. American
Antiquity 54:396404.
1992 Commerce or Service? Models of Practice in
Archaeology. In Quandaries and Quests: Visions of
Archaeologys Future, edited by LuAnn Wandsnider,
pp. 924. Center for Archaeological Investigations
Occasional Paper 20. Southern Illinois University
Press, Carbondale.
Sørensen, Tim Flohr
2017 The Two Cultures and a World Apart: Archaeology
and Science at a New Crossroads. Norwegian Archaeo-
logical Review 50:101115.
Spencer, Charles S., Elsa M. Redmond, and Christina
M. Elson
2008 Ceramic Microtypology and the Territorial Expan-
sion of the Early Monte Albán State in Oaxaca, Mexico.
Journal of Field Archaeology 33:321341.
Stanford University
2019 Major, Bachelor of Arts in Archaeology. Stanford
Archaeology Center, School of Humanities and
Sciences. Electronic document, https://archaeology.
stanford.edu/academicsundergraduate-program/major,
accessed May 25, 2019.
Sundstrom, Linea
1993 A Simple Mathematical Procedure for Estimating
the Adequacy of Site Survey Strategies. Journal of
Field Archaeology 20:9196.
Tankosic
́
,Z
̌
arko, and Maria Chidiroglou
2010 The Karystian Kampos Survey Project: Methods and
Preliminary Results. Mediterranean Archaeology and
Archaeometry 10(3):1117.
Tartaron, Thomas F.
2003 The Archaeological Survey: Sampling Strategies
and Field Methods. In Landscape Archaeology in
Southern Epirus, Greece, edited by James Wiseman
and Konstantinos Zachos, pp. 2345. Hesperia Supple-
ments 32. American School of Classical Studies at Ath-
ens, Athens.
Thomas, David H.
1978 The Awful Truth about Statistics in Archaeology.
American Antiquity 43:231244.
1999 Archaeology: Down to Earth. Nelson Thomson
Learning, Toronto.
Thompson, Steven K., and George Seber
1996 Adaptive Sampling. John Wiley & Sons, New York.
Ullah, Isaac, Paul Duffy, and Edward B. Banning
2015 Modernizing Spatial Micro-Refuse Analysis: New
Methods for Collecting, Analyzing, and Interpreting
the Spatial Patterning of Micro-Refuse from House-
Floor Contexts. Journal of Archaeological Method
and Theory 22:12381262. DOI:10.107/s10816-014-
9223-x.
VanPool, Christine S., and Todd L. VanPool
1999 The Scientic Nature of Postprocessualism. Ameri-
can Antiquity 64:3353.
Varien, Mark D., Scott G. Ortman, Timothy A. Kohler,
Donna M. Glowacki, and C. David Johnson
2007 Historical Ecology in the Mesa Verde Region:
Results from the Village Ecodynamics Project. Ameri-
can Antiquity 72:273299.
Vaughn, Kevin J., and Hector Neff
2000 Moving beyond Iconography: Neutron Activation
Analysis of Ceramics from Marcaya, Peru, an Early
Nasca Domestic Site. Journal of Field Archaeology
27:7590.
Vescelius, Gary S.
1960 Archaeological Sampling: A Problem in Statistical
Inference. In Essays in the Science of Culture, in
Honor of Leslie A. White, edited by Gertrude E. Dole
and Robert L. Carneiro, pp. 457470. Crowell,
New York.
Wallace-Hadrill, Andrew
1990 The Social Spread of Roman Luxury: Sampling
Pompeii and Herculaneum. Papers of the British School
at Rome 58:145192.
Walsh, Michael R.
1998 Lines in the Sand: Competition and Stone Selection
on the Pajarito Plateau, New Mexico. American
Antiquity 63:573593.
Watson, Richard A.
1990 Ozymandias, King of Kings: Postprocessual Radical
Archaeology as Critique. American Antiquity 55:673
689.
Watson, Patty Jo, Stephen LeBlanc, and Charles L. Redman
1971 Explanation in Archeology: An Explicitly Scientic
Approach. Columbia University Press, New York.
Welch, Paul
2013 Designing a Sample of Cores to Estimate the Num-
ber of Features at a Site. Advances in Archaeological
Practice 1:4758.
Williams, Leonard, David H. Thomas, and Robert Bettinger
1973 Notions to Numbers: Great Basin Settlements as
Polythetic Sets. In Research and Theory in Current
Archaeology, edited by Charles L. Redman, pp. 215
237. John Wiley & Sons, New York.
Wobst, Martin H.
1983 We Can
t See the Forest for the Trees: Sampling and
the Shapes of Archaeological Distributions. In Archaeo-
logical Hammers and Theories, edited by James
A. Moore and Arthur S. Keene, pp. 3785. Academic
Press, New York.
Yasur-Landau, Assaf, Eric H. Cline, Andrew J. Koh,
David Ben-Shlomo, Nimrod Marom, Alexandra
Ratzlaff, and Inbal Samet
2015 Rethinking Canaanite Palaces? The Palatial Econ-
omy of Tel Kabri during the Middle Bronze Age. Jour-
nal of Field Archaeology 40:607625.
Submitted July 29, 2019; Revised April 13, 2020; Accepted
April 15, 2020
60 [Vol. 86, No. 1, 2021AMERICAN ANTIQUITY
https://doi.org/10.1017/aaq.2020.39 Published online by Cambridge University Press