TutorialsinQuantitativeMethodsforPsychology
2007,vol.3(2),p.4350.
UnderstandingPowerandRulesofThumb
forDeterminingSampleSizes
CarmenR.WilsonVanVoorhisandBetsyL.Morgan
UniversityofWisconsinLaCrosse
ThisarticleaddressesthedefinitionofpoweranditsrelationshiptoTypeIandTypeII
errors. We discuss the relationship of sample size and power. Finally, we offer
statistical rules of thumb guiding the selection of sample sizes large enough for
sufficientpowertodetectingdifferences,associations,chisquare,andfactoranalyses.
As researchers, it is disheartening to pour time and
intellectualenergy into a research project, analyze the data,
and find that the elusive .05 significance level was not met.
Ifthenullhypothesisisgenuinelytrue,thenthefindingsare
robust.But, what if the null hypothesis is false and the
resultsfailedtodetectthedifferenceatahighenoughlevel?
Itisamissedopportunity.Powerreferstotheprobabilityof
rejecting a false null hypothesis.Attending to power
during the design phase protect both researchers and
respondents.In recent years, some Institutional Review
Boards for the protection of human respondents have
rejected or altered protocols due to design concerns
(Resnick,2006).Theyargue that an“underpowered”study
maynotyieldusefulresultsandconsequentlyunnecessarily
putrespondentsatrisk.Overall,researcherscanandshould
attend to power.This article defines power in accessible
ways, provides guidelinesforincreasing power, and finally
offers“rulesofthumb”fornumbersofrespondentsneeded
forcommonstatisticalprocedures.
Whatispower?
Beginning social science researchers learn about Type I
andTypeIIerrors.TypeIerrors(representedbyαaremade
whenthedataresultinarejectionofthenullhypothesis,but
in reality the null hypothesis is true (
Neyman & Pearson
(1928/1967).Type II errors (represented by β) are made
when the data do not support a rejection of the null
 
Portions of this article were published in Psi Chi Journal of
UndergraduateResearch.
hypothesis, but in reality the null hypothesis is false
(Neyman & Pearson).However, as shown in Figure 1, in
every study, there are four possible outcomes.In addition
to Type I and Type II errors, two other outcomes are
possible.First, the data may not support a rejection of the
nullhypothesiswhen,inreality, the nullhypothesis istrue.
Second, the data may result in a rejection of the null
hypothesiswhen,inreality, the null hypothesis isfalse(see
Figure 1).This final outcome represents statistical power.
Researchers tend to overattend to Type I errors (e.g.,
Wolins, 1982), in part, due to the statistical packages that
rarelyinclude estimates of theotherprobabilities.Posthoc
analyses of published articles often yield the finding that
TypeIIerrorsarecommoneventsinpublishedarticles(e.g.,
Strasaik, Zamanm, Pfeiffer, Goebel, & Ulmer, 2007;
Williams,Hathaway,Kloster,&Layne,1997).
Whena.05orlowersignificanceisobtained,researchers
arefairlyconfidentthattheresultsare“real,”inotherwords
not dueto chancefactors alone.In fact, with a significance
level of .05, researchers can be 95% confident the results
represent a nonchance finding (Aron & Aron, 1999).
Researchers should continue to strive to reduce the
probability of Type I errors; however, they also need to
increasetheirattentiontopower.
Every statistic has a corresponding sampling
distribution.A sampling distribution is created, in theory,
viathefollowingsteps(Kerlinger&Lee,2000):
1. Select a sample of a given n under the null
hypothesis.
2. Calculatethespecificstatistic.
3. Repeatsteps1and2an“infinite”numberoftimes.
43
44
Figure1.Possibleoutcomesofdecisionsbasedonstatisticalresults.
“TRUTH”OR“REALITY”
Nullcorrect Nullwrong
Failtoreject Correctdecision TypeII(β)
Decisionbasedon
statisticalresult
Reject TypeI(α) Correctdecision
Power
4. Plotthegivenstatisticbyfrequencyofvalue.
Forinstance,thefollowingsteps couldbe usedto create
a sampling distribution for the independent samples ttest
(basedonFisher1925/1990;Pearson,1990).
1. Select two samples of a given size from a single
population.The two samples are selected from a
single population because the sampling
distributionis constructed given the null
hypothesis is true (i.e., the sample means are not
statisticallydifferent).
2. Calculate the independent samples ttest statistic
basedonthetwosamples.
3. Complete steps 1 and 2 an “infinite” number of
times.Inotherwords,selecttwosamplesfromthe
same population and calculate the independent
samplestteststatisticrepeatedly.
4. Plottheobtainedindependentsamplesttestvalues
by frequency.Given the independent samples t
testisbasedonthedifferencebetweenthemeansof
the two samples, most of the values will hover
aroundzero as the samplesbothweredrawn from
the same population (i.e. both sample means are
estimatingthesamepopulationmean).Sometimes,
however, one or both of the sample means will be
poor estimates of the population mean and differ
widely from each other, yielding the bellshaped
curve characteristic of the independent samples t
testsamplingdistribution.
When a researcher analyzes data and calculates a
statistic, the obtained value is compared against this
sampling distribution.Depending on the location of the
obtained value along the sampling distribution, one can
determine the probability ofachieving that particular value
given the null hypothesis is true.If the probability is
sufficientlysmall, the researcher rejects the null hypothesis.
Of course, the possibility remains, albeit unlikely, that the
nullhypothesisistrueandtheresearcherhasmadeaTypeI
error.
Estimatingpower depends upon a different distribution
(Cohen,1992).The simplestexample is the ztest, in which
the mean of a sample is compared to the mean of the
population to determine if the sample comes from the
population (P1).Power assumes that the sample, in fact,
comes from a different population (P2).Therefore, the
sampling distribution of P2 will be different than the
sampling distribution of P1 (see Figure 2).Power assumes
thatthenullhypothesisisincorrect.
Thegoalistoobtainaztestvaluesufficientlyextremeto
reject the null hypothesis.Usually, however, the two
distributions overlap.The greater the overlap, the more
values P1 and P2 share, and the less likely it is that the
obtained test value will result in the rejection of the null
hypothesis.Reducingthisoverlapincreasesthe power.As
the overlap decreases, the proportion of values under P2
which fall within the rejection range (indicated by the
shadedareaunderP2)increases.
Figure 2. Sampling distributions of means for the ztest
assuming the null hypothesis is false.P1 represents the
sampling distribution of means of the original population;
P2 represents the sampling distribution of means from
which the sample was drawn.The shaded area under P2
representspower,i.e.,theprobabilityofcorrectlyrejectinga
falsenullhypothesis.
45
Table1:SampleDataSet
Person X  Person X
1 5.50 11 7.50
2 6.00 12 7.50
3 6.00 13 8.00
4 6.50 14 8.00
5 6.50 15 8.00
6 7.00 16 8.50
7 7.00 17 8.50
8 7.00 18 9.00
9 7.50 19 9.00
10 7.50 20 9.50
ManipulatingPower
SampleSizesandEffectSizes
As argued earlier a reduction of the overlap of the
distributions of two samples increases power.Two
strategies exist for minimizing the overlap between
distributions.The first, and the one a researcher can most
easily control, is to increase the sample size (e.g., Cohen,
1990; Cohen, 1992).Larger samples result in increased
power. The second, discussed later, is to increase the effect
size.
Largersamples more accurately representthe
characteristics of the populations from which they are
derived (Cronbach, Gleser, Nanda, & Rajaratnam, 1972;
Marcoulides, 1993).In an oversimplified example, imagine
apopulation of20 peoplewith the scoreson somemeasure
(X)aslistedinTable1.
Themeanofthis“population”is7.5(
σ
=1.08).Imagine
researchers are unable to know the exact mean of the
populationandwantedtoestimateitviaasamplemean.If
they drew a random sample, n = 3, it could be possible to
selectthreelowor threehigh scoreswhich wouldbe rather
poor estimates of the “population” mean.Alternatively, if
theydrewsamples,n=10,eventhetenlowestortenhighest
scores would better estimate the population mean than the
sample of three.For example, using this “population” we
drewtensamplesofthreeandtensamplesoften(seeTable
2fo
ean
(
μ
= a
rthesamplemeans).
The overall mean of the sample means based on three
peopleis7.57andthestandarddeviationis.45.Theoverall
mean of the sample means based on ten people is 7.49 and
the standard deviation is .20.The sample means based on
tenpeople were, on average, closer to the populationm
7.50)thanthes mplemeansbasedonthreepeople.
The standard error of measurement estimates the
average difference between a sample statistic and the
population statistic.In general, the standard error of
measurement is the standard deviation of the sampling
distribution.In the above example, we created two
miniature sampling distributions of means.The sampling
distributionofthez test(usedtocompareasamplemeanto
a population mean) is a sampling distribution of means
(although it includes an “infinite” number of sample
means).As indicated by the standard deviations of the
means(i.e.,thestandarderrorofmeasurements)theaverage
difference between the sample means and the population
meanissmallerwhenwedrewsamplesof10thanwhenwe
drew samples of 3.In other words, the sampling
F
igure 3.The relationship between standard error of
measurement and power.As the standard error of
measurement decreases, the proportion of the P2
distribution above the zcritical value (see shaded area
u
nder P2) increases, therefore increasing the power.The
distributionsat the top of the figure have smaller standard
errorsofmeasurementandthereforelessoverlap,whilethe
distributions at the bottom have larger standard errors of
measurement and therefore more overlap, decreasing the
power.
46
ill overlap less than
“w
ningthisbalancebetweenresourcesand
ide
eenthe
μ
1 and
μ
2in
tandarddeviationunits.Specifical ,
distributionbasedonsamples ofsize 10 is“narrower” than
the sampling distribution based on samples of size 3.
Applied to power, given the population means remain
static, “narrower” distributions w
ider”distributions(seeFigure3).
Consequently, larger sample sizes increase power and
decreaseestimationerror.However,thepracticalrealitiesof
conducting research such as time, access to samples, and
financial costs restrict the size of samples for most
researchers.The balance is generating a sample large
enough to provide sufficient power while allowing for the
abilitytoactuallygarnerthesample.Laterinthisarticle,we
providesome“rulesofthumb”forsomecommonstatistical
testsaimedatobtai
alsamplesizes.
The second way to minimize the overlap between
distributions is to increase the effect size (Cohen, 1988).
Effect size represents the actual difference between the two
populations;ofteneffectsizesarereportedinsomestandard
unit (Howell, 1997).Again, the simplest example is the z
test.Assumingthenullhypothesisisfalse (aspowerdoes),
theeffectsize(d) isthe differencebetw
s ly
where M is the sample mean derived from
μ
2 (remember,
power assumes the null hypothesis is false, therefore, the
sampleisdrawnfromadifferentpopulationthan
μ
1.)Ifthe
effect size is .50, then
μ
1 and
μ
2 differ by onehalf of a
standard deviation.The more disparate the population
means,thelessoverlapbetweenthedistributions(seeFigure
4).Researchers can increase power by increasing the effect
size.
Manipulatingeffectsizeis notnearly asstraightforward
as increasing the sample size.At times, researchers can
attempt to maximize effect size by maximizing the
difference between or among independent variable levels.
Forexample,supposeaparticularstudyinvolvedexamining
the effect of caffeine on performance Like y differences in
performance, if they exist, will be more apparent if the
researcher compares individuals who ingest widely
different amounts of caffeine (e.g., 450 mg vs. 0 mg) than if
shecomparesindividualswhoingestmoresimilaramounts
of caffeine (e.g., 25 mg. vs. 0 mg).If the independent
variable is a measured subject variable, for example, ability
level, effect size can be increased by including groups who
are “extreme” in ability level.For example, rather than
. l
Figure4.The relationship between effect size and power.As
the effect size increases, the proportion of the P2 distribution
abovethezcriticalvalue(seeshadedareaunderP2)increases,
thereforeincreasingthe power.The distributions atthe top o
f
the figure represent populations with means that differ to a
larger degree (i.e. a larger effect size) than the distributions at
the bottom.The larger difference between the population
means results in less overlap between the distributions,
increasingpower.
Figure 5. The relationship between α and power. As α
increases,asinasingletailedtest,theproportion of theP2
distribution above the z critical value (see shaded area
under P2). The distributions at the top of the figure
represent a twotailed test in which the
α
level is split
b
etweenthetwotails;thedistributionsatthebottomofthe
figure represent a onetailed test in which the
α
level is
includedinonlyonetail.
47
eliableaspossibletominimizeanyerrordue
to the measurement (which would serve to “widen” the
be tired and unmotivated.Dependent
sam
Each participant’s ble score (X) can be
the population mean, t
x is the effects of
trea
t
testdesign,inwhichtherearetwolevelsofanindependent
ariable.Theformulaforthedependentsamplesttestis
Table2.SampleMeansPresentedbyMagnitude
Sample M(n=3) M(n=10)
1 6.83 7.25
2 7.17 7.30
3 7.33 7.30
4 7.33 7.35
5 7.50 7.40
6 7.67 7.50
7 7.67 7.60
8 7.83 7.70
9 7.83 7.75
10 8.50 7.75
comparing people who are above the mean in ability level
with those who are below the mean, the researcher might
compare people who score at least one standard deviation
above the mean with those who score at least one standard
deviation below the mean.Other times, the effect size is
simplyoutoftheresearcher’scontrol.Inthoseinstances,the
bestaresearchercandoistobesurethedependentvariable
measureisasr
distribution).
ErrorVarianceandPower
Errorvariance,or variance due to factors otherthan the
independent variable, decreases the likelihood of detecting
differencesor relationships that actually exist, i.e. decreases
power (Cohen, 1988). Differences in dependent variable
scores can be due to many factors other than the effects of
theindependentvariable.For example,scoresonmeasures
with low reliability can vary dependent upon the items
included in the measure, the conditions of testing, or the
time of testing.A participant might be talented in the task
or, alternatively,
ples control for error variance due to such participant
characteristics.
dependent varia
characterizedas
X=
μ
+tx+p+e
where
μ
is
tment, p is theparticipant characteristics, and e is
randomerror.
In the true dependent samples design, each participant
experiences each level of the independent variable.Any
participant characteristics which impact the dependent
variablescoreatonelevel willsimilarlyaffectthedependent
variable score at other levels of the independent variable.
Different statistics use different methods to separate
variance due to participant characteristics from error
variance.The simplest example is a dependent samples
v
wh
etotreatment
then are easier to detect.In other words, such a design
01;Cohen,1988).
a
onetailed test, the entire signed to one of the
tails.Itisasiftheα le from.025to.05.
r an n en w
the principles of regarding
power.Table 3 provides an overview of the sample size
rulesofthumbdiscussedbelow.
ereMDis themeanofthedifferencescoresand SE
MDis
thestandarderrorofthemeandifference.
Difference scores are created for each participant by
subtracting the score under one level of the independent
variable from the score under the other level of the
independent variable.The actual magnitude of the scores,
then is eliminated, leaving a difference that is due, to a
larger degree, to the treatment and to a lesser degree to
participantcharacteristics.Thedifferencesdu
increasespower(Cohen,20
TypeIErrorsandPower
Finally, power is related to α,or the probability of
makingaTypeIerror.Asαincreases,powerincreases(see
Figure 5).The reality is that few researchers or reviewers
are willing to trust in results where the probability of
rejecting a true null hypothesis is greater than .05.
Nonetheless, this relationship does explain why onetailed
testsaremorepowerfulthantwotailedtests.Assumingan
α level of .05, in a twotailed test, the total α level must be
splitbetweenthetails,i.e.,.025isassignedtoeachtail.In
α level is as
velhasincreased
RulesofThumb
The remaining articles in this edition discuss specific
power estimates for various statistics.While we certainly
advocate fo full underst ding of a d att tion to po er
estimates,at times, suchconceptsare beyondthescopeofa
particular researchers training (for example, in
undergraduate research).In those instances, power need
not be ignored totally, but rather can be attended to via
certain rules of thumb based on
48
Table3:Samplesizerulesofthumb
Relationship Reasonablesamplesize
Measuringgroupdifferences
(e.g.,ttest,ANOVA)
Cellsizeof30for80%power,ifdecreased,nolowerthan
7percell.
Relationships
(e.g.,correlations,regression)
~50
Chi‐Square Atleast20overall,nocellsmallerthan5.
FactorAnalysis ~300is“good”
NumberofParticipants:Cellsizeforstatisticsusedto
detectdifferences.
The independent samples ttest, matched sample ttest,
ANOVA (oneway or factori al), MANOVA are all statistics
designed to detect differences between or among groups.
How many participants are needed to maintain adequate
power when using statistics designed to detect differences?
Given a medium to large effect size
, 30 participants per cell
should lead to about 80% power (the minimum suggested
power for an ordinary study) (Cohen, 1988). Cohen
conventions suggest an effect size of .20 is small, .50 is
medium,and.80islarge.If,forsomereason,minimizingthe
number of participants is critical, 7 participants per cell,
givenatleastthreecells,will yield power of approximately
50% when the effect size is .50.Fourteen participants per
cell, given at least three cells and an effect size of .50, will
yield power of approximately 80% (Kraemer & Thiemann,
1987).
Caveats. First, comparisons of fewer groups (i.e., cells)
require more participants to maintain adequate power.
Second, lower expected effect sizes require more
participants to maintain adequate power (Aron & Aron,
1999).Third,whenusingMANOVA,itisimportanttohave
more cases than dependent variables (DVs) in every cell
(Tabachnick&Fidell,1996).
Numberofparticipants:Statisticsusedtoexamine
relationships.
Although there are more complex formulae, the general
ruleofthumbisnolessthan50participantsforacorrelation
or regression with the number increasing with larger
numbers of independent variables (IVs). Green (1991)
providesacomprehensive overviewoftheproceduresused
todetermineregressionsamplesizes.HesuggestsN>50+8
m (where m is the number of IVs) for testing the multiple
correlationandN>104+mfortestingindividualpredictors
(assumingamediumsizedrelationship).Iftestingboth,use
thelargersamplesize.
AlthoughGreenʹs(1991)formulaismorecomprehensive,
therearetwootherrulesofthumbthatcouldbeused.With
five or fewer predictors (this number would include
correlations),aresearchercanuseHarrisʹs(1985)formulafor
yielding the absolute minimum number of participants.
Harris suggests that the number of participants should
exceed the number of predictors by at least 50 (i.e., total
number of participants equals the number of predictor
variables plus 50)‐‐a formula much the same as Greenʹs
mentioned above. For regression equations using six or
more predictors, an absolute minimum of 10 participants
per predictor variable is appropriate. However, if the
circumstances allow, a researcher would have better power
to detect a small effect size with approximately 30
participants per variable. For instance, Cohen and Cohen
(1975) demonstrate that with a single predictor that in the
populationcorrelateswiththeDVat.30,124participantsare
needed to maintain 80% power. With five predictors and a
population correlation of .30, 187 participants would be
neededtoachieve80%power.
Caveats. Larger samples are needed when the DV is
skewed,theeffectsizeexpectedissmall,thereissubstantial
measurement error, or stepwise regression is being used
(Tabachnick&Fidell,1996).
Numberofparticipants:Chisquare.
The chisquare statistic is used to test the independence
of categorical variables.While this is obvious, sometimes
theimplicationsarenot.Theprimaryimplicationisthatall
observationsmust be independent.Inother words, no one
individual can contribute more than one observation.The
degrees of freedom are based on the number of variables
andtheirpossiblelevels,notonthenumberofobservations.
Increasing the number of observations, then has no impact
on the critical value needed to reject the null hypothesis.
49
The number of observations still impacts the power,
however.Specifically, small expected frequencies in one or
more cells limit power considerably.Small expected
frequencies can also slightly inflate the Type I error rate,
however, for totally sample sizes of at least 20, the alpha
rarelyrisesabove.06(Howell,1997).Aconservativeruleis
thatnoexpectedfrequencyshoulddropbelow5.
Caveat.If the expected effect size is large, lower power
canbetoleratedandtotalsamplesizescanincludeasfewas
8observationswithoutinflatingthealpharate.
NumberofParticipants:Factoranalysis.
A good general rule of thumb for factor analysis is 300
cases (Tabachnick & Fidell, 1996) or the more lenient 50
participants per factor (Pedhazur & Schmelkin, 1991).
ComreyandLee (1992)(seeTabachnick&Fidell,1996) give
the following guide samples sizes: 50 as very poor; 100 as
poor,200asfair,300 asgood,500as verygoodand 1000as
excellent.
Caveat. Guadagnoli & Velicer (1988) have shown that
solutions with several high loading marker variables (>.80)
donotrequireasmanycases.
Conclusion
This article addresses the definition of power and its
relationship to Type I and Type II errors.Researchers can
manipulate power with samplesize.Notonly does proper
sample selection improve the probability of detecting
difference or association, researchersare increasingly called
upontoprovideinformationonsamplesizeintheirhuman
respondentprotocolsandmanuscripts(includingeffectsizes
and power calculations). The provision of this level of
analysis regarding sample size is a strong recommendation
of the Task Force on Statistical Inference (Wilkinson, 1999),
andisnowmorefullyelaboratedinthediscussionofʺwhat
to include in the Results sectionʺ of the new fifth edition of
theAmericanPsychologicalAssociationʹs(APA)publication
manual (APA, 2001). Finally, researchers who do not have
the access to large samples should be alert to the resources
availableforminimizingthisproblem(e.g.,Hoyle,1999).
References
American Psychological Association. (2001). Publication
manual of the American Psychological Association (5th ed.).
Washington,DC:Author.
Aron, A., & Aron, E. N. (1999). Statistics for psychology (2nd
ed.).UpperSaddleRiver,NJ:PrenticeHall.
Cohen, B. H. (2001). Explaining Psychological Statistics (2
nd
ed.).NewYork,NY:JohnWiley&Sons,Inc.
Cohen, J. (1988). Statistical power analysis for the behavioral
sciences(2nded.).Hillsdale,NJ:Erlbaum.
Cohen, J. (1990). Things I have learned (so far). American
Psychologist,45,13041312.
Cohen,J. (1992).A powerprimer.Psychological Bulletin,112,
155159.
Cohen,J.,&Cohen, P.(1975).Appliedmultiple
regression/correlation analysis for the behavioral sciences.
Hillsdale,NJ:Erlbaum.
Comrey, A. L., & Lee, H. B. (1992). A first course in factor
analysis(2nded.).Hillsdale,NJ:Erlbaum.
Cronbach,L. J.,Gleser,G.C.,Nanda,H.,&Rajaratnam,N.,
(1972).Thedependabilityofbehavioralmeasurements:Theory
ofgeneralizabilityforscoresandprofiles.NewYork:Wiley.
Fisher, R. A.(1925/1990). Statistical methods for research
workers.Oxford,England:OxfordUniversityPress.
Green, S. B. (1991). How many subjects does it take to do a
regression analysis? Multivariate Behavioral Research, 26,
499510.
Guadagnoli, E., & Velicer, W.F. (1988). Relation of sample
size to the stability of component patterns. Psychological
Bulletin,103,265275.
Harris,R.J.(1985).Aprimerofmultivariatestatistics(2nded.).
NewYork:AcademicPress.
Howell, D. C. (1997). Statistical methods for psychology (4th
ed.).Belmont,CA:Wadsworth.
Hoyle,R.H.(Ed.).(1999). Statisticalstrategiesforsmallsample
research.ThousandOaks,CA:Sage.
Kerlinger, F. & Lee, H. (2000).Foundations of behavioral
research.NewYork:InternationalThomsonPublishing.
Kraemer, H. C., & Thiemann, S. (1987). How many subjects?
Statistical power analysis in research. Newbury Park, CA:
Sage.
Marcoulides, G. A. (1993). Maximizing power in
generalizability studies under budget constraints.
JournalofEducationalStatistics,18(2),197206.
Neyman, J. & Pearson, E. S. (1928/1967). On the use and
interpretation of certain test criteria for purposes of
statistical inference, Part I. Joint Statistical Papers.
London:CambridgeUniversityPress.
Pearson , E, S. (1990) ‘Student’, A statistical biography of
WilliamSealyGosset.Oxford,England:OxfordUniversity
Press.
Pedhazur, E. J., & Schmelkin, L. P. (1991). Measurement,
design, and analysis: An integrated approach. Hillsdale, NJ:
Erlbaum.
Resnick,D.B.(2006,Spring)Bioethicsbulletin.Retrieved
September22,2006from
http://dir.niehs.nih.gov/ethics/news/2006spring.doc.
WashingtonDC:NationalInstituteforEnvironmental
EthicsHealthSciences.
Strasaik, A. M, Zamanm, Q., Pfeiffer, K. P., Goebel, G.,
Ulmer,H.(2007).Statisticalerrorsinmedicalresearch:A
50
review of common pitfalls.Swiss MedicalWeekly, 137,
4449.
Tabachnick, B. G., & Fidell, L. S. (1996). Using multivariate
statistics(3rded.).NewYork:HarperCollins.
Wilkinson, L., & Task Force on Statistical Inference, APA
Board of Scientific Affairs. (1999). Statistical methods in
psychology journals: Guidelines and explanations.
AmericanPsychologist,54,594604.
Williams, J. L., Hathaway, C. A., Kloster,K. L. & B. H.
Layne, (1997). Low power, type II errors, and other
statistical problems in recent cardiovascular research.
HeartandCirculatoryPhysiology,273,(1).487493.
Wolins, L. (1982). Research mistakes in the social and
behavioralsciences.Ames:IowaStateUniversityPress
ManuscriptreceivedOctober21
st
,2006
ManuscriptacceptedNovember5th,2007