by Kim Smith
01 May 1999
This article deals with the audit methodologies of audit sampling and other selective
testing procedures from both theoretical and practical perspectives. The paper 6, Audit
Framework Teaching Guide includes:
This is assumed knowledge at paper 10, Accounting and Audit Practice, where students
should be able to “explain the use of audit sampling in the conduct of an audit”.
On the theoretical side, the professional guidance referred to is SAS 430, Auditing
Sampling (UK Stream) and ISA 530, Audit Sampling and Other Selective Testing
Procedures (International Stream).
On the practical side, audit efficiency relies on obtaining the minimum audit evidence,
sufficient to form the audit opinion, as cost effectively as possible. To this end,
formalised audit sampling procedures have been developed and become commonplace
in the majority of audit firms. The use of audit sampling, on all audit assignments, offers
innumerable benefits to all auditors. These include:
Nature of sampling
Audit evidence is obtained by carrying out audit tests which may be classified as ‘non-
sampling’ or ‘sampling’.
A definition — Audit sampling is the testing of less than 100% of the items within a population to
obtain and evaluate evidence about some characteristic of that population, in order to form a
conclusion concerning the population.
Both statements of standards concern audit sampling in general i.e., both statistical and
non-statistical sampling. (It is a common misconception that these standards concern
only statistical sampling.) However, the ISA considers selective testing other than audit
sampling in more detail than the SAS.
The ISA definition of audit sampling specifically states that “all sampling units have a
chance of selection”. Thus, 100% examination and selecting specific items are clearly
non-sampling procedures.
A definition — Sampling risk is the risk that the sample is not representative of the population
from which it is drawn and thus the auditor’s conclusion is different to that which would be
reached if the whole population was examined.
(a) ‘the risk of incorrect rejection’ (also called Alpha risk) which arises when the sample
indicates a higher level of errors than is actually the case. This situation is usually
resolved by additional audit work being performed. This risk affects audit efficiency but
should not affect the validity of the resulting audit conclusion;
(b)‘the risk of incorrect acceptance’ (also called Beta risk) when material error is not
detected in a population because the sample failed to select sufficient items containing
errors. This risk, which affects audit effectiveness, can be quantified using statistical
sampling techniques. Although it is possible that an unqualified auditors’ report could be
issued inappropriately, such errors should be detected by other complementary audit
procedures (assuming that the sample size is appropriate to the level of detection risk).
Non-sampling risk is the component of detection risk that is not due to examining only a
portion of the data. Examples of sources of non-sampling risk include:
Selective testing which does not constitute audit sampling (e.g., selection of risk-prone
items) is also subject to non-sampling risk. ISA 530 defines non-sampling risk as the
risk which “arises from factors that cause the auditor to reach an erroneous conclusion
for any reason not related to the size of the sample”. Thus non-sampling risk can also
arise, for example, if the auditor fails to recognise an error in an individual item in a
sample. The auditor seeks to minimise the risk of erroneous conclusions by proper
planning, supervision and review.
A confidence level is the degree of assurance that material error does not exist; it is the
converse of risk.
Reliability (R-) factors are derived from the Poisson sampling distribution (a distribution
of ‘rare events’) and are related to risk percentages as shown in Figure 1. Note the
‘inverse’ nature of the relationship between R-factors and risk and that a confidence
level is the mathematical complement of risk.
The use of R-factors (and related methods) is popular. It makes determination of sample
size easy, avoids the need to carry statistical tables and is compatible with the Audit
Risk Model as illustrated in Figure 2. In this illustration, the risk of errors arising
(inherent risk) is high, but assurance is planned to be obtained from tests of control. The
auditor’s tests of detail will therefore be planned at a level corresponding to sampling
risk of not more than 14% (see Figure 4 later).
Deciding to sample
When planning the audit procedures to be adopted, the decision to sample account
balances and transactions is influenced by:
Sample design, which may be set out in a sample plan, includes consideration of:
Population
A definition – Population is the entire set of data from which a sample is selected and about
which the auditor wishes to draw a conclusion.
A population may be ‘stratified’, that is, divided up into sub-populations. Each sub-
population is a group of sampling units having similar characteristics. For example,
current debts, debts 1–3 months overdue, debts 3–6 months overdue and debts more than
6 months overdue.
For tests of control the population must have the same control characteristics. So, for
example, supplier’s invoices for raw materials will be distinct from supplier’s invoices
for services because the former should evidence the receipt of goods.
For substantive procedures the population could be a list of ledger balances or debit
entries to individual ledger accounts or all transactions of a particular type (e.g., weekly
wages).
Sampling unit
For example, an account balance, a debit (or credit) entry on a bank statement, a goods
received note or a monetary unit (i.e., £ or $).
Sample size
• assurance required;
• tolerable and expected error (or deviation rate); and
• stratification.
Absolute assurance cannot be achieved through sampling procedures. The lower the
assurance required, the smaller the required sample size. The tolerable error (or
deviation rate) is also called precision. It is the maximum error (or deviation rate) that
can be accepted to conclude that the audit objective has been achieved. (The combined
tolerable error for all audit tests is sometimes called gauge.)
For substantive tests, precision may be expressed as a monetary amount (which is less
than overall materiality) or a percentage of population value. For tests of control,
precision is the maximum rate of failure of an internal control that can be accepted in
order to place reliance on it (and is therefore likely to be small).
Errors increase the imprecision of results from sampling. Therefore, if they are expected,
a larger sample size is required.
Stratification
In conclusion to this section, a sample plan for a substantive test is set out in Figure 3.
Figure 4 illustrates a sample plan for a test of controls.
2 Sample selection
The aim of audit sampling is to form a conclusion about the population from which a
sample is obtained. It is, therefore, necessary to ensure that the method of sample
selection can be expected to produce a representative sample with each item in the
population having a chance of being selected.
The distinction between statistical and non-statistical sampling should be made clear
before considering the methods of selection.
Statistical sampling
The two main types of statistical sampling are Attribute sampling and Variables
sampling.
Attribute sampling is concerned with testing items which can have only two possible
values (e.g., 0 or 1) or attributes (e.g., correct or incorrect). It is used to provide
information about rates of occurrence of events or characteristics. It is most widely used
in tests of control (to determine rates of non-compliance within control procedures) and
Monetary Unit Sampling (see later).
Variables sampling is concerned with testing items which can take any value within a
continuous range and is therefore used in substantive tests of details.
Non-statistical sampling
Any approach to sampling which does not fulfil all the characteristics set out above for
statistical sampling. Such approaches are often referred to as judgement sampling.
However, as statistical sampling
• random number;
• systematic;
• haphazard.
Random number selection — every item in a population has the same statistical
probability of being selected as every other item. Random numbers are selected using a
computer program or random number tables.
Haphazard selection — this method attempts to give all items in a population a chance
of selection by choosing items haphazardly. To avoid conscious bias it is necessary to
avoid: favouring middle items, ignoring first and last items, selection of unusual items,
etc. Sometimes it is the only practical method (in terms of time and cost) of selecting a
sample from a population which cannot be accessed using a numerical sequence.
Though sometimes used for non-statistical sampling it is not sufficiently rigorous for
statistical sampling.
Value weighted selection — this systematic selection method uses currency unit values,
rather than the items, as the sampling population. Each individual pound is given an
equal chance of selection. Since these cannot be examined, the item in which a pound
selected lies is tested. Using this method, high-value items have a greater chance of
being selected. Random number selection could be employed but usually the method
involves cumulative totalling of currency values (which can be time consuming unless
computer-assisted). This is illustrated in Figure 6. The determination of Cumulative
Monetary Amounts (CMA) is frequently used in Monetary Unit Sampling (MUS). MUS
is a statistical sampling technique used for substantive testing which tests each $1 (or
£1) to see if it is correct or incorrect (i.e., a form of attribute sampling).
Value weighted selection can also be used in non-statistical sampling. For example, by
stratifying the population and selecting more items from a particular strata. An example
of this is the ‘two strata’ sampling method which combines 100% selection of high-
value (key) items with the sampling of items from two strata in the remaining
population. The boundary between the two strata may be calculated as (for example)
twice the average population value. The sample selected from the two strata is then
weighted towards the higher value stratum items.
Block sampling — consists of the selection ‘en bloc’ of adjacent transactions or items.
(i) a block selected may not be typical of the characteristics in the population as a whole;
and
(ii) relatively few blocks may be selected. It is, therefore, unlikely that a reasonable
conclusion can be drawn.
Nevertheless significant cost savings in audit time can result and practical considerations
may dictate its use. For example, when conducting an audit over numerous branches, it
may be appropriate to select just one week’s payroll or one month’s postings to the
general ledger or all customer accounts beginning with the same letter.
The errors or deviations detected should be analysed and used to estimate the total error
or deviation rate in the population. The risk, that the actual error or deviation rate may
exceed the tolerable error, should be assessed.
When analysing the errors or deviations (as defined when planning the sample) their
nature, cause and possible impact on other audit areas and the financial statements as a
whole should be considered. If they have a common and potentially significant feature a
sub-population of items possessing that feature may be identified for further testing.
For substantive tests there are two quantitative methods of error projection. Their use
depends on whether or not the error relates closely (i.e., is proportional) to the size of the
item.
Ratio method
This method is used if errors relate closely to the size of the items (i.e., small errors in
small balances, large errors in large balances). For example, if sales invoices for the first
week of January were all priced at December prices.
The projected error is estimated as:
This is illustrated in Figure 7. To this must be added the actual errors in items examined
100% (if any) to give a total estimate of error. (Refinements of this method, for example,
using ‘error taintings’, ‘rankings’ and ‘precision-gap widening’ are beyond the scope of
this article, the professional guidance and the ACCA’s examination syllabuses!)
Difference method
This method is used if errors do not relate closely to the size of items but are relatively
constant for all items. A simple example would be if a credit card company charged a
renewal fee of $21 per account instead of $12 per account.
Such errors can be projected by multiplying the average difference between audited (i.e.,
correct) and recorded (i.e., incorrect) amounts (i.e., $9 in the preceding example) by the
total number of items in the population. This amounts to calculating:
Number of
Error found in
items in
sample x
population
Number of
items in
sample
For tests of control the number of observed deviations divided by the sample size is the
best estimate of the deviation rate in the population from which it was selected, as
illustrated in Figure 8.
Projected errors are not precise measures of the actual errors present in the population.
When using statistical sampling confidence intervals may be calculated to indicate the
likely range of possible error. Alternatively, and more commonly, judgement is used to
draw a reasonable conclusion.