Feature Article—Changes to the accuracy of labour statistics
Recently
the Australian Bureau of Statistics (ABS) has made changes to the way it
collects labour market data. These changes have reduced the accuracy of results
published as the ABS labour force statistics.
Labour
Force Survey
It
is important to decision makers to receive timely and accurate social and
economic indicators of what is happening in the economy. Among the key
indicators are labour market data. The ABS has been producing labour force data
on a monthly basis for more than 30 years.
It
is not practical to ask every person in Australia their labour market status on
a monthly basis. Hence the ABS conducts a sample survey, the Labour Force
Survey (LFS). The LFS survey randomly selects a number of households from the
population and questions the occupants on their labour force status and
experience. There are about 30 000 households surveyed each month.
All
people aged 15 years and over in these households are the sample; in
June 2008 this was 54 900 people. This sample is used to calculate
all LFS estimates including such important and widely-reported statistics as
the number of people employed, the number unemployed, the unemployment rate,
and the labour force participation rate.
Sample
size and accuracy
Ultimately
the size of the sample dictates the accuracy of the statistics calculated from
that sample. In general terms a small sample has lower statistical accuracy, a
large sample has higher accuracy. High accuracy is clearly desirable. However,
surveying a large number of people costs more than surveying a few people. So,
the sample size must be chosen to achieve an accurate outcome while at the same
time minimising costs.
Whenever
a statistic such as the level of unemployment is calculated from a single
random sample we do not know whether the statistic is high or low or just about
right. This is because there are a very large number of samples which we could have
taken and we only have one of them. Each sample will produce a slightly
different result so that some samples will produce a low estimate of the
statistic being estimated and other samples will produce high values.
Statisticians
measure accuracy—how close the statistic is to the true value—using a statistic
called the standard error of the sample. There is a 68 per cent chance that the
true value lies within one standard error of the sample statistic and a 95 per
cent chance that the true value is within two standard errors of the sample
statistic.
This
goes to the heart of the changes now made to the way that the ABS collects data
to produce a picture of the labour force. According to the ABS, its tight
2008–09 budget has forced reductions in its statistical work program. As a
result the ABS has decided to reduce the LFS sample size across Australia by 24
per cent, i.e. from 54 900 in June 2008 to around 41 900 in July
2008. With this has come an inevitable reduction in accuracy which shows up as
an increase in the standard errors associated with each statistic.
This
loss of accuracy can be shown in two ways. One is to calculate what is called
the relative standard error (RSE) which is the ratio of the standard error to
the value of the statistic. Thus between June and July 2008, for the level
of unemployment, the relative standard error increased from 2.7 per cent
to 3.3 per cent. The other way to show the loss of accuracy is to simply
use the standard errors which are conceptually easier to grasp than RSEs and
are published by the ABS.
Effect
on estimates
Every
statistic calculated from a survey such as the LFS has an associated standard
error. For the level of unemployment, for example, the standard error was 12 400
in June 2008. This meant that in June 2008, we could be 68 per
cent confident that the true value for unemployment lay in the range
12 400 less than the reported statistic to 12 400 more than the reported statistic. Thus at the 68 per cent level, we could say
that the true value lay in a range that was two times 12 400 wide, viz.
24 800 people wide. At the 95 per cent level the true value lay in a
range from twice 12 400 less than the reported statistic to twice
12 400 more than the reported statistic; this made the range 49 600
people wide at the 95 per cent level.
In
July 2008, the published standard error had grown by 2200 from the 12 400
of June 2008 to 14 600. This means that the range had grown to
29 200 people at the 68 per cent level and 58 400 at the
95 per cent level.
Not
only is it possible to calculate standard errors for the levels of the various
LFS statistics, it is also possible to calculate them for changes in levels,
i.e. in movements. Movements are often quoted in the media as, for example, the
level of unemployment rises and falls. Hence in July 2008 the published standard
error on the changes of level of unemployment was 15 300 which is an
increase of 2000 from that for June 2008 which was 13 300. This makes
the range for one standard deviation 4000 people wider and for two standard
deviations 8000 people wider.
The
bottom line is that the accuracy of the estimates produced by the LFS has
declined. Depending on the certainty needed, the width of the confidence
intervals on levels and on movements have grown, in the case of unemployment by
as much as 8800, of employment by as much as 28 800, and of the labour
force as much as 29 200.
Monthly statistical bulletin tables 1.1 to 1.5
ABS LFS data appears in the Monthly statistical
bulletin tables 1.1 to 1.5. All data in these tables from and including
July 2008 are now less accurate than they have been.
Greg Baker
Statistics and Mapping Section

|