Method of
reducing the
Curve of
Probability
to the scale
of any Curve
of Frequency.
Criteria
by which the
homogeneity
of a curve is
to be judged.
Two main
elements in
any series.
Appendix, Note II, p. 125). If this standard is generally used it will be necessary to alter
the units of measurement in each Curve of Frequency, so that if the observations did fit
the Law of Errors the Curves of Frequency and Probability would coincide.
Such a method has the advantage that the form of the curve is always the same
and can be easily retained in the pictorial imagination. The calculations involved are,
however, so exceedingly laborious, that this method of procedure is impracticable when
the number of series is as large as in the present case.
An equally efficacious and much less laborious method is that of reducing the Curve
of Probability to the standard of die Frequency Curve instead of vice versa. For the
method by which this is done see Appendix (Note III, p. 126).
This is the procedure which has been adopted throughout in the diagrams
representing Frequency Curves and Probability Curves, which are reproduced in the
supplement to this volume. Every separate series consequently has its own particular
Curve of Probability, and it is from the gradient of that curve alone that any information
can be obtained.
The criteria by which the homogeneity of the series must be judged are of two kinds,
viz.:—
1. Those depending directly on graphic representation; viz. the extent of the
coincidence or divergence of the two curves, that of Frequency and that of Probability,
and the place or places where the divergences occur.
2. The more subtle characteristics of the curves, which are not readily appreciated
by the eye and are therefore more conveniently expressed by mathematical symbols and
formulae.
It is now necessary to give a brief description of these symbols for the sake of those
readers who are not practised in this style of work.
A , M, R, r, y, cy are the symbols which are used in the diagrams given in the Supplement.
Of these A is the area of either the Curve of Probability or that of Frequency, i. e.
it is the number of specimens in the series under consideration.
M is the arithmetical mean of the series. R is the measure of the oscillation or
instability of that mean, r is the ‘ probable error’ of the series, y is the average error
of the specimens.
c is the modulus of the curve and serves to determine its shape in conjunction with
7-=» which represents the number of specimens which in the Probability Curve would
occur at the mean.
In the Appendix (Note IV, p. 128) is given an example of the working out of these
values for a particular series which will show exactly how each of them is obtained. We
may now proceed to show their application and incidentally to illustrate how they are
obtained.
In judging of the nature of any given series for which a Probability Curve has
been constructed there are two main elements of which the stability must be estimated,
viz.:—
1. The series itself in relation to its own standard or type.
2. The standard or type itself.
1. The consistency of the series in relation to its standard or type may be judged
according to either of two methods, the important symbols of which are named respectively
the ‘ error of mean square ’ (or the ‘ standard deviation ’ as it is generally called by Two main
biometricians) and the ' probable error’ ; the latter only is used in this volume*. an y Ï Ï r ie s ?
The probable error, r, gives the units within which half the examples in the
whole series occur. That is to say, r is the quantity which, if added to M on the one side
and subtracted from M on the other side, accounts for half the individuals in the series. It
is evident therefore that the largeness or smallness of the value of r gives immediately an
approximate idea of the compactness of a series or its reverse. Consequently it is most
useful in cases of comparison. Suppose, for example, that M and r have been determined
for a series made up of several smaller series, then, if in any one of the component series
it is found that the proportion of examples lying between the limits of M ± r is not
nearly an exact half, the amount of the variation will show the extent to which the elements
forming that component series differ from the predominant type in the compound
series.
2. The second element of which the stability must be estimated is that of the type
itself, the standard from which the other individuals deviate. This is the pivot on which
the whole, turns, as a balance upon its axis. I f the axis itself is unsteady, the balance
itself must be doubly unsteady as it adds the elements of its own instability to those of
the axis. Suppose that the type itself cannot be fixed at all closely, but varies within the
limits of, say, several units along the base of the Frequency Curve; then, as a natural
consequence, the measurement of the deviation of any individual observation from the
type is uncertain to exactly the same degree. The type, as previously stated, is given by
the apex of the curve. In a pure curve, as in the Curve of Probability itself, the apex is
unmistakably defined, and is in the same position as the arithmetical mean. In an
impure curve, however, the apex is ill-defined or perhaps there may be more than one
apparent apex. The difficulty then arises as to the point at which the type may really be
said to be found, e.g. in the impure curve in Fig. 14 it might be said that the type was at
140 or at 146 or even at some point between them.
The stability of the type itself is judged by the nearness of its approach to identity
with the arithmetical mean. As a matter of fact the outline of the Frequency Curve
always shows fairly clearly whether the type varies within wide limits or not. For purposes
of comparison, however, it is useful to have a mathematical expression for this variation.
This may be obtained in the following way. The arithmetical mean of the series under
consideration is first calculated : this is, of course, obtained by adding together all the
individual values and dividing the result by their number : this is symbolized by M. We
next calculate y, which, as we have seen, is done by adding together all the differences of
the individual values from M, leaving out of account the plus or minus sign of these
differences : dividing this result by the number of examples in the series we arrive at
the formula y = — . From this we obtain the probable error r by multiplying y by the
constant number '8453.
We now proceed further to the application of this work to determine the stability
of the type. We divide the value of r so found by the square root of the number of
* T h e error o f mean square, or the standard deviation, symbolized b y e, is obtained as follows. Suppose
there are » observations in a series, let M stand fo r the arithmetical mean, 8 for the difference o f any individual
from the mean. T h en, S denoting summation for all the individual cases, e is defined b y the formula «* = .