The variance (σ2)
is defined as a measure of how far each value in the data set is from the mean.
Here is how it is defined:
- Subtract the mean from each value in the
data. This gives you a measure of the distance of each value from the
mean.
- Square each of these distances (so that
they are all positive values), and add all of the squares together.
- Divide the sum of the squares by the
number of values in the data set.
I guess
all of you are familiar enough with what are average (mean), variance and standard
deviation. Averages, variance and standard deviation are the three most basic in statistics. This post is more about how to teach variance. Let’s say there
are 8 test scores, the average is 46 and, the variance is 16.
What if
each test scores are doubled? Average? Sure, still easy. It will be 92. How
about variance? or the standard deviation? I am not sure how many can answer
this question right away.
In order to write the equation that defines the variance, it is simplest to use the summation operator, "Σ". The summation operator is just a shorthand way to write, "Take the sum of a set of numbers."
Data
|
X1
|
X2
|
X3
|
X4
|
X5
|
X6
|
X7
|
Value
|
3
|
4
|
9
|
13
|
17
|
22
|
23
|
Think of the variable (X) as the measured quantity from your experiment and think of the subscript as indicating the trial number (1-7). To calculate the average, first we have to add up the values from each of the seven trials. Using the summation operator, we will write it like this:
X1 + X2 + X3 + X4 +
X5 + X6 + X7
|
or:
3+ 4 + 9 + 13+ 17 + 22 +
23
|
Defining Variance:
Now you know how the summation
operator works, you can understand the equation that defines the variance:
The variance (σ2),
is defined as the sum of the squared distances of each term in the distribution
from the mean (μ), divided by the number of terms in the distribution (N).
You take the sum of the squares of the terms in the distribution, and divide by
the number of terms in the distribution (N).
How to do the calculation:
1) First, add your data points together:
3 + 4 + 9 + 13 + 17
+ 22 + 23 = 91
next, divides your answer by the number of data: 91 ÷ 7 = 13.
next, divides your answer by the number of data: 91 ÷ 7 = 13.
Sample mean, x̅ = 13.
*You can think of the mean as the "centre-point" of the data. If the data clusters around the mean, variance is low. If it is spread out far from the mean, variance is high.
*You can think of the mean as the "centre-point" of the data. If the data clusters around the mean, variance is low. If it is spread out far from the mean, variance is high.
2) Subtract the mean from each of data. Each answer will tells that number's
deviation from the mean, or in plain language, how far away it is from the
mean.
X - X̅ = 3 - 13 = -10
X - X̅ = 3 - 13 = -10
X - X̅ = 4 - 13 = -9
X - X̅ = 9 - 13 = -4
X - X̅ = 13 - 13 = 0
X - X̅ = 17 - 13 = 4
X - X̅ = 22 - 13 = 9
X - X̅ = 23 - 13 = 10
3) To
solve this problem, find the square of each deviation. This will make all the
number became positive numbers, so the negative and positive values no longer
cancel out.
(-10)2 = 100
(-9)2 = 81
(-4)2 = 16
02 = 0
42 = 16
92 = 81
42 = 16
92 = 81
102 = 100
4) Find the sum of the squared values. Now
calculate the entire numerator of the formula ∑(X - x̅)2. The upper-case
sigma, “∑”, tells you to sum the value of the following term for each value of. You've already
calculated for each value of in your sample, so all you need to do is add the results together.
100
+ 81 + 16 + 0 + 16 + 81 + 100 = 394.
5) Divide by n - 1, where n is the number of data
points. As it turns out, dividing
by “n – 1” instead of “n” gives you a better estimate of variance of the larger
population.
There are seven data points in the sample, so n = 7. Variance of the sample σ2= 394 ÷ 6 = 65.67
There are seven data points in the sample, so n = 7. Variance of the sample σ2= 394 ÷ 6 = 65.67
Data set 1: 3, 4, 4, 5, 6,
8, 10.
Data set 2: 1, 2, 4, 5, 7, 9, 11.
|
As an
example, let's go back to the distributions where we started our discussion
with:
What is the variance of each data set above?
First, try to follow the step above to find the variance for results from your experiments or you can construct using a table to calculate the values.
First, try to follow the step above to find the variance for results from your experiments or you can construct using a table to calculate the values.
(Answer: Data 1 : 6.24 and Data 2
: 13.29)
*Although both data sets have the
same mean (μ = 5), the variance (σ2) of the second
data set, 13.29, is a little more than two
times the variance of the
first data set, 6.24.
It might
be so easy to memorize for you, but not for them. Any questions? Post on our
comments. We will be happy to answer any statistics problem and will try to
help you to solve the problem.
By: Nur Fariza
By: Nur Fariza
References:
- http://www.wikihow.com/Calculate-Variance
- http://www.sciencebuddies.org/science-fair-projects/projects_data_analysis_variance_std_deviation.shtml
- http://www.mathsisfun.com/data/standard-deviation.html
No comments:
Post a Comment