Search This Blog

Wednesday 14 June 2017

Chi Square

What is Chi Square?
Chi square statistic is a method of showing relationship between two categorical variables. In statistic, there are two types of variables which is the numerical and non-numerical variables. The chi squared statistic is a single number that tells you how much difference exists between your observed counts you would expect if there were no relationship at all in the population.

chi square formula

There are a few variations on the chi square statistic. However, all of the variations use the same idea, which is that you are comparing your expected values with the values you actually collect. One of the most common forms can be used for contingency tables.

Chi square test consists of two types which is the chi square goodness of fit test and chi square test for independence. In this section, we will focus on the chi square goodness of fit test. The purpose of this test was to determines if a sample data matches a population. This test is applied when there is one categorical variable from a single population. It is used to determine whether sample data are consistent with a hypothesized distribution.

example


 When to use it?
This test is appropriate when those conditions are met:
-               The sampling method is simple random sampling.
-               The variable under study is categorical.
-               The expected value of the number of sample observation in each level of the variable is at least 5.

Step in Chi Square

1)         State the hypothesis
2)         Formulate an analysis plan
3)         Analyze sample data
4)         Interpret results

Homogeneity of Proportions
The test were used to a single categorical variable from two or more different populations. It is used to determine whether frequency counts are distributed identically across different populations.
This test also will be used when:
-               In the population, the sampling method is simple random sampling.
-               The variable is categorical.
-               Data displayed in contingency table showed that the expected frequency count for each cell is less than 5.

The steps and procedures are also the same as the usual chi square test.

By : Hanis Jefry

References :
http://www.statisticshowto.com/probability-and-statistics/chi-square/
http://math.hws.edu/javamath/ryan/ChiSquare.html
http://www.statisticssolutions.com/non-parametric-analysis-chi-square/
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm

Correlation & Regression Analysis

Definition & Objective

The objective of correlation analysis is to see whether two measurements variables co vary, and also to quantify the strength of the relationship between variables. The regression, in the other hand will express the relationship in the form of an equation.


In an example, a group of students taking a maths and english test, we could use correlation to determine whether students who are good at maths tend to be good at english as well, and regression to determine whether the marks in english can be predicted for given marks in maths.

There are three main objective of this method which is :
- to test hypotheses about cause and effect relationships.
- to see whether two variables are associated, without necessarily inferring a cause and effect relationship.
- estimating the value of one variable corresponding to a particular value of the other variable.

Uses of Correlation

Correlation can be used such as the Pearson Product Moment Correlation Coefficient, to test if there is a linear relationship between the variables. To quantify the strength of the relationship, we can calculate the correlation coefficient (r). Its numerical values ranges from +1.0 to -1.0. r > 0 indicates positive linear relationship, while r < 0 indicates negative linear relationship and r = 0 indicates no linear relationship.


Uses of Regression

In the analysis of regression, the problem of interest is the nature of the relationship itself between the dependent variable and the independent variables.
The analysis consists of choosing and fitting an appropriate model, done by the method of least squares, with a view to exploiting the relationship between the variables to help estimate the expected response for a given value of the independent variable. For example, if we are interested in the effect of age on height, then by fitting a regression line, we can predict the height for a given age.



By: Hanis Jefry

References:

http://abyss.uoregon.edu/~js/glossary/correlation.html
http://sphweb.bumc.bu.edu/otlt/mph-modules/bs/bs704_multivariable/bs704_multivariable5.html
http://www.bmj.com/about-bmj/resources-readers/publications/statistics-square-one/11-correlation-and-regression
http://keydifferences.com/difference-between-correlation-and-regression.html