Search This Blog

Thursday 27 July 2017

Post-hoc Tests for ANOVA

post hoc analyses are usually concerned with finding patterns and/or relationships between subgroups of sampled populationthat would otherwise remain undetected and undiscovered were a scientific community to rely strictly upon a priori statistical methods

Tests
Fisher's least significant difference (LSD)
Bonferroni procedure
Holm–Bonferroni method
Newman–Keuls method
Duncan's new multiple range test (MRT)
Rodger's method
Scheffé's method
Tukey's procedure
Dunnett's correction
Sidák's inequality
Benjamini–Hochberg (BH) procedure

Post hoc procedures vs. a priori comparisons (we look at only post hoc)

Comparisonwise error rate - alpha for each comparison
Experimentwise error rate - the probability of making a Type I error for the set of all possible comparisons

alphae = 1 - (1-alpha)c

SPSS One-Way ANOVA with Post Hoc Tests


A hospital wants to know how a homeopathic medicine for depression performs in comparison
to alternatives. They adminstered 4 treatments to 100 patients for 2 weeks and then measured
their depression levels. The data, part of which are shown above, are in depression.sav.

ANOVA - Main Assumptions

  • Independent observations often holds if each case (row of cells in SPSS) represents a unique person or other statistical unit. That is, we usually don't want more than one row of data for one person, which holds for our data;
  • Normally distributed variables in the population seems reasonable if we look at the histograms we inspected earlier. Besideds, violation of the normality assumption is no real issue for larger sample sizes due to the central limit theorem.
  • Homogeneity means that the population variances of BDI in each medicine group are all equal, reflected in roughly equal sample variances. Again, our split histogram suggests this is the case but we'll try and confirm this by including Levene's test when running our ANOVA.



There's many ways to run the exact same ANOVA in SPSS. Today, we'll go for General Linear 
Modelbecause it'll provide us with partial ate square as an estimate for the effect size of our 
model

. We'll briefly jump into Post Hoc and Options before pasting our syntax.


The post hoc test we'll run is Tukey’s HSD (Honestly Significant Difference), denoted as 
“Tukey”. We'll explain how it works when we'll discuss the output.


“Estimates of effect size” refers to partial ate square. “Homogeneity tests” includes Levene’s 
test for equal variances in our output.

SPSS ANOVA Output - Levene’s Test



 Levene’s Test checks whether the population variances of BDI for the four medicine groups 
are all equal, which is a requirement for ANOVA. “Sig.” = 0.949 so there's a 94.9% probabilityof
finding the slightly different variances that we see in our sample. This sample outcome is very
likely under the null hypothesis of homoscedasticity; we satisfy this assumption for our ANOVA.

SPSS ANOVA Output - Between Subjects Effects


 If our population means are really equal, there's a 0% chance of finding the sample
differences we observed. We reject the null hypothesis of equal population means.
 The different medicines administered account for some 39% of the variance in the BDI
scores. This is the effect size as indicated by partial eta squared.
Partial Eta Squared is  the Sums of Squares for medicine divided by  the corrected total
sums of squares (2780 / 7071 = 0.39).
 Sums of Squares Error represents the variance in BDI scores not accounted for by medicine. Note that  +  = .



comparing 4 means results in (4 - 1) x 4 x 0.5 = 6 distinct comparisons, each of which is listed
twice in this table. There's three ways for telling which means are likely to be different:
 Statistically significant mean differences are flagged with an asterisk (*). For instance, the
very first line tells us that “None” has a mean BDI score of 6.7 points higher than the placebo 
which is quite a lot actually since BDI scores can range from 0 through 63.
 As a rule of thumb, “Sig.” < 0.05 indicates a statistically significant difference between
two means.
 A confidence interval not including zero means that a zero difference between these means
in the population is unlikely.
Obviously,  and  result in the same conclusions.


by : Fatin Farhana bt Marzuki

References

https://www.spss-tutorials.com/spss-one-way-anova-with-post-hoc-tests-example/
https://statistics.laerd.com/statistical-guides/one-way-anova-statistical-guide-4.php

Thursday 13 July 2017

REPEATED MEASURE

WHAT IS REPEATED MEASURE?

The same entities take part in all conditions of an experiment

Variable
§One independent variable (categorical) 
(e.g. Time 1/ Time 2/ Time 3)
§One dependent variable (continous
(e.g. scores on the Confidence  in Coping with Statistics Test).

STEP BY STEP REPEATED MEASURE

#1.   Your dependent variable should be measured at the continuous   level   (i.e., they are interval or ratio variables).
#2.  Your independent variable should consist of at least two   categorical, "related groups" or "matched pairs". "Related groups"   indicates that the same subjects are present in both groups
#3.  There should be no significant outliers in the related groups. The   problem with outliers is that they can have a negative effect on   the   repeated measures ANOVA, distorting the differences   between the   related groups, and can reduce the accuracy of your results.
#4.   The distribution of the dependent variable in the two or more related   groups should be approximately normally distributed. You can test for   normality using the Shapiro-Wilk test of normality.
#5.   Known as sphericity, the variances of the differences between all   combinations of related groups must be equal. Unfortunately, repeated   measures ANOVAs are particularly susceptible to violating the   assumption of sphericity, which causes the test to become too liberal.

HOW TO REPORTING RESULT OF REPEATED MEASURE?

§A one-way repeated measures ANOVA was conducted to compare scores on the Confidence in Coping with Statistics test at Time 1 (prior to the intervention), Time 2 (following the intervention) and Time 3 (three-month follow-up) . The means and standard deviations are presented in Table XX. There was a significant effect for time [Wilks’ Lambda=.25, F(2, 28)=41.17, p<.0005, multivariate partial eta squared=.75.]

MANOVA

MANOVA (Multivariate Analysis of Variance) 


(MANOVA) is an extension of analysis of variance (ANOVA) which ANOVA used only single 
dependent variable but with MANOVA more than one dependent variables. 
MANOVA also provide the univariate result which can be shown separately to compare the groups
and also indicate the significance different between mean groups.
It can reduce the chance of getting type 1 error if ANOVA is being used repeatedly to perform the
analysis like MANOVA did.

To ↓ the type 1 error, alpha value ( usually .05) should be ÷ 3 = .017 .
• This is to put a cut – off value to show that it is statistically significant between the groups value

Assumptions

1.Sample Size 
The sample size need to be larger than the dependent variables. 
2.Normality
It can be divide into two which is multivariate and univariate. 
Usually MANOVA will used the multivariate analysis that represent the Mahalanobis distances. 
For the univariate, usually they will used this analysis when they have lack of sample size on their study. 
In this assumption, it also will help to identify any outliers on the data.
3.Outliers
The outliers actually can be describe as the strange or unusual number combination on the data such as very high or very low between the number
4.Homogeneity of regression
Only important if you are intending to perform stepdown analysis
•This approach is used when you have some theoretical or conceptual reason for ordering your dependent variables
Complex procedure
5.Multicollinearity and singularity
MANOVA works best when DV are moderately correlated
•So you need to check the correlation!
•Need to consider removing one of the strongly correlated pairs of DV
•Strongly correlate when correlations up to 0.8 to 0.9
6.Homogeneity of variance-covariance matrices
Generated as part of MANOVA output
•Test used is Box’s M Test of Equality of Covariance Matrices

How to write the results 
There was a statistically significant difference between males and females on the combined dependent variables , F(3, 428)= 3.57, P=.014; Wilks’ Lambda = .98; partial eta squared =.02
•Considering the dependent variables separately , the only difference to observe statistically significance is by using Bonferroni , by adjusting the alpha level of .017 , the result was perceived stress , F ( 1, 430) = 8.34, p=.004 partial eta squared = .02
•Observation of mean score showed that female showed slight higher level s of perceive d stress  ( M= 27.42, SD = 6.08) than males ( M= 25.79, SD =5.41)

BY:
Nurulain Parlan

Thursday 6 July 2017

ANOVA

What is ANOVA?

- ANOVA is apply for more than 2 groups or category independent variable.
- If only 1 classifying the variable, then we have one-way ANOVA but if 2 classifying variables are present, then we will have two-ways ANOVA

One-way ANOVA is conducted to access whether population means significantly different among groups. If overall ANOVA test is significant then pairwise comparison test should be conducted to investigate which 2 populations means significantly different.

For example:

We have result of the treatment for three different races which is Malay1, Chinese2 and Indian3 and we interested in testing whether these population means different.


Step 1: Generate the hypothesis
Ho: m1=m2=m3
Ha: At least two of the treatment groups are different.

Step 2: Set the significance level
α = 0.05

Step 3: Checking the assumptions
Assumptions for ANOVA
- Random sample from statement
              - The observations are independent:
                 Each observation refer to different group
* If the distribution is normal
 Select Analyze=>Descriptive Statistics=> Explore

               •Insert treatment in the Dependent List box.
               •Insert group in the Factor List box.
•               * In the Explore: Plot, click on the Normality Plots with Test and Histogram.  

From statistical technique,
*p value Shapiro-Wilk for Malay1= 0.441 (>0.05), not significant (Normal)
*p value Shapiro-Wilk for Chinese2= 0.611 (>0.05), not significant (Normal)
*p value Shapiro-Wilk for Indian3= 0.222 (>0.05), not significant (Normal)
From graphically, all graphs show normal distribution
So, assumption is met. Distribution is normal

Step 4 :Test statistics using SPSS:

        »Analyze => Compare mean  = > One-way ANOVA.
        »Insert Treatment in Dependent List box.
        »Insert Group in Factor box.
        »Select Post Hoc and click Bonferroni and continue.
        »Select Option and click Descriptive and Homogeneity of variance test and continue.
        »Select Ok.


Step 5: Interpretation
        •p value is 0.001, reject Ho.
        •At least two groups of treatment is significance difference
        •Use Post Hoc test: Bonferroni
      •To check which groups has significance difference
    
Step 6: Conclusion
  At 5% level of significance, at least two groups of treatment is statistically significance difference (p value=0.001).
  By using Post Hoc Test, there is statistically significance difference between races (p value = 0.001).


By: Nur Fariza


References :

- http://www.statisticssolutions.com/manova-analysis-anova/

- http://support.minitab.com/en-us/minitab/17/topic-library/modeling-statistics/anova/basics/what-is-anova/

- http://www.statisticssolutions.com/Conduct-and-Interpret-One-Way-ANOVA/

Wednesday 14 June 2017

Chi Square

What is Chi Square?
Chi square statistic is a method of showing relationship between two categorical variables. In statistic, there are two types of variables which is the numerical and non-numerical variables. The chi squared statistic is a single number that tells you how much difference exists between your observed counts you would expect if there were no relationship at all in the population.

chi square formula

There are a few variations on the chi square statistic. However, all of the variations use the same idea, which is that you are comparing your expected values with the values you actually collect. One of the most common forms can be used for contingency tables.

Chi square test consists of two types which is the chi square goodness of fit test and chi square test for independence. In this section, we will focus on the chi square goodness of fit test. The purpose of this test was to determines if a sample data matches a population. This test is applied when there is one categorical variable from a single population. It is used to determine whether sample data are consistent with a hypothesized distribution.

example


 When to use it?
This test is appropriate when those conditions are met:
-               The sampling method is simple random sampling.
-               The variable under study is categorical.
-               The expected value of the number of sample observation in each level of the variable is at least 5.

Step in Chi Square

1)         State the hypothesis
2)         Formulate an analysis plan
3)         Analyze sample data
4)         Interpret results

Homogeneity of Proportions
The test were used to a single categorical variable from two or more different populations. It is used to determine whether frequency counts are distributed identically across different populations.
This test also will be used when:
-               In the population, the sampling method is simple random sampling.
-               The variable is categorical.
-               Data displayed in contingency table showed that the expected frequency count for each cell is less than 5.

The steps and procedures are also the same as the usual chi square test.

By : Hanis Jefry

References :
http://www.statisticshowto.com/probability-and-statistics/chi-square/
http://math.hws.edu/javamath/ryan/ChiSquare.html
http://www.statisticssolutions.com/non-parametric-analysis-chi-square/
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm