Report of Sta2
Report of Sta2
I. Scenario
In recent years, along with an increasing demand in human resources, a growing number of universities have plan to open new faculties as well as increase the number of student admissions for these hot sectors. However, it is undeniable that the mismatch between the number of students enrollment and teachers/lecturers quantity has large effect on the quality of education and training. To be aware of this important issue, our group decided to find out whether there are any differences in the number of students per teacher from 2005 to 2009 (particularly 2005, 2007 and 2009) by using statistical technique (2-way ANOVA). The available data is blocked into six main regions in Vietnam. After conducting the test, the result show that during this 6-year period, despite the changes in both number of students and teachers, the number of students per teacher is nearly the same, which lead to our conclusion that there is no difference among three years.
Methodology
Data collection
As the problem objective is to test whether there are changes in the amount of students per teacher in recent years in Viet Nam, to be more detail we conduct the test over three years including 2005, 2007, and 2009. Moreover, the data type is quantitative; we decided to use the analysis of variance. The data was collected from the Vietnam General Statistics Office website (shown in Appendix E).
However, we pointed out that many other factors may affect to the result of our test. As a result, the variability within the samples might be large. In order to reduce the variation in each year, we made the survey according to blocks and then did the test. Therefore, we took a random sample of six regions containing Red River delta, Northern midlands & mountainous, Northern Central and Central Coastal, Highlands, South East, and Mekong River delta to test the changes in the rate of student over one teacher in those areas over three years. Nevertheless, because it was so difficult to conduct the experiment on those areas, we continued using excel to select randomly one province in each area to be on behalf of that region. And thereafter, we got the result of six provinces: Hai Phong, Son La, Da Nang, Kon Tum, Dong Nai, and the last one is Kien Giang. Thus, there are six blocks containing six regions and three treatments are three years in this test. The experimental design used here is a randomized block design, which treatments are the three years 2005, 2007, 2009.
After doing the test, the following table was produced:
Red River delta
23.04452467
28.10416667
28.43558606
Northern midlands & mountainous
21.81818182
31.32592593
10.34782609
North Central and Central Coast
45.16666667
33.19047619
27.26348748
Highlands
24.68253968
12.05464481
38.86703383
South East
19.89583333
25.53491436
37.99269006
Mekong River delta
14.95890411
8.356495468
11.10789474
Approach
In order to indicate whether differences exist among the number of students over the quantity of teachers over three years, it is necessary to check the required conditions for using F-test of two-way ANOVA, which are the random variable is normally distributes and the population variances are equal. We will check each condition one by one.
Analysis and discussion
Check the required condition
Normality
As you can see from the histogram in Appendix D, the three populations are non normal, in order to use 2 way ANOVA, we assume that all of them are normally distributed.
Variances equality
Since the best estimator of population variance is the sample variance, we applied the F – test to compare the variability of two populations (biggest versus smallest ones, shown in Appendix B). With ? = 5%, the F-values of the three tests are higher than 0.05. Therefore, it can be inferred that the variances are equal.
For its applicability, two-way ANOVA is a procedure that testes to determine whether differences exist among two or more population means. It enables to measure how much variation is attributable to difference among populations and how much variation is attributable to differences within populations. By designing a randomized block design experiment, it reduces the