Over the last week or so, I wrote a web-based tool to automatically generate datasets and worked-out solutions. It creates and displays a dataset, a completed solution, and the results of most intermediary computational steps. It is freely available online for instructors or students: http://rlanders.net/datasets.php
As an example, if you select “Paired Samples t” with “n = 10” and an outcome type of “Survey”, it will output:
- Instructions on what to do with the dataset generated
- A dataset with Time 1 and Time 2 “Survey” variables (1=min, 5=max, 3ish=mean, 1ish=sd)
- A “completed” dataset containing calculated difference scores and squared difference scores
- Sum of d, Mean of d, Sum of d^2, (Sum of d)^2, sd of d, se of d, critical t statistic, degrees of freedom, observed t, CI lower and upper bounds
- A box with all important “final” output: the research question, null and alternative hypotheses, alpha, critical value, journal-type reporting of CI and t with precise p-value, unstandardized effect size (difference score), standardized effect size (Cohen’s d) and a NARRATIVE CONCLUSION including interpretation of the effect size. As an example: “Conclusion: Reject the null and accept the alternative. The difference is statistically significant. The Survey variable decreases over time. If we assume this sample to represent the population, we would expect 95% of sample means to fall between -1.68 and 0.08. On average, the Survey variable was 0.80 lower at Time 2. The difference over time was medium.”
This is customized to each test. The program will also randomly choose between directional and non-directional tests when directional tests are plausible.
I have tried to tweak the generation algorithms so that you get statistically significant results about 50% of the time. However, you will get more significant results with larger sample sizes and fewer significant results with smaller sample sizes (as you might expect). But if you leave it set to n = 10, it should be about even.
The tool includes fully worked out problems for: central tendency and variability statistics, z-score calculations, confidence intervals, z-tests, one-sample t-tests, paired-samples t-tests, independent-samples t-tests, one-way ANOVA, chi-squared goodness of fit and test of independence, and correlation/regression.
If you’re wondering why I wrote this, it is because I wrote an undergraduate 1-semester introduction to statistics for business students which will be published in 2013. However, you don’t need the book to use the dataset generator (just ignore the references to “Chapters”). It is customized to the statistical method I teach in that text, however, so if you like it, I’d ask you to consider my textbook when it is published in 2013.
Also, if you decide to adopt this or provide it to your students, please leave a note in the comments saying so. Feedback is appreciated!