Project 1 Feedback Examples

1. Please include the age groups and their percentages and cite your source that you used to find the demographic information.

2. Please include the exact words that were used in the survey question. Note that the survey question has nothing to do with age. The survey question is about the time spent on the Internet. If I am wrong and age was the focus of the study, then you did not used stratified sampling. If I am right and the age was used to stratify the sample and your survey question was about time online, then all of your interpretations about "age" need to be changed to "time online".

2. You say that there were biases. Biases involve skewing the mean either on the high side or low side. Discuss whether you suspect that each of the biases that you acknowledged is likely to skew the mean higher or lower. If it is neither higher nor lower, then it is not a bias.

3. I am pretty sure you are reading your histogram incorrectly. The x-axis displays the number of hours spent online and the y-axis represents the number of individuals who spend that interval of time. Age is not part of your data. It was only used to help prevent bias. The main purpose of a histogram is to display the shape (uniform, skewed left, skewed right, normal, symmetric, unimodal, bimodal, etc. Be sure to write about the shape of the distribution and what that means about cell phone use. How might the cell company use that information?

4. For the stem and leaf, you seem to think that it shows ages, but the data of interest is internet use. If the data that you entered into the computer spreadsheet is ages then that changes the purpose of the project, but in your introduction you stated that the goal was to determine internet use. Other than the discussion of the sampling technique, the rest of the paper should not even mention age. On feature of the stem an leaf is you can quickly see the max, min and mode. This is a good place to discuss these and explain how the cell company will use the information.

5. The main point of the box plot is the box. This shows where the middle 50% of people lie. If the company wants to focus at the center and not be a niche company for the top or bottom, then this will help the company. Please write about this in your paper.

6. The standard deviation does tell you the spread of the data. Based on the number 10.5 (I still don't know whether it refers to age width or internet time width), give the company a recommendation on its marketing or production. Similarly, with the quartiles you say that the company will have a better idea what to do. Now go one step further and provide a recommendation on what to do based on these numbers.

7. For the z-score, the value of x being 13 is just the last number that I put in to give an example for the class. You should find the z-score for the min and max to see if they are outliers. Then explain how the company can use the outlier information. If there are any other interesting values of x (time online?) then find and interpret their z-scores too.

8. Where do you get 68%? This seems to come from the Empirical rule which can only be used with an approximately Normal distribution. Does the histogram show normality? If not, what does Chebyshev's theorem tell you about the data?

1. You write that you use stratified sampling and note the numbers for certain majors. Next you note that you don't know the percentage of each major at the college. Part of stratified sampling involve knowing the percents of each group in the population. Without that information it is not stratified sampling. If you used stratified sampling, please include the percents of the population of each group in your writeup. Then you can write the corresponding sample size for each group which should be the percent from the population times 34. If you did not use stratified sampling, explain how you selected your sample and what biases might have occurred due to not stratifying. Note that bias leads to either a lower sample mean or a higher one compared to the population mean.

2. The splitting of the sample into majors should only be used help lower the bias in the study. The analysis of the data should not be about the groups. It should be only about the population as a whole. This is not a comparison study.

3. The main purpose of the histogram is the look at the shape of the distribution. The histogram does not show the mean or median. It tells you, for example, whether the distribution is approximately normal, unimodal, bimodal, symmetric, skewed, or some other shape.

4. It is the "standard deviation" not "standard of deviation". A standard deviation cannot be skewed, only a distribution can. It is a number so it can be large or small.

1. The term "random" is very specific in statistics. You did not go to "random" classmates. This can only be done by having a computer select names from a list.

2. Your use of cluster sampling was flawed. Cluster sampling involves surveying all students (not just 10-15) within each cluster (class). It is too late to change how you did your sampling, but you should address this in your paper and explain that further studies could fix this issue. This issue certainly causes bias since humans are not capable of selecting "random" samples. Studies have shown that people tend to select others most like themselves. Explain whether this bias likely made your sample mean higher or lower than the population mean. Proper use of cluster sampling avoids this bias.

3. In your listing of the statistics: mean, standard deviation, Q1, Q3, etc., you should also explain what specific decisions NetTutor is recommended to make based on them. For example the IQR is from 14.25 to 22. After looking at these specific numbers what might NetTutor do to help students and to make a better profit?

4. You should put your charts directly into the paper. If you are having difficulty, you can use Snipping Tool that comes with Windows or a similar program that comes with the Mac.

5. I would like to see a discussion of each chart: histogram, box plot, stem-and-leaf. Be sure to explain what each shows and how NetTutor will use it (see comment #3.)

6. How might NetTutor react to the knowledge that there is an outlier on the high side but not on the low side?

7. You have the Empirical Rule and Normal distribution argument backward. The Empirical Rule is not used to determine if there is a normal distribution. Rather you see if there is a normal distribution and then if there is, you can use the Empirical Rule. You check for normality by looking at the histogram and stem and leaf diagram and see if the data is somewhat normal shaped (not very skewed, no extreme outliers, not bimodal). If it is normal shaped, then you can use the Empirical Rule. If you cannot use the Empirical Rule you have to resort to the weaker Chebyshev.

8. There were not a "large amount of factors that went into the survey". Instead there was a diverse population.

9. You talk about the mathematics that you did to find the standard deviation, but didn't you just use the spreadsheet program which automatically found the standard deviation? Better would be to discuss how NetTutor could use the fact that the standard deviation is 4.79 to make its business decisions. How will that information help NetTutor help students and help its bottom line?

10. At the end you discuss that the histogram gives you the standard deviation and the quartiles. The histogram does not give you this information. They are in the spreadsheet's statistics but not shown in the histogram. The boxplot does show the quartiles and you need to explore this.

11. You need to have a conclusion in your paper that discusses the highlights that you discovered and also how you could gather additional information if you had more time to delve deeper into the study an conduct another longer survey.