# R Project (Deadline: Tuesday, November 8th, 2013, by 11:59 a.m.)

Orçamento $35 USD

Project Description:

R Project (Deadline: Tuesday, November 5th, 2013, by 11:59 a.m.)

1. Problem 1. Commercial Eggs produced from different housing systems. In the production of commercial eggs in Europe, four different types of housing systems for the chickens are used: Cage, free range, barn, and organic. The characteristics of eggs produced from the four housing systems were investigated in Food Chemistry (Vol. 106, 2008). Twenty-eight commercial grade A eggs were randomly selected from supermarkets, 10 of which were produced in cages, 6 in barns, 6 with free range, and 6 organic. A number of quantitative characteristics (response variable) were measured for each egg, including shell thickness (in millimeters), whipping capacity (percent overrun), and penetration strength (in Newtons). The data are saved in the [url removed, login to view] file. For each characteristic, the researcher compared the means of the four housing systems.

Note: In this problem, HOUSING is the grouping variable to do an ANOVA test for each of the response variables THICKNESS, OVERRUN, and STRENTH. The variable HOUSING has four levels: CAGE, FREE, BARN, and ORGANIC.

a. Import the data set [url removed, login to view] from Blackboard > Course Documents > Data Sets to RStudio.

b. Run the ANOVA procedure (using the aov() function) for each of the response variables. That is, use the aov() function to test the null hypothesis of equality of the four population means for each of the response variables. Then use the summary() function to obtain the ANOVA table in each case. Write a paragraph on the interpretation of the results.

c. For each response variable, use the [url removed, login to view]() function to make pairwise comparisons of the population mean. Use the Bonferroni’s adjustment test. Discuss the results.

2. Problem 2. Income of Mexican street vendors. Interviews were conducted in the city of Puebla, Mexico, in order to study the factors influencing vendors’ incomes. The researchers collected data on annual earnings, age, and hours worked per day from a random sample of 15 vendors. The data are saved in the [url removed, login to view] file, which can be found in Blackboard. Notice that, in this data set, the variable Earnings is the response variable and the variables Age and Hours are the predictors.

a. Import the data set [url removed, login to view] from Blackboard > Course Documents > Data Sets to RStudio.

b. Use the pairs() function to make a scatter plot matrix.

c. Fit a multiple regression model using the variable Earnings as the response variable and the variables Age and Hours as the predictor variables.

d. Obtain the regression coefficients and t-tests of significance of the predictors. Use the results to discuss the significance of the predictors. Explain.

e. Obtain 90% confidence intervals for the regression parameters of the model.

f. Get the predicted values and the residuals for the model. Then make a data frame (using the [url removed, login to view]() function) that includes the variables Earnings, Age, Hours, predicted values, and [url removed, login to view] required:

Analytics, Statistics