Bowie State University Regression Analysis for Health Prediction Report
Question Description
Question 1
The accompanying data file contains survey information on 120 American adults who rated their health (Health) and social connections (Social) on a scale of 1 to 100. The file also contains information on their household income (income, in $1000s) and college education (College equals to 1 if they have completed a bachelors degree, 0 otherwise).
- A)Build a linear regression model (Model 1) for Health using Social, Income, and College as the predictor variables in the training set. Provide the regression output and comment on the overall model fit in the training set, back up your answer with values from the regression output.
- B) Use Model 1 to predict Health in the validation set and compute RMSE.
- C)Build a linear regression model (Model 2) for Health using Social, Income, and College as well as interactions of Social with Income and Social with College as the predictor variables in the training set. Provide the regression output and comment on the overall model fit of the model in the training set, back up your answer with values from the regression output.
- D)Use Model 2 to predict Health in the validation set and compute RMSE.
- E)Which model would you consider a better predictor? Explain your answer.
Question 2
Annabel, a retail analyst, has been following Under Amour, Inc., the pioneer in the compression-gear market. Compression garments are mean to keep moisture away from a wearer’s body during athletic activities in warm and cold weather. Annabel believes that the Under Armour brand attracts a younger customer, whereas the more established companies, Nike and Adidas, draw on older clientele. In order to test her belief, she collects data on the age of the customers and whether or not they purchase Under Armour (Purchase; 1 for purchase, 0 otherwise). The data is shown in the accompanying file.
- A)Estimate the logistic regression model where the Under Armour purchase depends on age.
- B) Compute the predicted probability of an Under Armour purchase for a 20-year-old customer and a 30-year-old customer.
- C) Is Annabel correct that the Under Armour brand attracts a younger customer? Explain.
Question 3
According to the 2017 census, just over 90% of Americans have health insurance (CNBC, May 22, 2018). However, a higher percentage of Americans on the lower end of the economic spectrum are still without coverage. Consider a portion of data in the following table relating to insurance coverage (1 for coverage, 0 for no coverage) for 30 working individuals in Atlanta, Georgia. Also include in the table is the percentage of the premium paid by the employer and the individual’s income (in $1,000s).
- A) Estimate the logistic regression model where insurance coverage depends on premium percentage and income in the training set. Provide the regression output and comment on the overall model performance in the training set, back up your answer with values from the regression output.
- B) Use your model to predict insurance coverage in the validation set. What percentage of observations is correctly classified in the validation set?
- C) Consider an individual with an income of $60,000. What is the probability that she has insurance coverage if her employer contributes 50% of the premium? What if her employer contributes 75% of the premium?
- D) In your opinion, what is the quality of the predictions in part c? Clearly support your opinion with values from your analysis.
Note: The excel files for all 3 questions will be uploaded
Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."