Due date: 11:59pm, Tuesday, February 23
You MUST submit both your .Rmd and .pdf files to the course site on Gradescope here: https://www.gradescope.com/courses/224715/assignments. Do NOT create a zipped folder with the two documents but instead, upload them as two separate documents. Also, be sure to knit to pdf and NOT html; ask the TA about knitting to pdf if you cannot figure it out. Be sure to submit under the right assignment entry. Finally, when submitting your files, please select the corresponding pages for each exercise.
This is an individual lab and each student must submit a separate solution by the due date.
This lab is based on Exercise 6 of Section 12.11 of Data Analysis Using Regression and Multilevel/Hierarchical Models by Gelman A., and Hill, J. The data is from the following article:
Hamermesh, D. S. and Parker, A. (2005), “Beauty in the classroom: instructors’ pulchritude and putative pedagogical productivity”, Economics of Education Review, v. 24 (4), pp. 369-376.
The data contains information about student evaluations of instructor’s beauty and teaching quality for several courses at the University of Texas from 2000 to 2002. Evaluations were carried out during the last 3 weeks of the 15-week semester. Students administer the evaluation instrument while the instructor is absent from the classroom. The beauty judgements were made later using each instructor’s pictures by six undergraduate students (3 women and 3 men) who had not attended the classes and were not aware of the course evaluations. The sample contains a total of 94 professors across 463 classes, with the number of classes taught by each professor ranging from 1 to 13. Underlying the 463 observations are 16,957 completed evaluations from 25,547 registered students. The data you will use for this exercise excludes some variables in the original dataset.
Read the article (available via Duke library) for more information about the problem.
Download the data here. The code book is given below:
Variable | Description |
---|---|
profnumber | Id for each professor |
beauty | Average of 6 standardized beauty ratings |
eval | Average course rating |
CourseID | Id for 30 individual courses. The remaining classes were not identified in the original data, so that they have value 0. |
tenured | Is the professor tenured? 1 = yes, 0 = no |
minority | Is the professor from a minority group? 1 = yes, 0 = no |
age | Professor’s age |
didevaluation | Number of students filling out evaluations |
female | 1 = female, 0 = male |
formal | Did the instructor dress formally (that is, wears tie–jacket/blouse) in the pictures used for the beauty ratings? 1 = yes, 0 = no |
lower | Lower division course? 1 = yes, 0 = no |
multipleclass | 1 = more than one professor teaching sections in course in sample, 0 = otherwise |
nonenglish | Did the Professor receive an undergraduate education from a non-English speaking country? 1 = yes, 0 = no |
onecredit | 1 = one-credit course, 0 = otherwise |
percentevaluating float | didevaluation/students |
profevaluation | Average instructor rating |
students | Class enrollment |
tenuretrack | Is the professor tenure track faculty? 1 = yes, 0 = no |
You do not need to write a full report for the exercises below. You can simply answer/address each question directly.
Treat the variable eval
as the response variable and the other variables as potential predictors.
Is the distribution of eval
normal? If not, try the log transformation. Does that look more “normal”? If you think the log transformation does not help, what other transformation(s) do you think might work? Examine and describe the distribution(s) for the transformation(s).
Describe the overall relationship between eval
(or the transformed version) and beauty
.
Examine the relationship between eval
(or the transformed version) and beauty
, by CourseID
(think EDA for interactions). Are there any courses for which the relationship between eval
and beauty
looks potentially different than others?
There are 31 levels of CourseID
in all, which might be tough to explore graphically, so you should probably take a look at a random sample of say 7 classes plus class CourseID
== 0, making a total of 8 classes. Note that level CourseID
== 0 actually includes so many other classes which were not identified in the data. For the purposes of this lab, we will treat that class as one single huge class.
Do you think it is meaningful to fit a multilevel linear model that includes varying slopes for beauty
by profnumber
? Why or why not?
Note that we should not include profevaluation
as a predictor for eval
. Why is that?
Fit a random intercepts model for eval
(or the transformed version) grouped by profnumber
, with beauty
as the only predictor. Interpret the results in the context of the question.
Using the model from question 6 above, how does the variation in average ratings across professors compare to the variation in ratings for the same professor? Which is higher, and what does that mean to you?
Extend the model from question 6 by also allowing for random slopes for beauty
by CourseID
. Compare the two models using a metric or test of your choice. What can you conclude from your comparison? If you find that including the random slopes is necessary, interpret the results of the new model: the fixed effect estimates and the estimated standard deviations.
DO NOT INCLUDE R CODE OR OUTPUT IN YOUR SUBMISSION. All R outputs should be converted to nicely formatted tables. Feel free to use R packages such as kable, xtable, stargazer, etc.