DA Homework 4

Task 1

Download the civil-rights-act.csv from the Data section (it might make sense to look at the description file as well). Using the data answer the questions below (run regressions).

  • Which of the two parties (democrats and republicans) were more supportive for the Civil Rights Act? First, look at a scatter plot. It might help to experiment with jitter to see the points better. (Hint: look at this.)

  • Run a regression of the dummy of voting on the dummy of party (note that R is automatically going to use a character variable in a regression as a dummy.). What are the shares of the two parties who voted for the act?

  • Which states (northern or southern) were more supportive for the act? What are the shares of the representatives in the two groups of states who voted for the act?

  • When controlling for the state which party were more supportive for the act? How does it compare to what you found in task b)? How could you explain the difference?

Task 2

Use the easyshare_sample.csv for this task. This is a sample of the easyshare project (you can read more about this here). The Survey of Health, Ageing and Retirement in Europe is a multidisciplinary panel survey targeting individuals above 50 years. Hungary participated once, in the 4th wave. The data is about the Hungarian sample with only a few variables: lm_status is originally called as ep005_, mbirth as dn002_mod. You can read about the variables here. The recall variables contain scores from a simple memory test: 10 simple words are listed to the participants of the survey which they should repeat once immediately (recall_1) and once with some delay (recall_2). The sum of these two numbers (ranging from 0 to 20) form a great measure for the memory of the elderly.

  • Know your data. Look at summaries, strange values, and try to clean them. (Hint: negative values usually mean missing values, turn them into NA-s.)

  • Create a new dummy variable which takes one for those who do not work (for whom lm_status not equal to 2). Create a new variable which is the total word recall.

  • Look at whether the memory of those who are not working are worse than those who do. Run a regression which answers this question and interpret the coefficients.

  • Is there any other variable you may want to include the regression to get closer to the answer of the question whether working preserves ones memory? Include it in the regression and interpret your results.

Task 3

Collect the data you would like to use for your term project. Make a plot, or run a regression using that data, and interpret your results.

Task +1

Watch this video and collect 3 positive (or negative) points about the presentation.