Religion_2SR – Exploring Further the Relationship between Religion and Attitudes toward Same-Sex Marriage

Note to the Instructor: This is the second in a series of two exercises that focus on the relationship between religion and attitudes about same-sex marriage. In these exercises we're going to analyze data from the Pew 2014 Religious Landscape Survey conducted by the Pew Research Center. We're going to use SPSS to analyze the data. This exercise uses frequency distributions, three-variable tables, Chi Square, and measures of association as our statistical tools. A weight variable is automatically applied to the data set so it better represents the population from which the sample was selected. You have permission to use this exercise and to revise it to fit your needs. Please send a copy of any revision to the author so I can see how people are using the exercises. Please contact the author for additional information.

Goal of Exercise

The goal of this exercise is to introduce three-variable (i.e., multivariate) data analysis. In the previous exercise (Religion_1SR) we explored the relationship between religiosity and how people felt about same-sex marriage. We discovered that those who were more religious were more likely to oppose same-sex marriage. In this exercise we'll consider whether this relationship might be spurious. We'll use three-variable crosstabulations, percentages, Chi Square, and measures of association as our statistical tools. In the next exercise (Religion_3ER) we'll explore the relationship between religion and how people feel about the environment.

Part I – The Data Set We'll be Using

The Pew Research Center has conducted a number of surveys that deal with religion. Two of these surveys are the Religious Landscape Surveys conducted in 2007 and then repeated in 2014. They were very large telephone surveys of about 35,000 adults in the United States. For more information about the surveys, go to their website. 

We'll be using a subset of the 2014 survey in this exercise which I have named Pew_2014_Religious_Landscape_ Survey_subset_for_classes.sav. For the purposes of these exercises I selected a subset of variables from the complete data set. I recoded some of the variables, created a few new variables, and renamed the variables to make them easier for students to use. There is a weight variable which should always be used so that the sample will better represent the population from which the sample was selected. To open the data set in SPSS, just double click on the file name.[1] Your instructor will tell you where the file is located.

Part II – Same-Sex Marriage

The Pew survey asked respondents "do you strongly favor, favor, oppose, or strongly oppose allowing gays and lesbians to marry legally?" Let's start by finding out how respondents answered this question. If you haven't opened the data set yet, open it now. Run a frequency distribution for the variable SS1 which is the name of the variable. The variable name starts with the letters SS which tells you that this variable describes how people feel about same-sex marriage. Some of you have used SPSS, the statistical package we're using, and know how to get a frequency distribution. Others of you are new to SPSS. There is a tutorial that you can use to learn how to get a frequency distribution. The tutorial is freely available on the Social Science Research and Instructional Center's website. Chapter 1 of the tutorial gives you a basic overview of SPSS and frequency distributions are covered in Chapter 4. 

It's easy to get frequency distributions. Once you have opened the data set in SPSS, look on the menu bar at the top and click on "Analyze." This will open a drop-down menu. Click on "Descriptive Statistics" and then on "Frequencies." Notice that the list of all variables is in the pane on the left. Select SS1 by clicking on it and then click on the arrow pointing to the right. This will move SS1 into the "Variable(s)" box. Now all you have to do is click on "OK" to get your frequency distribution.[2] 

The frequency distribution tells you how respondents answered this question. The difference between the percents and valid percents in the table is important. Percents are based on everyone in the sample while valid percents are based on only those who gave a valid answer. For various reasons, some respondents have missing data. Missing data for this variable refers to respondents who said they didn't know or refused to answer the question. For this variable, these respondents were given the value of "9". Valid percents are computed by removing these respondents from the base for the percent. To make sure you understand the difference between percents and valid percents, answer the following questions.

  • What is the percent for those who strongly favor same-sex marriage? What does this mean?
  • What is the valid percent for this category? What does this mean?
  • Why aren't the percents and valid percents the same?

Part III – Religiosity and Attitudes toward Same-Sex Marriage

In the previous exercise (Religion_1SR) we explored the relationship between religiosity and how people felt about same-sex marriage. Religiosity refers to the strength of a person's attachment to their religious preference. In other words, it describes how religious a person is. There are three commonly used measures of religiosity – how often a person attends religious services, how important they say religion is to them, and how often they pray. We're going to use the respondent's self-identified importance of religion in this exercise. The Pew survey asked, "How important is religion in your life – very important, somewhat important, not too important, or not at all important?" This is called REL2 in the data set.

Run a crosstabulation showing the relationship between REL2 and SS1. (See Chapter 5, Cross Tabulations, in the online SPSS book cited on page 1 of this exercise.) You're going to put your two variables (i.e., REL2 and SS1) in the "Row(s)" and "Column(s)" boxes by clicking on the variable in the left-hand pane to select it and then clicking on the arrow that points to the right. When you do that, the arrow will change so it points left. If you click on it again, it will move the variable back to the left-hand pane. That way you can correct errors you would make when you select the wrong variable.

But which variable goes in which box? Typically, we put the independent variable in the column box and the dependent variable in the row box. So we're going to put SS1 in the row box and REL2 in the column box. We're also going to click on the "Cells" box and check the box for the "Column" percents. If your independent variable is in the columns, then you want to use the column percents. If it is in the rows, then you want to use the row percents. To get the table, click on "Continue" and then on "OK." 

There are two numbers in each cell of the table. The top number is the number of cases in each cell and the bottom number is the column percent. Notice that the column percents add down by column to 100. Since the percents sum down to 100, you want to compare the percents straight across. Always compare the percents in the direction opposite to the way they sum to 100. 

We're going to use Chi Square to help us interpret the table. Chi Square is a test of significance that tests the null hypothesis that the two variables are unrelated to each other. In statistical speak, we would say that the null hypothesis is that the variables are statistically independent. Chi Square tests this null hypothesis and tells you whether you should reject it or not reject it. If you can reject it, then you have evidence that the two variables are related to each other. If you can't reject it, then you don't have any evidence of a relationship. 

We're also going to use a measure of association. A measure of association is a statistic that measures the strength of the relationship. The Chi Square test doesn't tell you anything about the strength of the relationship. You need a measure of association to do that. There are many different measures of association. Kendall's tau-b is used when both of your variables are ordinal. Ordinal means that the categories have an inherent order to them. In other words, they are ordered from high to low or from low to high.

To get Chi Square and Kendall's tau-b click on the "Statistics" button and then click on the boxes for both Chi Square and Kendall's tau-b. Click on "Continue" and then on "OK" to get the table. 

Now the question is how to interpret Chi Square and Kendall's tau-b. To interpret Chi Square look at the first row for "Pearson Chi Square" and the column for "Asymptotic Significance." In your output, it should read ".000". This is the probability that you would be wrong if you rejected the null hypothesis. It's actually not 0, but rather it is less than (<) .0005 since it's a rounded value. That tells you that it's very unlikely that this is a chance relationship. There probably is some relationship between these two variables. Our rule is to reject the null hypothesis when the significance value is < .05. In other words, when the probability of being wrong is less than five out of one hundred.

To interpret Kendall's tau-b look at the value in your table. Think of a continuum from 0 (no relationship) to 1 (strongest possible relationship). Tau-b varies from 0 to 1 and can be either positive or negative. However, for this exercise ignore the sign when you interpret tau-b. Measures of association are useful when comparing tables to see which table has the stronger or weaker relationship. 

Write a paragraph that summarizes the relationship between religiosity and attitudes toward same-sex marriage. Be sure to answer the following questions and to use the valid percents.

  • Were people who felt that religion was important to them more or less likely to favor same-sex marriage than those who felt religion was noa href=t important? Use the column percents to illustrate your answer.
  • What does the Chi Square test tell you about this relationship?
  • What does Kendall's tau-b tell you about the relationship?

Part IV – Spuriousness due to Sex

At this point we have only considered two variables. We need to consider other variables that might be related to religiosity and attitudes toward same-sex marriage. For example, sex may be related to both these variables. Women may be more likely to favor same-sex marriage and women may also be more likely to feel that their religion is important to them. This raises the possibility that the relationship between self-reported strength of religion and how one feels about same-sex marriage might be due to sex. In other words, it may be spurious due to sex.

Let’s check to see if sex is related to both our independent and dependent variables. This is important because the relationship can only be spurious if the third variable (sex) is related to both your independent and dependent variables. Use CROSSTABS to get two tables – one table should cross tabulate D14 (sex) and REL2 and the other table should cross tabulate D14 and SS1. Be sure to get the percents, Chi Square, and Kendall's tau-b. If sex is related to both variables, then we need to check further to see if the original relationship between religiosity and how people feel about same-sex marriage is spurious as a result of sex.

Write a paragraph describing the relationship between sex and your independent and dependent variables. Remember to use the percents, Chi Square, and Kendall's tau-b in your answer.

Since sex is related to both variables we need to check on the possibility that the relationship between strength of religion and how people feel about same-sex marriage is due to the effect of sex on that relationship? What we can do is to separate males and females into two tables and look at the relationship between strength of religion and attitudes toward same-sex marriage separately for men and for women. In effect, we are holding sex constant. We can do that in SPSS by getting a crosstab with REL2 in the column (our independent variable), SS1 in the row (our dependent variable), and D14 in the third box down in SPSS. (See Chapter 8, Multivariate Analysis in the online SPSS book mentioned on page 1 of this exercise.) In this case, sex is the variable we are holding constant and is often called the control variable. 

Check to see what happens to the relationship between importance of religion and opinion on same-sex marriage when we hold sex constant. If the original relationship is spurious, then it either ought to go away or to decrease substantially for both males and females. So look carefully at the two tables (i.e., one table for males and the other table for females). But how can we tell if the relationship goes away or decreases for both males and females? One clue will be the percent differences. Compare the percent differences between those who are more religious and those who are less religious for males and then for females with the percent differences in the original two-variable table.[3] Did the percent difference stay about the same or did they decrease substantially? Another clue is your measure of association. Did Kendall's tau-b stay about the same or did they decrease substantially from that in the original two-variable table?

If the relationship had been due to sex, then the relationship between importance of religion and opinion on same-sex marriage would have disappeared or decreased substantially for both males and females when we took out the effect of sex by holding it constant. In other words, the relationship would be spurious. Spurious means that there is a statistical relationship, but not a causal relationship. It important to note that just because a relationship is not spurious due to sex doesn’t mean that it is not spurious at all. It might be spurious due to some other variable such as age.

Write a paragraph describing the relationship between religiosity and attitudes toward same sex-marriage for men and then write a second paragraph describing the relationship for females. Now write a third paragraph discussing whether this relationship is spurious due to sex. Be sure to describe how you came to your conclusion. Remember to use the percents, Chi Square, and Kendall's tau-b in your answer.

Part V – Spuriousness due to Age

Now let's see if the relationship is spurious due to age. Run a frequency distribution for D6 which is the name of the variable for age. Notice that there are a large number of categories. To reduce the number of categories I recoded D6 into two new variables – D6R1 and D6R2.[4] Both of these variables recode age into different sets of four categories. We're going to use D6R2 in this part of the exercise. 

Follow the same steps that you used in Part 4.

  • Crosstabulate D6R2 and SS1 and then run another crosstabulation for D6R2 and REL2 to see if your control variable (i.e., recoded age) is related to both your independent and dependent variables.
  • Run a three-variable table with REL2 as your independent variable, SS1 as your dependent variable, and D6R2 as your control variable.
  • Write a paragraph describing the relationship between religiosity and attitudes toward same sex-marriage for each category of your control variable (age). Since there are four categories for age, that means you will have four paragraphs. 
  • Now write a fifth paragraph discussing whether this relationship is spurious due to age. Be sure to describe how you came to your conclusion. Remember to use the percents, Chi Square, and Kendall's tau-b in your answer.

Part VI—Conclusions

Summarize what you learned in this exercise. Was the relationship spurious when you controlled for sex? Was it spurious when you controlled for age? What does it mean to say a relationship is spurious?


[1] This assumes that the proper associations have been set up on your computer so the computer knows that .sav files are SPSS data files

[2] SPSS allows you to change the way your output is displayed.  You can change these preferences by clicking on "Edit" in the menu bar at the top of the screen and then clicking on "Options" and finally on the "Output" tab.  Under "Variables in item labels shown as" select "Names and Labels" and then under "Variable values in item labels shown as" select "Values and Labels."  Then click on "OK."  You can also try out other options.

[3] The percent difference refers to the difference between the percents for those who say religion is important and those who say it's not important.  For example, subtract the percent of those for whom religion is important that favor same-sex marriage from the percent of those for whom it's not important that favor same-sex marriage.  It doesn't matter which percent you subtract from the other percent as long as you are consistent.

[4] The "R" indicates that it is a recoded or composite variable and the "1" indicates that it is the first recoded or composite variable.  The second recoded or composite variable would be "R2".