Emilee Rader, Kelley Cotter and Janghee Cho. “Explanations as Mechanisms for Supporting Algorithmic Transparency” CHI 2018. Montreal, Quebec, Canada. April 2018. doi: 10.1145/3173574.3173677

1 Data Collection Overview

1.1 When did the study take place?

Data collection took place on 2017-08-10 through 2017-08-24. Participants were recruited by Qualtrics using their panel service, targeting US Facebook users who are 18+ years old, and a sample that is 52% female and 35% over 55 years old.

1.2 How many responses were collected?

The survey was started by 6842 potential participants. After data cleaning, there are 681 participants in the dataset. The number of participants in each condition is reported here:

1.3 How long did it take to complete the study?

The study took an average of 21.57 minutes for participants to complete, after they had finished the consent and screening questions. The maximum completion time (capped at 60 minutes by a limit placed in the survey which participants were informed about in the consent form and instructions) was 59.14 minutes, and the minimum was 6.07 minutes.

2 Survey Questions and Descriptive Statistics

2.1 Screening Questions

  • Do you use Facebook? [Yes/No]

  • What is your age in years? (variable name: age)

  • How many Facebook friends do you have? (variable name: fb_friends)

  • How long have you been a user of Facebook? (variable name: fb_how_long) [Less than one month / A couple of months / 6 months or so / About a year / 1-2 years / 3-5 years / More than 5 years]

  • How often do you usually visit Facebook? (variable name: fb_visit) [Several times per day / About once per day / A few times per week / About once per week / A few times per month / About once per month / Less than once per month / Never]

  • Do you now, or have you ever managed a Page on Facebook? [Yes/No]

  • Do you now, or have you ever worked in a job where your responsibilities included: ‘posting content on social media’, ‘communicating with clients, customers, etc. via social media’, or ‘working on the social media strategy for your organization’? [Yes/No]

  • Do you now, or have you ever worked in a job where your responsibilities included computer programming, quality assurance and testing, IT security, or network administration? [Yes/No]

  • What is your gender? (variable name: gender) [Woman / Man / Fill in the blank / Prefer not to answer]

2.2 Control Variable Questions

2.2.1 Trust Propensity

A series of 6 questions was designed to measure the tendency or propensity of each participant to trust in “social media websites”. The questions were:

Please respond to the following statements by answering how much you agree or disagree with each one:

  • I feel that most social media websites act in people’s best interests.
  • Most social media websites are helpful.
  • In general, social media websites are well-managed.
  • Most social media websites are able to meet users’ needs.
  • I feel fine using social media websites since they are generally reliable.
  • I always feel confident that social media websites will do what I want them to do.

The mean of this composite variable was 4.78 (SD = 0.98, median = 4.83, Cronbach’s alpha: 0.87). (variable name: trust_propensity)

Here are the descriptive statistics for each of those trust propensity questions:

2.2.2 Internet Literacy

A series of 8 questions was designed to measure the internet literacy to be asked the following question about the internet-related terms. The questions were:

How familiar are you with the following Internet-related terms? Please rate your understanding of each term below from None (no understanding) to Full (full understanding)

  • Wiki.
  • Meme.
  • Phishing.
  • Bookmark.
  • Cache.
  • SSL.
  • AJAX.
  • RSS.

A composite variable was created for each participant by averaging the responses for each participant across the eight internet literacy items. The mean of this composite variable was 2.69 (SD = 0.82, median = 2.75, Cronbach’s alpha: 0.81). (variable name: internet_literacy)

Here are the descriptive statistics for each of those questions:

2.3 Questions about Facebook Usage and Behaviors

2.3.1 Years on Facebook

What was the date of your very first activity on Facebook? (variable name: year_diff)

Note: In this question, we asked participants to look up the date of their first activity on Facebook, on their Activity Log page. Then we calculated the number of years they’ve been using Facebook by subtracting that date from the date they participated in the experiment.

2.3.2 Posted Story in Past Week

How many stories did you post on Facebook in the past week? (variable name: stories_postedYes)

Note: Since this variable also has a long-tail distribution, we convert it into a binary variable: “No” means 0, “Yes” means 1 or more.

2.3.3 Routine Facebook Behavior

A series of 4 questions was designed to measure the habitual/routine nature of participants’ Facebook use. The questions were:

  • Facebook has become part of my daily routine.
  • I feel out of touch when I haven’t logged onto Facebook for a while.
  • I have difficulty controlling the amount of time I spend on Facebook.
  • I think about deleting my Facebook profile. [reverse-coded]

The mean of the composite variable created from these four questions is 4.3 (SD = 1.2, median = 4.5, Cronbach’s alpha: 0.63). (variable name: fb_routine)

Here are the descriptive statistics for each of those questions:

2.3.4 Facebook Satisfaction

A series of 4 questions was designed to measure users’ overall Facebook satisfaction. The questions were:

  • How do you feel about your overall experience with the Facebook News Feed? Very Dissatisfied — Very Satisfied ( M = 4.71, SD = 1.47 )
  • How do you feel about your overall experience with the Facebook News Feed? Very Displeased — Very Pleased ( M = 4.69, SD = 1.43 )
  • How do you feel about your overall experience with the Facebook News Feed? Very Frustrated — Very Contented ( M = 4.64, SD = 1.49 )
  • How do you feel about your overall experience with the Facebook News Feed? Absolutely Terrible — Absolutely Delighted ( M = 4.54, SD = 1.29 )

The mean of the composite variable created from these four questions was 4.64 (SD = 1.3, median = 4.75, Cronbach’s alpha: 0.94). (variable name: fb_satisfaction)

2.3.5 Diversity of Friend Network

By placing your Facebook friends into a single category, what percentage of all of your Facebook friends belong to each category below? Type a number in the blank between 0 and 100 for each category; the numbers you enter must add up to 100. (variable name: network_diversity)

  • Family members
  • Friends
  • Classmates
  • Work colleagues or co-workers
  • Other

We calculated Simpson’s D, based on Beam et al. (2017). This measure ranges from 0 to 80, and lower numbers indicate greater proportions in just one or two categories:

2.4 Questions about Prior Knowledge

2.4.1 Prior Knowledge Q1 (Before Manipulation)

Before the experiment manipulation, we asked participants two questions about their knowledge about the Facebook News Feed algorithm:

  • Facebook shows me every story created by my Facebook friends and the Pages I’ve “liked”. [Strongly disagree - Strongly agree]

2.4.2 Prior Knowledge Follow-Up Questions

Follow-up question 1: You answered your answer(e.g., “Agree”) to the question “Facebook shows me every story created by my Facebook friends and the Pages I’ve ‘liked’”. How sure or unsure are you about your answer? Please indicate your answer below on a scale from 0-100, where 0 means COMPLETELY UNSURE and 100 means COMPLETELY SURE. M =66.7, SD = 25.43. (variable name: evpst_before_sure)

Follow-up question 2 (only asked BEFORE the manipulation): Please explain why you chose your answer (e.g., “Agree”) to the question “Facebook shows me every story created by my Facebook friends and the Pages I’ve ‘liked’”. Your answer must be at least 150 characters long, which is about three sentences.

2.4.3 Prior Knowledge Q2

(variable name: aware_info_new)

2.4.4 Prior Knowledge Q3

(variable name: aware_info_surprise)

2.5 Questions about Awareness

2.5.1 Awareness Q1

(variable name: aware_evpst_aft)

After the experiment manipulation, we asked participants the same question about their awareness of the algorithm as we had asked before the manipulation:

  • Facebook shows me every story created by my Facebook friends and the Pages I’ve “liked”. [Strongly disagree - Strongly agree]

Follow up question: You answered your answer(e.g., “Agree”) to the question “Facebook shows me every story created by my Facebook friends and the Pages I’ve ‘liked’”. How sure or unsure are you about your answer? Please indicate your answer below on a scale from 0-100, where 0 means COMPLETELY UNSURE and 100 means COMPLETELY SURE - M =76.27, SD = 22.17. (variable name: aware_evpst_aft_sure)

2.5.2 Awareness Q2

A series of 14 questions was designed to measure awareness with 5 different factor variables. The questions were:

Below are possible reasons why you might not see every story posted by your Facebook friends and Pages you’ve “liked” in your News Feed. Please consider the text you read about Facebook, and indicate how much you agree or disagree with each reason.

  • System Agency (variable name: system_reasons) - M = 4.14, SD = 1, Cronbach’s alpha: 0.7.
    • Facebook thinks I don’t want to see stories from some of my friends.
    • Some of my friends are not popular enough on Facebook for me to see all of stories they post.
    • Facebook doesn’t show me certain types of stories in my News Feed.
    • Facebook thinks that I’m not very close to some of my Facebook friends.
    • If I am not tagged in stories posted by certain friends, I won’t see those stories.
  • User Agency (variable name: user_reasons) - M = 4.28, SD = 1.3, Cronbach’s alpha: 0.71.
    • I don’t spend a lot of time going through my News Feed.
    • I don’t scroll down to see older stories in my News Feed.
    • I don’t visit Facebook often enough to see all of the stories my friends post.
    • I don’t interact (comment, like, share) with stories from certain friends very often.
  • Hide Reason (variable name: hide_reasons) - M = 3.86, SD = 1.46, Cronbach’s alpha: 0.4.
    • I’ve previously used the News Feed settings to hide stories from some of my friends.
    • My friends may have hidden some of their stories from me.
  • Too Many Reason (variable name: toomany_reasons) - M = 5.58, SD = 1.1, Cronbach’s alpha: 0.55.
    • I don’t always read every story when I browse my News Feed.
    • There are so many stories in my News Feed that I don’t always see every one.
  • Too New Reason (variable name: aware_reasons_too_new) - M = 3.93, SD = 1.5.
    • Some posts from my friends are too new for me to have seen them yet.

Here are the descriptive statistics for each of those questions:

2.6 Questions about Correctness

2.6.1 Correctness Q1

(variable name: correct_expect)

Follow up question: You answered your answer(e.g., “Agree”) to the question. How sure or unsure are you about your answer? Please indicate your answer below on a scale from 0-100, where 0 means COMPLETELY UNSURE and 100 means COMPLETELY SURE - M = 76.37, SD = 20.71. (variable name: correct_expect_sure)

2.6.2 Correctness Q2

(variable name: correct_missed)

2.6.3 Correctness Q3

A series of 13 questions was designed to measure correctness with 5 different factor variables. The questions were:

Below are different kinds of stories that might appear in your News Feed. Please consider the text you read about Facebook, and indicate whether you feel like you see each kind of story about as often as you should, not often enough, or too often. Choose your answer on a scale from 0-100, where 0 means NOT OFTEN ENOUGH and 100 means TOO OFTEN.

  • Wanted Stories (variable name: want_frequency) - M = 55.28, SD = 15.05, Cronbach’s alpha: 0.61.
    • stories from people I want to keep in touch with ( M = 52.82, SD = 23.52 )
    • stories that I think are interesting ( M = 54.51, SD = 22.47 )
    • stories that are informative ( M = 52.48, SD = 22.17)
    • stories that are too similar to other stories I’ve already seen ( M = 61.31, SD = 20.17 )
  • Unwanted Stories (variable name: dont_want_frequency) - M = 54.66, SD = 19.14, Cronbach’s alpha: 0.72.
    • stories from people I don’t want hear from ( M = 52.67, SD = 23.69 )
    • stories that I don’t want to see ( M = 58.31, SD = 23.19 )
    • stories posted by people I don’t know ( M = 53, SD = 25.01 )
  • Spam Stories (variable name : spam_frequency) - M = 63.97, SD = 19.36, Cronbach’s alpha: 0.7.
    • stories about false content ( M = 59.43, SD = 25.29 )
    • stories that I think are spam ( M = 61.04, SD = 26.25 )
    • stories that are “sponsored” or “suggested” ( M = 71.44, SD = 21.93 )
  • Offensive Stories (variable name : offensive_frequency) - M = 52.18, SD = 20.91, Cronbach’s alpha: 0.74.
    • stories I think are offensive ( M = 49.66, SD = 23.63 )
    • stories that are upsetting or irritating ( M = 54.7, SD = 23.39 )
  • Correct Stories (variable name : correct_frequency_liked) - M = 60.06, SD = 21.01.
    • stories that my friends have “liked” ( M = 60.06, SD = 21.01 )

2.7 Questions about Interpretability

2.7.1 Interpretability Q1

(variable name: interpret_rsns)

Follow up questions: You answered your answer(e.g., “Agree”) to the question. How sure or unsure are you about your answer? Please indicate your answer below on a scale from 0-100, where 0 means COMPLETELY UNSURE and 100 means COMPLETELY SURE - M = 78.44, SD = 19.01. (variable name: interpret_rsns_sure)

2.7.2 Interpretability Q2

(variable name: interpret_order)

2.7.3 Interpretability Q3

A series of 8 questions was designed to measure interpretability with 2 different factor variables. The questions were:

Below are some goals that different people can have for using the News Feed. Please consider the text you read about Facebook, and indicate how consistent or random the News Feed seems like it is about showing you stories that would help you achieve each goal, on a scale of 0-100 where 0 means COMPLETELY RANDOM and 100 means COMPLETELY CONSISTENT.

  • Interpersonal Goals (variable name: entertained_goals) - M = 67.29, SD = 16.7, Cronbach’s alpha: 0.79.
    • helping me keep up with friends and family ( M = 73.74, SD = 21.29 )
    • showing me a variety of content ( M = 65.8, SD = 20.09 )
    • understanding my interests and preferences ( M = 60.97, SD = 22.35 )
    • keeping me entertained ( M = 68.63, SD = 21.69 )
  • Information Goals (variable name: informed_goal) - M = 56.14, SD = 17.22, Cronbach’s alpha: 0.75.
    • keeping me well-informed ( M = 61.76, SD = 21.28 )
    • helping me catch up on the news ( M = 61.31, SD = 23.14 )
    • letting me know about future events ( M = 63.07, SD = 21.59 )
    • helping me network and find job opportunities ( M = 38.43, SD = 24.66 )

2.8 Questions about Accountability

2.8.1 Accountability Q1

(variable name: acct_influence)

Follow up question: You answered your answer(e.g., “Agree”) to the question. How sure or unsure are you about your answer? Please indicate your answer below on a scale from 0-100, where 0 means COMPLETELY UNSURE and 100 means COMPLETELY SURE - M = 8.71, SD = 2.1. (variable name: acct_influence_sure)

2.8.2 Accountability Q2

A series of 3 questions was designed to measure the concept of accountability/fairness in News Feed. The questions were:

  • The News Feed behaves the same for everyone who uses it.
  • The News Feed is unbiased about which stories it shows to people.
  • The News Feed acts in the best interest of the people who use it.

The mean of the composite variable created from these three questions is 3.86 and Cronbach’s alpha is 0.64. (variable name: acct_fairness)

Here are the descriptive statistics for each of those questions:

2.8.3 Accountability Q3

A series of 10 questions was designed to measure accountability with 4 different factor variables. The questions were:

Below are some actions related to Facebook. How likely do you think it is that if you do these actions in the future, it will affect which stories appear in your News Feed? Please consider the text you read about Facebook, and indicate your answer for each item below on a scale from 0-100, where 0 means UNLIKELY, and 100 means LIKELY.

  • Content Action (variable name: content_actions) - M = 72.01, SD = 17.14, Cronbach’s alpha: 0.75.
    • “Like” a story ( M = 74.74, SD = 19.55 )
    • Comment on a story ( M = 69.27, SD = 21.89 )
    • Follow a person or a Page ( M = 72.03, SD = 21.48 )
  • UI Control Action (variable name: ui_controls_actions) - M = 67.55, SD = 18.72, Cronbach’s alpha: 0.66.
    • Prioritize who to “see first” ( M = 68.25, SD = 24.78 )
    • Sort by “Most Recent” stories ( M = 67.69, SD = 23.54 )
    • Add a person to your “Close Friends” list ( M = 66.71, SD = 24.53 )
  • Visit Action (variable name: visit_actions) - M = 49.57, SD = 21.69, Cronbach’s alpha: 0.56.
    • Visit Facebook less often ( M = 44.04, SD = 26.36 )
    • Visit Facebook more often ( M = 55.11, SD = 25.72 )
  • Hide Action (variable name: hiding_actions) - M = 60.01, SD = 24, Cronbach’s alpha: 0.64.
    • Hide a story ( M = 57.65, SD = 26.66 )
    • Unfollow a person or a Page ( M = 62.36, SD = 29.16 )

2.9 Demographics Questions

  • What is the last grade or class you completed in school? (variable name: education) [None, or grades 1-8 / Some high school / High school graduate or GED certificate / Technical, trade, or vocational school AFTER high school / Some college, no 4-year degree / 4-year college degree / Some postgraduate or professional schooling, no postgraduate degree / Postgraduate or professional degree, including master’s, doctorate, medical or law degree]

  • What was your total household income before taxes during the past 12 months? (variable name: income) [Less than $25,000 / $25,000 to $34,999 / $35,000 to $49,999 / $50,000 to $74,999 / $75,000 to $99,999 / $100,000 to $149,999 / $150,000 to $199,999 / $200,000 or more]

  • Which categories below best describe you? Select all that apply: (variable name: ethnicity) [White / Hispanic, Latino or Spanish / Black or African American / Asian / American Indian or Alaska Native / Middle Eastern or North African / Native Hawaiian or Other Pacific Islander / Some Other Race, Ethnicity or Origin (please specify)]

  • Which region of the country do you live in? (variable name: region) [New England - Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont / Middle Atlantic - New Jersey, New York, Pennsylvania / East North Central - Illinois, Indiana, Michigan, Ohio, Wisconsin / West North Central - Iowa, Kansas, Minnesota, Missouri, Nebraska, North Dakota, South Dakota / South Atlantic - Delaware, District of Columbia, Florida, Georgia, Maryland, North Carolina, South Carolina, Virginia, West Virginia / East South Central - Alabama, Kentucky, Mississippi, Tennessee / West South Central - Arkansas, Louisiana, Oklahoma, Texas / Mountain - Arizona, Colorado, Idaho, Montana, Nevada, New Mexico, Utah, Wyoming Pacific - Alaska, California, Hawaii, Oregon, Washington / Other region (please specify)]

3 Ineligible & Excluded Participants

3.1 How many participants were ineligible vs. excluded?

The email that the Qualtrics project manager sends out to possible participants does not contain a description of the study. Instead, the consent form serves as the information provided to participants so they can decide whether they want to participate. 6842 started the study by clicking on the link in the recruiting email and viewing the consent form.

285 participants did not consent to participate. Another 5056 were screened out for being ineligible. 820 participants were eligible, but were excluded before they completed the survey, for other reasons (e.g., failing the attention checks or manipulation check, providing poor quality responses, etc.). After data cleaning, there were 681 participants in the final dataset for analysis.

3.2 For what reasons were participants ineligible?

This only includes participants that consented to the study, but were determined to be ineligible in the screening. The survey was configured to direct participants to the end of the survey after each screening question, if they gave an ineligible response to that question.

The table is sorted by column so that the number in the “n” column is how many people were ineligible for the reason in the right-most column having “TRUE” or “No”. (TRUE means they entered an ineligible response.)

The two largest categories of ineligible respondents were those who said they had managed a Page on Facebook, and those who said they had fewer than 50 Facebook friends.

3.3 For what reasons were eligible participants excluded?

The reasons participants were excluded after consenting and being determined eligible to participate were:

  • failed attention check: The study had four attention check questions. Three of them were of the form To help us monitor the quality of our data, please select from the choices below. If they chose a different response than the one they were directed to, they were directed to the end of the survey. The fourth one was a fake word in the block of internet-related terms. If they said they had “Good” or “Full” familiarity with the fake word, they were directed to the end of the survey.
  • failed manipulation check: The study included three questions in each condition designed to ensure that participants had read and paid attention to the explanation text. Participants were given two chances to pass all three questions. If they answered one of the questions incorrectly, they were directed back to read the explanation again a second time. If they again failed to answer all of the questions correctly, they were directed to the end of the survey.
  • finished too slowly: Participants were required to finish the study within 60 minutes of finishing the consent and screening questions. If they did not, they were excluded.
  • incomplete: These are partial responses where a participant started but did not finish the survey.
  • poor quality response: We examined the responses for four symptoms of low quality responses, including 1) answers to the open-ended question that were unrelated to the question or contained repeated characters or other text intended to fill up space that did not answer the question; 2) answes that were too similar to other answers indicating possible repeat participation; 3) very low variation in answers (“straight down” answers) across 4 different sections of the survey; and 4) a mismatch between a multiple-choice question asking people how long they had been Facebook users, and a question that asked them to enter the date (month, day and year) of their first activity on Facebook, by looking it up on their Facebook Activity Log page.

Here is the number of responses excluded for each reason (this does not include participants who were ineligible or declined consent):

4 Participant Demographics

Age
mean: 43.4
sd: 16.31
range: 18, 88

Gender
Women: 357 (52.42%)
Men: 324 (47.58%)

Ethnicity
White: 541 (79.44%)
Hispanic, Latino or Spanish: 34 (4.99%)
Black or African American: 34 (4.99%)
Asian: 31 (4.55%)
American Indian or Alaska Native: 3 (0.445)
Native Hawaiian / Pacific Islander: 1 (0.15%)
Multiracial / Other: 37 (5.43%)

Note: ethnicity category counts are adjusted to account for the participants who reported being multi-racial. Otherwise, approximately 35 people would be counted in more than one category.

Education
No high school degree: 14 (2.06%)
High school grad, or GED: 130 (19.09%)
Some college, or vocational: 278 (40.82%)
4-year college degree: 166 (24.38%)
Some grad school, or grad degree: 93 (13.66%)

Income
Less than $25,000: 151 (22.17%)
$25,000 to $34,999: 89 (13.07%)
$35,000 to $49,999: 111 (16.3%)
$50,000 to $74,999: 148 (21.73%)
$75,000 to $99,999: 88 (12.92%)
$100,000 to $149,999: 70 (10.285)
$150,000 or more: 24 (3.52%)

Number of Facebook Friends
mean: 338.61
sd: 455.55
range: 50, 4958

Frequency of Visting Facebook
Several times per day: 487 (71.51%)
About once per day: 116 (17.03%)
A few times per week: 53 (7.78%)
Once per week or less: 25 (3.67%)

Posted story in the past week?
yes: 420 (61.67%)
no: 261 (38.33%)

Years Since Creating Facebook Account
mean: 7.58
sd: 2.58
range: 0.12, 13.62

5 Experiment Manipulation

Note: Each explanation has three different manipulation check questions and the correct answers to each question are in bold type.

Pilots of the explanations: We did initial think-aloud pilots of the explanations and manipulation check questions with members of our lab, and revised the explanations. We then did two rounds of piloting using Amazon Mechanical Turk workers as participants, and revised the explanations and questions between the two rounds. We asked participants to rate how new the information was to them, the tone of the text, the clarity of the text, and the believability of the text, and compared means across conditions. Difficulty was based on time to read the explanations, and the Flesch Kincaid reading level calculation available in MS Word.

5.1 “What” Explanation

(198 words; Flesch-Kincaid Grade Level 10.6)
The stories people see in their Facebook News Feeds are only some of the stories created by their Facebook friends, people they “follow”, Pages they “like”, and groups they belong to. The News Feed does not show people every possible story. Instead, there is an algorithm (a formula or set of rules) that makes a guess about which stories people want to see, and the order that they’ll want to see them in. Each person’s News Feed is a unique list of stories created specifically for them. This means that the stories in one person’s News Feed may be completely different from the stories in another person’s News Feed.

The News Feed does not show stories in the order that they were created, unless people change a setting that causes the stories to be shown in “Most Recent” order for a little while. Instead, the stories at the top of the News Feed are the stories the algorithm has automatically chosen to show first. This means that people are more likely to see stories that the algorithm has decided to put higher up in the News Feed, and may miss stories it has decided to put lower down.

5.1.1 “What” condition manipulation check questions

  1. The text I just read discussed…
    • The types of ads Facebook allows under its advertising policy.
    • Facebook’s algorithm that personalizes the News Feed.
    • The use of Facebook for organizing events.
    • The different kinds of Facebook apps that are available.
    • None of the above.
  2. Choose the information from the list below that was in the text that you read:
    • The stories in the News Feed are all created by Facebook employees.
    • Stories about kittens are more popular than all other types of stories in the News Feed.
    • The News Feed was first available to users on September 6, 2006.
    • The stories in the News Feed are not presented in the same order that they were created.
    • None of the above.
  3. Which of the following statements is true?
    • Facebook was first created at Stanford University in California.
    • English (Pirate) and Klingon are two of the language options offered by Facebook.
    • The News Feed has a “Most Recent” setting.
    • Facebook’s main user interface color is blue, because blue is Mark Zuckerberg’s favorite color.
    • None of the above.

5.2 “How” Explanation

(194 words; Flesch-Kincaid Grade Level 10.7)
When a person visits Facebook, thousands of pieces of information called “signals” are collected about their actions while they view stories in their News Feeds. Signals provide clues about the kinds of stories people want to see. Facebook uses an algorithm (a formula or set of rules) to calculate a score for all the possible stories that could appear in a person’s News Feed. The score is based on the signals that have been collected, and it is used to put the stories in order. The stories that are higher up in the list are the ones the algorithm has guessed the person will like the most, so it shows those stories first.

For example, if a person always “likes” stories posted by a certain friend, that signal tells the algorithm that it should choose more stories from that friend to show higher up in the person’s News Feed. Another signal is the total number of “likes” a particular story has received from everyone on Facebook who saw the story. If the story has only a few “likes”, the algorithm assumes the story isn’t very popular, and most people won’t want to see it.

5.2.1 “How” condition manipulation check questions

  1. The text I just read discussed…
    • The types of ads Facebook allows under its advertising policy.
    • The signals Facebook’s News Feed algorithm considers.
    • The use of Facebook for organizing events.
    • The different kinds of Facebook apps that are available.
    • None of the above.
  2. Choose the information from the list below that was in the text that you read:
    • The stories in the News Feed are all created by Facebook employees.
    • Stories about kittens are more popular than all other types of stories in the News Feed.
    • The News Feed was first available to users on September 6, 2006.
    • The order of the stories in the News Feed is based on a score calculated by the algorithm.
    • None of the above.
  3. Which of the following statements is true?
    • Facebook was first created at Stanford University in California.
    • English (Pirate) and Klingon are two of the language options offered by Facebook.
    • “Likes” on stories indicate people’s preferences.
    • Facebook’s main user interface color is blue, because blue is Mark Zuckerberg’s favorite color.
    • None of the above.

5.3 “Why” Explanation

(202 words; Flesch-Kincaid Grade Level 10.7)
Each time a person visits Facebook, there are thousands of stories posted by the people, Pages and groups they’re connected to that could possibly be shown to them in their News Feeds. But, this is too many stories for most people to be able to see them all. Also, some of the stories may be ones that people would prefer not to see, while other stories may be more important to them. Facebook’s goal for the News Feed is to show people stories that are interesting and relevant.

To accomplish this goal, the News Feed can’t show every possible story. Instead, Facebook uses an algorithm (a formula or set of rules) to automatically choose some of the available stories for the News Feed. The algorithm chooses stories it expects to be high quality and that people may “like” or comment on or share, and not stories it has determined to be outdated, spam, or misleading. The algorithm also decides the order the stories are shown in for each person. This means that the algorithm may choose different stories to show to different people even if they have the same Facebook friends, depending on its guess about what matters most to each person.

5.3.1 “Why” condition manipulation check questions

  1. The text I just read discussed…
    • The types of ads Facebook allows under its advertising policy.
    • The goals of the Facebook News Feed algorithm.
    • The use of Facebook for organizing events.
    • The different kinds of Facebook apps that are available.
    • None of the above.
  2. Choose the information from the list below that was in the text that you read:
    • The stories in the News Feed are all created by Facebook employees.
    • Stories about kittens are more popular than all other types of stories in the News Feed.
    • The News Feed was first available to users on September 6, 2006.
    • The algorithm chooses stories that are high quality and engaging to appear in the News Feed.
    • None of the above.
  3. Which of the following statements is true?
    • Facebook was first created at Stanford University in California.
    • English (Pirate) and Klingon are two of the language options offered by Facebook.
    • There are thousands of possible stories available on every visit to Facebook.
    • Facebook’s main user interface color is blue, because blue is Mark Zuckerberg’s favorite color.
    • None of the above.

5.4 “Objective” Explanation

(207 words; Flesch-Kincaid Grade Level 10.6)
Sometimes when people visit Facebook they see stories in their News Feeds they don’t want to see, or they miss stories that are important to them. This can happen when there is a problem with how the algorithm (a formula or set of rules) guesses which stories it should put higher up in the News Feed. Facebook collects information about how people interact with stories in their News Feeds, and feedback from people about what kinds of stories they prefer. Facebook’s engineers use this information to understand which stories people are most interested in, and why. Then, the engineers make changes to improve how the algorithm chooses stories.

Facebook tests the changes to the algorithm with a small number of people to see if it is working better for them. One way Facebook can tell if things are better is by whether people spend more time looking at stories in their News Feeds after the changes, compared with before the changes. Facebook also asks people questions about how much they wanted to see certain stories. All of the information about people’s actions, preferences and opinions is used to determine whether the change is an improvement or not. The algorithm is updated like this on a regular basis.

5.4.1 “Objective” condition manipulation check questions

  1. The text I just read discussed…
    • The types of ads Facebook allows under its advertising policy.
    • The process of evaluating the Facebook News Feed algorithm.
    • The use of Facebook for organizing events.
    • The different kinds of Facebook apps that are available.
    • None of the above.
  2. Choose the information from the list below that was in the text that you read:
    • The stories in the News Feed are all created by Facebook employees.
    • Stories about kittens are more popular than all other types of stories in the News Feed.
    • The News Feed was first available to users on September 6, 2006.
    • Facebook tests changes to the News Feed algorithm with a small number of users.
    • None of the above.
  3. Which of the following statements is true?
    • Facebook was first created at Stanford University in California.
    • English (Pirate) and Klingon are two of the language options offered by Facebook.
    • Engineers regularly update the News Feed algorithm.
    • Facebook’s main user interface color is blue, because blue is Mark Zuckerberg’s favorite color.
    • None of the above.

5.5 Control Text

(189 words; Flesch-Kincaid Grade Level 10.8)
Facebook is an American company and online social media website. It is based in Menlo Park, California and has offices in 15 countries. Facebook was launched on February 4, 2004, by Mark Zuckerberg and several other Harvard University students. At first, membership was limited to Harvard students. Since 2006, anyone who is age 13 and older and has a valid email address can become a registered user.

Facebook is available in multiple languages. It may be accessed over the Internet and mobile networks using desktops, laptops, tablet computers, and smartphones. After creating an account, people typically fill out a user profile with information like their name, occupation and email address. They then use Facebook to add other users as friends, exchange messages, create events, and post stories. People interact with others by commenting on or “liking” stories. They can also join common interest groups and share photos, videos and links. People receive notifications when others they are connected to update their user profiles or post stories. Stories that have been posted by people, Pages and groups appear in the News Feed, which is a constantly updating list of stories.

5.5.1 Control condition manipulation check questions

  1. The text that I just read discussed…
    • The types of ads Facebook allows under its advertising policy.
    • How Facebook first got started.
    • The use of Facebook for organizing events.
    • The different kinds of Facebook apps that are available.
    • None of the above.
  2. Choose the information from the list below that was in the text that you read:
    • The stories in the News Feed are all created by Facebook employees.
    • Stories about kittens are more popular than all other types of stories in the News Feed.
    • The News Feed was first available to users on September 6, 2006.
    • The News Feed shows stories from people, Pages and groups.
    • None of the above.
  3. Which of the following statements is true?
    • Facebook was first created at Stanford University in California.
    • English (Pirate) and Klingon are two of the language options offered by Facebook.
    • A valid email address is required to sign up for Facebook.
    • Facebook’s main user interface color is blue, because blue is Mark Zuckerberg’s favorite color.
    • None of the above.

6 Correlations Between the Variables

7 Regression Models

7.1 Knowledge


====================================================================
                                      Dependent variable:           
                           -----------------------------------------
                                       evpst_before_num             
--------------------------------------------------------------------
age.centered                            -0.006 (0.004)              
genderWoman                             -0.130 (0.120)              
internet_literacy.centered              -0.140• (0.075)             
trust_propensity.centered               0.220** (0.075)             
fb_satisfaction.centered               0.250*** (0.055)             
fb_routine.centered                      0.019 (0.056)              
stories_postedYes                       -0.027 (0.130)              
Constant                               4.500*** (0.120)             
--------------------------------------------------------------------
Observations                                  681                   
R2                                           0.097                  
Adjusted R2                                  0.088                  
Residual Std. Error                    1.500 (df = 673)             
F Statistic                         10.000*** (df = 7; 673)         
====================================================================
Note:                      . p<0.1; * p<0.05; ** p<0.01; *** p<0.001

=========================================================================
                                          Dependent variable:            
                               ------------------------------------------
                               aware_info_new_num aware_info_surprise_num
                                      (1)                   (2)          
-------------------------------------------------------------------------
conditionwhat                   1.100*** (0.210)     1.100*** (0.200)    
conditionhow                    1.100*** (0.210)     0.800*** (0.200)    
conditionwhy                    1.300*** (0.210)     1.100*** (0.200)    
conditionobjective              1.500*** (0.210)     1.000*** (0.200)    
evpst_before.centered           0.270*** (0.043)     0.250*** (0.041)    
age.centered                     0.006 (0.004)        0.010* (0.004)     
genderWoman                     -0.270• (0.140)       -0.200 (0.130)     
internet_literacy.centered     -0.380*** (0.085)     -0.290*** (0.080)   
trust_propensity.centered        0.070 (0.084)        -0.086 (0.079)     
fb_satisfaction.centered         -0.008 (0.062)        0.059 (0.059)     
fb_routine.centered              -0.031 (0.063)       0.150* (0.060)     
stories_postedYes                0.230 (0.140)        0.300* (0.140)     
Constant                        3.500*** (0.190)     2.700*** (0.170)    
-------------------------------------------------------------------------
Observations                          681                   681          
R2                                   0.180                 0.170         
Adjusted R2                          0.160                 0.160         
Residual Std. Error (df = 668)       1.700                 1.600         
F Statistic (df = 12; 668)         12.000***             11.000***       
=========================================================================
Note:                           . p<0.1; * p<0.05; ** p<0.01; *** p<0.001

7.2 Awareness


======================================================================================
                                                 Dependent variable:                  
                               -------------------------------------------------------
                               aware_evpst_aft_num  system_reasons     user_reasons   
                                       (1)                (2)               (3)       
--------------------------------------------------------------------------------------
conditionwhat                   -0.870*** (0.180)  0.490*** (0.120)   -0.090 (0.140)  
conditionhow                     -0.300• (0.180)   0.480*** (0.120)    0.140 (0.140)  
conditionwhy                    -0.490** (0.180)    0.330** (0.120)    0.120 (0.140)  
conditionobjective                0.210 (0.180)     0.260* (0.120)     0.062 (0.140)  
evpst_before.centered           0.460*** (0.037)   -0.110*** (0.024)  0.067* (0.029)  
age.centered                     -0.008* (0.004)   -0.007** (0.002)    0.003 (0.003)  
genderWoman                      -0.240* (0.120)    -0.140• (0.076)   -0.099 (0.093)  
internet_literacy.centered       -0.092 (0.073)     0.130** (0.047)   -0.140* (0.057) 
trust_propensity.centered        -0.100 (0.072)    -0.130** (0.047)  -0.150** (0.057) 
fb_satisfaction.centered          0.087 (0.054)     -0.015 (0.034)    -0.072• (0.042) 
fb_routine.centered               0.014 (0.054)      0.035 (0.035)   -0.340*** (0.042)
stories_postedYes                 0.040 (0.120)      0.025 (0.079)   -0.340*** (0.096)
Constant                        3.800*** (0.160)   3.900*** (0.100)  4.500*** (0.120) 
--------------------------------------------------------------------------------------
Observations                           681                681               681       
R2                                    0.260              0.120             0.230      
Adjusted R2                           0.250              0.100             0.220      
Residual Std. Error (df = 668)        1.500              0.950             1.200      
F Statistic (df = 12; 668)          20.000***          7.500***          17.000***    
======================================================================================
Note:                                        . p<0.1; * p<0.05; ** p<0.01; *** p<0.001

7.3 Correctness


=======================================================================================
                                                 Dependent variable:                   
                               --------------------------------------------------------
                               correct_missed_num  want_frequency   dont_want_frequency
                                      (1)                (2)                (3)        
---------------------------------------------------------------------------------------
conditionwhat                    0.410* (0.200)    -3.600* (1.800)     3.300 (2.300)   
conditionhow                     0.160 (0.200)     -3.200• (1.800)     0.330 (2.300)   
conditionwhy                     0.270 (0.200)     -2.700 (1.800)      0.830 (2.300)   
conditionobjective               0.240 (0.200)     -2.600 (1.800)      1.100 (2.300)   
evpst_before.centered           -0.100* (0.040)   1.300*** (0.360)    -0.420 (0.460)   
age.centered                    -0.013** (0.004)   -0.036 (0.036)      0.036 (0.046)   
genderWoman                      0.069 (0.130)      0.590 (1.100)     -1.900 (1.500)   
internet_literacy.centered      0.270*** (0.079)    0.011 (0.710)     -0.280 (0.910)   
trust_propensity.centered        0.190* (0.079)     0.330 (0.700)     -1.200 (0.910)   
fb_satisfaction.centered         0.006 (0.058)    1.800*** (0.520)   -2.700*** (0.670) 
fb_routine.centered             0.260*** (0.059)   1.400** (0.530)    -1.200• (0.680)  
stories_postedYes                0.260• (0.130)     1.200 (1.200)     -0.630 (1.500)   
Constant                        4.000*** (0.170)  57.000*** (1.500)  55.000*** (2.000) 
---------------------------------------------------------------------------------------
Observations                          681                681                681        
R2                                   0.130              0.120              0.089       
Adjusted R2                          0.120              0.100              0.072       
Residual Std. Error (df = 668)       1.600             14.000             18.000       
F Statistic (df = 12; 668)          8.500***          7.300***           5.400***      
=======================================================================================
Note:                                         . p<0.1; * p<0.05; ** p<0.01; *** p<0.001

7.4 Interpretability


=====================================================================================
                                                Dependent variable:                  
                               ------------------------------------------------------
                               interpret_rsns_num entertained_goals  informed_goals  
                                      (1)                (2)               (3)       
-------------------------------------------------------------------------------------
conditionwhat                   0.620*** (0.150)   -2.000 (1.800)    -2.000 (1.800)  
conditionhow                    0.830*** (0.160)   -3.800* (1.800)  -5.400** (1.900) 
conditionwhy                    0.600*** (0.160)   -1.400 (1.800)    -2.200 (1.900)  
conditionobjective              0.540*** (0.160)   -2.600 (1.800)    -3.600• (1.900) 
evpst_before.centered            0.051 (0.032)    1.800*** (0.370)  1.900*** (0.380) 
age.centered                   -0.015*** (0.003)    0.033 (0.037)    -0.025 (0.038)  
genderWoman                      0.058 (0.100)      1.700 (1.200)    -0.870 (1.200)  
internet_literacy.centered       0.140* (0.062)     0.320 (0.720)    -0.740 (0.740)  
trust_propensity.centered        0.069 (0.062)    3.100*** (0.710)  4.000*** (0.740) 
fb_satisfaction.centered        0.170*** (0.046)  2.600*** (0.530)  2.000*** (0.550) 
fb_routine.centered              0.092* (0.046)    1.600** (0.540)   1.300* (0.550)  
stories_postedYes                -0.066 (0.110)     1.900 (1.200)    2.900* (1.300)  
Constant                        4.500*** (0.140)  67.000*** (1.600) 57.000*** (1.600)
-------------------------------------------------------------------------------------
Observations                          681                681               681       
R2                                   0.160              0.250             0.250      
Adjusted R2                          0.150              0.240             0.240      
Residual Std. Error (df = 668)       1.300             15.000            15.000      
F Statistic (df = 12; 668)         11.000***          19.000***         19.000***    
=====================================================================================
Note:                                       . p<0.1; * p<0.05; ** p<0.01; *** p<0.001

7.5 Accountability


======================================================================================
                                                 Dependent variable:                  
                               -------------------------------------------------------
                               acct_fairness_num  content_actions  ui_controls_actions
                                      (1)               (2)                (3)        
--------------------------------------------------------------------------------------
conditionwhat                  -0.500*** (0.140)  -0.700 (2.000)      3.200 (2.100)   
conditionhow                    -0.180 (0.140)    -0.280 (2.000)     -5.300* (2.100)  
conditionwhy                    -0.140 (0.140)    -4.100* (2.100)    -3.000 (2.200)   
conditionobjective              -0.240• (0.140)   -1.600 (2.000)     -2.000 (2.100)   
evpst_before.centered          0.130*** (0.029)    0.480 (0.420)     -0.710 (0.430)   
age.centered                   -0.00000 (0.003)   -0.007 (0.042)    -0.140*** (0.043) 
genderWoman                     -0.079 (0.091)     1.900 (1.300)     2.500• (1.400)   
internet_literacy.centered     -0.240*** (0.056) 4.000*** (0.820)   5.800*** (0.850)  
trust_propensity.centered       0.170** (0.056)   1.500• (0.810)     1.600• (0.850)   
fb_satisfaction.centered       0.210*** (0.041)    0.280 (0.600)      0.260 (0.630)   
fb_routine.centered             -0.035 (0.042)     0.680 (0.610)     1.300* (0.630)   
stories_postedYes                0.120 (0.095)    2.800* (1.400)     2.500• (1.400)   
Constant                       4.000*** (0.120)  71.000*** (1.800)  66.000*** (1.900) 
--------------------------------------------------------------------------------------
Observations                          681               681                681        
R2                                   0.200             0.080              0.170       
Adjusted R2                          0.180             0.063              0.150       
Residual Std. Error (df = 668)       1.100            17.000             17.000       
F Statistic (df = 12; 668)         14.000***         4.800***           11.000***     
======================================================================================
Note:                                        . p<0.1; * p<0.05; ** p<0.01; *** p<0.001

7.6 Heatmap of Regression Coefficients

The figure below is a heatmap showing all coefficients, including experimental conditions and controls, for the predictors in each model having p-values less than 0.05. Outcome variables were standardized so that they are directly comparable. Awareness variables = Knowledge After, System Agency, User Agency. Correctness variables = Missed Stories, Wanted Stories, Unwanted Stories. Interpretability variables = Understand Why, Interpersonal and Informational Goals. Accountability variables = Fairness, Content and User Actions.

8 Variable Names, and How the Paper Refers to Them

8.1 Control Variables

Demographics

  • age: Age
  • gender: Gender
  • internet_literacy: Internet Literacy
  • trust_propensity: Trust Propensity [towards social media]

Facebook Use

  • fb_satisfaction: FB Satisfaction
  • fb_routine: Routine FB Behavior
  • stories_posted: Posted Last Week

8.2 Outcome Variables

Prior Knowledge

  • evpst_before: Knowledge Before [about the News Feed]
  • aware_info_new: New Info [in the explanations]
  • aware_info_surprise: Surprising Info [in the explanations]

Awareness

  • aware_evpst_aft: Knowledge After [about the News Feed]
  • system_reasons: System Agency [controlling what users see]
  • user_reasons: User Agency [controlling what users see]

Correctness

  • correct_missed: Missed Stories [in the News Feed]
  • want_frequency: Wanted Stories [in the News Feed]
  • dont_want_frequency: Unwanted Stories [in the News Feed]

Interpretability

  • interpret_rsns: Understand Why [stories appear in the News Feed]
  • entertained_goals: Interpersonal Goals [that the News Feed meets]
  • informed_goals: Informational Goals [that the News Feed meets]

Accountability

  • acct_fairness: Fairness [of the News Feed]
  • content_actions: Content Actions [affect which stories appear]
  • ui_controls_actions: UI Controls [affect which stories appear]
