Supplementary Information for A 61-Million-Person ... - James Fowler

2 downloads 97 Views 6MB Size Report
3 Data Science, Facebook, Inc., Menlo Park, CA 94025, USA ..... Gordon super computer19 at the San Diego Super Computer
Supplementary Information for A 61-Million-Person Experiment in Social Influence and Political Mobilization Robert M. Bond1, Christopher J. Fariss1, Jason J. Jones2, Adam D. I. Kramer3, Cameron Marlow3, Jaime E. Settle1, and James H. Fowler1,4* 1

Political Science Department, University of California, San Diego, La Jolla, CA 92093, USA Psychology Department, University of California, San Diego, La Jolla, CA 92093, USA 3 Data Science, Facebook, Inc., Menlo Park, CA 94025, USA 4 Medical Genetics Division, University of California, San Diego, La Jolla, CA 92093, USA 2

∗ To whom correspondence should be addressed, email: [email protected].

TABLE OF CONTENTS Research Design.............................................................................................................................. 2! Distribution of Key Variables and Balance Testing ....................................................................... 2! Matching to Voting Records ........................................................................................................... 3! Overreporting and Underreporting of Voting Behaviour ............................................................... 4! Determination of “Close” Friends .................................................................................................. 5! Correlation in Behaviour Between Friends..................................................................................... 6! Analysis of Direct and Indirect Effects ........................................................................................... 7! Variation in the Treatment Effect by Number of Friends Shown ................................................... 9! Recency of Contact ......................................................................................................................... 9! Average Per-Friend Treatment Effect vs. Number of Friends ........................................................ 9! Monte Carlo Tests of the Network Permutation Method.............................................................. 10! Monte Carlo Code in R ................................................................................................................. 13! Tables ............................................................................................................................................ 15! Figures........................................................................................................................................... 34! SI References ................................................................................................................................ 40!

1

Research Design The research design for this study was reviewed and approved by the University of California, San Diego Human Research Protections Program (protocol #101273). All registered Facebook users over the age of 18 who logged in to their Facebook account on November 2, 2010, were automatically included in the experiment. Random assignment to one of the three treatment groups was done using a random number generator. In total, 61,279,316 Facebook users participated in the study. Most participants (98%) were exposed to the “social message” condition (60,055,176). Half of the remaining participants were exposed to the “informational message” condition (613,096) and the rest were in the control (“no message”) group (611,044). Ideally, we would have designed the experiment with equal sized treatment and control groups to maximize power. However, Facebook wanted to encourage all users to participate in the 2010 US Congressional Election, and they therefore asked us to limit the size of the groups that did not receive the standard “get out the vote” (GOTV) message. As a result, 98% of users were exposed to the social message, while only 1% received the alternative informational message and another 1% received no message. Fortunately, the large number of users means there were still over 600,000 people in each of the 1% groups. When Facebook users log into their account, they are normally greeted with their “News Feed,” which includes informational content posted about or by themselves and the friends to whom they are connected. This standard view constituted our control condition. The two treatment conditions included a message from Facebook placed at the beginning of the “News Feed” (see Fig. 1 in the main text). This message provided users a link to information about how to find their polling place. The cumulative total number of Facebook users who had reported voting was shown in the upper right corner. In the middle of the box, users could press a button that read “I Voted” to post a message on their profile that indicated they had voted. The two treatment conditions were differentiated based on the presence of information about voting behaviour among the user’s social network on Facebook. In the “social message” condition, the bottom of the box had small pictures of up to six pictures of the user’s friends who had already reported voting. The names of two friends and the total number of the users’ friends who had reported voting were included with the pictures. Participants who had six or fewer friends that had voted saw the photos of all of their voting friends; participants who had more than six friends who had voted saw a randomly selected six (the number six was arbitrarily chosen due to space constraints). Distribution of Key Variables and Balance Testing Table S1 shows summary statistics for age, sex, and the following variables: •

Identity as a Partisan. Respondents can choose to identify their partisanship. Particular party variables (Democrat and Republican) were coded as a 1 when the name of the party appeared in the user's political views and 0 otherwise. 2



Ideology. Facebook users can write in their political ideology in an open-ended response box. Particular ideology variables (Liberal and Conservative) were coded as a 1 when the ideological label appeared in the user's political views and 0 otherwise.



Expressed Voting. For those respondents in the two treatment conditions, the site recorded when the respondent clicked the “I Voted” button.



Polling Place Search. For those respondents in the two treatment conditions, the site recorded when the respondent clicked the “Find Your Polling Place” link.



Validated Vote. Respondents who had the same first name, last name, and birthdate as a record in their state’s voter file were matched at the group level to allow statistical analysis on the relationship between the treatment and real world behaviour (see below).

Table S2 shows balance tests for the demographic variables. There were no significant differences (all pairwise two-tailed t tests indicated p > 0.05) between the treatment and control groups on any of these variables, suggesting that random assignment was successful. Tables S3-S5 show additional balance tests for the demographic variables of friends, close friends, and close friends of close friends. These results show that the user treatment is uncorrelated with the attributes of the people the user is connected to, suggesting that any difference we find between friends of those who received the treatment and friends of those who were in the control group is either due to sampling variation or due to a causal effect of the user treatments on the friends. Figure S1 shows a few of several possibilities for how the user treatment might generate a change in a friend’s or friend’s friend’s behaviour. This lack of correlation is important because it means that even if people who have more friends who vote are more likely to see a message with many friends, the number of friends shown is not correlated with the treatment either of the user or their friends. Variation in the number of friends shown therefore cannot drive a spurious relationship between treatment and behaviour. Matching to Voting Records To choose which states to validate, we identified those states that provided (for research purposes) first names, last names, and full birthdates in publicly available voting records. From these, we chose a set that minimized cost per population, but allowed us to detect a 0.5% effect with 80% power given a treatment rate of 98% and a turnout rate of 40% based on rough estimates. The cost of state records varied from $0 to $1500 per state. We excluded records from Texas because they had systematically excluded some individuals from their voting records (specifically, they did not report on the voting behaviour of people that had abstained in the four prior elections). The resulting list of states included Arkansas, California, Connecticut, Florida, Kansas, Kentucky, Missouri, Nevada, New Jersey, New York, Oklahoma, Pennsylvania, and Rhode Island. These states account for about 40% of all registered voters in the U.S., and their

3

records yielded 6,338,882 matched observations of voters and abstainers that we could use to compare to treatment categories from the experiment. About 1 in 3 users were successfully matched to voter records (success depends on many factors, including voting eligibility, rates of registration, and so on). It is important to note that the match rate for our study is lower than the match rates in many other GOTV studies, in which more than 50% of users are matched1. The primary reason for the low match rate is the age distribution of Facebook users; because the population of Facebook users shows positive skew relative to the country in general (i.e., Facebook users are younger), and young people are less likely to be registered voters, we were able to match fewer records. In Figure S2 we show the distribution of users by age in the 13 states used for matching voters and the match rate by age in those states. Additionally, as in other studies in which individuals self-enter data2, matches are more difficult due to a lack of consistency in name conventions in the voter file and Facebook (for instance, a voter may be listed as “Lucille” in the voter record and “Lucy” in Facebook). All information was discarded after we finished the data analysis. In order to match information in Facebook to public voting records, we relied on the “Yahtzee” method3. This method is a group-level matching procedure that preserves the privacy of individual actions while still allowing statistical analysis to be conducted at the individual level. We matched users to individuals on the registration list in the same state by first name, last name, and date of birth (dropping all instances that had duplicates) and set the level of error in individual assignments to be 5%. This means that a matched user identified as a voter had a 5% chance of being classified as an abstainer, and vice versa. For the validated vote results, we assume that the states in which we matched users are a representative sample of all states. Since these states represent about 40% of the U.S. population, we divide the total number of matched users by 0.40 to estimate the total we would have matched if we had acquired voter records in all states, and we use this value for estimating total effects. Note that even with this adjustment, our estimate is probably conservative because it assumes that the treatment effect on unmatched users who are actually in the voter record (as in the Lucille/Lucy example above) is zero. Overreporting and Underreporting of Voting Behaviour Since we collected information about both online self-reported voting and real world voting validated by government records, we can compare these two measures to learn more about truthful reporting and the effect the experiment had on it. A comparison of the two measures shows that 3.8% of those in the matched sample self-reported voting when the validated record shows they abstained (an “overreport”), while 50.1% declined to report voting when they actually voted (an “underreport”). The Pearson’s φ correlation between the two measures is 0.46 (SE 0.03, p < 0.01), which is somewhat lower than the correlation found in most survey research because our self-reported voting measure is not forced-response. In addition to measuring the effect of the treatment on validated voting (described in the main text), we also analysed the effect of the treatment on overreport and underreport. The results show that users who received the social message were 0.99% (SE 0.14%, p < 0.01) more likely 4

to overreport voting and 4.19% (SE 0.27%, p < 0.01) less likely to underreport voting than those who received the informational message. Thus, the social message appears to have affected both the desire to vote and the desire to be perceived as a voter. Determination of “Close” Friends We wished to characterize the strength of ties between pairs of Facebook users beyond the mere existence (or not) of a friendship tie. It has been frequently observed that strong ties engage in “media multiplexity.” For example, if two people communicate often by phone, it is likely they also communicate often through email. Boase et al.4 summarize their findings by saying, “People who communicate frequently use multiple media to do so. The more contact by one medium, the more contact by others” (p. 23). We used the frequency with which users interacted with each other on Facebook to estimate the overall closeness of their social tie. On Facebook, people can interact by sending messages, uploading and tagging photos, commenting on posts by friends, posting a “like” on another user’s post in order to show approval, or in a number of other methods. To identify which Facebook friendships represented close ties, we began with the set of friends who interacted with each other at least once during the three months prior to the election. As individuals vary in the degree to which they use the Facebook website, we normed this level of interaction by dividing the total number of interactions with a specific friend by the total number of interactions a user had with all friends. This gives us a measure of the percentage of a user’s interactions accounted for by each friend (for example, a user may interact 1% of the time with one friend and 20% of the time with another). We then categorized all friendships in our sample by decile, ranking them from lowest to highest percentage of interactions. Each decile is a subset of the previous decile. For example, decile 5 contains all friends at the 40th percentile of interaction or higher while decile 6 contains all friends at the 50th percentile of interaction or higher, meaning that decile 6 is a subset of decile 5. We validated this measure of tie strength with a survey. We fielded four surveys to Facebook users asking them to name some number of their close friends (1, 3, 5, or 10). Each survey began with the following prompt: Think of the people with whom you have spent time in your life, friends with whom you have a close relationship. These friends might also be family members, neighbors, coworkers, classmates, and so on. Who are your closest friends? We tested the hypothesis that counting interactions would be a good predictor of named closest friends. We constructed a list of closest friends by pairing each survey respondent with the first friend named in response to the prompt. Thus, closest friends were defined as friendships including Person A (the survey-taker) and Person B (the first name generated by the survey-taker when prompted to name his/her closest friends).

5

The surveys were completed between October 2010 and January 2011. We obtained 1,656 responses. We then counted the number of times respondents interacted with each of their friends over the three months prior to the user taking the survey, and divided that number by the total number of interactions that the user had with all friends over the same three-month period. We split the percentages of interaction into deciles (see Table S3). This is the same procedure we used to create the deciles of interaction for users in the political mobilization experiment. In Table S6 and Figure 2a of the main text we show the probability that one of the friendships in each decile is the closest friendship identified by the survey-taker. The results show that as the decile of interaction increases, the probability that a friendship is the user’s closest friend increases. This finding is consistent with the hypothesis that the closer a social tie between two people, the more frequently they will interact, regardless of medium. In this case, frequency of Facebook interaction is a good predictor of being named a close friend. Tables S7, S8, and S9, and Figures 2b, c, and d in the main text also show that as the number of interactions increases, so does the effect of the user’s treatment on his friend’s behaviour. To simplify analysis, we arbitrarily labelled any friend in the 9th decile (80th percentile of above) a “close friend.” All other friends are labelled “friends.” To measure the person-to-person-toperson effect of a treatment, we labelled people who were not friends or close friends but who shared a close friend in common as “close friends of close friends” (2 degrees of separation). Figure S2 shows the cumulative distribution of the number of users who have a certain number of each of these types of relationships. We therefore studied three mutually exclusive sets of relationships: 5.9 billion “friends,” 3 billion “close friends”, and 4.6 billion “close friends of close friends” (see Figure S3). Correlation in Behaviour Between Friends The town of Abilene, Texas, was selected for illustrative purposes in Figure S4. It features 868 users in the largest connected component of the close friend network who list Abilene, Texas, as their current city in their profile, and shows those who clicked on the “I Voted” button. The graph was generated using the Kamada-Kawai algorithm5, which is implemented in the igraph library6 in R7 and visualized using Pajek8. In the whole network, expressed voting is correlated between friends (Pearson’s φ =0.05) and even more correlated between close friends (φ =0.12, see Fig. 3), consistent with other observational studies9 that found somewhat higher correlations in a smaller set of closer friends. However, these observational associations might result from homophily (the tendency to choose friends who are similar) or exposure to shared environments rather than a process of social influence. Our experimental design allows us to isolate the effect of influence from these alternative explanations since the treatment is uncorrelated with any attribute of the users, their friends, or their friends’ friends, and we measure the relationship between user treatment and friend’s behaviour.

6

Analysis of Direct and Indirect Effects For direct effects, we used t tests to compare the percentage of individuals who exhibited a certain political behaviour (clicking on the “I Voted” button, clicking on the polling place link, or validated vote) in the treatment and control conditions or between the two treatment conditions. For indirect effects, we want to estimate the relationship between a user’s political behaviour (expressed vote, validated vote, and polling place search) and the experimental condition to which their friend was exposed. To estimate the influence that the friend has on the user whose behaviour is being studied, we must be sure that any relationship between the user’s behaviour and the friend’s experimental condition is not due to chance. Standard techniques like ordinary least squares regression assume independence of observations, which is not the case here due to the complex interdependencies in the network. To take the network into account, we measure the empirical probability of observing a behaviour by a friend, conditional on a user’s treatment (see Figure S1 for examples of how the user treatment might cause changes in a friend’s or friend’s friend’s behaviour). A single user will be connected to many friends, so we conduct this analysis on a per-friend basis. For example, looking across all friendships, we may find that 6 of 10 users connected to a friend in the treatment group vote (for a rate of 60%) while only 5 of 10 of those connected to a friend in the control group vote (for a rate of 50%), suggesting a per-friend average treatment effect of 10%. To compare this observed value to what is possible due to chance, we keep the network topology fixed but randomly permute the assignment to treatment for each user and once again measure the per-friend treatment effect. We repeat this procedure 1,000 times. The simulated values generate a theoretical null distribution we would expect due to chance when there is no treatment effect. We then compare the observed value to the simulated null distribution to evaluate significance. We obtain confidence intervals for the null distribution by sorting the results and taking the appropriate percentiles (in our case, we are interested in the 95% confidence interval, so we use the 25th and 975th values). The random permutation method overcomes the problem of non-independent observations by taking the specific network structure into account when the null distribution is generated. The results shown in Tables S7, S8, and S9 are for the effect of Social Message vs. Control and they include estimates of the null distribution as described here. Results are also summarized in Figures 2b, c, and d and Figure 3 of the main text. For each of the three behaviours we studied (expressed vote in Tables S7 and S10, validated vote in Tables S8 and S11, and polling place search in Tables S9 and S12) we used the same procedure. We analysed each friendship in the sample, first calculating the mean rate of behaviour for each user conditional on their friend’s experimental condition (see Figure S1 for some examples of how the treatment might affect a user and spread through the network). We then subtracted the rate of behaviour of the users whose friends were in the control condition from the rate of behaviour of the users whose friends were in the treatment condition to calculate the per-friend treatment effect.

7

We used the permutation procedure to calculate the 95% confidence interval of the null distribution of treatment effects that we would expect due to chance. When the observed effect size falls outside of the confidence interval we consider that result to be statistically significant. We also calculate the average treatment effect per user by multiplying the per-friend treatment effect times the average number of friends per user. To calculate the null distribution of the peruser effects we repeat this calculation on each of the simulated networks generated by the permutation procedure. And finally, for significant treatment effects that fall outside the 95% confidence interval of the null distribution, we calculate the total effect by multiplying the per-user effect times the number of users. To calculate the null distribution of the per-user effects, we repeat this calculation on each of the simulated networks generated by the permutation procedure. Notice that for expressed voting, the treatment effects were strong enough to be detectable at two degrees of separation. For each close friend of a close friend who saw the social message, an individual was 0.022% (null 95% CI –0.011% to 0.012%) more likely to express voting. And given the large number of such connections, the number of people affected was also large. We estimate that the per-user effect was +1.7% (null 95% CI –0.8% to 0.9%), which means the treatment caused 1,025,000 close friends of close friends (2 degrees of separation) to express voting. For validated voting and information seeking we did not find significant effects for close friends of close friends, but it is important to note that these results may be due to limited power since validated voting was measured in a sample one tenth the size, and the direct effects on information seeking were also approximately one-tenth the size as those for expressed voting. Using the Permutation Method to Calculate Direct Effects For consistency, we also re-calculated the direct effects using the permutation procedure. For clicking the “I Voted” button and for clicking the polling place link, we calculated the observed difference in means by comparing the social message group and the informational message group. We then randomly re-assigned treatment status to the subjects and re-calculated the difference in means 1,000 times. We then calculated the 95% confidence interval of the null distribution by taking the 25th and 975th values from this simulation. For validated voting, we used the same procedure to simulate the null distribution of the difference in means between the social message group and the informational message group as well as the difference in means between the social message group and the control group. This procedure provided similar evidence of the statistical significance of the direct effects. For the “I Voted” comparison, we observed a difference of means of 2.08% (NULL CI –0.10% to 0.10%). For clicking on the polling place link we observed a difference of means of 0.26% (NULL CI –0.04% to 0.04%). For validated voting we observed a difference of means between the social message group and the control group of 0.39% (NULL CI –0.39% to 0.37%) and a difference of means between the social message group and the informational message group of 0.39% (NULL CI –0.41% to 0.37%).

8

Variation in the Treatment Effect by Number of Friends Shown We were interested in the possibility that the treatment effect varied with the number of friends shown. For the social message group, up to six friends who had previously clicked the “I voted” button were shown in the message as friends who had voted. For users who had less than six friends who had previously reported voting, all previously-voting friends were shown. We recorded the number of friends shown for 1% of the social message group. We then calculated the difference in means of those exposed to the social message with a specified number of friends shown on the initial login to the informational message group for expressed voting and polling place search (see Table S13), and to both the informational message group and the control group for validated voting (see Table S14). The results show no variation in treatment effect for polling place search or validated voting. However, for expressed voting, the treatment effect increases as more faces are shown on the initial login. Note that this pattern shows that the treatment varies in strength depending on how many faces are shown for expressed voting, but does not affect the interpretation of the overall average treatment effect. Recency of Contact For indirect effects, we were curious if the recency of contact (rather than the frequency of contact) could also predict the extent to which friends are influential. We took the same set of interacting friends, but instead of dividing them into deciles by the number of interactions they had in the three months prior to the election, we divided them into deciles based on the number of days prior to the election that the friends had most recently interacted on Facebook. The correlation between the number of interactions and the number of days prior to the election that friends had interacted was 0.25. Using this measure of the closeness of friendship, Tables S15-S17 show that no groups (other than the full set of interacting friends for expressed vote) had significant indirect effects on any of the three dependent variables (expressed voting, polling place search and validated voting). Users who interacted more recently did not show any signs of larger treatment effects. This suggests that the quantity of interaction rather than its recency is important for influence. Average Per-Friend Treatment Effect vs. Number of Friends As with any experiment that estimates an average treatment effect, our experimental design may obscure important differences in the marginal effect. In our case, the first friend who received the treatment may have a much different effect on the user than the 100th friend. One possibility is that the effect declines with the number of friends, as people pay less attention to a message or behaviour the more often they see it. Another possibility is that the effect increases with the number of friends receiving the treatment, as reinforcement can sometimes induce “complex contagion”10. In order to examine differences in the effect of treatment of friends on behaviour we divided the sample of users into deciles by the number of close friends (from lowest to highest) and measured the treatment effect for each decile (see Figure S5). The null distributions are not 9

shown since the observed values in each decile all fall well within the 95% confidence interval (due to dividing the sample by 10). Note that the per-friend effect sizes for vote and expressed vote do tilt slightly downward, but the rate of decrease is probably too small relative to the sampling error in the treatment effect estimates to claim that the difference is significant. Monte Carlo Tests of the Network Permutation Method The network permutation method described here has been used in several other publications11-18. However, in those applications the goal was to measure the likelihood that a correlation in observed behaviour between connected individuals in the network was due to chance. Here we use the network permutation method to evaluate an observed correlation between a treatment variable and a resulting behaviour in the treated individual, the treated individual’s friends, and the treated individual’s friends of friends. To evaluate whether this procedure yields accurate estimates of causal treatment effects, we have written a computer program in R that 1) generates a network, 2) endows individuals within the network with an initial likelihood of a behaviour, 3) randomly assigns them to treatment and control groups, 4) updates their likelihood of the behaviour according to treatment effects that we can assign (the “true” effects), and then 5) uses these probabilities to determine which individuals exhibit the behaviour. Specifically, we assume ! = ! + !! !! + !! !! + !! !! where y is the total probability a user engages in a specific behaviour (e.g. expressed vote, validated vote, poll search), ! is the baseline probability of the behaviour, !! is a variable taking the value 1 if the user was in the treatment group and 0 otherwise, and !! is the direct effect of the treatment. In addition, !! and !! are the number of friends at one and two degrees of separation who are in the treatment group, and !! and !! are the per-friend and per-friend-offriend effects, respectively. We can then test our permutation procedure to see whether or not there is bias in the estimated treatment effects and the rate at which our estimation procedure produces false positives. The computation resources necessary to run these Monte Carlo tests necessitated the use of the Gordon super computer19 at the San Diego Super Computer Center because we simulate a random, 5 million node network 1000 times for each of the scenarios described below. We tested a variety of scenarios with various parameter combinations: All Scenarios: We created a network with n = 5 million users, with an average of 10 friends_per_person, and assumed the control group made up 1% of this population (control_proportion). We repeated each scenario 1,000 times, each time setting the “true” effect sizes to be similar to those observed in the real data. Specifically, we let the direct_effect of the message on the behaviour be drawn from a uniform distribution with range !! ∈ [0.0%, 0.8%], and we let the per_friend_effect be drawn from a uniform distribution with range !! ∈ [0.00%, 0.34%] to test effect sizes similar to those

10

we estimated for validated vote. We also let the per_friend_of_friend_effect be drawn from a uniform distribution with range !! ∈ [0.000%, 0.044%] to test effect sizes similar to those we estimated for expressed vote at two degrees of separation. Scenario A (baseline): We assumed a Watts–Strogatz (small world) network20 (network_type = 1) and set the rewiring parameter in this model to yield network transitivity of about 0.2 (similar to transitivity in the Facebook close friend network). The initial probability of voting was set to 0.5 for everyone (! = !initial_behaviour_type = 1). We also assumed the treatment effect was linear in the number of friends (! =!update_behaviour_type = 1). Scenario B: Same as scenario A except the initial probability of the behaviour was drawn from a uniform distribution for each user (! =!initial_behaviour_type = 2). This allows us to test whether heterogeneity in initial behaviour interferes with estimates of treatment effects. Scenario C: Same as scenario A except the initial probability of the behaviour was assigned to each user based on his or her index id (! =!initial_behaviour_type = 3). Since users are initially placed on a lattice in order, this causes the initial probability of the behaviour to be very highly correlated between connected users (Pearson’s φ =0.20 in our simulations, higher than in the observed data). This allows us to test whether homophily on initial behaviour interferes with estimates of treatment effects. Scenario D: Same as scenario A except we assumed an Erdos-Renyi random network21 (network_type = 2). This allows us to test whether assumptions about the network structure interfere with estimates of treatment effects. Scenario E: Same as scenario B except we assumed an Erdos-Renyi random network (network_type = 2). Scenario F: Same as scenario C except we assumed an Erdos-Renyi random network (network_type = 2). Scenario G: Same as scenario A except we assumed a Barabasi-Albert “scale free” network22 (network_type = 3). In particular, this kind of network generates extreme skewness in the degree distribution (most users have only a few friends but a small number have very many friends) similar to that observed in the Facebook data. This allows us to test whether assumptions about the network structure interfere with estimates of treatment effects. Scenario H: Same as scenario B except we assumed a Barabasi-Albert “scale free” network (network_type = 3). Scenario I: Same as scenario C except we assumed a Barabasi-Albert “scale free” network (network_type = 3).

11

The results from all scenarios are presented in Figure S6. There are 9 panels for each treatment effect (the direct effect, the effect on friends, and the effect on friends of friends). Each panel shows results from one of the scenarios described above (labelled by the letter of the scenario), and each point in a plot is one simulation. The dotted line is the theoretical relationship between the “true” values we set and the values estimated by our method one would expect if there were no bias in the procedure, and the solid line is the actual relationship estimated by ordinary linear regression. Notice that in all cases the solid line lies very close to the dotted line. In Table S18 we report the intercept and slope for each of these lines, and note that all intercepts are near zero (no bias) and all slopes are near one (bias does not emerge as effect sizes increase). These results suggest that our estimates of direct, per-friend, and per-friend-of-friend treatment effects are not overstated. In Table S19 we report the results of conducting the same analysis for each scenario, but setting the “true” effect sizes to 0 for all 1,000 simulations. In each simulation, we also sample the null distribution 1,000 times and calculate its 95% confidence interval. We then count the number of times our estimate of the treatment effect falls outside this interval, suggesting there is a treatment effect when one does not exist (this is the false positive rate). Notice that all scenarios generate false positive rates of about 5%, consistent with what one would expect at this level of confidence. Finally, note that in more limited tests with larger networks, it appears that the standard deviation of the estimates decreases with the square root of the number of users in the network. This means that the dispersion in estimates will be lower in the real data than that shown in Figure S6. For our estimates based on 6 million users (validated vote) the variance decreases by a factor of 1.2 and for our estimates based on 60 million users (expressed vote, poll search) it decreases by a factor of 12, and in all cases power to detect effects of the same size as those we observe in the real data exceeds 0.8.

12

Monte Carlo Code in R # load a network library library(igraph) # population size n