Rumor detection on twitter

3 downloads 179 Views 972KB Size Report
such as twitter have been working robustly[3]. However twitter does not always give us its benefits. It also brings a ne
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/261264390

Rumor detection on twitter Conference Paper · November 2012 DOI: 10.1109/SCIS-ISIS.2012.6505254

CITATIONS

READS

19

1,100

2 authors, including: Tetsuro Takahashi Fujitsu Ltd. 10 PUBLICATIONS 197 CITATIONS SEE PROFILE

All content following this page was uploaded by Tetsuro Takahashi on 13 August 2015. The user has requested enhancement of the downloaded file.

Rumor detection on twitter Nobuyuki Igata Fujitsu Laboratories LTD. 4-1-1 Kamikodanaka, Kawasaki 211-88, JAPAN Email: [email protected]

300 200

# tweet

100 0

Abstract—Twitter is useful in a situation of disaster for communication, announcement, request for rescue and so on. On the other hand, it causes a negative by-product, spreading rumors. This paper describe how rumors have spread after a disaster of earthquake, and discuss how can we deal with them. We first investigated actual instances of rumor after the disaster. And then we attempted to disclose characteristics of those rumors. Based on the investigation we developed a system which detects candidates of rumor from twitter and then evaluated it. The result of experiment shows the proposed algorithm can find rumors with acceptable accuracy.

400

Tetsuro Takahashi Fujitsu Laboratories LTD. 4-1-1 Kamikodanaka, Kawasaki 211-88, JAPAN Email: [email protected]

15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 Time on Mar. 11 2011

I. I NTRODUCTION

However twitter does not always give us its benefits. It also brings a negative by-product, rumors. The wider twitter is used, the wider and faster rumors go through people. And raising in the number of twitter users boosts the chances that new rumors are generated. Those rumors may confuse people with the wrong information and affect primary aid required in emergency. If many rumors occupy twitter, it would be difficult to believe information in twitter, and it means that we lose this new hopeful media. This motivated us to tackle the problem of rumors in twitter. We first investigated actual instances of rumor generated after a disaster. And then we attempted to disclose characteristics of those rumors. Based on the investigation, we developed a system which detects candidates of rumor from twitter. In this paper, Section II introduces two instances of actual rumor. Section III reports the result of investigation on those rumors. An experiments of detecting rumors using the results are shown in Section IV.

300 200

# tweet

100 0

From several years ago, people have used twitter to spread information about disasters such as earthquakes. While traditional communication channels like telecommunication has been outage under the situation of disaster, the Internet has been relatively strong and some social networking services such as twitter have been working robustly[3].

Number of tweets (Rumor1)

400

Fig. 1.

Twitter1 is a widely used micro blogging service. The purpose of twitter varies from people to people. As Zhao et al. [1] described in their paper, twitter has been used as a communication channel and an information source. Twitter works quickly and effectively also under a situations of disasters [2].

11

12

13

14

Date in Mar. 2011

Fig. 2.

Number of tweets (Rumor2)

II. RUMORS ON TWITTER A. Instances of rumor After the earthquake and tsunami occurred in Japan on March 11th, twitter was widely used for communication [4]. On the other hand, some rumors were generated and spread through twitter users. We introduce two set of tweets about rumors here. These two tweets are not truth either2 . 1) Rumor1: Geek house Asakusa (Original in Japanese): 1 http://twitter.com/ 2 The original tweets are not referable now because they were erased in few days.

400 300 200

# tweet

100 0 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 Time on Mar. 11 2011

Rumors and Corrections (Rumor1)

350

Fig. 3.

# tweet

150

250

rumor correct

50

【拡散希望】千葉市近辺に在住の方! コスモ 石油の爆発により有害物質が雲などに付着し、 雨などといっしょに降るので外出の際は傘か カッパなどを持ち歩き、身体が雨に接触しな いようにして下さい Rumor2: Cosmo Oil (Translated in English): “Request for spreading” People living in Chibacity area must use an umbrella or rain coat to avoid touching rain. Due to the explosion in Cosmo Oil Factory, harmful material will fall with rain.

rumor correct

0

地震が起きた時、社内サーバールームにいた のだが、ラックが倒壊した。腹部を潰され、血 が流れている。痛い、誰か助けてくれ。ドア が変形し、安定した情報が流れるまでは誰も 動いてはならない旨が館内放送で流れている。 それでは遅すぎる。腕しか動かない、呼吸が できない ... Rumor1: Geek house Asakusa (Translated in English): When the earthquake occurred, I was in our server room. A rack collapsed on me now. My stomach is pressed and I’m shedding blood. Painful. Somebody help me. Doors look damaged. An announcement tells that no one should move before certain information. There is no time. I can use only my arm. Hard to breath. 2) Rumor2: Cosmo Oil (Original in Japanese):

11

12

13

14

Date in Mar. 2011

B. Data of analysis The twitter data we used for the investigation is obtained by using the streaming API provided by Twitter, Inc. This API enables us to get 1% sampling of total tweet. Using this API with language constraint (Japanese) we have obtained about 6 million tweets that were posted in March 2011. We extracted tweets which contain keywords “server room OR Geek house” and “Cosmo Oil” for Rumor1 and Rumor2 respectively from this data set, and then manually checked them in order to eliminate irrelevant tweets. As the result, 1, 135 tweets for Rumor1 and 2, 042 tweets for Rumor2 were obtained. We used these data for the analysis described below. Figure 1 and Figure 2 show the number of tweets about those two rumors on time series. C. How rumors spread Tweets of Rumor1 and Rumor2 are posted by 1, 127 and 2, 023 users respectively. Only 0.70% (Rumor1) and 0.93% (Rumor2) of tweets were posted by the same people. This means that these events were spread by a lot of people rather some specific people mentioned repeatedly. On twitter, users can have a relationship called “follow” with other users. If userA follows userB, userB’s tweets are shown in userA’s page3 . Twitter provides a function, “retweet” by which users can spread a tweet to their followers. People use retweet when they read a valuable tweet to be spread. 3 This

page is called timeline in twitter.

Fig. 4.

Rumors and Corrections (Rumor2)

Usually, people read tweets posted by their followings. This guarantees reliability of information on twitter. However, in case of retweet, people read a tweet written by a stranger. When people retweet a retweeted tweet, it has higher possibility as a rumor comparing with their followings’ tweets. Especially under a situation of disaster, people try to spread urgent information with good intent. Once a rumor starts spreading, it is difficult to stop. D. Rumors Corrections Among the tweets which mentioned those two rumors, there are two types, rumor itself and correction for the rumor. Figure 3 and Figure 4 show the number of them, and we can see a hope for dispelling rumors. There are a number of tweets which attempted to correct the rumors and they worked to dispel the rumors. These charts show that after correcting tweets were posted, the rumor went down quickly. And after that, those rumors had never come up. Shirai et al. [5] reported that 14.7% people corrected their rumor tweet4 . They also reported the spreading speed of correction was about two times faster than that of rumor. 4 Because our data is 1% sampling data, it was difficult to reveal how much people who spread rumors did correct in our data.

0.8 0.6 0.4

Retweet ratio

0.2 0.0 15:00 15:30 16:00 16:30 17:00 17:30 18:00 18:30 Time on Mar.11 2011

Retweet ratio (Rumor1)

0.4

0.6

0.8

1.0

Fig. 5.

0.0

The fundamental approach to the problem of rumor is to stop the generation, however, it is difficult to stop especially on twitter in which vast amount of people can freely post any messages. As we described in the previous section, a correction on twitter works effectively against rumors. Once a target people or organization that is mentioned in rumors (we call them “target” here after) finds a rumor, it is not hard to correct it by posting a tweet for correction with appropriate explanation. In order to prevent the spreading of rumors we propose to post a tweet for correction. Of course there are other proactive approach to prevent rumors like a moral education. The proposed strategy can be complementary to the proactive ones. To realize a quick correction, we have to notice a rumor as soon as possible. We investigated the two set of tweets and found some characteristics that have a potential for detecting rumors.

Retweet ratio

A. Objective

0.2

III. I NVESTIGATION OF RUMORS ON TWITTER

1.0

From our analysis and previous work, we can say that rumors spread quickly on twitter, however, once tweets for correction are posted, they spread more quickly and they can dispel the rumors. People are not willing to spread rumors.

B. Clues for Detecting Rumors 1) Burst: We can see that both number of tweets have burst suddenly. We can find bursts by applying ordinary techniques however, not always they work well to find rumors. In Figure 1 and Figure 2, those peaks consist of different types of tweet. While the biggest burst in Rumor1 consisted of rumors, that in Rumor2 consisted of the correction tweets. Burst detection can work to find the rumor in Rumor1 because the number of rumors suddenly burst and then went down quickly. On the other hand, rumors in Rumor2 spread slowly in 30 hours and went out. Burst cannot work in this case. We have to care that not always burst can work to detect rumors. 2) Retweet ratio: While the general retweet ratio is 8.03%, the retweet ratio of the two cases were 96.2% (Rumor1) and 70.7% (Rumor2). This high value can be a clue to find rumors. Figure 5 and Figure 6 show the retweet ratio among the all tweet about Rumor1 and Rumor2 respectively. We can see that the ratio got high suddenly, and kept the high ratio for a while. Obviously this is highly abnormal in retweet ratio comparing with the general value (8.03%). Using this characteristic, we can estimate that a rumor is spreading. 3) Difference of word distribution: Since words in rumor tweets seem to be different from those in correction tweets, the difference may work to detect rumors. We investigated the difference of word distribution between rumor and correction tweets in the two instances. For the investigation, we extracted a list of content word from tweets by applying morpho-logical analysis. And then, we gave scores (score(w)) to every content word

11

12

13

14

Date in Mar. 2011

Fig. 6.

Retweet ratio (Rumor2)

(w). We used the ratio of word occurrence in correction tweets (num in correction(w)) over rumor tweets (num in rumor(w)). score(w) =

num in correction(w) num in rumor(w)

(1)

If rumor tweets did not have a word, 0.1 was used for num in rumor(w). Table I is the result of score calculation for every word. This is a list of words which has higher score than 1.0. In Rumor1, there was only one word, “false rumor”. While in Rumor2, there were many words which has high score, only one word, “false rumor” was a common one. Even though a word has high score, the word may depend on a specific topic. C. Rumor recognition by content analysis If we rely on only statistics we may catch other topics such as gossips or campaigns instead of rumors. Many companies are using twitter for their campaigns now, and these cause a buzz and burst in twitter. We need a function of content analysis for tweet in order to distinguish rumors from gossip or campaigns , however, it is still hard to do automatically by content analysis. It may

Rumor1 word デマ (false rumor)

Rumor2 word 確認 (check) 虚偽 (lie) 消防 (fire fighting) 送る (send) 否定 (negative) 事実 (truth) : デマ (false rumor) TABLE I

score 630.00

score 5690.00 2910.00 2630.00 2580.00 1650.00 1400.00 : 259.00

Tweet

Scoring Retweet Ratio (2)

Target Detection (1)

W ORD DISTRIBUTION IN TWEETS

Fig. 7.

(3)

Target List w/ high Retweet Ratio

Target List

word TFIDF F requency コスモ石油 (Cosmo Oil) 17248.70 2116 有害 (harmful) 10977.46 1405 雨 (rain) 10898.68 1847 物質 (material) 8777.04 1385 爆発 (explosion) 6862.35 1129 降る (fall) 6852.59 1102 付着 (attached) 5668.96 646 傘 (umbrella) 5415.50 703 外出 (go out) 4962.68 681 カッパ (rain coat) 4861.93 552 TABLE II E XTRACTED K EYWORD FOR RUMOR 2

Scoring Clue Word

Rumor Candidates

Diagram of process flow

this research field, we should extract PERSON ORGANIZATION and LOCATION as a target of rumors for our purpose We can apply named entity extraction to all sampled tweets, and can obtain a list of named entity. The list of the named entity can be used as keyword to obtain a subset of tweets for rumor detection. IV. E XPERIMENTS

require deep semantic recognition from text especially from a short text as limited in 140 characters. Furthermore, it requires background knowledge to judge a event is rumor or not. The background knowledge may require to be built from other tweet in real time. As a compromise, summarization by tag cloud consists of weighted keyword list can help human to grasp the meaning from a set of tweets. TFIDF [6] is well known term weighting method, and it works to extract keyword that summarize a set of tweets. Table II shows a list of keyword extracted from tweets of Rumor2. These keywords help companies to recognize that a burst is a rumor or the result of their campaign. D. Target detection If everyone is watching tweets which mention them, they can detect candidates of rumor about them by using the clues described in Section III-B. But it is not realistic for all targets of rumor. A part of clues to detect rumors described in Section III-B can work with a specific subset of tweets which mention a target. Burst and retweet ratio for all tweet can find nothing about rumor. If the specific keyword cannot be prepared in advance, we need to have a subset of tweet. We propose to apply named entity extraction technique for this purpose. Since a rumor tweets like Rumor2 contain the name of target (e.g. “Cosmo Oil”), target can be extracted from the rumor tweet itself. In order to extract the name of target, ordinary named entity extraction[7] can work. Many algorithm have been proposed so far and have given high performance by using not only a dictionary but also contextual information. While several type of named entities such as DATE, LOCATION, PERSON, ORGANIZATION have been extracted in

Aiming to detect rumors on twitter, we built a system based on findings explained in Section III. The diagram of process flow is shown in Figure 7. The target data of this experiments is the same as the investigation described in Section III. A. Named Entity Extraction The system first applied named entity recognition to all tweets and extracted named entities which occurred more or equal to 30 times at least in one day. The threshold 30 is arbitrary, however, any word cannot become a big topic if it does not capture enough attention. We used an implementation proposed by Iwakura [8] for the named entity recognition. After this process, 1, 976 unique named entities were obtained. The average value of number of named entities in a day was 255. This means that there are a number of words that appear in multiple days. 1, 264 named entities were extracted in every 31 days. The extracted named entities are used as targets in the following process. B. Filtering by Retweet Ratio In the next step, the system calculated retweet ratio on each day in this month for every targets. Because a rumor spreads in a short period as shown in Figure 8, the average of retweet ratio does not work to find rumor. Figure 8 shows retweet ratio over one month for 7 targets that are found as rumors by the system at last. System selected targets whose retweet ratio was more or equal to 0.80 at least in one day. 1, 015 targets remained as the result of the selection. Most of these targets were the topics that widely spread, however, not all the topics were rumors. The following tweet is an instance of a spread topic.

1.0 0.6 0.4 0.0

0.2

Retweet ratio

0.8

Kashima GeekHouse Tsukiji Oda Radiological Inst. Hitler Takashimaya

5

10

15

20

25

30

Date in Mar. 2011

Fig. 8. Index 1 2 3 4 5 6 7 8 9 10

Retweet ratio distribution

Label in Figure 8 Kashima

Target (in English) 茨城県鹿島コンビナート (Kashima complex in Ibaraki) GeekHouse ギ ー ク ハ ウ ス 浅 草 (GeekHouse Asakusa) (same as 2) 東京都台東区花川戸 (Hanakawado Taito ward Tokyo) Tsukiji 築地市場 (Tsukiji market) Oda 尾田 (Oda) (same as 5) 尾田栄一郎 (Eiichiro Oda) Radiological Inst. 放射線医学総合研究所 (National Institute of Radiological Sciences) Hitler ヒトラー (Hitler) (same as 1) 鹿島 (Kashima) Takashimaya 新宿高島屋 (Shinjuku Takashimaya) TABLE III E XTRACTED 10 CANDIDATES OF RUMOR

The following tweet is an example of the wrong one (index 1 in Table III). We need food. Please help me. This is not a false rumor. In order to distinguish the tweet from rumors, we can apply a document classification technique. The above example has a negation expression (“is not a false rumor”) which can be easily detected by text analysis. We used only one key word “false rumor” which is based on two instances of rumor. If more instances of rumor are available, we may be able to have more key words. Furthermore, then we can give weight to these keywords which enables more flexible calculation in the filtering. If we rely on text search, we will have much more false candidates. We applied text search for each target in 1, 976 target list with the keyword “false rumor”. As the result, 667 targets has at least one tweet which have both the target word and the keyword “false rumor”, and 302 targets has at least five tweets. This result shows that only text search does not work to find rumor. Comparing the result of text search, our proposed method gave 10 candidates that are feasible for human to read the original tweets, or summaries generated by a method described in Section III-C. While the evaluation shows the accuracy of system output, the result tells nothing about the coverage (how many rumors did the system find over the all rumors in the target data set). The evaluation of coverage is not an easy task, however, we should do that with similar techniques conducted in the research field of Information Retrieval [9]. V. R ELATED WORK

C. Filtering by Clue keyword

Shirai et al. [5] attempted to modeled the diffusion of rumors. They found some characteristics of rumors on twitter by using the model, however, they did not mentioned how to detect rumor. Umejima et al. [10] investigated rumor tweets focusing on the content of retweeted tweets. While they found that some characteristics in content of rumor tweet, they did not mention how to detect rumor tweets either. Sakaki et al. [11] tried to detect earthquakes from twitter. This work focused on a single kind of event, namely only earthquake. The biggest difference between this work and ours is that any event can be rumors. We cannot prepare unexpected type of event in advance.

In the third step, we used keyword “false rumor” (“デマ” in Japanese) to detect rumors from the candidates. This keyword appeared in 19.0% (in Rumor1) and 12.5% (in Rumor2) of tweets. From the investigation, we used 10% as the threshold, in order to cover these two instances. If a set of tweet posted in one day contains the keyword “false rumor” in more than 10%, the system detects that the topic is rumor. As the result of the filtering using the clue keyword, system found 10 candidates of rumor shown in Table III including two instances introduced in the first half part of this paper. Six (2,3,4,5,6,10 in Table III) candidates out of 10 were actually rumor, and four (1,7,8,9 in Table III) were wrong.

In the theme of disaster mitigation, not only a natural disaster itself, but the information disasters have been taken place. After a disaster, we cannot erase the disaster itself, but we can mitigate the damage by information from side effects. In this paper we investigated two sets of rumor tweet, and found several characteristics of rumors. Based on the investigation, we proposed a sequence of processes to detect rumors. First, a target list has to be generation for which we can apply named entity extraction techniques. Second,we have to detect topics which are widely spreading. Burst in number

Tokyo International Forum is now opening its lobby to the public for those who cannot go home due to the earthquake. People tried to spread this message to help other ones in this case. This is not a rumor that we are targeting in this paper. Indeed, there are many other spreading topics such as witty joke, gossip about celebrities, weather forecast and so on. In twitter, these topics are generated and spread widely and quickly through twitter users. And retweet ratio tells no difference between those tweets and rumors. Hence we need another clue to detect rumors.

VI. C ONCLUSION

of target word seems to work, however, it does not because a rumor may spread slowly, on the other hand a general topic can burst. Our investigation shown that retweet ratio can work for the purpose to capture spreading topics. And third, we have to detect negative rumors from the spreading topics. We extracted a clue keyword from the instances of rumor, and confirmed that the keyword works to find other rumors as well. While we have been focusing on only twitter in this work, people are getting information from multi-channels such as other Social Networking Services, TV, online and offline chat, and so on in which they may generate, spread or correct rumors. We have to involve these other data to get high coverage as a future work. R EFERENCES [1] D. Zhao and M. B. Rosson, “How and why people twitter: the role that micro-blogging plays in informal communication at work,” in Proceedings of the ACM 2009 international conference on Supporting group work, ser. GROUP ’09. New York, NY, USA: ACM, 2009, pp. 243–252. [Online]. Available: http://doi.acm.org/10.1145/1531674. 1531710 [2] T. Hossmann, P. Carta, D. Schatzmann, F. Legendre, P. Gunningberg, and C. Rohner, “Twitter in disaster mode: security architecture,” in Proceedings of the Special Workshop on Internet and Disasters, ser. SWID ’11. New York, NY, USA: ACM, 2011, pp. 7:1–7:8. [Online]. Available: http://doi.acm.org/10.1145/2079360.2079367 [3] A. Hermida, “From tv to twitter: How ambient news became ambient journalism,” Media/Culture Journal, vol. 13, no. 2, 2010. [4] White paper 2011, Information and Communications in Japan. Ministry of Internal Affairs and Communications, Japan, 2011. [5] T. Shirai, T. Sakaki, F. Toriumi, K. Shinoda, K. Kazama, I. Noda, M. Numao, and S. Kurihara, “Estimation of false rumor diffusion model and estimation of prevention model of false rumor diffusion on twitter (in japanese),” in The 26th Anual Conference of the Japanese Society for Artificial Intelligence, 2012. [6] K. S. Jones, Journal of Documentation, vol. 28, no. 1, pp. 11 –21, 1972. [7] E. F. Tjong Kim Sang and F. De Meulder, “Introduction to the conll2003 shared task: language-independent named entity recognition,” in Proceedings of the seventh conference on Natural language learning at HLT-NAACL 2003 - Volume 4, ser. CONLL ’03. Stroudsburg, PA, USA: Association for Computational Linguistics, 2003, pp. 142–147. [Online]. Available: http://dx.doi.org/10.3115/1119176.1119195 [8] T. Iwakura, “A named entity recognition method using rules acquired from unlabeled data,” in RANLP, 2011, pp. 170–177. [9] R. A. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Boston, MA, USA: Addison-Wesley Longman Publishing Co., Inc., 1999. [10] A. Umejima, M. Miyabe, E. Aramaki, and A. Nadamoto, “Tendency of rumor and correction re-tweet on the twitter during disasters,” IPSJ SIG Notes, vol. 2011, no. 4, pp. 1–6, 2011-07-26. [Online]. Available: http://ci.nii.ac.jp/naid/110008583012/en/ [11] T. Sakaki, M. Okazaki, and Y. Matsuo, “Earthquake shakes twitter users: real-time event detection by social sensors,” in Proceedings of the 19th international conference on World wide web, ser. WWW ’10. New York, NY, USA: ACM, 2010, pp. 851–860. [Online]. Available: http://doi.acm.org/10.1145/1772690.1772777

View publication stats