NetProbe: A Fast and Scalable System for Fraud Detection in Online Auction Networks Shashank Pandit, Duen Horng Chau, Samuel Wang, Christos Faloutsos ∗ Carnegie Mellon University Pittsburgh, PA 15213, USA {shashank,
dchau, samuelwang, christos}@cs.cmu.edu
ABSTRACT Given a large online network of online auction users and their histories of transactions, how can we spot anomalies and auction fraud? This paper describes the design and implementation of NetProbe, a system that we propose for solving this problem. NetProbe models auction users and transactions as a Markov Random Field tuned to detect the suspicious patterns that fraudsters create, and employs a Belief Propagation mechanism to detect likely fraudsters. Our experiments show that NetProbe is both efficient and effective for fraud detection. We report experiments on synthetic graphs with as many as 7,000 nodes and 30,000 edges, where NetProbe was able to spot fraudulent nodes with over 90% precision and recall, within a matter of seconds. We also report experiments on a real dataset crawled from eBay, with nearly 700,000 transactions between more than 66,000 users, where NetProbe was highly effective at unearthing hidden networks of fraudsters, within a realistic response time of about 6 minutes. For scenarios where the underlying data is dynamic in nature, we propose Incremental NetProbe, which is an approximate, but fast, variant of NetProbe. Our experiments prove that Incremental NetProbe executes nearly doubly fast as compared to NetProbe, while retaining over 99% of its accuracy.
Categories and Subject Descriptors H.4.m [Information Systems]: Miscellaneous
General Terms Algorithms
Keywords Fraud detection, Bipartite cores, Markov random fields, Belief propagation ∗ This material is based upon work supported by the National Science Foundation under Grants No. IIS-0326322 IIS-0534205. This work is also supported in part by the Pennsylvania Infrastructure Technology Alliance (PITA), an IBM Faculty Award, a Yahoo Research Alliance Gift, with additional funding from Intel, NTT and Hewlett-Packard. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, or other funding parties.
Copyright is held by the International World Wide Web Conference Committee (IW3C2). Distribution of these papers is limited to classroom use, and personal use by others. WWW 2007, May 8–12, 2007, Banff, Alberta, Canada. ACM 978-1-59593-654-7/07/0005.
Online Auction Site Auction data modelled as graph Nodes: users Edges: transactions
...
...
...
Crawler Agents 2-tier parallelizable. Multiple agents with multiple threads to download auction data.
...
Data Master
NetProbe
Maintain centralized queue to avoid redundant crawlling.
Application Server Runs algorithms to spot suspicious patterns in auction graph.
XM
L
User Queries Trustworthiness of "Alisher" User enters the user ID "Alisher" into a Java applet that talks to the server, which sends assessment results in an XML file. The applet interprets and visualizes suspicious networks.
Figure 1: Overview of the NetProbe system
1. INTRODUCTION Online auctions have been thriving as a business over the past decade. People from all over the world trade goods worth millions of dollars every day using these virtual marketplaces. EBay1 , the world’s largest auction site, reported a third quarter revenue of $1,449 billion, with over 212 million registered users [6]. These figures represent a 31% growth in revenue and 26% growth in the number of registered users over the previous year. Unfortunately, rapid commercial success has made auction sites a lucrative medium for committing fraud. For more than half a decade, auction fraud has been the most preval