ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

A Shapley Value-Based Approach to Discover Influential Nodes in Social Networks

Narayanam, Ramasuri and Narahari, Yadati (2011) A Shapley Value-Based Approach to Discover Influential Nodes in Social Networks. In: IEEE Transactions on Automation Science and Engineering, 8 (1). pp. 130-147.

[img] PDF
Shaply.pdf - Published Version
Restricted to Registered users only

Download (810Kb) | Request a copy
Official URL: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumb...

Abstract

Our study concerns an important current problem, that of diffusion of information in social networks. This problem has received significant attention from the Internet research community in the recent times, driven by many potential applications such as viral marketing and sales promotions. In this paper, we focus on the target set selection problem, which involves discovering a small subset of influential players in a given social network, to perform a certain task of information diffusion. The target set selection problem manifests in two forms: 1) top-k nodes problem and 2) lambda-coverage problem. In the top-k nodes problem, we are required to find a set of k key nodes that would maximize the number of nodes being influenced in the network. The lambda-coverage problem is concerned with finding a set of k key nodes having minimal size that can influence a given percentage lambda of the nodes in the entire network. We propose a new way of solving these problems using the concept of Shapley value which is a well known solution concept in cooperative game theory. Our approach leads to algorithms which we call the ShaPley value-based Influential Nodes (SPINs) algorithms for solving the top-k nodes problem and the lambda-coverage problem. We compare the performance of the proposed SPIN algorithms with well known algorithms in the literature. Through extensive experimentation on four synthetically generated random graphs and six real-world data sets (Celegans, Jazz, NIPS coauthorship data set, Netscience data set, High-Energy Physics data set, and Political Books data set), we show that the proposed SPIN approach is more powerful and computationally efficient. Note to Practitioners-In recent times, social networks have received a high level of attention due to their proven ability in improving the performance of web search, recommendations in collaborative filtering systems, spreading a technology in the market using viral marketing techniques, etc. It is well known that the interpersonal relationships (or ties or links) between individuals cause change or improvement in the social system because the decisions made by individuals are influenced heavily by the behavior of their neighbors. An interesting and key problem in social networks is to discover the most influential nodes in the social network which can influence other nodes in the social network in a strong and deep way. This problem is called the target set selection problem and has two variants: 1) the top-k nodes problem, where we are required to identify a set of k influential nodes that maximize the number of nodes being influenced in the network and 2) the lambda-coverage problem which involves finding a set of influential nodes having minimum size that can influence a given percentage lambda of the nodes in the entire network. There are many existing algorithms in the literature for solving these problems. In this paper, we propose a new algorithm which is based on a novel interpretation of information diffusion in a social network as a cooperative game. Using this analogy, we develop an algorithm based on the Shapley value of the underlying cooperative game. The proposed algorithm outperforms the existing algorithms in terms of generality or computational complexity or both. Our results are validated through extensive experimentation on both synthetically generated and real-world data sets.

Item Type: Journal Article
Additional Information: Copyright 2011 IEEE. Personal use of this material is permitted.However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Keywords: Diffusion of information; influential nodes; Shapley value; social networks; target set selection; top-k nodes; lambda-coverage
Department/Centre: Division of Electrical Sciences > Computer Science & Automation (Formerly, School of Automation)
Date Deposited: 01 Mar 2011 11:22
Last Modified: 01 Mar 2011 11:22
URI: http://eprints.iisc.ernet.in/id/eprint/35821

Actions (login required)

View Item View Item