The Network Completion Problem: Inferring Missing Nodes and Edges in Networks

PDF Version Also Available for Download.

Description

Network structures, such as social networks, web graphs and networks from systems biology, play important roles in many areas of science and our everyday lives. In order to study the networks one needs to first collect reliable large scale network data. While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying ... continued below

Physical Description

PDF-file: 18 pages; size: 0.4 Mbytes

Creation Information

Kim, M & Leskovec, J November 14, 2011.

Context

This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided by UNT Libraries Government Documents Department to Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 13 times , with 4 in the last month . More information about this article can be viewed below.

Who

People and organizations associated with either the creation of this article or its content.

Publisher

Provided By

UNT Libraries Government Documents Department

Serving as both a federal and a state depository library, the UNT Libraries Government Documents Department maintains millions of items in a variety of formats. The department is a member of the FDLP Content Partnerships Program and an Affiliated Archive of the National Archives.

Contact Us

What

Descriptive information to help identify this article. Follow the links below to find similar items on the Digital Library.

Description

Network structures, such as social networks, web graphs and networks from systems biology, play important roles in many areas of science and our everyday lives. In order to study the networks one needs to first collect reliable large scale network data. While the social and information networks have become ubiquitous, the challenge of collecting complete network data still persists. Many times the collected network data is incomplete with nodes and edges missing. Commonly, only a part of the network can be observed and we would like to infer the unobserved part of the network. We address this issue by studying the Network Completion Problem: Given a network with missing nodes and edges, can we complete the missing part? We cast the problem in the Expectation Maximization (EM) framework where we use the observed part of the network to fit a model of network structure, and then we estimate the missing part of the network using the model, re-estimate the parameters and so on. We combine the EM with the Kronecker graphs model and design a scalable Metropolized Gibbs sampling approach that allows for the estimation of the model parameters as well as the inference about missing nodes and edges of the network. Experiments on synthetic and several real-world networks show that our approach can effectively recover the network even when about half of the nodes in the network are missing. Our algorithm outperforms not only classical link-prediction approaches but also the state of the art Stochastic block modeling approach. Furthermore, our algorithm easily scales to networks with tens of thousands of nodes.

Physical Description

PDF-file: 18 pages; size: 0.4 Mbytes

Source

  • Presented at: SIAM International Conference on Data Mining (SDM), Mesa, AZ, United States, Apr 28 - Apr 30, 2011

Language

Item Type

Identifier

Unique identifying numbers for this article in the Digital Library or other systems.

  • Report No.: LLNL-CONF-513839
  • Grant Number: W-7405-ENG-48
  • Office of Scientific & Technical Information Report Number: 1035599
  • Archival Resource Key: ark:/67531/metadc837635

Collections

This article is part of the following collection of related materials.

Office of Scientific & Technical Information Technical Reports

What responsibilities do I have when using this article?

When

Dates and time periods associated with this article.

Creation Date

  • November 14, 2011

Added to The UNT Digital Library

  • May 19, 2016, 3:16 p.m.

Description Last Updated

  • Dec. 6, 2016, 1:52 p.m.

Usage Statistics

When was this article last used?

Yesterday: 0
Past 30 days: 4
Total Uses: 13

Interact With This Article

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

Citations, Rights, Re-Use

Kim, M & Leskovec, J. The Network Completion Problem: Inferring Missing Nodes and Edges in Networks, article, November 14, 2011; Livermore, California. (digital.library.unt.edu/ark:/67531/metadc837635/: accessed August 17, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT Libraries Government Documents Department.