k -core: Theories and applications

https://doi.org/10.1016/j.physrep.2019.10.004 Get rights and content

Abstract

With the rapid development of science and technology, the world is becoming increasingly connected. The following dire need for understanding both the relationships amongst individuals and the global structural characteristics brings forward the study of network sciences and many interdisciplinary subjects in recent years. As a result, it is crucial to have methods and algorithms that help us to unveil the structural properties of a network. Over the past few decades, many essential algorithms have been developed by scientists from many different fields. This review will focus on one of the most widely used methods called the k -core decomposition. The k -core decomposition is to find the largest subgraph of a network, in which each node has at least k neighbors in the subgraph. The most commonly used algorithm to perform k -core decomposition is a pruning process that to recursively remove the nodes that have degrees less than k . The algorithm was firstly proposed by Seidman in 1983 and soon became one of the most popular algorithms to detect the network structure due to its simplicity and broad applicability. This algorithm is widely adopted to find the densest part of a network across a broad range of scientific subjects including biology, network science, computer science, ecology, economics, social sciences, etc., so to achieve the vital knowledge under different contexts. Besides, a few physicists find that an exciting phase transition emerges with various critical behaviors during the pruning process. This review aims at filling the gap by making a comprehensive review of the theoretical advances on k -core decomposition problem, along with a review of a few applications of the k -core decomposition from many interdisciplinary perspectives.

Introduction

In the recent years, complex networks are widely used to model the real-world systems that are composed of interacting individuals [1], [2], [3], [4], [5], [6], [7], [8]. By exploring the structural characteristics of the network, we are able to have an in-depth understanding of the properties of the real-world systems. k -core decomposition, is one of the most widely accepted algorithms due to its linear time complexity [9] and intuitive characteristics. In general, the k -core of a network is the maximal subgraph in which each node has at least k connections to other nodes in the subgraph, despite how many links we have outside the subgraph. The idea of k -core can be traced back to Erdős and Hajnal [10] in 1966, they proposed that the coloring number of a graph G to be the least k for which there exists an ordering of the nodes of G in which each node has fewer than k neighbors that are earlier in the ordering. An equivalent expression called the degeneracy was later defined by Lick and White in 1970 [11], they defined that the degeneracy of a graph G is the least k such that for each induced subgraph of G , at least one node in the subgraph has k or fewer neighbors.

The commonly accepted concept of k -core was first proposed by Seidman [12] and Seidman also derived an algorithm called the k -core pruning process to obtain the k -core of a given network, which is to remove the nodes that have degree less than k recursively. Here we show a simple illustration of the concepts in Fig. 1.

With this method, the densely connected area can be identified and in order to be included in the k -core, a node must have at least k links to other nodes in the k -core, regardless of how many other nodes they are connected to outside the k -core. Another closely related concept is called the k -shell [13], which is defined as the group of nodes that belong to k -core but not belong to ( k + 1 ) -core. Also, a very similar concept that is widely used, the coreness of a node [14]. The statement that the coreness of a node equals k , is equivalent to the statement that the node is in the k -shell of the network. As the k -core pruning process is both simple (linear time complexity) and intuitive, later on, the concept of k -core has become surprisingly popular, and widely been applied in many scientific fields.

In the researches of the k -core, people come to notice the emergence of criticality of the k -core pruning process. The only two outcomes after the pruning process are either the network disappears and no node survives the pruning, or no nodes can be further removed so that the network finally has a fix-sized k -core. Researchers [15], [16], [17], [18], [19], [20] find that whether or not the network has a k -core remaining after the pruning process very much depends on the density of the network, and there exists a specific criterion of the initial density of the network that controls the existence of the k -core. Many mathematicians and physicists have made essential contributions in solving the problem theoretically in the past decades [15], [18], [19], [21], [22], [23]. Recently, it is reported that the exact analytical result has been obtained [20], [24], [25]. The k -core pruning process is among the few examples that the precise critical behavior is presented in detail and solved analytically. The analytical solution of k -core pruning process provides new insights to study the critical phenomena that attract so many scientists. In this review we give a summary of the related researches in this direction, along with a survey of some important applications of k -core on various research fields.

The review article will be organized as follows. Section 2 will review the theoretical researches of the critical behavior in k -core decomposition, with special emphasis on the theoretical framework to solve the problem and the exact analytical solution that describes the entire pruning process. In Section 3, we will review the applications of k -core decomposition in many different scientific fields, to show how this method is used to uncover the underlying information in various kinds of real systems, as well as multiple variations of the standard k -core decomposition that have been extended to many different types of networks, including the weighted networks, directed networks, multi-layer networks, dynamic networks and also. Many of the research papers that will be covered in this section already made a great impact in their own disciplines by introducing k -core decomposition as a tool to unveil the underlying information. To conclude, Section 4 will summarize the most important messages of this review, and outline the future challenges related to this research topic.

Section snippets

Background concepts

Before we start to survey the researches, we will briefly introduce some fundamental concepts that we will frequently use in the numerous theoretical studies of the k -core, to facilitate the reading throughout the theoretical section in this review.

Applications of k -core decomposition

Above we have reviewed the theoretical advances in analyzing the k -core problem. Recent studies in the critical behavior of the k -core pruning process have shown significant developments. Meanwhile, due to the simplicity and the effectiveness of the k -core decomposition, extensive applications of k -core, k -shell, or coreness have been widely seen in a variety of scientific fields, including biology, ecology, computer sciences, information spreading, geology and so on. In the following, we will

Summary

In recent years, the complex networks are found to be particularly useful to model the real-world systems [133], [134], [135], [136]. To understand the characteristics of the systems that we desire to know, many algorithms have been developed by researchers with different backgrounds. One of the most important algorithms is k -core pruning process, also called k -core decomposition. It serves as a commonly used algorithm and has been widely used in many research areas like detecting community

Acknowledgment

We thank Prof. Jianyang Zhao and Prof. Bolun Chen for helpful discussions. This work is supported by the Swiss National Science Foundation