3.3 Complexity control in overlapping stochastic block models

2 janvier 2017
Durée : 00:54:07
Nombre de vues 10
Nombre d’ajouts dans une liste de lecture 0
Nombre de favoris 0

Networks are highly used to represent complex systems as sets of interactions between units of interest. For instance, regulatory networks can describe the regulation of genes with transcriptional factors while metabolic networks focus on representing pathways of biochemical reactions. In social sciences, networks are commonly used to represent re- lational ties between actors. Numerous graph clustering algorithms have been proposed since the earlier work of Moreno [2]. Most of them partition the vertices into disjoint clus- ters depending on their con- nection profiles. However, recent studies showed that these techniques were too restrictive since most existing networks contained overlapping clus- ters. To tackle this issue, we proposed the Overlapping Stochastic Block Model (OSBM) in [1]. This approach allows the vertices of a network to belong to multiple classes and can be seen as a generalization of the stochastic block model [3]. In [1], we developed a variational method to cluster the vertices of networks and showed that the algorithm had good clustering performances on both simulated and real data. However, no criterion was proposed to estimate the number of classes from the data, which is a major issue in practice. Here, we tackle this limit using a Bayesian framework. Thus, we introduce some priors over the model parameters and consider variational Bayes methods to approximate the full posterior distribution. We show how a model selection criterion can be obtained in order to estimate the number of (overlapping) clusters in a network. On both simulated and real data, we compare our work with other approaches.

 Informations