go back

Volume 18, No. 4

Most Similar Biclique Search at Scale

Authors:
Deming Chu, Zhizhi Gao, Fan Zhang, Wenjie Zhang, Xuemin Lin, Zhihong Tian

Abstract

The biclique is a fundamental model of bipartite cohesive subgraphs. To analyze a bipartite graph, many existing works seek the maximum biclique, that is, the biclique with the largest number of edges. However, our finding is that the most similar biclique (i.e., the biclique whose vertices are the most similar to each other) can be a good alternative for understanding the network. Using the model, we can detect meaningful communities with high similarity and avoid unnecessary searches based on vertex similarity. In particular, we aim to find (i) local most similar biclique : the biclique that contains a query node 𝑞 and the similarity between vertices is the highest, and (ii) global most similar biclique : the biclique with the highest similarity between vertices. Despite the NP-hardness of the problems, this paper presents two efficient algorithms, Mosib and Mosib-GloApp . Specifically, our Mosib is an exact algorithm for the most similar biclique search. The algorithm incorporates three novel graph reduction rules that can reduce the size of the bipartite graph while preserving the most similar biclique, as well as two similarity-first search rules that can prioritize the bicliques with high similarity in the search. These techniques can significantly improve the practical efficiency of the algorithm. Meanwhile, our Mosib-GloApp is an approximate algorithm that adopts a novel MinHash-based dividing method, and it can further improve the efficiency of the global most similar biclique search. We experimentally evaluate our algorithms on realworld networks, and show that the most similar biclique models can find meaningful results while being computed efficiently.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy