Aiming at the problem of over-sampling for high-degree nodes and low-degree nodes in current sampling algorithms, a node Neighborhood Clustering coefficient Hierarchical Random Walk (NCHRW) sampling method is proposed. Firstly, the idea of hierarchy and degree distribution are adopted, and the k-means clustering algorithm is used to determine the value of the number of layers; secondly, combining the accuracy degree distribution to determine the boundary value between each hierarchical network; thirdly, sampling is carried out not only by taking the degree of the current node, the number of common neighbors between the current node and its neighbors, but the clustering coefficient of these neighbors into consideration at each layer. Finally, on eight real networks and one synthetic network, NCHRW and existing algorithms are compared from six aspects of degree distribution, density, average degree, average clustering coefficient, transitivity and sampling network visualization. The results show that the proposed NCHRW method is significantly better than other nine traditional sampling algorithms in terms of degree distribution, density and average degree, the topology properties of the network can be preserved very well.

Complex Network Hierarchical Sampling Method Combining Node Neighborhood Clustering Coefficient with Random Walk

Fiumara, Giacomo;De Meo, P
2022-01-01

Abstract

Aiming at the problem of over-sampling for high-degree nodes and low-degree nodes in current sampling algorithms, a node Neighborhood Clustering coefficient Hierarchical Random Walk (NCHRW) sampling method is proposed. Firstly, the idea of hierarchy and degree distribution are adopted, and the k-means clustering algorithm is used to determine the value of the number of layers; secondly, combining the accuracy degree distribution to determine the boundary value between each hierarchical network; thirdly, sampling is carried out not only by taking the degree of the current node, the number of common neighbors between the current node and its neighbors, but the clustering coefficient of these neighbors into consideration at each layer. Finally, on eight real networks and one synthetic network, NCHRW and existing algorithms are compared from six aspects of degree distribution, density, average degree, average clustering coefficient, transitivity and sampling network visualization. The results show that the proposed NCHRW method is significantly better than other nine traditional sampling algorithms in terms of degree distribution, density and average degree, the topology properties of the network can be preserved very well.
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11570/3240190
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact