CS224W: Analysis of Networks Lecture Notes PDF

Document Details

HarmlessPrehistoricArt

Uploaded by HarmlessPrehistoricArt

Stanford University

Jure Leskovec

Tags

network analysis community detection structural roles computer science

Summary

These lecture notes from Stanford University's CS224W course detail network analysis concepts, focusing on roles and communities within networks. The notes explain the concept of structural roles and describe how RolX is used to discover them. The content also includes examples and applications of these techniques.

Full Transcript

CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Roles Communities RolX Fa...

CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu Roles Communities RolX Fast Modularity Henderson, et al., KDD 2012 Clauset, et al., Phys. Rev. E 2004 Nodes with different structural roles Nodes belonging to the same (connector node, bridge node, etc.) cluster/community 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 2 Plan for Today: ¡ Structural role discovery in networks ¡ Community detection via Modularity optimization 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 3 ¡ Roles are “functions” of nodes in a network: § Roles of species in ecosystems § Roles of individuals in companies ¡ Roles are measured by structural behaviors: § Centers of stars § Members of cliques § Peripheral nodes, etc. 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 5 centers of stars members of cliques peripheral nodes Network Science Co-authorship network [Newman 2006] 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 6 ¡ Role: A collection of nodes which have similar positions in a network: ¡ Roles are based on the similarity of ties among subsets of nodes § Different from community (or cohesive subgroup) § Group is formed based on adjacency, proximity or reachability § This is typically adopted in current data mining Nodes with the same role need not be in direct, or even indirect interaction with each other 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 7 ¡ Roles: § A group of nodes with similar structural properties ¡ Communities: § A group of nodes that are well-connected to each other ¡ Roles and communities are complementary ¡ Consider the social network of a CS Dept: § Roles: Faculty, Staff, Students § Communities: AI Lab, Info Lab, Theory Lab 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 8 ¡ Structural equivalence: Nodes ! and " are structurally equivalent if they have the same relationships to all other nodes [Lorrain & White 1971] § Structurally equivalent nodes are likely to be similar in other ways – i.e., friendships in social networks a b c u v d e 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 9 ¡ Nodes ! and " are structurally equivalent: § For all the other nodes #, node ! has tie to # iff node " has tie to # Adjacency matrix ¡ Example: 1 2 1 2 3 4 5 1 - 0 1 1 0 2 0 - 1 1 0 3 4 3 0 0 - 0 1 4 0 0 0 - 1 5 5 0 0 0 0 - ¡ E.g., nodes 3 and 4 are structurally equivalent 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 48 Task Example Application Role query Identify individuals with similar behavior to a known target Role outliers Identify individuals with unusual behavior Role dynamics Identify unusual changes in behavior Identity resolution Identify/de-anonymize, individuals in a new network Role transfer Use knowledge of one network to make predictions in another Network comparison Compute similarity of networks, determine compatibility for knowledge transfer 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 12 ¡ RolX: Automatic discovery Role Discovery of nodes’ structural roles in Input networks [Henderson, et al. 2011b] Output § Unsupervised learning approach § No prior knowledge required § Assigns a mixed-membership of roles to each node üAutomated discovery § Scales linearly in #(edges) üRoles Behavioral roles ü generalize 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 13 Input Example: degree, mean Recursive weight, # of edges in Node × Node Node × Feature ego-network, mean Feature Adjacency Matrix Matrix clustering coefficient of Extraction neighbors, etc. Role Extraction Node × Role Role × Feature Matrix Matrix Output 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 14 ¡ Recursive feature extraction [Henderson, et al. 2011a] turns network connectivity into structural features Regional Neighborhood Local Egonet Recursive 1411# 0# 1# 2# 1# 0# 0# 0# 1# 1# 0# 1# 0# 0# 1# 1# 2# 2# Recursive 1410# 0# 1# 1# 1# 0# 1# 0# 0# 1# 0# 1# 0# 1# 0# 1# 1# 1# 338# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 1# 0# 0# 0# 339# 1# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 1# 0# 1# 0# 0# 0# 1415# 0# 1# 1# 2# 0# 1# 0# 0# 0# 0# 0# 0# 1# 1# 1# 1# 1# feature 941# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 1414# 0# 1# 1# 1# 0# 1# 0# 0# 0# 0# 0# 0# 1# 1# 0# 1# 1# 942# 0# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 1413# 0# 1# 1# 1# 0# 1# 1# 0# 0# 0# 0# 0# 1# 1# 0# 1# 1# extraction 1412# 0# 0# 0# 0# 0# 0# 0# 1# 2# 0# 1# 1# 0# 0# 1# 2# 0# 940# 0# 0# 1# 0# 0# 0# 0# 1# 0# 0# 0# 1# 1# 0# 1# 1# 1# ReFeX 1419# 0# 0# 1# 0# 0# 1# 0# 1# 1# 0# 1# 1# 1# 0# 1# 1# 1# Nodes 945# 0# 1# 4# 3# 0# 0# 0# 0# 2# 0# 1# 0# 0# 2# 1# 3# 1# 332# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 1418# 0# 0# 1# 0# 0# 0# 0# 1# 0# 0# 0# 1# 2# 0# 1# 0# 1# 946# 0# 1# 1# 0# 0# 1# 0# 1# 0# 0# 0# 1# 4# 0# 1# 1# 2# 333# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 1417# 0# 1# 1# 1# 0# 2# 0# 0# 1# 0# 1# 0# 1# 0# 1# 1# 1# 943# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 1# 0# 0# 330# 1# 3# 2# 0# 1# 2# 2# 0# 2# 2# 2# 0# 3# 1# 0# 2# 5# 1416# 0# 1# 1# 1# 1# 2# 0# 0# 1# 0# 1# 0# 1# 0# 0# 1# 1# 944# 0# 1# 4# 2# 0# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 3# 1# 331# 0# 3# 2# 1# 0# 1# 0# 0# 2# 0# 2# 0# 2# 0# 1# 2# 5# 949# 0# 0# 0# 0# 2# 0# 0# 1# 0# 1# 0# 1# 0# 0# 0# 0# 0# 336# 0# 0# 0# 0# 2# 0# 0# 1# 1# 1# 1# 1# 0# 0# 0# 1# 0# 337# 1# 1# 1# 0# 0# 1# 2# 0# 1# 1# 1# 0# 1# 1# 1# 1# 1# 947# 1# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 1# 0# 1# 0# 0# 0# 334# 0# 0# 0# 1# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 948# 0# 0# 0# 0# 0# 1# 0# 1# 1# 0# 1# 1# 1# 0# 1# 1# 0# 335# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 1# 0# 0# 531# 1# 0# 0# 0# 1# 0# 2# 0# 0# 2# 0# 0# 0# 2# 0# 0# 0# ¡ Neighborhood features: What is a node’s connectivity pattern? ¡ Recursive features: To what kinds of nodes is a node connected? 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 15 ¡ Idea: Aggregate features of a node and use them to generate new recursive features ¡ Base set of a node’s neighborhood features: § Local features: All measures of the node degree: § If network is directed, include in- and out-degree, total degree § If network is weighted, include weighted feature versions § Egonetwork features: Computed on the node’s egonet: § Egonet includes the node, its neighbors, and any edges in the induced subgraph on these nodes § #(within-egonet edges), #(edges entering/leaving egonet) Egonet for red node 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 16 ¡ Start with the base set of node features ¡ Use the set of current node features to generate additional features: § Two types of aggregate functions: means and sums § E.g., mean value of “unweighted degree” feature among all neighbors of a node § Compute means and sums over all current features, including other recursive features Features § Repeat 1411# 1410# 338# 0# 0# 0# 1# 1# 0# 2# 1# 0# 1# 1# 0# 0# 0# 1# 0# 1# 0# 0# 0# 1# 1# 0# 0# 1# 1# 0# 0# 0# 1# 1# 1# 0# 0# 0# 0# 0# 1# 0# 1# 0# 1# 1# 1# 0# 2# 1# 0# 2# 1# 0# The number of possible recursive 339# 1# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 1# 0# 1# 0# 0# 0# 1415# 0# 1# 1# 2# 0# 1# 0# 0# 0# 0# 0# 0# 1# 1# 1# 1# 1# ¡ 941# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 1414# 0# 1# 1# 1# 0# 1# 0# 0# 0# 0# 0# 0# 1# 1# 0# 1# 1# 942# 0# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 1413# 0# 1# 1# 1# 0# 1# 1# 0# 0# 0# 0# 0# 1# 1# 0# 1# 1# 1412# 0# 0# 0# 0# 0# 0# 0# 1# 2# 0# 1# 1# 0# 0# 1# 2# 0# features grows exponentially with 940# 0# 0# 1# 0# 0# 0# 0# 1# 0# 0# 0# 1# 1# 0# 1# 1# 1# Nodes 1419# 0# 0# 1# 0# 0# 1# 0# 1# 1# 0# 1# 1# 1# 0# 1# 1# 1# Output 945# 0# 1# 4# 3# 0# 0# 0# 0# 2# 0# 1# 0# 0# 2# 1# 3# 1# 332# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 1418# 0# 0# 1# 0# 0# 0# 0# 1# 0# 0# 0# 1# 2# 0# 1# 0# 1# 946# 0# 1# 1# 0# 0# 1# 0# 1# 0# 0# 0# 1# 4# 0# 1# 1# 2# 333# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# each recursive iteration: 1417# 0# 1# 1# 1# 0# 2# 0# 0# 1# 0# 1# 0# 1# 0# 1# 1# 1# 943# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 1# 0# 0# 330# 1# 3# 2# 0# 1# 2# 2# 0# 2# 2# 2# 0# 3# 1# 0# 2# 5# 1416# 0# 1# 1# 1# 1# 2# 0# 0# 1# 0# 1# 0# 1# 0# 0# 1# 1# 944# 0# 1# 4# 2# 0# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 3# 1# 331# 0# 3# 2# 1# 0# 1# 0# 0# 2# 0# 2# 0# 2# 0# 1# 2# 5# 949# 0# 0# 0# 0# 2# 0# 0# 1# 0# 1# 0# 1# 0# 0# 0# 0# 0# 336# 0# 0# 0# 0# 2# 0# 0# 1# 1# 1# 1# 1# 0# 0# 0# 1# 0# 337# 1# 1# 1# 0# 0# 1# 2# 0# 1# 1# 1# 0# 1# 1# 1# 1# 1# § Reduce the number of features using a 947# 1# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 1# 0# 1# 0# 0# 0# 334# 0# 0# 0# 1# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 948# 0# 0# 0# 0# 0# 1# 0# 1# 1# 0# 1# 1# 1# 0# 1# 1# 0# 335# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 1# 0# 0# 531# 1# 0# 0# 0# 1# 0# 2# 0# 0# 2# 0# 0# 0# 2# 0# 0# 0# pruning technique: § Look for pairs of features that are highly correlated § Eliminate one of the features whenever two features are correlated above a user-defined threshold 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 17 Input 1411# 0# 1# 2# 1# 0# Features 0# 0# 1# 1# 0# 1# 0# 0# 1# 1# 2# 2# 1410# 0# 1# 1# 1# 0# 1# 0# 0# 1# 0# 1# 0# 1# 0# 1# 1# 1# 338# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 1# 0# 0# 0# 339# 1# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 1# 0# 1# 0# 0# 0# 1415# 0# 1# 1# 2# 0# 1# 0# 0# 0# 0# 0# 0# 1# 1# 1# 1# 1# 941# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 1414# 0# 1# 1# 1# 0# 1# 0# 0# 0# 0# 0# 0# 1# 1# 0# 1# 1# 942# 0# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# Recursively 1413# 0# 1# 1# 1# 0# 1# 1# 0# 0# 0# 0# 0# 1# 1# 0# 1# 1# 1412# 0# 0# 0# 0# 0# 0# 0# 1# 2# 0# 1# 1# 0# 0# 1# 2# 0# 940# 0# 0# 1# 0# 0# 0# 0# 1# 0# 0# 0# 1# 1# 0# 1# 1# 1# Nodes 1419# 0# 0# 1# 0# 0# 1# 0# 1# 1# 0# 1# 1# 1# 0# 1# 1# 1# 945# 0# 1# 4# 3# 0# 0# 0# 0# 2# 0# 1# 0# 0# 2# 1# 3# 1# extract features 332# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 1418# 0# 0# 1# 0# 0# 0# 0# 1# 0# 0# 0# 1# 2# 0# 1# 0# 1# 946# 0# 1# 1# 0# 0# 1# 0# 1# 0# 0# 0# 1# 4# 0# 1# 1# 2# 333# 0# 0# 0# 0# 1# 0# 1# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 1417# 0# 1# 1# 1# 0# 2# 0# 0# 1# 0# 1# 0# 1# 0# 1# 1# 1# 943# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 1# 0# 0# 330# 1# 3# 2# 0# 1# 2# 2# 0# 2# 2# 2# 0# 3# 1# 0# 2# 5# 1416# 0# 1# 1# 1# 1# 2# 0# 0# 1# 0# 1# 0# 1# 0# 0# 1# 1# 944# 0# 1# 4# 2# 0# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 3# 1# 331# 0# 3# 2# 1# 0# 1# 0# 0# 2# 0# 2# 0# 2# 0# 1# 2# 5# 949# 0# 0# 0# 0# 2# 0# 0# 1# 0# 1# 0# 1# 0# 0# 0# 0# 0# 336# 0# 0# 0# 0# 2# 0# 0# 1# 1# 1# 1# 1# 0# 0# 0# 1# 0# 337# 1# 1# 1# 0# 0# 1# 2# 0# 1# 1# 1# 0# 1# 1# 1# 1# 1# 947# 1# 0# 0# 0# 2# 0# 1# 0# 0# 2# 0# 1# 0# 1# 0# 0# 0# 334# 0# 0# 0# 1# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 948# 0# 0# 0# 0# 0# 1# 0# 1# 1# 0# 1# 1# 1# 0# 1# 1# 0# 335# 0# 0# 0# 1# 0# 0# 0# 0# 0# 0# 0# 0# 0# 0# 1# 0# 0# 531# 1# 0# 0# 0# 1# 0# 2# 0# 0# 2# 0# 0# 0# 2# 0# 0# 0# 1) Can compare nodes based on their structural similarity 2) Can cluster nodes to identify different Output structural roles e.g, RolX uses a clustering technique called non-negative matrix factorization 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 18 ¡ Task: Cluster nodes based on their structural similarity ¡ Two networks: § Network science co-authorship network: § Nodes: Network scientists; Edges: The number of co-authored papers § Political books co-purchasing network: § Nodes: Political books on Amazon; Edges: Frequent co-purchasing of books by the same buyers ¡ Setup: For each network: § Use RolX to assign each node a distribution over the set of discovered, structural roles § Determine similarity between nodes by comparing their role distributions 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 19 IP traffic classes are well-separated in role space” with as few as 3 roles. (a) t showing the degree of membership of P2P, and Web host in each of three roles. density plot obtained by adding uniform (a) Role-colored Visualization of the Network to reveal overlapping points. DEVICE RolX Baseline in a) of Time s. DEVICE (a) m (a) Business Role-colored Role-colored Student Visualization graph: vs. Resteach nodeof the Networkby is colored Role affinity heat-map the primary role that RolX finds (b) Role Affinity Heat Map RolX Baseline Figure 9: RolX e↵ectively discovers roles in the Making sense of roles: Network Science Co-authorship Graph. (a) Author ¡ Blue circle: Tightly knit, nodes that participate in tightly-coupled groups network RolX discovered four roles, like the het- ¡ Red diamond: Bridge nodes, that connectbridges erophilous groups (red of nodes diamond ), as well as the ho- ¡ Gray rectangle: Main-stream, mostmophilous “pathy” nodesclique, of nodes, neither a (green nor a chain(b) Affin- triangle) ¡ Green triangle: Pathy, nodes thatity belong matrix to elongated (red clustersblue is low) - strong is high score, homophily for roles #1 and #4. 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 20 Roles Communities RolX Fast Modularity Henderson, et al., KDD 2012 Clauset, et al., Phys. Rev. E 2004 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 22 ¡ We often think of networks “looking” like this: ¡ What led to such a conceptual picture? 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 23 ¡ How does information flow through the network? § What structurally distinct roles do nodes play? § What roles do different links (“short” vs. “long”) play? ¡ How do people find out about new jobs? § Mark Granovetter, part of his PhD in 1960s § People find the information through personal contacts ¡ But: Contacts were often acquaintances rather than close friends § This is surprising: One would expect your friends to help you out more than casual acquaintances ¡ Why is it that acquaintances are most helpful? 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 24 [Granovetter ‘73] ¡ Two perspectives on friendships: § Structural: Friendships span different parts of the network § Interpersonal: Friendship between two people is either strong or weak ¡ Structural role: Triadic Closure a If two people in a network have a friend in common, then there is an increased likelihood b they will become friends c themselves. Which edge is more likely, a-b or a-c? 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 25 ¡ Granovetter makes a connection between social and structural role of an edge ¡ First point: Structure § Structurally embedded edges are also socially strong § Long-range edges spanning different parts of the network are socially weak ¡ Second point: Information § Long-range edges allow you to gather information from different parts of the network and get a job § Structurally embedded edges are S Weak Strong heavily redundant in terms of S a W b S information access S 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 26 ¡ Triadic closure == High clustering coefficient Reasons for triadic closure: ¡ If ! and " have a friend # in common, then: § ! is more likely to meet " B § (since they both spend time with #) § ! and " trust each other A C § (since they have a friend in common) § # has incentive to bring ! and " together § (since it is hard for # to maintain two disjoint relationships) ¡ Empirical study by Bearman and Moody: § Teenage girls with low clustering coefficient are more likely to contemplate suicide 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 27 ¡ For many years Granovetter’s theory was not tested ¡ But, today we have large who-talks-to-whom graphs: § Email, Messenger, Cell phones, Facebook ¡ Onnela et al. 2007: § Cell-phone network of 20% of country’s population § Edge strength: # phone calls 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 28 ¡ Edge overlap: |&(() ⋂ & + | !"# = |&(() ⋃ & + | § &(() … a set of neighbors of node ( ¡ Overlap = - when an edge is a local bridge 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 29 ¡ Cell-phone network ¡ Observation: § Highly used links True Neighborhood overlap have high overlap! Permuted strengths ¡ Legend: § True: The data § Permuted strengths: Keep the network structure but randomly reassign edge strengths Edge strength (#calls) 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 30 ¡ Real edge strengths in mobile call graph § Strong ties are more embedded (have higher overlap) 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 31 ¡ Same network, same set of edge strengths but now strengths are randomly shuffled 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 32 Low Size of largest component disconnects the network sooner Fraction of removed links ¡ Removing links by strength (#calls) § Low to high § High to low Conceptual picture of network structure 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 33 Low Size of largest component disconnects the network sooner Fraction of removed links ¡ Removing links based on overlap § Low to high § High to low Conceptual picture of network structure 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 34 ¡ Granovetter’s theory leads to the following conceptual picture of networks Strong ties Weak ties 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 35 ¡ Granovetter’s theory suggest that networks are composed of tightly connected sets of nodes Communities, clusters, ¡ Network communities: groups, modules § Sets of nodes with lots of internal connections and few external ones (to the rest of the network). 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 37 ¡ How to automatically find such densely connected groups of nodes? ¡ Ideally such automatically detected clusters would then correspond to real groups Communities, clusters, ¡ For example: groups, modules 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 38 ¡ Zachary’s Karate club network: § Observe social ties and rivalries in a university karate club § During his observation, conflicts led the group to split § Split could be explained by a minimum cut in the network 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 39 Find micro-markets by partitioning the “query-to-advertiser” graph in web search: query advertiser Nodes: advertisers and queries/keywords; Edges: Advertiser advertising on a keyword. 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 40 Can we identify node groups? (communities, modules, clusters) Nodes: Teams Edges: Games played 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 41 NCAA conferences Nodes: Teams Edges: Games played 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 42 Can we identify social communities? Nodes: Users Edges: Friendships 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 43 High school Company Stanford (Basketball) Stanford (Squash) Nodes: Users Social communities Edges: Friendships 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 44 Can we identify functional modules? Nodes: Proteins Edges: Interactions 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 45 Functional modules Nodes: Proteins Edges: Interactions 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 46 ¡ Communities: sets of tightly connected nodes ¡ Define: Modularity ! § A measure of how well a network is partitioned into communities § Given a partitioning of the network into groups " ∈ $: Q µ ∑sÎ S [ (# edges within group s) – (expected # edges within group s) ] Need a null model! 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 47 ¡ Given real ! on " nodes and # edges, construct rewired network !’ § Same degree distribution but i random connections j § Consider !’ as a multigraph § The expected number of edges between nodes '& '% '& % and & of degrees '% and '& equals to: '% ⋅ )# = )# § The expected number of edges in (multigraph) G’: + '% '& + + § = ∑ ∑ = ⋅ ∑ ' ∑&∈. '& = ) %∈. &∈. )# ) )# %∈. % Note: + § = )# ⋅ )# = # 0 31 = 25 /# 1∈2 10/11/18 Jure Leskovec, Stanford CS224W: Analysis of Networks 48 ¡ Modularity of partitioning S of graph G: § Q µ ∑sÎ S [ (# edges within group s) – (expected # edges within group s) ] & 0, 0- § ! ", $ = '( ∑*∈$ ∑,∈* ∑-∈*.,- − '( Aij = 1 if i®j, Normalizing const.: -1

Use Quizgecko on...
Browser
Browser