Social Network Analysis (SNA) - Introduction PDF

Summary

This document provides an introduction to Social Network Analysis (SNA), covering its concepts, terminologies, historical context, and real-world applications. It delves into the analysis of social networks, graph theory methods, and practical examples to understand network structures, relationships, and the behavior of individuals at the micro and macro levels. You'll learn about the various SNA terminologies, methods like VADER sentiment analysis, and TF-IDF.

Full Transcript

Social Network Analysis Module 1 Module -1 ✓ Understanding Social Networks ✓ Attributes of networks ✓ Nodes and Edges ✓ Relationships in networks ✓ Betweenness and centrality ✓ Visualizing social network data Choice overload Who we want to talk in the network On wha...

Social Network Analysis Module 1 Module -1 ✓ Understanding Social Networks ✓ Attributes of networks ✓ Nodes and Edges ✓ Relationships in networks ✓ Betweenness and centrality ✓ Visualizing social network data Choice overload Who we want to talk in the network On what channel we need to talk Consequences ….. Wrong choices Right choices The Seven Bridges of Königsberg is a historically notable problem in mathematics. Its negative resolution by Leonhard Euler in 1736 laid the foundations of graph theory and prefigured the idea of topology. The city of Königsberg in Prussia (now Kaliningrad, Russia) was set on both sides of the Pregel River, and included two large islands— Kneiphof and Lomse—which were connected, to the two mainland portions of the city, by seven bridges. The Konigsberg Bridge contains the following problem which says: Is it possible for anyone to cross each of The problem was to devise a walk the seven bridges only a single time and come back to through the city that would cross the beginning point without swimming across the river each of those bridges once and if we begin this process from any of the four land only once. areas that are A, B, C, and D?. Historical Social network analysis (SNA) applications have had three main and parallel influences beginning in the 1930s. The first was sociometric analysis, which used graph theory methods. The second was a mathematical approach taken up first by Kurt Lewin and later by Harvard researchers, which laid the foundation for the analysis of social networks. The Harvard analysis introduced the notion of cliques, which operationalised social structures. No longer was network analysis merely descriptive in nature. The third influence came from the Manchester anthropologists who looked at the structure of community relations in villages. All traditions were brought together, again at Harvard, in the 1960s and 1970s when contemporary SNA was developed (Kilduff and Tsai, 2003). Jacob. l Moreno, a social scientist working in the 1930's, created simple diagrams to visualize relationships within small groups of people. This diagram shows friendships among boys (triangles) and girls (circles) in a group of fourth graders. The diagram reveals how strongly the two groups are separated by gender; only one boy crossed the divide, and not a single girl did. (See Linton C. Freeman. 'Visualizing Social Networks,' 2000) 1.Though the experiment went through several variations, Milgram typically chose individuals in the U.S. cities of Omaha, Nebraska, and Wichita, Kansas, to be the starting points and Boston, Massachusetts, to be the end point of a chain of correspondence. These cities were selected because they were thought to represent a great distance in the United States, both socially and geographically. 2.Information packets were initially sent to "randomly" selected individuals in Omaha or Wichita. They included letters, which detailed the study's purpose, and basic information about a target contact person in Boston. It additionally contained a roster on which they could write their own name, as well as business reply cards that were pre-addressed to Harvard. 3.Upon receiving the invitation to participate, the recipient was asked whether he or she personally knew the contact person described in the letter. If so, the person was to forward the letter directly to that person. For the purposes of this study, knowing someone "personally" was defined as knowing them on a first-name basis. 4.In the more likely case that the person did not personally know the target, then the person was to think of a friend or relative who was more likely to know the target. They were then directed to sign their name on the roster and forward the packet to that person. A postcard was also mailed to the researchers at Harvard so that they could track the chain's progression toward the target. 5.When and if the package eventually reached the contact person in Boston, the researchers could examine the roster to count the number of times it had been forwarded from person to person. Additionally, for packages that never reached the destination, the incoming postcards helped identify the break point in the chain Real World Cases-SNA 1. Supply Chain Management: A supply chain can be modeled into a network of supplier/consumer relations. Network analysis on the supply chain helps us improve the operation efficiency by identifying and eliminating less important nodes (suppliers/warehouses). It can help identify crucial nodes in the network and create a standby in crises or emergencies. Nodes include Retailers, Suppliers, Warehouses, Transporters, Regulatory agencies. SNA applications can help manufacturers identify more operationally critical nodes and identify potential sources to increase the number of connections to suppliers. This can also help identify any bottlenecks in the supply process and inventory management. Real World Cases-SNA 2. Human Resources: HRM often strives to identify critical resources and understand their contribution to the organization flow, collaboration, participation, and information flow. By following the Organizational Network Analysis (ONA), an organization will optimize the talent connections, productivity, and utilization. It will also help identify the reach of an individual, identify accelerators of growth and poorly connected resources, and decide whom to give more opportunity. Real World Cases-SNA 3. Transmission of Infectious Diseases: SNA could help identify and isolate individuals and groups with high betweenness and out-degree centrality (transmitters of disease) and implement sound contact tracing activities to mellow the impact. Real World Cases-SNA Apart from Contact tracing, SNA can also identify dominant themes and relations between keywords and identify the sentiment. Here is the connection between the top 10 words for COVID- 19 themes: Why Networks ? An invisible thread connecting all the dots despite the digital growth happening every day. In other words, we are a part of a network in all stages of our lives, be it a social network like friends or family, an organization network like an educational institution or workplace. The networks we are a part of also include a social media network where we connect with people across the world or even a consumer network as users of various brands. Thus, networks are all around us. Why Networks ? The concept of networks and extracting information has untapped potential, be it a social setting, consumer behavior, health management, education, politics. Though intellectuals have started seeing the benefits of identifying social groups for various applications, this concept has not become mainstream in the business world. What is SNA? Social Network Analysis (SNA), also known as network science, is a general study of the social network utilizing network and graph theory concepts. It explores the behavior of individuals at the micro-level, their relationships (social structure) at the macro level, and the connection between the two. What is SNA? SNA uses several methods and tools to study the relationships, interactions, and communications in a network. The basic entities required for building a network are nodes and the edges connecting the nodes. Let us try and understand this with the help of a most common application of SNA, the Internet. Webpages are often linked to other web pages on their own page or other pages. In SNA language, these pages are nodes, and the links between the pages are the edges. In this way, we can interpret the entire internet as one large graph. Most Used SNA Terminologies As established earlier, nodes and edges are the building blocks for SNA. Few characteristics of the edges that define the features of a network are shown below. The Edges connect the Nodes. The direction of connections determines the Edge type. SNA Terminologies 1.a Directed Edge: The nodes connected by this edge are ordered, that is, the connection between the nodes is one way. For example, Twitter, Instagram are predominantly directed edge networks. You can follow someone without them following you back. 1.b Undirected Edge: The relationship between the nodes connected by this edge is mutual, i.e., the connection is applicable both ways. E.g., Befriending a person on Facebook, LinkedIn automatically creates a two-way connection. SNA Terminologies 2. Weight: In a weighted network, an edge carries a label (weight) between the nodes. Different applications can have their own definition of weight. In social media analysis, a weight can define the number of mutual connections between the nodes connected by that edge. In Figure 2, John and Frank have two mutual friends, Rose and Amy. Thus, the edge connecting John and Frank carries a weight of 2. Most Used SNA Terminologies SNA Terminologies 3. Density: The relation between the number of existing connections in a network and all possible connections in the network is calculated as follows: Centrality Measures: a) Degree Centrality: Measures the number of direct ties to a node; this will indicate the most connected node in the group. Let’s consider the network in Figure 4. The degree centrality score of a network is the sum of edges connected to that node. For Node 1, the degree centrality is 1, and for Nodes 3 and 5, the score is 3. The standardized score is calculated by dividing the score by (n-1), where n is the number of nodes in the network. Centrality Measures We can see that nodes 3 and 5 have a high degree centrality of 0.5, i.e., they are the most well-connected nodes in the network. Centrality Measures b) Closeness Centrality: Closeness measures how close a node is to the rest of the network. It is the ability of the node to reach the other nodes in the network. It is calculated as the inverse of the sum of the distance between a node and other nodes in the network. Let us take node 1 from Figure 4; the sum of distances from node 1 to all other nodes is 16. Centrality Measures Hence the Closeness score for node 1 will be 1/16. The standardized score is calculated by multiplying the score by (n-1). We can conclude that node 4 is the closest/central node in the network with the highest closeness score of 0.6. Centrality Measures d) Eigenvector Centrality: A relative measure of the importance of the node in the network. Each node is assigned a value or score depending upon the number of other prominent/ high scoring nodes it is connected to. Why do we need such a relative measure? Consider the network in Figure 5. Here ‘d’ represents the degree centrality score. Nodes A and B are connected to 4 nodes each, and hence both have a degree centrality score of 4. But when we look at their neighbors, we can see that node B is connected to nodes with a high degree. Hence, node B can be preferred over node A when we have to choose based on connectivity. Sentiment Analysis Consider the following phrases: "Titanic is a great movie." "Titanic is not a great movie." "Titanic is a movie.“ For example, the first phrase denotes positive sentiment about the film Titanic while the second one treats the movie as not so great (negative sentiment). Take a look at the third one more closely. Sentiment Analysis SA is the process of ‘computationally’ determining whether a piece of writing is positive, negative or neutral. It’s also known as opinion mining, deriving the opinion or attitude of a speaker. Why sentiment analysis? Business: In marketing field companies use it to develop their strategies, to understand customers’ feelings towards products or brand, how people respond to their campaigns or product launches and why consumers don’t buy some products. Politics: In the political field, it is used to keep track of political view, to detect consistency and inconsistency between statements and actions at the government level. It can be used to predict election results as well!. Public Actions: Sentiment analysis also is used to monitor and analyse social phenomena, for the spotting of potentially dangerous situations and determining the general mood of the blogosphere. Process of Sentiment Analysis VADER Sentiment Analysis VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool that is specifically attuned to sentiments expressed in social media. VADER uses a combination of A sentiment lexicon is a list of lexical features (e.g., words) which are generally labeled according to their semantic orientation as either positive or negative. VADER not only tells about the Positivity and Negativity score but also tells us about how positive or negative a sentiment is. Examples of Sentiment Scores The VADER library returns 4 values such as: pos: The probability of the sentiment to be positive neu: The probability of the sentiment to be neutral neg: The probability of the sentiment to be negative compound: The normalized compound score which calculates the sum of all lexicon ratings and takes values from -1 to 1 Notice that the pos, neu and neg probabilities add up to 1. Also, the compound score is a very useful metric in case we want a single measure of sentiment. Typical threshold values are the following: positive: compound score>=0.05 neutral: compound score between -0.05 and 0.05 negative: compound score

Use Quizgecko on...
Browser
Browser