Social Networking: Social Network Analysis? What's that?

Brief Introduction

SNA (Social network analysis) has emerged as a key technique in modern sociology, which refers to methods to analyze social networks and social structures. Social network analysis views social relationships in terms of network theory consisting of nodes and ties.

Nodes are the individual actors within the networks, and ties are the relationships between the actors. Nodes are tied by one or more specific types of interdependency, such as friendship, kinship, common interest, financial exchange, dislike, sexual relationships, or relationships of beliefs, knowledge or prestige.

Case Study

Now we give an example to analyze the social network between notes.

This undirected sociogram describes a small social network composed of five social actors and a set of links. Here we just consider the one mode network.

1. General parameters

Degree

Density

Geodesic Distances

The degree of a node n_i, noted by d(n_i), is the number of nodes adjacent to it, including out-degree (the number of links pointing out of this node) and in-degree (the number of links pointing into of this node).

Density can measure the closeness of a network, is an indicator for the general level of connectedness of the graph.

Geodesic Distances, expressed by d(i, j), is the distance of the geodesic path between two i and j.

With regard to this instance, the degree of each notes are as following:

Notes	Degree
Alice	3
Bob	2
Carol	2
David	4
Eva	1

The density of this undirected graph is 0.6.

Geodesic Distances between two nodes are shown as below:

	Alice	Bob	Carol	David	Eva
Alice	—	1	1	1	2
Bob	1	—	2	1	2
Carol	1	2	—	1	2
David	1	1	1	—	1
Eva	2	2	2	1	—

What’s more, {Alice, Bob, David} and {Alice, Carol, David} are cliques.

2. Centrality

When identifying which nodes are in the center of the network, here we consider three standard centrality measures to capture a wide range of “importance” in the network:

Degree Centrality

Closeness Centrality

Betweenness Centrality

Historically first and conceptually simplest is degree centrality, which is defined as the number of links incident upon a node (i.e., the number of ties that a node has). The degree can be interpreted in terms of the immediate risk of a node for catching whatever is flowing through the network (such as a virus, or some information).

In graphs there is a natural distance metric between all pairs of nodes, defined by the length of their shortest paths. The farness of a node s is defined as the sum of its distances to all other nodes, and its closeness is defined as the inverse of the farness. Thus, a node is the more central the lower its total distance to all other nodes. Closeness can be regarded as a measure of how long it will take to spread information from s to all other nodes sequentially.

Betweenness is a centrality measure of a vertex within a graph (there is also edge betweenness, which is not discussed here). It was introduced as a measure for quantifying the control of a human on the communication between other humans in a social network by Linton Freeman. In his conception, vertices that have a high probability to occur on a randomly chosen shortest path between two randomly chosen nodes have a high betweenness.

With regard to this instance, the degree centrality of each notes are as following:

Notes	Degree Centrality	Closeness Centrality	Betweenness Centrality
Alice	0.6	0.8	0.08
Bob	0.4	0.67	0
Carol	0.4	0.67	0
David	0.8	1	0.58
Eva	0.2	0.57	0

(the results above have been normalized)

Related Formulas:

(a) Degree Centrality: C’_D(n_i) = d(n_i)/(g-1)；,

(b) Closeness Centrality:

(c) Betweenness Centrality:

and g_jk = the number of geodesics connecting jk, g_jk(n_i) = the number that actor i is on.

3. Influence Range

There is another measurement called Influence Range to show the set of actors who are reachable from the given node. This refined closeness centrality can be figured up by

J_i is the number of actors in the influence range of actor i (excluding i itself).

The computing results is:

Notes	Closeness Centrality (refined)
Alice	0.75
Bob	0.5
Carol	0.5
David	1
Eva	0.25

This index is a ratio of the fraction of the actors in the group who are reachable, to the average distance that these actors are from the actor n_i.

4. Matrices for SNA

Matrix is a very important concept in SNA, and the primary matrix is called the adjacency matrix, or sociomatrix.

With regard to this example:

	Alice	Bob	Carol	David	Eva
Alice	—	1	1	1	0
Bob	1	—	0	1	0
Carol	1	0	—	1	0
David	1	1	1	—	1
Eva	0	0	0	1	—

	n1	n2	n3	n4	n5
n1	—	1	1	1	0
n2	1	—	0	1	0
n3	1	0	—	1	0
n4	1	1	1	—	1
n5	0	0	0	1	—

Case conclusion:

According to the computing results, we find David is in the “center” of the network, which means he is the key player and is the most influential note.

What we can know from the above instance:

Social Network Analysis is not just about graphs and data. Once a graph is drawn, you can measure it. Social network metrics reveal much about the nodes, and the clusters they form. Who knows what is going on? Who wields power or influence? Who is a key connector? Who is in the "thick of things" in this conspiracy? In this example, our calculations reveal that David is most important node in the network.

The common wisdom is that only big business and government use social network analysis. Yet, there are many individuals and groups that are learning the craft, and solving local problems. Although social network analysis can not be learned by reading a book, it does not require a PhD either. Any intelligent person, under the right guidance, and with the proper tools, can apply the methodology to an appropriate problem and gain enormous insight into what was previously hidden.

References:

1. http://en.wikipedia.org/wiki/Centrality

2. http://www.orgnet.com/index.html

6 則留言:

Xiang LIU2012年3月15日晚上8:07
You and I have different view about the picture and according to your advise the distance in the picture do have sense. And before you do all the calculate, I wish to see a deatil about the reult to someone who never take the clss, not only for us to read.
回覆刪除
回覆
Fengyiming2012年3月15日晚上10:04
what an excellent job you did! In your blog, we can see formulas, graphs and principles. Besides, I got the same results of you.
However, I am afraid about that in reality, we have a huge network and very complicated relationships between users, which is always a dynamic system. So, can you give me some ideas about how to calculate efficiently?
回覆刪除
回覆
Michael2012年4月18日凌晨3:12
I agree with you! There are indeed many individuals and groups that are using this kind of analysis to make profit.
回覆刪除
回覆
Chenyu2012年4月24日凌晨3:35
Thank you for your detailed explanation about related definitions and the explicit methods of calculation.I'm pleased to see the result of betweenness centrality,because I didn't quit get the detailed calculation steps on betweenness centrality.After reading the process of your description of betweenness,I get both understanding and the ability to handle with betweenness.Vertices that have a high probability to occur on a randomly chosen shortest path between two randomly chosen nodes have a high betweenness and it can well explain the status of David in the network.
回覆刪除
回覆

新增留言

2012年3月12日 星期一

Social Network Analysis? What's that?

6 則留言:

2012年3月12日星期一