//
You are reading..
Ties

The strength of ties (revisited)


Some of you have seen the very interesting article from the Facebook Data team about the echo chamber of social networks. The slate magazine has a nice review of this recent bakshy and adamic paper. They come to an illustrative conclusion if you had 100 weak ties and 10 strong ties:

“The amount of information spread due to weak and strong ties would be 100*0.15 = 15, and 10*0.50 = 5 respectively, so in total, people would end up sharing more from their weak tie friends.”

So I thought that is an interesting thing to revisit on Twitter. Quite new with networkX I thought this might be an interesting research example.

Having computed communities of topological specialists (see last post) I use them as my data basis. So I have a community of 100 people that have been very often tagged with the word “publishing”. The edgelists have been precomputed in ruby by analyzing the 3200 tweets of each person (and their retweets) and saved to disk in an edgelist format.

AT = nx.read_edgelist('%s_AT.edgelist' % project_name1, nodetype=str, data=(('weight',float),),create_using=nx.DiGraph())
RT = nx.read_edgelist('%s_RT.edgelist' % project_name1, nodetype=str, data=(('weight',float),),create_using=nx.DiGraph())

The AT network is the network where edges between Twitter users correspond to @replies of users that are not retweets. It serves as a proxy for tie strength. So if a person adresses another person 5 times their tie strength is 5.

The RT network is the network where edges between Twitter users correspond to retweets of users material. If i retweet somebody 10 times (time is not important here) this tie has the bandwidth or simply strength of 10.

So the question is if I look at all of the AT-ties that have the strenght 1, how many retweets were transmitted through those ties? If you consequently do this for any given tie strength you will come up with a chart of how many ties are out there that have a certain strenght, and how much was transferred over those ties.
The code that does this in network X is the following:

#How much information do the ties carry according to their strength
result = []
# Some tie strengths
thresholds = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
for threshold in thresholds:
    at_edges = []
    for n,nbrs in AT.adjacency_iter():
        for nbr,eattr in nbrs.items():
            data=eattr['weight']
            if data==threshold: #if the ties have a specific strength
                at_edges.append((n,nbr,data)) # create a tuple of from_node,to_node,strength

    rt_edges = []
    for edge in at_edges:
        try:
            value = RT[edge[0]][edge[1]]['weight'] #if I can find this same pair of nodes in the RT graph capture how many retweets have been exchanged here.
        except KeyError:
            value = 0
        if value > 0:
            rt_edges.append((edge[0],edge[1],value))    #if retweets have been exchanged between those actors add them to the rt edges
    result.append([len(at_edges), math.fsum([x[2] for x in rt_edges]),threshold]) # sum up over the retweets and save the result

You end up with an array for each tie strength holding the  total number of at ties and the total number of retweets .You can plot this by using matplotlib

plt.plot(thresholds, [at[0] for at in result],'b-', label='# of AT ties with strength x')
plt.plot(thresholds, [at[1] for at in result], 'g-', label='# of retweets flowing through these ties')
plt.legend()

So if it is true that

“The amount of information spread due to weak and strong ties would be 100*0.15 = 15, and 10*0.50 = 5 respectively, so in total, people would end up sharing more from their weak tie friends.”

It is the case that the the stronger the ties get the less we have of those.
But we should see that the majority of retweets aretransmitted for ties with the strength of 1. Yet It is the case that the majority of retweets gets transmitted through ties of the strenght of 2 and 3.

I think this is interesting, I will now run this on 100 of those communities and plot the average.What do you think about this graph and the approach? Is it valid ?

Bonus Update

I also thought there needs to be a tiestrength lower than 1 which for me is a retweet happening without there being any AT interaction before.


#Get the "0" strong tie which is when a retweet happend although there is no at_tie (in either direction)
AT_undir = nx.Graph(AT) # We make it undirected to search for @replies in both directions
rt_0_edges = []
for n,nbrs in RT.adjacency_iter():
        for nbr,eattr in nbrs.items():
            data=eattr['weight']
            try:
                value = AT_undir[n][nbr]['weight']
            except KeyError:
                rt_0_edges.append((n,nbr,data))

#insert our results to the array as the first datapoints
result.insert(0,[0,math.fsum([x[2] for x in rt_0_edges]),0])
thresholds.insert(0,0)
percentages = np.array([rt[1] for rt in result],dtype="float32")/np.array([at[0] for at in result]) # convert it to floats for division

#Plot it
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(thresholds, [at[0] for at in result],'b-', label='# of AT ties with strength x')
ax1.plot(thresholds, [at[1] for at in result], 'g-', label='# of retweets flowing through these ties')
ax1.legend(loc=2)
ax2 = ax1.twinx()
ax2.plot(thresholds, percentages, 'r-', label='% of #RT/#AT')
ax2.legend()

So if we add this to the graphic we end up at another picture:

So what we see right now looks more like the “strength of NO ties”. (Would make a nice Paper title :)) People retweet each other even if they have had no interaction before. It will be interesting to explore if this asumption holds for all of the other 100 datasets.

P.S. I thought about adding another tie strength which I would have the strength e.g. 0.5, This would correspond to a person FOLLOWING the other person, wich is definitely less then writing an @reply to this person but more than nothing. (We all know following each other on Twitter is cheap.)

Cheers
Thomas

Advertisements

About plotti2k1

Thomas Plotkowiak is working at the MCM Institute in the Social Media and Mobile communication group which belongs to the University of St. Gallen. His PhD research in Social Media is researching how the structure of social networks like Facebook and Twitter influences the diffusion of information. His main focus of work is Twitter, since it allows public access (and has a nice API). Make sure to also have a look at his recent publications. Thomas majored 2008 in Computer Science and Economics at the University of Mannheim and was involved at the computer science institutes for software development and multimedia technoIogy: SWT and PI4. During his studies I focused on Artificial Intelligence, Multimedia Technology, Logistics and Business Informatics. In his diploma/master thesis he developed an adhoc p2p audio engine for 3D Games. Thomas was also a researcher for a year at the University of Waterloo in Canada and in the Macquarie University in Sydney. He was part of the CSIRO ICT researcher group. In his freetime thomas likes to swim in his houselake (drei weiher) and run and enjoy hiking in the Appenzell region. Otherwise you will find him coding ideas he recently had or enjoying a beer with colleagues in the MeetingPoint or Schwarzer Engel.

Discussion

4 thoughts on “The strength of ties (revisited)

  1. Dear Thomas. Nice post on this idea.
    Actually, we have just published a paper in which we went a little bit further and in the direction which you suggest in the end: the key question we wanted to answer was: where (in the network) do RTs happen? We found, as expected that RTs happen with more probability between communities and thus RTs resemble useful information going from a group to another through bridges (weak links).
    You can find the details here http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0029358
    I would be glad to know your opinion.
    Best

    Posted by emoro | April 18, 2012, 11:54 am
    • Great Paper! Your insight that “links with mentions are more abundant as internal links than the baseline follower relations for groups of size up to 150 users” in regard to the Dunbar number is also something I stumbled upon when collecting lists on particular topics. I found that creating topic groups between 100-200 people works the best because after that the attention towards others that contribute to this topic drops off (I will try to write a blog entry on that).

      What I found somewhat find puzzling is that retweets apprear with higher probability in links between groups with a low overlap. Although retweets propagate information, what motivation would someone have to retweet something from a remote group to his followers (I mean beside his own motivation, which is being a broker and caring for a number of groups). Will his followers really care about it? I think if we were to count how many retweets people got from other groups this proportion should be quite small.

      Posted by plotti2k1 | April 18, 2012, 3:01 pm
      • Thanks for your comment. Regarding your question about the motivation to retweet something, I guess the real question is why are we sharing information? There are the main reasons, but I found the following list quite comprehensive

        1. Altruism: We share to bring valuable and entertaining content to others. We think about what our friends want to know, and try to help them out.
        2. Self-definition: We share to define ourselves to others. Perhaps this notion is better phrased as, “you are what you share.” People consciously shape their online persona by the types of things they share.
        3 Empathy: We share to strengthen and nourish our relationships. Sharing shows someone else we’re thinking about them and we care.
        4 Connectedness: We share to get credit and feedback for being a good sharer, to feel valuable in the eyes of others.
        5 Evangelism: We share to spread the word about a cause or brand we believe in.

        (this list was taken from “The Psychology of sharing”, The New York Times Insights http://nytmarketing.whsites.net/mediakit/pos/)

        I guess what we measured in twitter was number 1. People want to bring valuable information to their groups and that information most of the time comes from another group.

        Posted by emoro | April 18, 2012, 8:08 pm

Trackbacks/Pingbacks

  1. Pingback: On the weakness of weak ties « Twitter Research - August 28, 2012

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: