Some of you have seen the very interesting article from the Facebook Data team about the echo chamber of social networks. The slate magazine has a nice review of this recent bakshy and adamic paper. They come to an illustrative conclusion if you had 100 weak ties and 10 strong ties:

**“The amount of information spread due to weak and strong ties would be 100*0.15 = 15, and 10*0.50 = 5 respectively, so in total, people would end up sharing more from their weak tie friends.”**

So I thought that is an interesting thing to revisit on Twitter. Quite new with networkX I thought this might be an interesting research example.

Having computed communities of topological specialists (see last post) I use them as my data basis. So I have a community of 100 people that have been very often tagged with the word “publishing”. The edgelists have been precomputed in ruby by analyzing the 3200 tweets of each person (and their retweets) and saved to disk in an edgelist format.

AT = nx.read_edgelist('%s_AT.edgelist' % project_name1, nodetype=str, data=(('weight',float),),create_using=nx.DiGraph()) RT = nx.read_edgelist('%s_RT.edgelist' % project_name1, nodetype=str, data=(('weight',float),),create_using=nx.DiGraph())

The AT network is the network where edges between Twitter users correspond to @replies of users that are not retweets. It serves as a proxy for tie strength. So if a person adresses another person 5 times their tie strength is 5.

The RT network is the network where edges between Twitter users correspond to retweets of users material. If i retweet somebody 10 times (time is not important here) this tie has the bandwidth or simply strength of 10.

So the question is if I look at all of the AT-ties that have the strenght 1, how many retweets were transmitted through those ties? If you consequently do this for any given tie strength you will come up with a chart of how many ties are out there that have a certain strenght, and how much was transferred over those ties.

The code that does this in network X is the following:

#How much information do the ties carry according to their strength result = [] # Some tie strengths thresholds = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20] for threshold in thresholds: at_edges = [] for n,nbrs in AT.adjacency_iter(): for nbr,eattr in nbrs.items(): data=eattr['weight'] if data==threshold: #if the ties have a specific strength at_edges.append((n,nbr,data)) # create a tuple of from_node,to_node,strength rt_edges = [] for edge in at_edges: try: value = RT[edge[0]][edge[1]]['weight'] #if I can find this same pair of nodes in the RT graph capture how many retweets have been exchanged here. except KeyError: value = 0 if value > 0: rt_edges.append((edge[0],edge[1],value)) #if retweets have been exchanged between those actors add them to the rt edges result.append([len(at_edges), math.fsum([x[2] for x in rt_edges]),threshold]) # sum up over the retweets and save the result

You end up with an array for each tie strength holding the total number of at ties and the total number of retweets .You can plot this by using matplotlib

plt.plot(thresholds, [at[0] for at in result],'b-', label='# of AT ties with strength x') plt.plot(thresholds, [at[1] for at in result], 'g-', label='# of retweets flowing through these ties') plt.legend()

So if it is true that

**“The amount of information spread due to weak and strong ties would be 100*0.15 = 15, and 10*0.50 = 5 respectively, so in total, people would end up sharing more from their weak tie friends.”**

It is the case that the the stronger the ties get the less we have of those.

But we should see that the majority of retweets aretransmitted for ties with the strength of 1. Yet It is the case that the majority of retweets gets transmitted through ties of the strenght of 2 and 3.

I think this is interesting, I will now run this on 100 of those communities and plot the average.What do you think about this graph and the approach? Is it valid ?

## Bonus Update

I also thought there needs to be a tiestrength lower than 1 which for me is a retweet happening without there being any AT interaction before.

#Get the "0" strong tie which is when a retweet happend although there is no at_tie (in either direction) AT_undir = nx.Graph(AT) # We make it undirected to search for @replies in both directions rt_0_edges = [] for n,nbrs in RT.adjacency_iter(): for nbr,eattr in nbrs.items(): data=eattr['weight'] try: value = AT_undir[n][nbr]['weight'] except KeyError: rt_0_edges.append((n,nbr,data)) #insert our results to the array as the first datapoints result.insert(0,[0,math.fsum([x[2] for x in rt_0_edges]),0]) thresholds.insert(0,0) percentages = np.array([rt[1] for rt in result],dtype="float32")/np.array([at[0] for at in result]) # convert it to floats for division #Plot it fig = plt.figure() ax1 = fig.add_subplot(111) ax1.plot(thresholds, [at[0] for at in result],'b-', label='# of AT ties with strength x') ax1.plot(thresholds, [at[1] for at in result], 'g-', label='# of retweets flowing through these ties') ax1.legend(loc=2) ax2 = ax1.twinx() ax2.plot(thresholds, percentages, 'r-', label='% of #RT/#AT') ax2.legend()

So if we add this to the graphic we end up at another picture:

So what we see right now looks more like the “strength of NO ties”. (Would make a nice Paper title :)) People retweet each other even if they have had no interaction before. It will be interesting to explore if this asumption holds for all of the other 100 datasets.

P.S. I thought about adding another tie strength which I would have the strength e.g. 0.5, This would correspond to a person FOLLOWING the other person, wich is definitely less then writing an @reply to this person but more than nothing. (We all know following each other on Twitter is cheap.)

Cheers

Thomas

Dear Thomas. Nice post on this idea.

Actually, we have just published a paper in which we went a little bit further and in the direction which you suggest in the end: the key question we wanted to answer was: where (in the network) do RTs happen? We found, as expected that RTs happen with more probability between communities and thus RTs resemble useful information going from a group to another through bridges (weak links).

You can find the details here http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0029358

I would be glad to know your opinion.

Best

Great Paper! Your insight that “links with mentions are more abundant as internal links than the baseline follower relations for groups of size up to 150 users” in regard to the Dunbar number is also something I stumbled upon when collecting lists on particular topics. I found that creating topic groups between 100-200 people works the best because after that the attention towards others that contribute to this topic drops off (I will try to write a blog entry on that).

What I found somewhat find puzzling is that retweets apprear with higher probability in links between groups with a low overlap. Although retweets propagate information, what motivation would someone have to retweet something from a remote group to his followers (I mean beside his own motivation, which is being a broker and caring for a number of groups). Will his followers really care about it? I think if we were to count how many retweets people got from other groups this proportion should be quite small.

Thanks for your comment. Regarding your question about the motivation to retweet something, I guess the real question is why are we sharing information? There are the main reasons, but I found the following list quite comprehensive

1. Altruism: We share to bring valuable and entertaining content to others. We think about what our friends want to know, and try to help them out.

2. Self-definition: We share to define ourselves to others. Perhaps this notion is better phrased as, “you are what you share.” People consciously shape their online persona by the types of things they share.

3 Empathy: We share to strengthen and nourish our relationships. Sharing shows someone else we’re thinking about them and we care.

4 Connectedness: We share to get credit and feedback for being a good sharer, to feel valuable in the eyes of others.

5 Evangelism: We share to spread the word about a cause or brand we believe in.

(this list was taken from “The Psychology of sharing”, The New York Times Insights http://nytmarketing.whsites.net/mediakit/pos/)

I guess what we measured in twitter was number 1. People want to bring valuable information to their groups and that information most of the time comes from another group.