Is it that in a group the more stronger ties the group has, also more information gets diffused between its members? Well according to Granovetter saying that information among people with strong ties tends to diffuse faster this should be the case. But if we want to study this phenomenon in Twitter we have to come up with a definition of what a strong tie is. I have come up with at least four definitions using the following relationship and the @reply relationship.
- Weak ties: A follows B, B does not follow A. This is the cheapest tie on Twitter, where a person simply follows another person.
- Weak-Strong ties: A follows B, B does not follow A but @replies A. In this case A followed B and B greeted A so at least acknowledged A’s existence.
- Strong ties: A follows B, B follows A. This tie is reciprocated, so it would suffice some definitions of a strong tie.
- Strongest ties: A @replies B, B @replies A. These are the strongest ties, since both interact at least once with each other.
Beyond that we can try to compute an “average tie strength” between a number of people by summing all of the tie strengths, which are counted by how many @replies were exchanged and then calculate an average for a group. So for example if the group consists of 3 people A,B,C: A @replies 3x B, B @replies 4x C. The average is 3+4 / 2 (Strengths added up / # ties). To do this with networkX is pretty easy. Given you have a graph (D) which holds the tie strengths in “weight”. This measure is somehow problematic though as I can imagine cases where among 100 people no-one talks to each other apart from two persons exchanging 100 @replies. The “average” tie strength would be 1 then.
If we want to be rather conservative about tie strength we can use the reciprocated ties definition. To compute it with networkX is similarly easy. We can call this measure reciprocity as it measures the proportion of reciprocated ties of all ties. See http://www.faculty.ucr.edu/~hanneman/nettext/C8_Embedding.html UCINET has a similar routine under Network>Network Properties>Reciprocity. I dont’ know why networkX is not having one.
Given that we have three different graphs:
- The FF graph, holding the following relationships:
- The AT (@) graph, holfing the interaction relationships
- And the RT graph, holding the actual diffusion between people
If we input the FF graph into reciprocity, we find out how many of the follower relationships are reciprocated (see above “strong ties”). If we enter the AT graph we get the “strongest ties” (see above). So we end up having three at least three operationalizable definions of strong ties (Reciprocated ties in FF, Reciprocated ties in AT, and average tie strength measured by the average interaction inside the group). We will see now if for 100 groups, we find out that the more strong ties the group has the more information is diffused inside the group.
To measure the information diffusion I will use two measures. One is the density in the RT network. The higher the density the more information diffusion ties there are between those people. The second measure is the total volume of the exchanged information in the group. We do this by adding up all the retweet ties with their according weight for a group and then dividing by the number of people in the group. See method total_edge_weight above / len(RT.edges() . So for example if our network had only two nodes A and B: And A retweeted B 3 times. The total volume of information exchanged would be 3 / 2 = 1.5, and the density would be 0.5.
Using simple regresssion you get the following results:
The covariance between FF_reciprocity and AT_reciprocity not shown.
So in general by counting how many strong ties the group has we can explain about 22% of the variance in the diffusion, as measured by density. If we do the same regression and measure the diffusion by volume, we get the following result:
So the strong ties defined by AT_reciprocity seem to not be able to contribute to the explanation of the volume and we can barely explain 8% of the variance. I will maybe have to re-think my measure of information diffusion as measured in volume. It might suffer from the fact that on average the volume might seem high for a group, but is only produced by a small number of people who retweet each other all the time. I will create some histograms of the RT volume for each group to see what is going on.