My blog on my research on Twitter

A few months ago I’ve made a blog post (https://twitterresearcher.wordpress.com/2012/01/17/the-strength-of-ties-revisited/) investigating tie strenghts on Twitter and their influence on retweets. Well it turns out that my analysis was lacking a lot of detail, so I re-did it again considering more aspects than before. So lets get started.

The data that I am using for this analysis is the following: Each group of people consists of 100 people that have been highly listed for a given topic in Twitter e.g. snowboarding or comedy or any other topical interest that people have on Twitter. There are 170 of such groups, each consisting of exactly 100 members (You can read how I created such groups in my recent blog posts here https://twitterresearcher.wordpress.com/2012/06/08/how-to-generate-interest-based-communities-part-1/ and here https://twitterresearcher.wordpress.com/2012/06/12/how-to-generate-interest-based-communities-part-2/). In an abstract way you can imagine the structure of the network to looks something like this:

The graphic above indicates that we only have the friend-follower ties on Twitter between those people. But indeed there are quite a few more ties between people, resulting in a multiplex network between them. This network consists of three layers:

- The friend-follower ties
- The @interaction ties (whenever a user mentions another user this corresponds to a tie)
- And finally the retweet ties (whenever a user retweets another user this corresponds to a tie)

Schematically this looks something like this:

Now when we think about ties between those people especially in regard to tie-strengths we can come up with a couple of different definitions of ties ( I mentioned a couple of those in my blog post here https://twitterresearcher.wordpress.com/2012/05/24/tie-strength-in-twitter/)

**Non-valued-ties:**

- No Tie: Neither in the Friend and Follower network, nor in the @interaction network there are any ties between those people.
- Non-reciprocated-friend-follower-tie: Person A follows a person B in the friend and follower network. Person B does not follow person A.
- Reciprocated-friend-follower-tie: Person A follows person B. Person B follows person A.
- Non-reciprocated-@-interaction-tie: Person A mentions person B EXACTLY one time. Person B does not mention person A.
- Reciprocated-@-interaction-tie: Person A mentions person B EXACTLY one time. Person B mentions person A at least one time.

**Valued ties:**

- Interaction tie with strength x: Person A mentions person B EXACTLY X times. (e.g. tie of strength 10 would mean person A has mentioned person B 10 times)

**Bridging vs. bonding ties:**

- Bridging ties: We call bridging ties all of those ties that are BETWEEN groups (see schematic network graphic above the ties in red)
- Bonding ties: We call bonding ties all of those ties that are INSIDE groups (see schematic network graphic above the ties in black)
- Notice that our definition of bridging and bonding ties might differ a bit from the pure network perspective, where maybe by definition bonding ties would have to have a certain strength, reciprocity and so on. Here we rather take the underlying groups, that we created artificially, but which represent nicely users that strongly share a certain interest.

Having all those definitions of ties we can now come up with a number of observations regarding the information diffusion between those people. The information diffusion is captured in the retweet network (see third layer in the schematic graphic) and the corresponding ties. In generall we want to look at how the different tie types affect the information diffused (retweets) between those people.

To get an overview over the data I will first have a look how many retweets have in total have been exchanged between the analyzed groups. I count how many retweets took place inside the group (blue) and between the groups (red). Each of the 170 groups is shown below:

Approximately a total of 214.000 retweets took place between groups (red) and 414.000 retweets that took place inside the groups (blue). In the graphic above we can clearly see the differences between the different interest groups. I’ve ordered the groups ascending to retweets inside the community and which makes us see that there are some groups that focus mostly on retweets inside the group (e.g. tennis or astronomy_physics) while other groups rather get mostly retweets from outside of their own group and do not retweet each other so much inside the group (e.g.poltics_news or liberal). Although we cannot clearly say that the group has an influence if it gets retweeted from outside the group, we can say that the members of the group at least have the choice to retweet other members of the group. If these members do not retweet each other it might have a reason about which you are free to speculate (or I will try to answer in the next blog post)

Given the different types of ties described above we can now ask the most important question:

**How do the different non-valued bridging ties differ from the bonding ties in regard to their influence on the information diffused through those ties?**

What do I mean by that? Having all retweets between the persons in the sample I want to find out through which ties these retweets have flown. So for example given that A has retweeted B three times , I ask the question which ties (that A and B already have in the friend and follower network or the interaction network) were “responsible” for this flow of information between those actors?

EXAMPLE: If two people have mentioned each other at least once, I will assume (according to the definition above) that they hold a reciprocated interaction tie. I will then assume that this tie was “responsible” for the retweet between them. NOTICE: This is a simplifying assumption because I assume that if there is a stronger tie it is always was responsible for the retweet and not the maybe underlying weaker tie (as in form of a friend and follower tie).

**The assumption that I make here is therefore:**

- > means this connection is supposed to be stronger
- AT_reciprocated_tie > AT_directed_tie_with_strength_1
- AT_directed_tie_with_strength_1 > FF_reciprocated_tie
- FF_reciprocated_tie > FF_non_reciprocated_tie
- FF_non_reciprocated_tie > No Tie

In order to compute which kind of ties were most successful of transmitting retweets, I compute the ratio of ties that had retweets that have flown through this TYPE of tie (e.g. ff_reciprocated_ties) and divide it through the amount of the same ties that no had no retweets (e.g. ff_reciprocated_ties between people where no retweet was exchanged between those persons). So if I have a total of 10.000 reciprocated ties and over 2000 a retweet took place while over the remaining 8000 no retweets have been transmitted the ratio for this type of tie is 0.25.

I have summarized the results in the table below. The std. deviation reports the deviation in the different retweet ties that belong to a certain edge type. (In the case of no_tie we have no data for no retweets because here we would have to count all the ties that are not present, which seems a bit unrealistic, given the structure of social networks)

As you can see in the table I have first of all differentiated if a tie belongs to a bridging tie or a bonding tie. Remember that bonding ties are between people who hold the same interest while bridging ties are between people who belong to different groups and thus share different interests.

**No ties**

As you can see first of all there are a couple of retweets that have taken place between people despite those people actually holding any ties. In the case of bridging ties we a bit more retweets than in the case of bonding ties. Yet regarding the total of almost 660.000 retweets, the approximately 73.000 retweets that took place without a tie are more or less only 10% of the total information diffusion. (So my appologies for the blog post on the importance of no ties was overstating their importance, given this new interpretation)

**Friend and follower ties**

What is more interesting are the friend and follower ties. We can see that in both cases holding a reciprocated tie with a person, results in a higher chance of getting retweeted by this person. Although when we look at the bonding ties this chance is almost 4 times as high, while in the bridging ties our chances improve only by less than 10%. When we compare the bonding with the bridging ties we clearly see that the reciprocated bonding ties have a magnitude of 10 higher chance of leading to a retweet than the bridging ties. This is very interesting. So despite the fact that of course bridging ties are important because they lead to a diffusion of information outside of the interest group, they are much more difficult to activate than ties between people who share the same interest. So from my point of view this fact shows exactly the weakness of weak ties. When I mean weak ties I refer to the bridging ties that link different topic interest communities together. We see that not only the weaker the tie the lower the chance of it carrying a retweet but also if the tie is a bridging tie the chances drop significantly.

Additionally we can also see that the reciprocated friend and follower ties correspond to the majority of the bandwidth of information exchanged. This is also an interesting fact since the stronger the ties get the higher the chance of obtaining a retweet through this tie, but at the same time the total amount of retweets flowing through these ties drops dramatically (we will also see this when we take a look at the valued at-interaction ties). Just by adding up the numbers we see that almost 3/4ths of all retweets inside the group have flown through the reciprocated friend and follower ties. So although those ties have only a ratio of 0.8 of retweets / no retweets they are the ties that are mostly responsible for the whole information diffusion inside the group.

**Interaction ties**

When we analyze the interaction ties we find a similar pattern. We see that the bonding ties have a much higher chance of resulting in a retweet than their bridging counterparts, although the difference is not as dramatic. In general we also notice that the reciprocated at_ties have the higher chance of leading to retweets. Actually the ratio is higher than one in the reciprocated bonding ties. This means that per tie we obtain more than one retweet. From tie “maintainance perspective” it would seem smart to maintain such ties with your followers because on average they lead to the highest “earnings” or retweets. We shouldn’t jump the gun too early here, because up till now we have analyzed the rather “weak” ties. Why weak? Well having had a reciprocated conversation with a person is great but having had received 10 or 50 @ replies from that person is definitely a stronger tie, and might lead to a higher chance of getting retweeted by this person.

If we look at the valued ties we could replicate the table above and go through each tie strength separately, but its more fun to do this in a graphical way. I have therefore plotted the tie strength between two persons on the X-axis and the ratio (ties that had retweets flow through this type of tie / same type of ties that had no retweet) on the Y axis (make sure to click on the graphic to see it in full resolution)

So what do we see? Well first of all the red line marks the ratio of 1, which is receiving more retweets through this type of tie than not receiving retweets. Anything above one is awesome ;). You also notice that there is quite a lot of variance in the retweets, which is indicated by the error bars (std deviation). As the ties get stronger I would say that the standard deviation also gets higher (due to higher and less values in the retweets)

**Bridging ties vs. bonding ties**

What we notice is that both the bridging and bonding ties have a tendency to result in a higher chance of retweets flowing through this tie, the stronger they get. I would say this holds up to a certain point maybe the strength of 40? After this the curve starts to fluctuate so much that we can’t really tell if this behavior looks like this simply by chance (notice the high error bars). What we also see is that clearly the bridging ties have a lower chance of resulting in retweets than their bonding counterparts (comare green curve with the blue one). This is an observation that we have also noticed before. So again here it is, the weakness of weak ties. Weaker ties lead to a lower chance of resulting in retweets and the typical weak bridging ties also are much harder to activate than their bonding counterparts. What is not shown in this graph is the total number of retweets that have flown through those strong ties. Those are ~ 29000 retweets for bridging ties and ~ 37000 for bonding ties. Compared to the other tie types this is only a fraction of the total of exchanged retweets. Yet these strong ties in comparison have a very high chance leading to retweets, having sometimes ratios higher than 3 (i.e. there are thee times more retweets than flowing through this type of tie than no retweets flowing through this tie).

Well that was it for today. I will update this blog post with the reverse direction of ties tomorrow where Iwill have a look on the influence of outgoing ties on the incoming retweets. But don’t expect any surprises ;). Plus I will post the code that I used to generate this type of analysis.

Cheers

Thomas