Pages

Tuesday, September 11, 2012

Using Python To Graph Twitter Activity

Did you ever want to know when your twitter followers are most active, or when the users you follow are most likely to post updates?  Using data gathered from the tool developed in my last post, along with my new program, tweet_graph, I'll show you how to create graphs of twitter activity that allow you to identify usage patterns over different time periods.

tweet_graph.py & tools

First of all you'll need to run contact_grab.py to get a list of your contacts, then run tweet_grab.py to get their status updates over a time span.  In my case I choose 42 days, an even 6 weeks.  You can create tailored graphs of the results by supplying tweet_graph with different arguments.  It simply steps through each status update, checks if it meets the parameters that you've set, and then based on the display period, places it into a bin to plot a histogram.

I'd rather show you some of the cool things you can discover using the tool, so I won't get too bogged down in the details of how tweet_graph.py works, but the following is a list of the arguments the program will take as an input.


usage: tweet_graph.py user display_period utc_offset [-h] [--rel_user USER_NAME] [--followee] [--follower] [--peers]

positional arguments:
  user                  twitter user to analyse
  display_period        (a) all, (d) daily, (w) weekly
  utc_offset            UTC offset for current location in hours

optional arguments:
  -h, --help            show this help message and exit
  --rel_user USER_NAME  related user to analyse
  --followee            show users that are followed by the user to analyse
                        but don't follow back
  --follower            show users that follow the user to analyse but aren't
                        followed back
  --peers               show users that follow the user to analyse and are
                        followed back


First up we'll display the twitter activity of all my contacts for the entire period downloaded.  In this case 6 weeks.

command line graph of twitter activity
my contacts twitter activity over 6 weeks

You can already start to see a pattern here.  At this point we can assume that the 6 troughs are low activity on the weekends, but we can plot the same data over a weekly time-frame to get a better understanding of what's going on.

command line graph of weekly twitter activity
the weekly trend of my contacts twitter activity

This confirms the idea that there's less activity on weekends, but it also shows another trend, one that's expected.  There ain't much happening on twitter in the middle of the night.  Of course this depends on who your contacts are.  I'd guess that 80% of my twitter contacts are local, with almost all the others in the US, so most of the activity I see will be during the day.  If we plot this data over the course of a day we can get a better representation of the activity at different times.

command line graph of daily twitter activity
the daily trend of my contacts twitter activity

This clearly shows that most of my twitter activity happens while I'm awake, with a quiet period from midnight through to about 5 am.  There are also 2 minor peaks visible throughout the day that coincide with the commute to and from work.  I'm reasonably sure that they are due to a couple of accounts I follow  tweeting about things like traffic and public transport.

Just to show some other capabilities of tweet_graph I'll display information about people that follow me but aren't followed back.

command line graph of twitter follower activity
the daily trend of twitter activity for users that follow me and aren't followed back

It shows that most of these users are active around 3 or 4 in the afternoon.  But what about people I follow who don't follow me?

command line graph of users followed on twitter
the daily trend of twitter activity for users that I follow but don't follow back

From the graph you can see that these accounts are most active around 9 in the morning with another peak at 3 in the afternoon and 9 at night. Generally though there is downward trend until about 1 am.

What about people I follow who follow me back?  I call them peers.

command line graph of user that follow back on twitter
the daily trend of twitter activity for users that follow me and are followed back

Here we can once again see the peaks around 8 am and 5 pm.

We can also get some more in-depth information about a specific user by using the --rel_user option.  In this case I'm having a look at the activity over a week for queenslandrail, the people responsible for passenger trains in Brisbane Australia.

graph of Queensland Rail twitter activity
the weekly trend of twitter activity for Queensland Rail

From this graph we can see that there isn't much activity on the weekend, which is what you'd expect from
a public transport provider.  Let's take a closer look by viewing the daily trend.

graph of Queensland Rail twitter activity
the daily trend of twitter activity for Queensland Rail

Once again, exactly what you'd expect from someone like Queensland Rail.  A lot of activity around peak hour times, most likely service updates and responding to commuter queries, a little throughout the day and not much from 7 pm to 5 am.

For a bit of a change let's have a look at the daily trend for an overseas user, sparkfun.

graph of Spark Fun twitter activity
the daily trend of twitter activity for Spark Fun with an incorrect UTC offset

At first it looks like they're creatures of the night, starting at midnight and finishing at 9am, however if you look at the command line you will see that the UTC offset is still set at +10 for Brisbane Australia, whereas it actually needs to be -6 for daylight saving time in Boulder Colorado.

graph of Spark Fun twitter activity
the daily trend of twitter activity for Spark Fun with a correct UTC offset

Once adjusted we get a more accurate picture of what's going on.  Because we are looking at a business user and not a personal user they are more likely to only respond during work hours which is what we see here.

We can also have a look and see what their weekly trend is.

graph of Spark Fun twitter activity
the weekly trend of twitter activity for Spark Fun

Wow, that's some awesome customer service,  they're posting regular status updates 7 days a week.

I find this kind of thing fascinating, using something as basic as twitter status updates you can build up a picture of a user or group of users.  Then again maybe it's just me, I think graphs are awesome.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.