In this short tutorial you will learn how to collect tweets using ruby and only two gems.
It is part of a series where I will show you what fantastic things you can do with twitter these days, if you love mining
The first gem I would like to introduce is sequel. It is a lightweight ORM layer that allows to to intterface a couple of of a
databases in ruby without pain. It works great with mysql or sqlite. We will use sqlite today.I have been using mysql in combination wit rails and the nice activerecord ORM, but for the most tasks it is a bit too bulky. The problem with Sqlite can be though that it does not provide multitasking capabilities. But we will bump into that later…
To get you started have a visit on http://sequel.rubyforge.org/
and have a look on the example. They are pretty straight forward. I can also recommend the cheatsheet under: http://sequel.rubyforge.org/rdoc/files/doc/cheat_sheet_rdoc.html
Install the sequel gem by and you are ready to go.
sudo gem install sequel
Let us set up a little database to hold the tweets. If you are familiar with activerecord, you have probably used migrations before. So sequel works the same way. You write migration files and then simply run them. So here is mine to get you started with a very easy table. Its important to save it as a 01_migration_name.rb file the number is important otherwise sequel wont recognize which migration to run first. I saved it as 01_create_table.rbclass CreateTweetTable < Sequel::Migration
class CreateTweetTable < Sequel::Migration def up create_table :tweets do primary_key :id String :text String :username Time :created_at end end def down drop_table(:tweets) end end
Run the first migration. You will find a great tutorial on migrations on http://steamcode.blogspot.com/2009/03/sequel-migrations.htmlsequel -m . -M 1 sqlite://tweets.db
If you are getting a “URI::InvalidURIError: the scheme sqlite does not accept registry part: …” then your database name probably contains some characters it shouldnt. Just try to use only letters and numbers.
So now you should have a sqlite database for the very basic needs of your tweets. But maybe you need a little bit more information on what you are capturing. So lets´write our second migration. In addition to just storing the text and the
username, I want to store the guid of the tweet and the timezone and the language used.class AddLangAndGuid < Sequel::Migration def up alter_table :tweets do add_column :guid, Integer add_column :lang, String add_column :time_zone, String end end def down alter_table :tweets do drop_column :guid drop_column :lang drop_column :time_zone end end end
After runningsequel -m . -M 2 sqlite://tweets.db
you have created a a nice database that will hold your tweets.
Lets see how it worked. To use sequel in your scripts you have to require rubygems and the seqel gem. What we want to do is to
connect to the database. Just fire up your irb and get us started:require 'rubygems' require 'sequel' DB = Sequel.sqlite("tweets.rb") tweets = DB[:tweets]
In those few lines you loaded up your database and now have a tweets collection that holds your data. I think that is really convenient.In part 2 I will show you how to collect them. Enjoy.