Rails on the Run

Rails experiments by Matt Aimonetti

Browsing Posts tagged legacy

In part I I quickly explained what I had to do, my limitations and a potential solution to connect to a legacy database.

In this post I’ll try to go through setting up a plugin for migration and start using RSpec for developing the migration plugin.

What we want is to migrate sites using the legacy application to our new Rails application. That means that new users won’t be be migrated. It therefore makes sense not to add the migrating logic to the main application but to create a plugin. (if you are not familiar with Rails plugins check this blog post from Geoffrey Grosenbach about plugins)

Let’s create our plugin

./script/generate plugin legacy_migration

Rails should have generated something like that:

legacy_migration
|-- init.rb
|-- install.rb
|-- uninstall.rb
|-- Rakefile
|-- README
|-- lib/
|   |-- legacy_migration.rb
|-- tasks/
|   |-- legacy_migration_tasks.rake
|-- test
|   |-- legacy_migration_test.rb

Since we are going to use RSpec, we can remove the test folder and create a spec folder. In our spec folder, let’s add some subfolders to organize our files. Let’s create a fixtures folder to hold, a helpers folder, a migrate folder (we’ll use that to migrate our legacy database) and finally, a models folder.

Our plugin folder should look more or less like that:

legacy_migration
|-- init.rb
|-- install.rb
|-- uninstall.rb
|-- Rakefile
|-- README
|-- lib/
|   |-- legacy_migration.rb
|-- tasks/
|   |-- legacy_migration_tasks.rake
|-- spec
|   |-- fixtures
|   |-- helpers
|   |-- migrate
|   |-- models

Great, let’s get started and let’s create our first spec. We should probably start by migrating users so I’ll create a new spec in the spec/models folder called legacy_user_spec.rb and add the following code:

require File.dirname(FILE) + '/../helpers/legacy_user_helper'
require File.dirname(FILE) + '/../helpers/spec_helper'
describe "a connection to the legacy application" do
  setup do
    @connection_status = LEGACY.connect
  end
end
it "should be connected to the legacy database" do
  @connection_status.current_database.should == ActiveRecord::Base.configurations'legacy'

end

Note that I’m using RSpec trunk/edge and I use “describe” instead of “context” and “it” instead of “specify”. For more information on how to run RSpec edge with TextMate read this previous post.

If we look at the code above, we start by requiring 2 helpers, a general helper called spec_helper and a helper just for this spec called legacy_user_helper (we will obviously need to create them otherwise our spec will failed).

Then we start our first spec by describing a connection to the legacy application and we specify that it should be connected to the legacy database.

Here is our setup code:

setup do
  @connection_status = LEGACY.connect
end

What I want is to retrieve a connection status after I connect to our legacy application. To manage the connection to the legacy application we will create a LEGACY module. We will need to connect to many legacy applications/sites and our module should help us doing that.

Then we can read that our spec checks that we are connected to the legacy database.

it "should be connected to the legacy database" do
  @connection_status.current_database.should == ActiveRecord::Base.configurations'legacy'
end

That means we want to compare the connection status to the ‘legacy’ environment defined in the database.yml file.

We now need to get this spec to pass.

Let’s get started by adding a legacy environment to our database.yml and creating our LEGACY module in our legacy_migration.rb file.

Add the following to your database.yml

legacy:
  adapter: mysql
  database: legacy
  username: root
  password:
  host: localhost

Now, let’s create the LEGACY module in our legacy_migration.rb file


module LEGACY

  # Connect to a Legacy database.
  # Usage:
  # Manual connection: LEGACY.connect(:database => 'legacy-database', :adapter => 'mysql', :username => 'root', :password => '', :host => 'localhost')
  # Auto connection to the database.yml defined legacy DB: LEGACY.connect
  # Connection to any database available in database.yml LEGACY.connect('legacy_test')
  # Connect a class to a specific database: LEGACY.connect(LegacyInstaller, ActiveRecord::Base.configurations['legacy_installer'])
  def self.connect(spec = nil, opt_env = nil)
    case spec
      # Automatically connect to the legacy environment database defined in the database.yml
    when nil
      raise 'Legacy Database not defined' unless defined? ActiveRecord::Base.configurations['legacy']
      LegacyActiveRecord.establish_connection(ActiveRecord::Base.configurations['legacy'])
      # Return the connections status
      LegacyActiveRecord.connection
      # A connection's name from the database.yml can be passed
    when Symbol, String
      if configuration = ActiveRecord::Base.configurations[spec.to_s]
        LegacyActiveRecord.establish_connection(configuration)
      else
        raise "#{spec} database is not configured"
      end
      # Connect a Model to a specific Database
    when Class
      raise 'Environment connection not provided or nil' unless defined? opt_env || opt_env['database'] == nil
      if spec.connection.current_database == opt_env['database']
        spec.connection
      else
        spec.establish_connection(opt_env)
      end
      # An array can be passed to establish the connection
    else
      spec = spec.symbolize_keys
      unless spec.key?(:adapter) then raise "database configuration does not specify adapter" end
        adapter_method = "#{spec[:adapter]}_connection"
        LegacyActiveRecord.establish_connection(spec)
      end
    end
  end

There we go, we have a really cool connect function, we can easily connect to the default legacy environment defined in the database.yml file, we can specify the connection settings, connect to another environment database and even connect one specific class/model to a specific environment. (that will useful since we have many databases). If we wanted to follow the TDD rules, I shouldn’t have written so much code… as a matter of fact, when I worked on this project I did not, but since I don’t have much time, I won’t go through the re-factoring steps.

One thing you might have noticed is that we establish a connection between LegacyActiveRecord and the legacy database. (instead of connecting ActiveRecord to the legacy database).

LegacyActiveRecord.establish_connection(ActiveRecord::Base.configurations['legacy'])

The problem is that we didn’t create the LegacyActiveRecord model yet. Let’s do that right away. Let’s add a new folder called models in our lib folder. In the models folder, let’s create a legacy_active_record.rb file and add the following code:

class LegacyActiveRecord < ActiveRecord::Base
  self.abstract_class = true
end

Cool, now let’s have fun with our new module, fire the console (./script/console) and try LEGACY.connect
Here is what we get back:


    >> LEGACY.connect
    => #nil, :database=>”legacy”, :allow_concurrency=>false, :host=>”localhost”, :username=>”root”, :adapter=>”mysql”}, @connection_options=[“localhost”, “root”, ””, “legacy”, nil, nil], [...]

let’s try to get the database we connected to:

>>LEGACY.connect.current_database
    => “legacy”

Awesome, we can now get our LegacyActiveRecord model to connect to our legacy database and return the connections status. Let’s run our specs……….. they pass.

Sweet, we setup our migration plugin, got our first spec written, added the code needed to get the spec to pass, I think we are done with PART 2 :)

I recently started working on migrating an old PHP based application to a new sexy Rails app. The old model was kind of messy, the usual case of bloated, feature creep app written by many people without any standard conventions.

Anyway, I have to migrate users and content from many sites using the legacy app.

Data Structure

Instead of having 1 code base and 1 database per site, we now only have 1 code base and 1 database for all sites on a server.

The legacy application had a minimum of 3 databases:
1. installer database (keeps track of site version, manage upgrades, list sites on a server)
2. central server database (keeps all the sites details such as billing address, location, contacts, licenses… )
3. site database (contains users, content, settings)

Our new application, on the other hand, is as mentioned earlier, very sexy and has a nice Model. It is also developed following the Behaviour-Driven Development principles using RSpec. The best way for me to migrate the legacy data is probably to develop a Rails Plugin with a set of Rake Tasks that I could use to migrate my sites automatically.

I therefore decided to start working on my migration plugin… using BDD, with RSpec edge….

The first think I started thinking was: what should I test/spec? how am I access the pile of not organized content from the legacy sites? How am I connect nicely to each of the legacy sites?

Model Connection

First question by order of priority: how could I leverage the legacy Model by using ActiveRecord?

I could create a Legacy Model for each table and connect each of them directly to the legacy database. Sounds ok but that means a lot of connections, and every time I would need to access another site, I would have to reconnect all my legacy classes… I was sure there was a better solution… and I was right. After googling for a solution, I noticed that Dave Thomas had already found a solution few months earlier:

class LegacyBase < ActiveRecord::Base
    establish_connection ...
  end

  class LegacyUser < LegacyBase
     ...
  end

  class LegacyContent< LegacyBase
    ...
  end

And as Dave explains in his post:

“It turns out that Rails does just about everything lazily. That includes connecting to databases and reflecting on tables to extract the schema (needed to build the internals of the models). This improves performance, but it also makes this hack possible. In general, you’d expect the LegacyBase class to map to a database table called legacy_base. It would, if we ever tried to use it to access data. But because we don’t, and because Rails only reflects on the table the first time a data access occurs, we can safely create an ActiveRecord class with no underlying database table. This scheme lets me specify the legacy connection once, and share that connection between all my legacy models. It’s tidy, expressive, and saves resources.”

Great, I had a technical solution to nicely connect to a legacy database.

Go to Part 2