Keeping Rails Migrations Rolling

Migrations allow Rails developers to easily incorporate schema changes as they develop and enhance their web application or service. Rails comes with a set of default rake tasks for migrations, which look for and load a set of time-stamp ordered files. By convention, these file names map to a set of class names, and on these, the migrator invokes the "up" or "down" class method as appropriate. 

The migration is thus ruby code, and lets the developer be very flexible in adjusting the schema and adjusting existing database information based on evolving requirements. For non-trivial applications, it is common to have many hundreds of migrations, added over time by different developers. 

Migrating Cleanly

As a general guideline, a new developer or a developer cloning the app on a new machine should always be able to migrate cleanly. Thus, either of the two below should work without a hitch.

rake db:drop; rake db:create; rake db:migrate;
rake db:reset

The latter has the same effect as the former, except that db:reset applies the pre-existing schema.rb while the former loads and runs all the migrations individually, creating a new schema.rb file. 

In my experience there are some creeping code and environment problems which can break migrations, especially for those that come after you to work on the project. Here is a list of common mistakes or gotchas and how to avoid them to keep your migrations rolling:

0) Ignoring db/schema.rb from source control 

In many projects, db/schema.rb is kept out of the source repository. The rationale is that it can always be recreated by running migrations. The perceived danger is if someone down-migrates for testing, and then checks in the stale schema.rb. However, this is less of a worry, since you are trusting your developers to have better sense and run the latest unit tests before committing. (You do have tests, right?!) 

The advantage of having a schema.rb is that a new developer can quickly view the database schema, and just run db:reset to get started. 

1) Assumptions about the DB and existing columns 

This is most likely in the case of projects that deal with a pre-existing database, and migrations start from the point where Rails was used to enhance the product with new features or a semi-independent sub-system. This bifurcates schema knowledge needlessly. 

When dealing with legacy databases, you should take the time to use rake db:schema:dump or some manual process to create a 00001_initial_db_schema.rb. You will thank yourself you took an afternoon out to do it. 

You should also use rescues judiciously to deal with legacy databases, especially if there are multiple copies which may have drifted from each other (e.g. no indexes on staging/test legacy server) over time.

2) Migrations used to populate data. 

Yes, migrations can have any Ruby code, and it is possible to include some nifty programmatic ways to pre-populate admin users, basic settings and other fun stuff. 

But migrations is not the place for it. Instead seed data using db/seed.rb and run rake db:seed. Consider gems such as populator, or create rake tasks for common or periodically imported data. Some apps might have an admin interface which can be used for introducing necessary data. 

3) Using Model functionality in migrations 

Models are most commonly used in migrations to "fix-up" data when modifying column semantics or changing table relationships. This is necessary, and allowed. In fact, Rails even provides the #reset_column_information method on ActiveRecord models to reflect the latest migration. 

Models can introduce circular or irreconcilable dependencies amongst your migrations, especially in conjunction with other mistakes.  

Let's say you add a column to the database in a later migration. You change the model with a corresponding validation on that field. You might still be able to run all migrations, unless you are creating dummy model objects. Or if you are doing fixups which involve saves or updates. Or more commonly, if you renamed fields, and therefore the corresponding validations on the renamed fields don't work on migrations preceding the renaming. 

You mitigate this either by using SQL statements with "execute" in the case of data fixups, or by using stripped down "mock-models" as nested classes inside your migration. 

4) Using ActiveRecord models in environment.rb, or in initializers

It might be tempting to use your nifty model (e.g. Setting, Tag) or some such as part of environment, configurtation or initialization. Terrible mistake, and usually indicative of teams which never run tests, continuous integration or other deployment mechanisms. Why? Because the Rails environment is initialized not just by the app server, but also by rake tasks, including migrations. Any model used in initializers means that migrations on an empty database can never run. 

5) Renaming database tables

If you have to rename database tables, you might want to ensure that you are not using any models (with the old name) in your earlier migrations. If you are, and have to rename, consider 3) above. Or you might want to consolidate earlier migrations and do some cleanups. 

6) Inserting migrations or adjusting order

Sometimes, you need to insert a migration other than at the end. Other times, you may want to ensure certain migrations are always run last. For older Rails projects, you might have migration files with simple numerical order, e.g. 022_migration_name.rb. Newer migrations would have a time-stamp. 

Be consistent with migration numerals, especially the number of digits (e.g. 0001_early_fixup, or 9999_must_be_last_migration).  If using timestamps, create a timestamp pattern that stands out and is used consistently to indicate a fix-up. (e.g. ending in 99 or 77)

7) Dividing migrations into multiple directories. 

Ideally, all migrations should be in the db/migrate directory. A well-written plugin or gem would have generate scripts for including its required schema into the application. If your app was componentized and those modules developed in parallel, consider re-syncing schema by adding and checkpointing new migrations. 

8) Short-circuiting old or unused migrations

Let's say a particular migration is no longer necessary or required. Take the time to delete it, including other dependent migrations. Don't just "return" or - even worse - raise "Error" from migrations you don't want. This introduces cognitive and processing overload for a new developer, while possibly hiding bugs. 

By taking these steps and avoiding the mistakes above, you can enjoy all the benefits of Rails migrations and make life easier for you, your team and future contributors.  

Rogue Viruses - With Robot Armies

Media_httpwwwfsecurec_siaip

"It could adjust motors, conveyor belts, pumps. It could stop a factory. With right modifications, it could cause things to explode."

Stuxnet exploits vulnerabilities in Windows PCs to make a home and spread itself. But it does not stop there. It is looking for Siemens micro-controllers that run industrial systems. So it can do some thing.

Cheaper and easier than sending the Mission Impossible team.

http://www.f-secure.com/weblog/archives/00002040.html

Mixing in MongoDB

I have started using MongoDB for specific use cases in production. I don't see the world in black and white, and neither should you. My goal in writing this post is to note down the rationale and list some gotchas when using it with MongoMapper.

Media_httpapimongodbo_qeoqq
Media_httpstaticrails_kiccp

MongoDB provides document-oriented storage, with replication, indexing and rich query support. It blends aspects of distributed key-value stores by keeping most of the data in memory, pulls in fine-grained indexing and query language reminiscent of relational databases, provides map-reduce support for data processing similar to Big Table, while optimizing for fast in-place modifications. In the near term, it will provide horizontal scaling and sharding.

The tradeoff for MongoDB is in lack of transactions and being not fully ACID.  This means MongoDB does not offer single-server durability. Data is only eventually consistent. Due to filesystem operational choices (fsync vs. append-only writes, commit log, etc.) MongoDB can lose data during hard server loss.

Realizing all theses aspects, why did I choose to go with MongoDB?

Read the rest of this post »

Twilio on Rails

A few weeks ago, I looked up Twilio in conjunction with a telephony project. I was intrigued by the promise of a simple REST web API for building SMS and voice based interactions within web apps. In prototyping, I ended up developing a Rails plugin called Twilioflow which allows simplified development and testing of telephony apps.

Having spent some time at AT&T/Lucent Bell Labs, I had heard my share of war stories from the programmers and managers who worked on the venerable Audix system. From that perspective, Twilio is yet another example of creative destruction made possible by improvements in the general purpose computing platform coupled with an extensible network protocol (HTTP). Operationally, Twilio goes even further by harnessing Amazon's EC2 infrastructure and we can be sure they have numerous other tricks up their sleeve.

Read the rest of this post »

Rebalancing and Culling

Two years ago, over the summer, I put out several social games on Facebook, Hi5 and Orkut.

It was fun to brainstorm ideas, get something out, and watch the rush of users hitting your servers. The whole process of maintaining, updating and expanding your apps and increasing user engagement was instructive. Another bonus was understanding the intricacies of building on top of other platforms and the headaches of dealing with ever-changing API and policies.  In short, it was a great experience.

Experience is what you get, when you didn't get what you wanted.

Read the rest of this post »

AirDB can has_many_through

We now have a better, declarative and richer mechanism of specifying model relationships. In the process, AirDB now has support for has_many_through associations. As discussed earlier for join table attributes, the cleanest way to express the association is a has_many_through, where two models have a many to many relationship through an intermediate model. Join table attributes are a nice hack for someone who does not want to rewrite their application code to introduce the intermediate Model class. In some cases, such as self-referential many-many associations, or when you are in the early stages of designing the schema and the application, you may find it appealing to manage the join table as a first-class Model. If so, then has_many_through will help you.

Read the rest of this post »

iAd - Applevertising

[[posterous-content:xz2yfDkeiBbMS3g3RZuB]]

As expected, Apple CEO Steve Jobs announced the iAd Mobile Advertising platform from Apple. Apple wants to enable highly interactive and emotional advertising on iPhones. 

According to Jobs, unlike the desktop, "where search is everything", on the iPhone, everyone is interacting with apps. And there is an app for everything, with tons of developers making them free. So if developers have to use advertising to make money, Apple wants to ensure that the ads also benefit from the famous Apple design sensibility.

There are several technical and economic aspects of the iAd platform, and their implications on overall web advertising, that we will see in the weeks and months to follow. But, Apple's approach is an indication of two key aspects: 

  1. People want free stuff, developers give it to them, look to advertising to make money. 
  2. Advertising is being redesigned for new interaction and engagement modalities. 

Shiny and exciting times ahead.

Bootstrapping Stories - TiE Panel 2

Within just 3 months into the new year, the TiE Oregon SW & Internet group has organized two great events for entrepreneurs, gathering together six successful veterans who shared their stories, recounted lessons learned, mistakes made and the opportunities taken. Both events were packed with an engaged audience, resulting in informative and honest Q&A.

 

Sudhir BhagwanMatt ComptonRyan BuchananTaizoon DoctorNitin Khanna

 

The first event in February featured Nitin Khanna, former Chairman and CEO of Saber Corp, Sudhir Bhagwan, former Chairman and CEO of SnapNames and Matthew Compton, then venture partner at Madrona and now CEO at ShopIgniter. That discussion was about getting big and growing fast with venture capital, and was covered by the Oregon Business magazine. The second event held yesterday was focused on bootstrapping your startup into generating solid profits, and included Mona Westhaver, President of Inspiration Software, Taizoon Doctor, former CEO of Xovix and Ryan Buchanan, CEO of eROI. Both of these events were hosted by Brent Bullock at Perkins Cooie who did a splendid job of moderating and directing the flow of discussion, as well as elaborating on some of the legal aspects to issues faced by the panelists. Starting Out For those wondering about starting out, there were three diverse beginnings, each with a unique mindset.  

Read the rest of this post »

iPad Strategy

Media_httpuploadwikim_mgopw
 

This weekend, tens of thousands will get their hands on the iPad. Several people have been playing with it already. John Doerr has been widely quoted with his lusty lines:

“I’ve touched it, I’ve carressed it and I hope to sleep with it this Saturday night. It’s not a big iPhone. It’s the future.”

That coupled with an announced doubling of the iFund has a lot of developers and companies considering iPad as a must-have checkbox platform and a potential moneymaker.

Read the rest of this post »