Inheriting Attributes in a Tree Data Structure

I just published a new gem, inherited-attributes, for Active Record that works with the Ancestry gem to provide a tree data structure the ability to inherit attributes from their parent node, or farther up the tree.  We’ve been using this technique for a long time to support configuring our multi-tenant application.

Once the gem is installed, its very simple to configure:

ActiveRecord::Schema.define do
  create_table :nodes, :force => true do |t|
    t.string :name
    t.string :value
    t.string :ancestry, :index => true
  end
end

class Node < ActiveRecord::Base
  has_ancestry

  inherited_attribute :value
end

From there, you can access the effective attributes which look up the tree ancestry to find a value to inherit.

root       = Node.create!
child      = Node.create!(:parent => root, :value => 12)
grandchild = Node.create!(:parent => child)

root.effective_value       # nil
child.effective_value      # 12
grandchild.effective_value # 12 -- inherited from child

There are more options and examples in the gem, including has-one relationships, default values and support for enumerations.

We’ve found it helpful and writing a gem made this code much easier to test.  What code do you have that would be easier to test as a gem or would be useful to others?

Capistrano Deploys without Swap

I work on a Ruby on Rails application that is deployed with Capistrano 3 to Amazon Web Services.  We monitor our site performance with New Relic.  About a month ago, we noticed that our deploys were causing a delay in request processing and a drop in Apdex.

Here is an example of what we saw during a deploy.  The blue vertical lines are the deploy times and the green bar is how long a request spent waiting to be processed.

When we dug into it, we found that our servers were going into memory swap during the deploys.  When we deploy with Capistrano, a new Rails process is started to pre-compile the assets.  This pushes the memory usage over the physical memory limits.   Here is the key graphic from New Relic.  Note the swap usage just after 3:00 pm and the disk I/O at the same time.

You can see in the graph above how memory usage drops after the deploy so the solution to this was pretty straight forward: restart the servers first.

We’re using Puma as our web server, so we added these lines to our deploy file.  This causes a phased-restart to be sent before the deploy, freeing memory and allowing the asset compilation to have enough memory to run without using swap.  Since Capistrano is based on Rake, its important to re-enable the phased-restart task after its run, otherwise it will only be run once.

Now our deploys run without causing requests to be queued.  What tricks do you have for zero-impact deployments?

Labeling Rails Enums

Rails 4.1 introduced ActiveRecord Enums, a handy little feature that lets you store an integer in a database column, but use symbols or strings in your code.  Until today, I’ve been presenting these in select boxes with code like this:

This generates a perfectly acceptable select box if you don’t need internationalization and the enum names you’ve picked are good enough to present to the user.  If thats not the case, or you to change the presentation without changing the code, a new helper method leveraging Ruby’s internationalization (i18n) features can be a good approach.

Keep reading for how I changed the presentation of enumerations in a drop down without changing my models or renaming the enumeration names.


Here is how we can provide alternate labels for some of our drop down values.

First, create a new helper that will pass each enumeration name through i18n to perform a label conversation

Second, change your form to call the helper

Third, provide the translations.

Before translation:

After translation:

A few notes:
  • If the translation is missing, the code’s default is to fall back to the titleized text.  If you want to omit options where the translation is missing,  You can modify the methods to call enum_to_translated_option with a default of blank and the blank options will be removed from the select.
  • enum_to_translated_option is useful in a show template.  
  • It may be useful to sort the options after translation.  That is an easy addition to the helper.

Is this useful for you?  Is there a better way?  Let me know!

Removing milliseconds in JSON under ActiveRecord 4.0

Rails 4.0 introduced a small bug in JSON generation with this pull request. The output format for times (ActiveRecord::TimeWithZone) in JSON changed to include milliseconds. Sounds good right? Well, not if your API clients crash trying to parse milliseconds. Unfortunately, Rails 4.0 didn’t provide a configuration option for the timestamp precision in JSON output. What is a programmer under the gun to do? Upgrade to Rails 4.1 or get out your monkey and your patch and get to work?


First, lets see the bug in action. This test passes in Rails 3.2.15, but fails in Rails 4.0.4:

Simple enough. This is easier to see if you’ve got a Rails app:

Rails 4.1 adds a configuration option to default the time precision. In that version, you can set ActiveSupport::JSON::Encoding.time_precision = 0. However, if you are on Rails 4.0 for a bit, you can monkey patch AtiveSupport::TimeWithZone to go back to the Rails 3.2 implementation:

Normally, it seems the Rails team does a better job than this. Fortunately, with Ruby we’ve got the openness to go back to the old behavior in a pinch.

Model is a poor scope name in Ruby on Rails 4

Upgrading from Rails 3.2 to Rails 4.0 is not a trivial task. Sure, there is a guide, but when you upgrade you’ll probably be upgrading a lot of your gems, maybe your jQuery and jQuery-UI versions and then there are all the undocumented unintentional changes that can cause you grief. If you’ve got a good test suite (you have one right?) you’ll catch a lot of these, but some will leave you scratching your head. This one tripped me up for a while: we had a very simple scope stop working in Rails 4.

The scope model below works in Rails 3.2, but fails in Rails 4 when used with an association:


Here is the failure:

In the second query, post.comments.model.count the query has lost the comments scope as well as the conditional defined in our model scope and it counts all comments in the system.

The fix is quite easy, just change the name of the scope to something like not_index.

This seems to be change in associations that is most clearly illustrated by this:

I could never quite figure out where in activerecord this changed, but model was not a great choice for a scope name. This executable gist has everything in this post if you want to explore this more.

What surprises have you found with Rails 4?

Rails: updating an association through nested attributes does not touch the owner

I was a little surprised to discover that when you update an association through nested attributes, it won’t touch the parent record.  It makes sense when you consider that Rails is optimizing by not writing records that have not changed, but if were using updated_at on the model for caching you may be surprised.

For instance:

With these models:

This code will not change updated at on the post record:

As shown in the log:

The fix is very simple. Just touch the post record from the comment if you need updated_at to change on posts.

And now our post is updated:

Of course there are other ways to do this…

The code for this is on github.

Caching user records when using Authlogic

Today, I ran across another form of this issue with Authlogic. In short, every request in a Rails application using Authlogic with a User model that includes a last_request_at column will cause the updated_at column to change. This can break caching that is based on the last time a user was updated.

last_request_at is one of Authlogic’s magic columns and is useful if you want to track users that might be logged in. However, by including this column, every controller request will touch the user record and change updated_at.

If you add last_request_at to your User model, you’ll start seeing this in your application logs:

UPDATE “users” SET “last_request_at” = ‘2014-03-12 20:59:48.314863’, “updated_at” = ‘2014-03-12 20:59:48.320481’

However, if you’ve added e-tags to a controller based on the updated_at timestamp, you’ll now have broken your cache control.  For instance, this will no longer work:

To fix caching based on the User’s updated_at timestamp with Authlogic, you can add your own timestamp and maintain it whenever the User record changes. Here is what I came up with:

  1. Add a new column, user_updated_at, to the user model
  2. Whenever a relevant column changes on user, set user_updated_at to Time.now
  3. Change the cache control to be based on user_updated_at instead of updated_at
Here is what that looks like in the User model:


And in the controller:


This made my controller request caching work as expected.  How would you handle this?

Find the most popular capitalization of a word or phrase

Today, I needed to find the most popular capitalization of user entered product brands. Previously, I’d just used lowercase brands, something like this:

However, this was returning “whole foods” instead of “Whole Foods”, which was considered more desirable.

Using window functions and common table expressions, this is what I came up with:

Given this data:

Here is how it works:

The first part of the query gives us the total count for each brand, grouped by lower case, and for each unique capitalization of that brand provides a count and row number.

In this query, the rows with row_number == 1 are all the unique capitalizations

Of those, we want the ones with the highest count. Again, we partition with another common table expression. This time, ordering by the count per unique capitalization to ensure that the first row for each group was the most popular

From here, we just pick the first row for each brand and get the most popular capitalization

Simple enough eh?

Messy rails code demonstrating this and generating the tables in this post

More about window functions

More about common table expressions

Adding Process ID and timestamps to the Rails Logger

For the last few days, I’ve been trying to debug a race condition between several Delayed::Job workers.  After looking at log files for many hours, it became very frustrating to not know which one of my 3 workers was writing which statement.  I finally found this post which almost did what I wanted.

To add process ids and timestamps to your Rails logs, you can add this as a Rails initializer (config/initializers/log_formatting.rb for instance) and restart your application / workers:

config/initializers/log_formatting.rb:

Before:

Log formatting before adding process ids and timestamps

After:

Log format after adding process IDs and Timestamps


How to write a simple Rails gem

Today, I wrote my first Ruby on Rails gem.  It was a very simple refactoring of our code that I undertook when I needed to add the same functionality to a new model and I decided to do it as a gem instead of keeping it within our project.  I wanted to see how it worked and this is what I ended up with.  You can follow the Rails Guide, but it didn’t cover everything I wanted to do.

Create a repository for your gem.

You can do this on github, like I did for my gem, locally, or somewhere else.  

Create a skeleton gem

The rails guide for creating a plugin has this, but the details in the guide are thin.

Rails 3.1 ships with a rails plugin new command which creates a skeleton for developing any kind of Rails extension with the ability to run integration tests using a dummy Rails application.

I used this to create a skeleton for my plugin:

This creates a skeleton gem with a dummy rails application you can use for testing your gem.

Switch to rspec

I’ve used rspec for all my rails testing and I wanted to use it with my new gem as well.  However, the generator creates a test-unit dummy application out of the box.  This StackOverflow question had a good answer that I used to switch from test-unit to rspec.

  1. Add rspec as a development dependency in your gemspec
  2. Bundle Install
  3. Convert from test-unit to rspec
  4. Modify spec_helper.rb with code taken from test_helper.rb
  5. Run the tests
  6. Commit the skeleton gem to source control

Author your gem

At this point, you have a skeleton gem that you can use to write your code.  The gem I wrote added some behavior to ActiveRecord models, so I started out by generating some models in the dummy rails application located in spec/dummy and using test-driven development to build my gem.  

Try it with a real project

The tests you author along with your gem are very helpful, but a time will come when you want to try your gem on your local file system with a real project.  You can include a gem from the local file system with this in your gemfile:


Squash your commits

When you’ve got your gem working and you are ready to publish it, you may want to squash all your commits to the repository into a single commit.

Take a look at http://gitready.com/advanced/2009/02/10/squashing-commits-with-rebase.html or search the web on how to squash multiple commits to a single commit, or try it with git rebase -i