Rails memory usage testing

Posted by Janos on March 11, 2010

We need to write a test to detect possible memory leaks in some Rails code. The test itself is written in Ruby.

Testing GETs

I simply put the URLs in an array:

uris = %w(/admin
/users)

Then I make GET calls using the net API that comes with Ruby:

uris.each do |uri|
  Net::HTTP.get(uri)
end

To determine the virtual memory used by Rails, execute:

memory_check = `ps -eo vsz,args | grep script/server`
m = memory_check.match(/(\d+)/)
virtual = m[0].to_i

If you don’t manually start the server, replace the string after grep with
the name of your process.

Testing POSTs

Operations that modify data are usually more involved to set up because the POST
data is sent in the body of the request and also because there may be constraints
in the database. We, therefore, modify the datastructure to include the POST data
separately from the URI:

uris = ["/admin",
        ["/admin/users", {"client_id" => "387", "email"=>"test@mycompany.com", "permissions_select"=>"standard" },
        Proc.new { |hash| hash["email"] = "test#{Time.now.to_i.to_s}@mycompany.com"} ]]

We have an Array now. The first element will be still the URI. The second element contains
the submit data – Rails style. Finally, we specify a Proc object to modify the data
in a way that lets us send the request multiple times without getting constraint or
validation errors. The example Proc above ensures that each email is unique by using
the current time to build it.
Then in the code we inspect the first element of each URI to detect whether it’s a GET
or a POST:

uris.each do |uri|
  post = false
  case uri
    when String
      full_uri = uri
    when Array
      # POST with a hash of values
      full_uri = uri[0]
      post = true
  end
  # test code here
end

Before submitting, we check the post flag and act accordingly. We invoke the supplied
Proc object passing in the Hash containing the POST data for modification:

url = URI.parse full_uri
  Net::HTTP.get(url) if !post
  if post
    # let's customize
    uri[2].call(uri[1])
    Net::HTTP.post_form(url, uri[1])
  end
end

Hitting each URL multiple times

The memory test is only realistic with many hits. For this, we introduce a variable
called @how_many into our mini-framework that the caller can set.

uris = [ "/admin" ]
module Net
  class HTTP
    class << self

      attr_accessor :prefix
      attr_accessor :how_many

      def fetch(uris)
        data = []

        uris.each do |uri|
          post = false
          case uri
            when String
              full_uri = prefix + uri
            when Array
              # POST with a hash of values
              full_uri = prefix + uri[0]
              post = true
          end

          url = URI.parse full_uri
          @how_many.times do
            Net::HTTP.get(url) if !post
            if post
              # let's customize
              uri[2].call(uri[1])
              Net::HTTP.post_form(url, uri[1])
            end

            memory_check = `ps -eo vsz,args | grep script/server`
            m = memory_check.match(/(\d+)/)
            virtual = m[0].to_i
          end
        end
      end
    end
  end
end
Net::HTTP.how_many = 50
Net::HTTP.prefix = 'http://localhost:3000'
Net::HTTP.fetch(uris)

We’ve also added a prefix to the URI so that we can test against any host/port
combination.

Data collection and graph

We use an array to store the memory used after each hit.

def fetch(uris)
  data = []
  # processing goes here
  memory_check = `ps -eo vsz,args | grep script/server`
  m = memory_check.match(/(\d+)/)
  virtual = m[0].to_i
  data << virtual
  # more processing
end

Once we looped through all URIs and hit them as many times as specified, we have an array that we
can use the draw a graph. gruff is well-suited for this purpose. First we need to
install ImageMagick, then we implement a module to encapsulate graphing code:

require 'gruff'

module Net
  module Grapher

    def draw_graph(title, url_count, times, data)
      puts 'drawing graph!'
      g = Gruff::Line.new
      g.title = title

      g.data("#{url_count.to_s}/#{times.to_s}", data)

      g.labels = {0 => '0', 40 => '40', data.size => "#{data.size}"}

      file_name = "#{title.gsub(/\//, '')}_#{url_count.to_s}_#{times.to_s}.png"
      File.delete(file_name) if File.exists? file_name
      g.write(file_name)
    end
  end
  class HTTP
    extend Grapher
    # more code here
  end
end

The draw_grap method is available now. It takes a title, and some statistical information.
The last argument is the Array itself containing the data. Since the number of hits
and the number of URIs is important to evaluate the graph, we use that information
to construct the name of the file name:

file_name = "#{title.gsub(/\//, '')}_#{url_count.to_s}_#{times.to_s}.png"

Running the test

Let’s repeat the code in its entirety:

#!/usr/local/bin/ruby
require 'gruff'

uris = ["/admin",
        ["/admin/projects", {"admin_project[change_id]"=>"0",
                              "admin_project[new_project_support_contact_attributes][][work_phone]"=>"",
                              "admin_project[new_project_support_contact_attributes][][last_name]"=>"",
                              "admin_project[new_project_support_contact_attributes][][first_name]"=>"",
                              "admin_project[new_project_support_contact_attributes][][mobile_phone]"=>"209-346-4294",
                              "admin_project[client_id]"=>"387",
                              "admin_project[description]"=>"",
                              "admin_project[project_name]"=>"This is test4448",
                              "commit"=>"Save",
                              "client_id"=>"387"}, Proc.new { |hash| hash["admin_project[project_name]"] = "This is test#{Time.now.to_i.to_s}" } ]
]

module Net
  module Grapher

    def draw_graph(title, url_count, times, data)
      puts 'drawing graph!'
      g = Gruff::Line.new
      g.title = title

      g.data("#{url_count.to_s}/#{times.to_s}", data)

      g.labels = {0 => '0', 40 => '40', data.size => "#{data.size}"}

      file_name = "#{title.gsub(/\//, '')}_#{url_count.to_s}_#{times.to_s}.png"
      File.delete(file_name) if File.exists? file_name
      g.write(file_name)
    end
  end
  class HTTP
    extend Grapher
    class << self

      attr_accessor :prefix
      attr_accessor :how_many

      def fetch(uris)
        data = []

        uris.each do |uri|
          post = false
          case uri
            when String
              full_uri = prefix + uri
            when Array
              # POST with a hash of values
              full_uri = prefix + uri[0]
              post = true
          end

          url = URI.parse full_uri
          @how_many.times do
            Net::HTTP.get(url) if !post
            if post
              # let's customize
              uri[2].call(uri[1])
              Net::HTTP.post_form(url, uri[1])
            end

            memory_check = `ps -eo vsz,args | grep script/server`
            m = memory_check.match(/(\d+)/)
            virtual = m[0].to_i
            data << virtual
          end
        end

        draw_graph(uris[0], uris.size, @how_many, data)
      end
    end
  end
end

Net::HTTP.how_many = 50
Net::HTTP.prefix = 'http://localhost:3000'
Net::HTTP.fetch(uris)

Make sure the file has the x flag set so it’s executable. If we save the code to
a file called test_me.rb and run ./test_me.rb, we get a nice graph like this:

Evaluation

This test was run on a Linux machine with 4G of memory. We hit 5 URLs, each a 100 times. The difference between the highest
and lowest memory usage is less than 4K so we see a nice garbage collection pattern.
No memory leak so far!

How to write a test for a Rails filter

Posted by Janos on March 05, 2010

How do we test a filter in ApplicationController? We want to bypass all the logic in the controllers that subclass it,
but we still want a realistic test with a request and a response. Such tests are
called functional tests in the Rails world.

Setting up the test

Our ApplicationController that has the filter is typical, with a before_filter:

class ApplicationController < ActionController::Base
  before_filter :check_logged_on
..
end

We create a test in test/functional/application_controller_test.rb:

require File.dirname(__FILE__) + '/../test_helper'

class FooControllerTest < ActionController::TestCase
  def setup
  end

  def test_before_filter
  end
end

We do need a real controller for a real test, but it can not be any of the existing ones. So let’s create a controller on the fly:

class FooController < ApplicationController
    def index
    render :text => "index called"
  end
end

For simplicity, we just render text, so we don’t have to create a view.
Now we can write the setup method:

class FooControllerTest < ActionController::TestCase
  def setup
    @controller = FooController.new
    @request    = ActionController::TestRequest.new
    @response   = ActionController::TestResponse.new
    ActionController::Routing::Routes.draw do |map|
      map.resources :foo
    end
  end
end

We needed to add the routing code, so we can hit the controller. Now we are ready to test something.
Let’s say, the filter renders an error if the user is not logged in:

class ApplicationController < ActionController::Base
  def check_logged_on
      unless @logged_on
        render :status => 500
      end
  end
  # more code here
end

Of course, in the real world, this method would be more complicated, but this example illustrates the point. Our first test looks like this:

class FooControllerTest < ActionController::TestCase
  def setup
    @controller = FooController.new
    @request    = ActionController::TestRequest.new
    @response   = ActionController::TestResponse.new
    ActionController::Routing::Routes.draw do |map|
      map.resources :foo
    end
  end

  def test_bypass
    @controller.logged_on = true
    get :index
    assert_response :success
  end
end

For this to work, we need an accessor in FooController:

class FooController < ApplicationController
  prepend_before_filter :check_logged_on
  attr_accessor :logged_on

  def index
    render :text => "index called"
  end
end

If you run this, you will get success because FooController#index has been called and by default the status code returned is 200 which is :success.
The second test, will run the full code in the filter:

def test_render_error
  @controller.logged_on = false
  get :index
  assert_response :error
end

In this case the code under unless will execute and FooController#index will not be called. The returned status code will correspond to :error.

Running the test

To run the test, type the following from command line and you should get no errors:

janos@janos-laptop:/work/myapp$ rake test:functionals TEST=test/functional/application_controller_test.rb
(in /work/myapp)
Loaded suite /usr/lib/ruby/gems/1.8/gems/rake-0.8.7/lib/rake/rake_test_loader
Started
..
Finished in 0.23019 seconds.

2 tests, 2 assertions, 0 failures, 0 errors

Simple caching in Rails

Posted by Janos on February 04, 2010

We have seen the documentation on Rails caching. So the real question about caching, then, is where to put the code? Caching is a crosscutting concern, it can occur anywhere in the code, because anything can be cached. It does not go well with other modules. Caching code stays separate.

Almost all calls end up in ActiveRecord::Base. There is also first, last, but let’s focus on find for now. All queries go through Base.find – a class method.
We need to alias it so we can do more. The new method will be called find, but not before the old method is renamed to ‘old_find’.

alias old_find find

Nobody will know about the new find method, and that’s the right thing to do. When it’s called, it will check the cache to see whether the requested thing is there. If it’s there, it will return the value and never call the original cache. If there is a miss, the requested item is not there, the find method will call old_find. If old_find returns a value, it will cache that value.

value=Rails.cache.read(key)
 if !value
    value=old_find(*args)
     Rails.cache.write(key, value, :expires_in => 1.hours) if value
end

How do I generate the key? If we assume that all queries return different results, then one way to go is to use the method arguments. I will write a method like this:

def generate_key(*args)
    key="#{self.name}#{args.join}"
    key
end

I made it two lines for illustration so that you can put a breakpoint there and inspect the value in the debugger. I also use the name of the class since it’s possible to have a Video with id=94 and the Customer with id=94.

Preloading

You might be out of luck when referencing a cached object’s collections. This is because the connection to the database may be closed since the ‘parent’ object was cached. So if your videos have actors, for example, you might just want to call actors on your video to force the proxy to load the actors. Since finders either return a single object or multiple objects in an Array, I need to handle that:

def preload(result)
   if result.class==Video
      result.actors
      result.images
   elsif result.class==Array
      result.each do |video|
         video.actors
         video.images
     end
   end
end

Notice how you can do this for multiple collections.
Let’s put this all together:

if ENV['RAILS_ENV']=='development'
  RAILS_DEFAULT_LOGGER.debug "environment.rb: Enabling finder interceptor for Videos."
  class ActiveRecord::Base
     class << self
        alias old_find find
        def find(*args)
          begin
            #if defined?(@cache_finds) and @cache_finds
            if self.name=='Video'
              key=generate_key(*args)
              value=Rails.cache.read(key)
              if value
                logger.debug ">>>>>>>>> Returning "<<key<<" from cache."
              end
              if !value
                value=old_find(*args)
                preload value
                Rails.cache.write(key, value, :expires_in => 1.hours) if value
              end
            else
              value=old_find(*args)
            end
            value
          rescue Exception => e
              logger.error e
          end
        end

        def generate_key(*args)
          key="#{self.name}#{args.join}"
          key
        end

        def preload(result)
          if result.class==Video
            result.series
            result.images
          elsif result.class==Array
            result.each do |video|
              video.actors
              video.images
            end
          end
        end
      end
  end
end

Put this code in an appropriate location in Rails.
The beauty of the solution is that it keeps all other parts of the code clean – it is non-invasive. You can remove it, without affecting other places in the code. You can use any caching technology such as Redis, if you like. Or just use the MemoryStore that comes with Rails.

Ajax Pagination

Posted by Janos on January 16, 2010

It’s often necessary to paginate in Ajax applications. For example, let’s say we have a tool for editors that lets them search for videos. The search is performed in one area and in an other area the user selects the page he wants to populate. Then he can just drag the chosen videos over the selected page. Full-page reload on pagination would be a problem, because the user would have to find the page he wants to edit again. We need to to able to paginate within the search results while leaving the rest of the page intact.

I did not want to implement pagination on the back end myself, so I took advantage of the mislaw-will_paginate gem.
The first task was to capture the pagination links. We render the links into a local variable like this:

@videos=Video.paginate :page=>params[:page]
pagination=ActionView::InlineTemplate.new("<%= will_paginate @videos %>", nil).render(@template, {})

The code to query the database would typically be more complicated, there would be chained named scopes, for example, and the call to ‘paginate’ would come last:

@videos=Video.most_popular.paginate :page=>params[:page]

We use JSON to communicate between the browser and the server. The video results are in an array generated by ActiveRecord. Let’s add the pagination string in the first position in the array:

@videos.unshift pagination # let's stick the pagination information in the first position
render({ :json => @videos.to_json(..) })

This does not disrupt the json encoding, the links stay a string while the rest of the array’s members get transformed according to the rules given in the to_json method.

On the JavaScript side, assuming that I have a div somewhere that looks like this:

<div id="pagination">
</div>

I can take resolve the pagination information and stick it to the div:

function onSearchResultsReady(results) {
  var pagination=results.shift();
  //processing here
  // ..
  $('#pagination').append(pagination);
  ajaxifyLinks();
}

The final step is to prevent the browser from reloading the page so that it instead makes an Ajax call when the user clicks on a page link. Notice the call to ajaxifyLinks. That function overrides the links’ default behavior:

function ajaxifyLinks() {
    $('#pagination a').click (function(){
        $.getJSON(this.href,
          {},
          onSearchResultsReady);
        return false; // don't follow the link
    });
  }

With minimal amount of work, we were able to make the gem do all the heavy-lifting, and even got our ajaxified pagination as well:

video_editor

Passenger on Ubuntu

Posted by Janos on November 26, 2009

These notes describe how to install Passenger in a typical environment.
First get the gem:

sudo gem install  passenger

Run the installer following its intructions:

sudo apt-get install apache2-prefork-dev
sudo passenger-install-apache2-module

The installer now will build and install the Apache module.

Once, that’s done, add the snippet given by the installer to /etc/apache2/apache2.conf. For example, it might look like this:

LoadModule passenger_module /usr/lib/ruby/gems/1.8/gems/passenger-2.2.7/ext/apache2/mod_passenger.so
PassengerRoot /usr/lib/ruby/gems/1.8/gems/passenger-2.2.7
PassengerRuby /usr/bin/ruby1.8

Let’s say we want to make the application available on port 3000. Add the port to /etc/apache2/ports.conf:

NameVirtualHost *:3000
Listen 3000

Create a file in /etc/apache2/sites-available choosing any name for it. In this example, let’s call it rails:

<VirtualHost *:3000>
  ServerName myserver.thruhere.net:3000
  CustomLog /var/www/ruby/log/access_log common
  ErrorLog /var/www/ruby/log/error_log
  DocumentRoot /somewhere/myrailsapp/public
</VirtualHost>

Notice how the port matches the entry in ports.conf.
Now go to /somewhere/ and type rails myrailsapp. Once the application is generated, restart Apache:

sudo /etc/init.d/apache2 restart

Now if you go to the url specified in the file rails, you should see the Rails welcome page:


http://myserver.thruhere.net:3000

First looks at RubyMine

Posted by Janos on April 15, 2009

I paid the $50 to get the beta version of RubyMine 1.0. (It will be $99 or so when it’s officially out.) It took me a little while to customize the fonts and colors so that the IDE does not look ugly. You might have to download fonts, especially on Ubuntu, until you find one that looks good in RubyMine.
I exported the preferences to a jar file so that I can share the settings between my MacBook Pro and Ubuntu machines.
The IDE seems a little sluggish.
The integration with Git is good. It’s easier to use than the command line Git commands, especially if you are getting started with Git. If you want to learn Git, you can look at the command-line commands RubyMine executes.
I like the ability to debug rake tasks.
The biggest selling point is code completion and code lookup. I can hold down the Control and Command keys over an expression and RubyMine navigates to the definition. If there is more than one match, you can select which one you want to go to.
Overall, RubyMine is a good tool for big tasks and exploration. For routine coding and for experienced Rubyists, the ‘favorite’ text editor will suffice.

rubymine

Install Ruby and Rails on Ubuntu 9.10

Posted by Janos on November 26, 2008

The following steps are required to install Ruby and Rails on a clean Ubuntu server. Your environment will be different so some steps may be missing or already have been performed.

sudo apt-get install ruby1.8
sudo apt-get install irb1.8
sudo apt-get install rdoc

Add a symbolic link to each binary you are using:

sudo ln -s /usr/bin/ruby1.8 /usr/bin/ruby
sudo ln -s /usr/bin/irb1.8 /usr/bin/irb

From the /tmp directory, execute this:

wget http://rubyforge.org/frs/download.php/60718/rubygems-1.3.5.tgz
tar xvzf rubygems-1.3.5.tgz
cd rubygems-1.3.5/
sudo ruby setup.rb

Before we can start installing the gems, we need to get the following libraries to ensure that gems that build native code can find their dependencies:

sudo apt-get install build-essential libopenssl-ruby libmysql-ruby libplrpc-perl libreadline-ruby1.8 libxslt1-dev

To access mySQL from Ruby, run the following:

sudo apt-get install libmysqlclient-dev

Now the gem should build without errors:

sudo gem install mysql --no-rdoc --no-ri

Install the Rails gems:

sudo gem install rails --no-rdoc --no-ri

This step may take a few minutes.

It is a just a matter of time before you’ll need a gem from github, so you might as well add it to the gem sources:

sudo gem sources -a http://gems.github.com

As a sanity check, run ‘gem environment’ and you should see github listed in the remote sources.
To see your gems, list the installation directory given from gem environment and the Rails and mysql gems should be there:

ls -la /usr/lib/ruby/gems/1.8/gems

If you are like me, you will test your code so install RSpec and Cucumber. Any serious test will require a mocking library. Factory girl is one very good one. So you end up with three steps:

sudo gem install rspec-rails --no-rdoc --no-ri
sudo gem install cucumber --no-rdoc --no-ri
sudo gem install thoughtbot-factory_girl --no-rdoc --no-ri

A useful gem for background jobs is bj, you can install it too:

sudo gem install bj --no-rdoc --no-ri

The test of the pudding is eating, so cd to /tmp and run:

rails myapp

cd to myapp and start the server. Since we did not install Mongrel or Passenger, the server will start with WEBrick:

./script/server
=> Booting WEBrick
=> Rails 2.3.4 application starting on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
[2008-11-26 14:44:15] INFO  WEBrick 1.3.1
[2008-11-26 14:44:15] INFO  ruby 1.8.7 (2009-06-12) [x86_64-linux]
[2008-11-26 14:44:15] INFO  WEBrick::HTTPServer#start: pid=4940 port=3000

If you like the least amount of work to verify something, make sure you have curl

sudo apt-get install curl

and run

curl http://localhost:3000

The raw html from the response will appear in your console.
Or, you can direct your browser to http://localhost:3000 and you should see the Rails welcome page. Enjoy!

Object-oriented code with closures

Posted by Janos on January 09, 2008

We want to write a program that prints the day of the week based on an index. To find out what day was yesterday, we pass -1 to the program, to find out what day it will be the day after tomorrow, we give 2. You get the idea. For simplicity’s sake let’s say that today is Friday. (In a real program you would use a Ruby Time class, etc. to know the days.) Our first attempt is like this:

class Calendar

    def print_day(index)
      h=calculate_days
      h[index]
    end

  private

    def calculate_days
      h=Hash[-3, 'Tue',
      -2, 'Wed',
      -1, 'Thu',
      0, 'Fri',
      1, 'Sat',
      2, 'Sun',
      3, 'Mon']
    end
end

c=Calendar.new
puts c.print_day(ARGV[0].to_i).to_s

If you save this snippet to a file called closure.rb and execute it passing the argument -2, you will see:

ruby closure.rb -2
Wed

There are two problems with the code. One is that the calculation depends on some initial condition (what’s today) which is evaluated every time print_day is invoked. This is not very performant, since h is built again and again. This program can be rewritten like this:

class Calendar
    def initialize
      # some calendar function populates the hash
      h=calculate_days
      @day_calculator=Proc.new { |index| h[index] }
    end

    def print_day(index)
      @day_calculator.call(index)
    end

  private
    def calculate_days
      h=Hash[-3, 'Tue',
      -2, 'Wed',
      -1, 'Thu',
      0, 'Fri',
      1, 'Sat',
      2, 'Sun',
      3, 'Mon']
    end
end

c=Calendar.new
puts c.print_day(ARGV[0].to_i).to_s

Let’s save this code a file called closure2.rb and run it with the argument: -2:

ruby closure2.rb -2
Wed

We get the same result. In the initialize method, the code determines the day names corresponding to the indexes and creates a Proc object. The Proc object keeps a reference to the local variable h created by calculate_days. The Proc object is then assigned to a member variable. Since @day_calculator points to h, it is said to be enclosing it – it is a closure. h will not be garbage collected as long as the Calendar instance is in memory, because its member references it.
Now every time print_day is invoked, the @day_calculator uses the same unchanging object to look up the day.

As classes inevitably grow, this kind of refactoring is necessary for performance reasons. From an object-oriented standpoint, the code using closures is cleaner, because it does not store the initial calculations using member variables, but builds functions using those values. The class concerns itself with the creation of the right methods. It is called functional programming.

On multiprocessor architectures, closures can be sent out to an other processor for execution. In a certain way, they enable a specific kind of optimization.