Madking's Musings: May 2011

Monday, May 23, 2011

Moore's Law Doesn't Catch Imagination

Back in 1992 I had an opportunity to use an iPSC super computer. If I recall correctly, the particular computer I used had a total of 192 MB of RAM. This was just an outrageously huge amount. It was enough that I didn't really pay too much attention to the memory I was using on the program I was writing. When it was suggested that maybe my program was failing due to using too much memory, I was incredulous. But upon further inspection of my program, sure enough, I was trying to allocate arrays requiring a few hundred megabytes of memory.

Using a generalization of Moore's Law that says that processing power should double every 18 months (I know, that's not the real law), and given that approximately 12 18month periods have passed since then, computers should be 4,000 times as powerful. And sure enough, you can find supercomputers with terabytes of RAM, and the laptop I just bought has 8GB of RAM, which is 4,000 times what was common then. I am actually kind of amazed at how accurate this rule of thumb is.

However, my programming goals apparently grow just as fast. Just the other day, as I was practicing for programming contests, I tried to allocate a 4 terabyte array. Needless to say, my program failed. And for some reason, instead of just crashing, it ground my computer to a halt, eventually requiring a reboot. I suspect in another two decades I'll be crashing programs by trying to allocate exabyte arrays.

Conclusion
For many business or web apps, network or database latency is going to be your biggest bottleneck. Because of this we often treat memory as infinite and processing time as zero. The emphasis on writing readable, maintainable code as opposed to the most efficient is typically the right trade-off. However, memory isn't infinite, and processing time isn't zero. Before coding a module that requires pulling your entire database into memory, think about whether this is a couple megabytes of memory or a hundreds of terabytes. Modern computers are powerful, but they still aren't as powerful as we'd all like.

Monday, May 16, 2011

Rails form_for_hash

Ruby on Rails has some nice form building helpers if you are building an HTML form for populating an ActiveRecord model. But what if you want your front end forms to not exactly map to your back end models? Maybe you are presenting a logical view that corresponds to parts of different models. Well, Rails still provides helpers that let you roll your own form. However, now you have to manually handle reading and writing values to the form. It would be nice if you could get the form_for functionality with just a simple Hash.

Well, it turns out that with a little bit of work, you can. Based on an idea I found at http://pullmonkey.com/2008/1/6/convert-a-ruby-hash-into-a-class-object/, I created a simple Object which wraps a Hash and behaves like an ActiveRecord. This class looks like:

class HashObject
  def initialize(hash={})
    @hash = hash
  end

  def method_missing(sym, *args, &block)
    @hash[sym]
  end
end

Then to make it easy to use, I put a helper function in helper/application_helper.rb

def form_for_hash(hash, name, url, html_options={}, &proc)
  object = HashObject.new(hash)
  form_for object, :as=>name, :url=>url, :html=>html_options, &proc
end

Now I can use it in the view as such: (e.g. new.html.haml)

= form_for_hash @person, :person, special_people_path do

  = f.label :first_name, 'First Name'
  = f.text_field :first_name

  = f.label :last_name, 'Last Name'
  = f.text_field :last_name
  %br
  = f.label :phone_number, 'Phone Number'
  = f.text_field :phone_number
  -# etc etc etc

Then in your SpecialPeopleController you would have methods like

def new
  @person = {:first_name=>'Jenny', :phone_number=>'867-5309'}
end

def create
  @person = params[:person]
  # logic for populating/updating models based on hash goes here
end

And that is all you need to do to be able to use a Hash to populate your forms.

Monday, May 9, 2011

Programming Contests and Languages

This past weekend was the qualifying round for the Google Code Jam. Google allows you to use any language you would like in this contest. This year I have decided to use Ruby as my language of choice. In the past I have always used Java, as that is the language that I currently know the best. However, I think Ruby will allow me to write cleaner, more compact code which will let me code faster and hopefully write fewer bugs.

Simplicity

Why do I think this? Let me give you a simple example. A common task in these contests is to be able to read in a line of space separated integers and convert it into an array for later processing. Here is how I accomplish this task in Java:

  1 BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
  2 String[] parts = reader.readLine().split(" "); 
  3 int[] array = new int[parts.length]; 
  4 for (int i = 0; i < array.length; i++) {
  5   array[i] = Integer.parseInt(parts[i]); 
  6 }

Here is how I do the same thing in Ruby:

array = gets.split.map { |i| i.to_i}

Bugs
Imagine I now want to loop through the array and determine how many array elements are the same as their position. I.e. if I have the array [0, 2, 1, 3, 5, 4], elements 0 and 3 are the same as their index and so I'd want to get a value of 2. Here is Java code that accomplishes this task:

  1 int count = 0;
  2 for (int i = 0; i < array.length; i++) {
  3   if (array[i] == i) count++; 
  4 }

Here's Ruby code that does the same thing:

count = 0
array.each_index {|i| count += 1 if array[i] == i }

Now these two code samples are almost identical. Some people would argue that its just a matter of style which you prefer. However, as someone who has written countless loops like this under the time pressure of programming contests I can tell you there is one crucial difference. In the Java code I am explicitly incrementing the index variable i in the for loop and checking it against the length of the array. In Ruby, the method each_index does this for me. Why does this matter? Well, imagine that you have nested loops. Well, it can be really easy to accidentally write your inner loop like:

for (int j = 0; j < array2.length; i++)

and now you suddenly have an infinite loop. More insidious, you could write:

int[][] map = new int[ROWS][COLS]; 
for (int r = 0; r < map.length; r++) 
   for (int c = 0; c < map.length; c++)
     array[r][c] = r + c;

when you meant to write the column loop:

for (int c = 0; c < map[r].length; c++)

and now you have code that works whenever ROWS == COLS, but not otherwise.

Having had many conversations with other competitors at contests, I can tell you that I am not the only one who has wasted large amounts of time tracking down dumb bugs just like these. Nor am I the only one who has unknowingly submitted a buggy solution because I only tested on square maps, and so failed a problem because of such a silly bug. My hope is that by using Ruby I will spend less time dealing with bugs like this, allowing me more time to actually solve the problem at hand.

Performance
Here's the catch. The dynamic aspect of Ruby comes at a cost. Many problems in programming contests like the Google Code Jam are very computationally intensive and there is always a time limit for how long your code can run. In fact GCJ's FAQ warns that Python's slowness may cause problems for you on some problems. My assumption is that Ruby has similar performance characteristics to Python, so I decided to do some simple comparison's between Ruby and Java performance on some programming contest type tasks.

Test 1
As my first test, I came up with a solution for GCJ's 2008 Round 1B's Problem C. Mousetrap. My Ruby solution takes about 40 seconds to correctly process Google's large input set. The same algorithm in Java takes about 4 seconds to process the same data set.

Test 2
My second test was based on a suggestion of a friend who was inspired by Problem D. GoroSort on this year's qualification round. What I did was write a program which creates every permutation of N integers, and then for each permutation counts up how many numbers are the same as their index. In addition to Java and Ruby, I dusted off my C and wrote a C implementation, as well, for comparison. Here are the results I get on my Macbook Pro. (all values are in seconds)

N	Ruby	Java	C
6	0.015	0.20	0.003
7	0.028	0.21	0.004
8	0.15	0.22	0.007
9	1.3	0.24	0.021
10	14	0.55	0.16
11	160	3.9	2.4
12	1,980	43	36

Conclusion
There is a definite and meaningful difference in performance between Ruby and the other languages for larger data sets. Still, for the majority of problems Ruby will be fast enough. I will just need to make sure I do a big-O analysis of my algorithms before I implement them, and if there is any doubt as to performance, write it in Java rather than Ruby. For any problems where speed doesn't look to be an issue, I think I will try to use Ruby. Hopefully this will work for me.

Monday, May 2, 2011

Session Timeouts on Rails

Session Timeouts are a perfect example of the conflict between usability and security. From a security standpoint you don't want to leave a user logged in for an extended time period, because someone else may get access to their browser and then act as that person. As an example I have a a Facebook friend who frequently has posts made in her name by friends who get a hold of her phone. On the other hand, as a user, it is really frustrating when I have to keep logging back into a site.

In general, be only as secure as you need to be. I am ok with my bank logging me out after a short period of inactivity. However, I am willing to risk the occasional prank post on Facebook to not have to keep logging back in. As long as the social sites require a re-entering of a password before allowing a password change or any other personal information changes, they are secure enough from my point of view.

Session Expiration Tasks
If you want to expire sessions, you need to do a couple of things:

Keep track of the time of the user's last action
Show the timeout on the client side, by having a javascript function running on the browser which will load the login page after the session has timed out
Enforce the timeout on the server side by having the server redirect to a login page if a different page is requested after a timeout
Make sure the redirect also happens on Ajax updates

Rail Implementation
For purposes of this code, I am assuming that you have a login system like shown in Michael Hartl's "Ruby on Rails Tutorial".

Tracking Timeout
Following the example I found at http://snippets.dzone.com/posts/show/7400 I made my ApplicationController look like:

class ApplicationController < ActionController::Base
  protect_from_forgery
  before_filter :session_expiry
  before_filter :update_activity_time

  include SessionsHelper

  def session_expiry
    get_session_time_left
    unless @session_time_left > 0
      sign_out
      deny_access 'Your session has timed out. Please log back in.'
    end
  end

  def update_activity_time
    session[:expires_at] = 30.minutes.from_now
  end

  private

  def get_session_time_left
    expire_time = session[:expires_at] || Time.now
    @session_time_left = (expire_time - Time.now).to_i
  end

end

In this code we store an expiration time in the session scope, and we make sure the session hasn't expired before performing any actions. update_activity_time accomplishes the goal of tracking the user's last action and session_expiry enforces the timeout.

Support Ajax Calls
A complication comes up if your web application has Ajax calls. If an Ajax request comes in after the session times out you want to do a full page redirect to the login page, not just a partial update. To accomplish this I modified the deny_access method in the SessionsHelper module to look like:

def deny_access(msg = nil)
  msg ||= "Please sign in to access this page."
  flash[:notice] ||= msg
  respond_to do |format|
    format.html {
      store_location
      redirect_to signin_url
    }
    format.js {
      store_location request.referer
      render 'sessions/redirect_to_login', :layout=>false
    }
  end
end

As you can see, on an Ajax call we store the referrer, i.e. the page that the user was on when they made the Ajax call, and then render a javascript fragment. The file 'views/sessions/redirect_to_login.js.haml' just consists of the one line of javascript:

window.location.replace( '#{escape_javascript signin_url}' );

which tells the browser to redirect to the login page.

Client Side Timeout
It can be a frustrating user experience to perform an action on a page, only to find out that you have timed out. Since we've already decided that we will timeout the user, we can't prevent that behavior, but we can at least let the user know they have timed out by automatically redirecting them to the login page when they time out. To accomplish this we add the following javascript lines to our public/javascripts/application.js file

function checkSessionTimeout(url) {
 $.getScript(url);
}
function setSessionTimeout(url, seconds) {
 setTimeout("checkSessionTimeout(\'" + url + "\')", seconds*1000 + 15);
}

checkSessionTimeout is just a method which makes an Ajax call (using jQuery) to a URL and executes the returned javascript. setSessionTimeout sets a javascript timeout to call checkSessionTimeout after the specified delay.

To call these functions we add the following to our views/layout/application.html.haml

-if @session_time_left
  :javascript
    $(function() {
      setSessionTimeout('#{check_session_alive_sessions_path}', #{@session_time_left})
    });

Basically, if the @session_time_left variable is set, we use the jquery functionality to call the setSessionTimeout method when the page has been loaded. We pass the url that corresponds to the check_session_alive action on the sessions controller which looks like:

def check_session_alive

  get_session_time_left
  if @session_time_left > 0
    render 'sessions/check_session_alive', :layout=>false
  else
    sign_out
    deny_access 'Your session has timed out. Please log back in.'
  end
end

If the session has timed out, we just call deny_access which we showed above and already handles the redirect. If the session hasn't timed out, we render sessions/check_session_alive.js.haml which just calls setSessionTimeout again, with the new timeout value:

setSessionTimeout('#{check_session_alive_sessions_path}', #{@session_time_left})

One Catch
We don't want to enforce that the session is valid on the session actions like login, logout, and check_session_alive. To fix this we add the following to the top of our SessionsController:

class SessionsController < ApplicationController

  skip_before_filter :session_expiry
  skip_before_filter :update_activity_time

Conclusion
With that you have a bare bones session expiration system on your Rails app. I will leave it as an exercise to the reader to add extra features, like giving a user a 5 minute warning that they are about to be logged out.