If you need to run scheduled jobs in your Rails app (!= background jobs as in Resque), you should use the excellent whenever gem.

However, one detail that is really easy to miss is which shell environment the cron job actually runs under.

bundle: command not found

And an easy way to confuse yourself even further is by trying to debug issues in this area by:

  1. Logging into your  server as the deployment user
  2. Doing crontab -l to see which entries whenever installed
  3. Testing an entry by cut’n’pasting it to the shell prompt
The outcome will not be reliable, because you’re running it from a login shell and cron is not. So for example, your rbenv shims may be in the path while testing, but cron does not load your .bashrc or .bash_profile files.

On a project I work on, the staging and production machines are currently quite different. In production we use RVM (globally installed) while staging has rbenv locally installed for the deployment user.

So in staging I need to assume that no binaries are available and use the full path to bundler, in my case /home/passenger/.rbenv/versions/1.9.2-p290/bin/bundle.

(I know, symlinking this to eg. /usr/local/bin/bundle would probably be more elegant, but I actually like the reminder of where my ruby is actually installed. I don’t spend much time on the servers, so I tend to forget.)

Running stuff only on certain stages

EDIT: See Markham’s comment for a better way to do achieve the same result.

Another issue I struggled with for a while is how to schedule jobs only in production. For example, I don’t want to run S3 backups from staging.

I use capistrano/multistage, so I needed to inspect the value of the current “stage” from schedule.rb. It turns out that by including this:

require 'capistrano/ext/multistage'

set :whenever_environment, defer { stage }
require "whenever/capistrano"

…the stage is available in schedule.rb as @environment. Very simple, but it took some googling to find.

Example schedule.rb

Below is my anonymized schedule.rb. It should be pretty self explanatory, post questions in the comments if not.

def production?
 @environment == 'production'
end

set :bundler, production? ? "gp_bundle" : "/home/passenger/.rbenv/versions/1.9.2-p290/bin/bundle"

job_type :gp_rake, "cd :path && RAILS_ENV=:environment :bundler exec rake :task --silent :output"
job_type :gp_runner, "cd :path && :bundler exec rails runner -e :environment ':task' :output"
job_type :gp_bundle_exec, "cd :path && RAILS_ENV=:environment :bundler exec :task"

every 5.minutes do
 gp_rake "thinking_sphinx:index"
end

every 30.minutes do
 gp_runner "FooBar.update"
end

every 1.day, :at => "4:00am" do
 gp_runner "Baz.purge_inactive"
end

if production?
 every 1.day, :at => "4:00am" do
   gp_bundle_exec 'backup perform --trigger my_backup --config-file config/backup.rb --log-path log'
 end
end

EDIT: Jakob Skjerning posted this snippet in the comments:

job_type :rake, “cd /var/www/appname/#{environment}/current && /home/appname/.rvm/gems/ree-1.8.7-2010.02@appname-#{environment}/bin/bundle exec /home/appname/.rvm/wrappers/ree-1.8.7-2010.02@appname-#{environment}/rake –silent RAILS_ENV=#{environment} :task :output”

A few weeks ago I attended a great Geek night at Trifork. The speaker was @drkrab, and he was talking about “Actor oriented programming”.

Knowing Kresten’s work in recent years, it is obvious that this talk was very inspired by Erlang. In fact, much of it was about Erlang, but he only briefly touched on his Erjang project (which was nice for those of us who saw his Erjang talk at last year’s GOTOcon).

Erlang

Erlang is “designed for reliability in the presence of errors”. Others (notably @drkrab) can explain what this means better than me, but the following is a quick tour of some of the highlights as I understood them from the talk.

An actor is an “active object” (ie. an object containing its own thread) plus a state machine. In Erlang, actors are implemented as processes which are designed to be very lightweight. Kresten mentioned that the size of a process in Erlang is comparable to that of a Hashtable in Java, so literally millions of processes can coexist in the same Erlang VM. Processes can be hooked up to monitor each other, so that if one process dies, it can be restarted by its monitor process. Error handling is done very differently than in other languages: In general, don’t try to cover all possible error cases by coding defensively. Instead, let stuff crash and make sure that the system as a whole is resilient.

Processes communicate with each other by passing messages in the form of immutable data structures. Often, communication will simply be one-way, or it may be indirect in the sense that process A receives a message from process C in response to a message it sent to process B. In this scenario, process B will typically have spawned process C as a worker for handling that particular request. The interface of an actor is called a protocol. It can be thought of as an API except that it may change with the internal state of the actor.

Revolution

The basic premise of Kresten’s talk was that we as developers are headed towards a revolution. Just like object orientation was a revolution about 15 years ago, the internet age will be forcing to change horses again soon.

This is a very interesting theory which sounds very plausible to me. I’ve heard the topic mentioned many times before in different forms, but this was the first time I had heard anyone systematically explore what may be going on right now, and I found that really interesting.

Kresten started by exploring the anatomy of a scientific revolution, using the classic example of the geocentric vs. the heliocentric model of our solar system. Assuming that the earth is at the centre of the universe obviously made the orbits of our neighboring planets look very strange. Also, I’m sure there was a multitude of other problems I know nothing about.

Revolution needed

So there were a lot of observations that were hard to fit into the model, and this is a sign that a revolution is needed. And indeed, a very painful revolution was needed in which lots of scientific effort was suddenly rendered obsolete, and lots of frustration had to be endured.

Software professionals saw the same thing happen with the advent of object orientation. There were lots of misconceptions and resistance initially, but as history has shown, OO turned out to be a very useful paradigm.

However, we may be on the verge of the next revolution. The internet age is profoundly changing the way we build software – everything is now much more decoupled than is used to be. We often build new systems by integrating lots of existing services, each of which is outside our control. So increasingly, we need to design for reliability in the presence of errors. Sounds familiar?

My thoughts

I have thought about this talk a couple of times over the last few weeks, and I am pretty convinced that Kresten is on to something. And when I try to put these ideas in the context of my day to day work, there’s at least a partial match.

Much of what I do is Ruby on Rails programming, and in that community we do a lot of integration work. Not integration in the SOA sense, but integration in the form of consuming services provided by other parties. More and more frequently, we use web services to supply functionality that is just easier to buy than to develop from scratch. A good example is mailing lists. I don’t know anyone who would still opt for a DIY mailing list system when we have services like Mailchimp which are feature-rich, affordable and very solid. It has just become cheaper to buy services like this. Error reporting is another good example. I pay $5 per month for my Hoptoad account – this is so cheap that I wouldn’t dream of rolling my own solution. It just wouldn’t be cost effective because my time is much better spent working for clients.

These external services may be down from time to time, and we need to be prepared for that. Were we to implement them in our own software, they would be down from time to time as well, but that downtime would often coincide with the downtime of the client software (which is especially scary in the case of error reporting – related, check Avdi Grimm’s war story on the latest PragProg podcast).

Apart from RoR I do a lot of Android programming, and some of the ideas apply here as well, but I think putting them into practice is further off into the future. Mobile devices certainly need to handle errors well (loss of network connectivity is a very real issue), but changing the programming paradigm to something like Erlang doesn’t seem to be right around the corner.

However, the Android app I’m working on right now is written in Scala, so maybe I should try to sneak in a few actors ;-)

 

One of the things that bothers me the most in software development is when you come across a problem for which there simply is no nice solution to be found. You come up with some ugly hack and get the job done, but the nagging feeling of having fallen short lingers for days.

Problem in question: A project I’m working on sells stuff and uses a payment gateway for this. API access is expensive so we’re opting for the proxy method, meaning that the page that reads credit card information is actually served by the payment provider. It looks something like this:

http://epay.dk/proxy.cgi/http://myshop.com/payment

This means that our /payment page is a very special case. All asset paths must be relative, ie. images/foo.png instead of /images/foo.png. In contrast, links must be absolute, http://shop.com/about instead of /about.

I want the rest of the site to follow normal Rails conventions, so I need to special case this very page. This in itself is a challenge, because as it turns out my ugly hack will affect the application layout as well.

I first tried a few different routes:

  1. Override url_for and set :path_only => false. This solves only half of the problem (asset paths would still be wrong) and turned out to be painful because of the two different url_for implementations. Figuring out why is left as an exercise for the reader because frankly, I can’t be bothered to explain why. Leave a comment if you want to know.
  2. Define some default_url_options. For some reason, I never got this to work reliably, and after some time of fruitless digging trough the Rails source, I abandoned the idea.
  3. Create middleware that kicks in only if controller == Xxx and action == Yyy. It would then process the HTML and change href=”/…” instances to be relative etc. You may object that this is a terrible way of abusing the middleware system, but if it had worked, it would actually have been the least intrusive option. The reason it broke down was unexpected and enlightening: I was able to rewrite the HTML, but afterwards the Apache module pagespeed rewrote it again to optimize CSS includes etc. Obviously this only happened in our production environment, so it took me a couple of hours to track down.
So it would seem that my quest to solve this (somewhat) elegantly had failed. So be it. Time to resort to the nasty hacks.

The solution

Warning: You will be appalled.

What I ended up doing was:

  • In views/layouts/application.html.haml, I only include JS and CSS if we’re not on the payment page.
  • In the payment view, I include all JS/CSS like this (notice that the path does not begin with a slash): %script{ :src => “javascripts/jquery.min.js”, :type => “text/javascript” }
  • Also in the payment view, I added some JS that absolutifies all links. It looks like this:
  $('a[href^="/"]').each(function (idx) {
    fullUrl = "http://myshop.com" + $(this).attr("href");
    $(this).attr("href", fullUrl);
  });

Conclusion

I have created some code that works, but which I’m really, really unhappy about. I hope that some day I’ll come up with a better idea and fix this, but I doubt it. Suggestions are obviously welcome – please leave a comment.

On a project, I have a “Revision” resource and URL’s like

http://foobar.dev/revisions/1

Now suddenly the client wants to rename “revision” to “edition”, so all URLs should be changed to the format

http://foobar.dev/editions/1

Unless I’m missing something in the Rails docs, there is no option to change the base part of the path without renaming the resource itself. Actions can be renamed, eg. “nuevo” instead of “new”, but I don’t think the path segment for the resource name can be changed (please correct me if I’m wrong).

I guess the proper way to go would be to rename everything: The AR model, all helper calls, controllers, tests etc. Even with lots of automation, this would be a really annoying task and I’m not inclined to spend time on this right now. Who knows if the client will change his mind again?

So instead, I ended up changes the resource from

  resources :revisions

to

  resources :editions, :controller => "revisions", :as => "revisions"

I’m changing the name of the resource, but the :controller option makes sure the right controller is still called, and the :as option controls the generation of path helpers (so I can keep using eg. edit_revision_path).

Remember to make the same change for nested resources.

EDIT: As Rasmus and Jakob point out, the :path option sets the actual path, not a prefix. I have made some minor updates to the Rails docs to correct this (most instances were already fixed).

Today, I found a bug in Rails.

On a project, I need to mail PDF files to users, and I would like to use PDFKit for that. The contents of the PDF are generated by rendering a view, but I don’t really want PDFKit to hit the server to get the content. Both because this breaks down in unit tests where the server may not be running, but also because there’s just no reason to go through the entire stack.

So I decided to generate the HTML with render_to_string, then convert it to PDF with PDFkit, and finally mail it to the user.

However, it turned out that this doesn’t work. Here is a minimal example:

class NoteMailer < ActionMailer::Base
  default :from => "from@example.com"

  def foo
    @notes = []
    render_to_string(:template => "notes/index.html.erb")
    # self.instance_variable_set(:@lookup_context, nil) # <-- Workaround
    mail(:to => "example@example.com")  
  end
end

It turns out that the call to render_to_string initializes the @lookup_context of the controller (== mailer in this case). The lookup context is the object that is responsible for locating templates to render, so it contains (among other things) a list of template extensions to look for.

The call to mail() renders the actual mail, but the mailer now reuses the lookup context that was created by render_to_string. This prevents it from finding the mail template, which is (in this example) named “foo.text.erb”, because the lookup context doesn’t recognize the “.text.erb” extension.

A workaround is to manually clear the lookup context on the mailer after the call to render_to_string, but obviously this is a really bad solution.

Although I basically agree with the Pragmatic Programmers’ old advice about learning a new programming language each year, the time has come for me to slow down.

I have been programming for many years and I know a lot of different languages. Many of them are boring and similar (eg. Java, C#, C++), some are not boring but similar (Ruby, Python) and some are completely different (Clojure, Common Lisp, JavaScript).

I’m well aware that I could learn a lot by studying Haskell or diving deeper into Javascript, but I’m going to take this year off.

Why? Very simple. I still have so much to learn from Clojure. I could write at length about the virtues of functional programming or about how Clojure takes advantage of the JVM. I could go on and on about the REPL, Java libraries, infinite lazy sequences (list of all primes, anyone?), the exceptionally smart community members and so on. But others have already done this much better than I could. For a great introduction to the language itself, go watch the intro videos by Clojure creator Rich Hickey. For all the surrounding stuff, there are lots of interesting blogs to read.

Likewise, although I have a lot of experience with both Ruby and Rails, I’m not done learning here either. Especially, I need to tune my Emacs to achieve better code navigation and debugging support.

Two languages that I really like. This year, I’m going to focus, not diversify.

Git autopush

January 10, 2011

Some of my Git repos are very, very simple. The commits are all on master, no branches in play at all. And the workflow is just as simple as back in the bad old Subversion days: Pull from the repo, commit some stuff, push it.

A good example is the repo I use to store my Project Euler solutions. It is only used by myself and I always want to push after committing.

So I set out to create (and complete within 10 minutes) my least ambitious project so far: A bash script to add “git push” as a post-commit hook in the current Git repo. You can find it on Github, but the business part of it is right here:

#!/bin/sh

HOOKS_FOLDER=.git/hooks
POST_COMMIT=$HOOKS_FOLDER/post-commit

if [ -d $HOOKS_FOLDER ]; then
    if [ -f $POST_COMMIT ]; then
        echo "Post commit hook already exits, please add 'git push' manually in .git/hooks/post-commit"
        exit -1
    fi
    echo "git push" > $POST_COMMIT
    chmod 755 $POST_COMMIT
    exit 0
else
    echo "This command must be run in the root of a Git repository."
    exit -1
fi

Suggestions welcome for improving the script. I think this is the first multiline Bash script I ever wrote, so I expect it to be flawed.

Ah yes, and before you comment: I know that this is not the shiniest of Git workflows, I know that branches are cheap etc. I love all the cool features of Git but sometimes I just don’t need them.

However I do not know if Git already has this functionality built in via some configuration option?

JRuby/Clojure integration

December 21, 2010

I recently did a demo at an aarhus.rb meeting, showing how to call Clojure code from a Rails app. Specifically, I wanted to try using the very impressive Incanter library which is written in Clojure. This is not just a contrived example – Incanter is something you’d want to integrate with if you needed to do something statistics related.

This post is a walkthrough of how I got a very simple demo app running. It displays a histogram and allows the user to enter the size of the normal distribution sample.

NOTE: The finished Rails app can be found on Github.

1. Set up the basics

As with any other project, I am using RVM for managing multiple (J)Ruby implementations and keeping gemsets project-specific.

So first, create a directory for the Rails app. Inside that directory, create a .rvmrc file containing the following line:

rvm --create jruby-1.5.2@jruby-clojure-demo

Now do a “cd .” to have RVM pick up this file. This switches to jruby-1.5.2 (which must be installed, see RVM docs) and a newly created “jruby-clojure-demo” gemset.

We obviously need Rails for this demo:

gem install rails

This installs Rails 3.0.x as of this writing. Since this takes a little while to complete, fire up another terminal and let’s move on.

2. Fetch all the JARs we need with Leiningen

Leiningen is a build/dependency management tool for Clojure. I’m not using it to build anything for this demo, but it is useful for pulling down all the JARs we need.

Install Leningen, then create a new project with “lein new”, eg. in ~/tmp. Modify the project.clj file to look something like this:

(defproject incanter-test "1.0.0-SNAPSHOT"
  :description "FIXME: write"
  :repositories {"incanter" "http://repo.incanter.org"}
  :dependencies [[org.clojure/clojure "1.1.0"]
                 [org.clojure/clojure-contrib "1.1.0"]
                 [org.incanter/incanter-full "1.0.0"]]
  :dev-dependencies [[swank-clojure "1.2.1"]])

Finally run “lein deps” to have Leiningen download all required JAR files into lib/. This also takes a while, but now our Rails gem is installed and we can move on.

3. Create the Rails app

We use -O to skip activerecord:

rails new . -O

Next, add jrclj to the Gemfile. This is a JRuby/Clojure bridge created by Kyle Burton, and it’s the heart of this integration.

gem 'jrclj'

Do a “bundle install” to install all gems required by our new Rails project.

4. Enabling Clojure integration in the Rails project

To allow us to call Clojure code from the Rails project, we need to load all the JARs we downloaded in step 2.

First, create a “deps” folder inside the Rails app and copy all the JARs into it. Then, insert the following before the Bundler#require call in application.rb:

require 'java' 
Dir["#{File.dirname(__FILE__)}/../deps/*.jar"].each do |jar|
  require jar
end

This code needs to be executed before the Bundler call because the imported JARs are needed for the JRClj bridge to initialize properly.

5. Creating the histogram view

It’s time to call some Clojure code to render our view. I’m not showing how to create the actual Rails controller, the view or the simple form that collects the size attribute. See the Github repo for a working example.

To create the histogram, we use the method incanter.charts/histogram. The input to this method is the result from incanter.stats/sample-normal, and we save its output with incanter.core/save.

The output of incanter.core/save is a stream of PNG data. It should have been possible to simply save this data into a new ByteArrayOutputStream (yes, with JRuby you can just instantiate that class), but because of what looks like a glitch in the JRuby/Clojure bridge, this did not work. It looks like the return types may be wrong in some cases, but I haven’t found time to investigate yet.

So I resorted to a dirty hack: Dropping the PNG data into a file in public/images and simply inserting an image_tag into the view to render it. Thread-safe? I don’t think so :-)

    clj = JRClj.new
    %w{core stats charts}.each { |l| clj._import("incanter.#{l}") }
    clj.eval clj.read_string "(incanter.core/save \
                                (incanter.charts/histogram \
                                   (incanter.stats/sample-normal #{@size.to_i})) \
                                 \"#{::Rails.root.to_s}/public/images/out.png\")"

So what I’m doing here shows one of the two ways you can use JRClj: Evaluating a string containing Clojure code. The clj.eval call above evaluates to the value of the call to incanter.core/save, ie. the outermost sexp in the string.

The other, arguably cleaner and more Ruby-like mode to go, is to instantiate a JRClj object (like above), then call the Clojure functions as if they were Ruby function, with the difference that dashes in method names are replaced with underscores:

jruby-1.5.2 > clj = JRClj.new
=> (truncated...)
jruby-1.5.2 > clj._import("incanter.stats")
=> (truncated...)
jruby-1.5.2 > clj.sample_normal(1000).take(5)  
=> [-1.67885083887929, 0.332869879174951, -0.178234653916897, -0.382876843067349, 1.16325259920326]

Appendix: A Clojure-enabled IRB

I think most Ruby programmers agree that the IRB is a very useful thing, and it becomes even more useful when playing with stuff like JRClj.

Included in the JRClj documentation is a small script that loads the necessary JAR files and fires up an IRB session. I used this while working on the demo, and the script can be found as “jrepl” in the root of the Github repo.

I have decided to build the one thing I know the world doesn’t need: Another Twitter client!!

There must be hundreds out there already, and dozens of them very good. However, after trying out quite a few of them, I’m not really pleased. A quick roundup of the ones I know of:

  • Twitterific is the best so far, but it displays ads every hour (unless you pay), and I can’t really get used to the overlay-style interface. Plus, it often reports communication errors for no reason.
  • TwitterPod is actually OK, but very wierd…! It features a completely useless “QC” mode, where tweets flicker by, intertwined with user mugshots and psychedelic colorized patterns (but of course, you don’t have to use that mode). Actually, I can’t really remember why I stopped using TwitterPod. Probably just because it is wierd and Japanese and alien. It didn’t really have a cool feel to it.
  • The web-based interface (nah)
  • All the AIR-based clients: No, I don’t want AIR on my Mac right now. Maybe some day when I run out of more interesting stuff to play with.
  • All the Android/iPhone/Windows Mobile based ones: No, I want it here on my laptop

So there you have it, nothing is good enough right now. But what would be good?

Here are my thoughts so far about the client I want to build:

  • Open source. I can’t be bothered to do it all myself.
  • Runs on my Mac. If someone wants to port it, great.
  • I want an easy way to track “conversations” between me and some other dude (there probably a funky term in Twitterish for this).
  • When I add a tweet, I expect it to appear in my timeline, or what it’s called, immediately. It should not wait for the next batch of updates.
  • Builtin support for quickly searching for people to follow, using first-, last- or username.
  • Easy way to create all sorts of filters. Most simple method would be to show only the tweets of one followee.
  • It should be easy to mute and unmute friends. The name of a muted friends appears in red when others address him.
  • Every tweet should be colorcoded. Each friend should get a “lease” on a particular color. The lease should expire after some period of inactivity.
  • Fancy visualization: A large canvas (optinally full-screen) on which tweets are clustered by relationship. Ie. when they respond to each other, they form a little group of tweets with arrows showing response directionality. Each tweet should fade in color as it ages.

I will probably be writing the client in Ruby using RubyCocoa.

I am not fooling myself here. I know that someone else has probably covered the features mentioned above in a client much cooler than I could ever come up with. There must be hundreds of alternatives listed on this page alone.

However, there does seem to be a lack of decent open-source Twitter clients. At least ones that are not based on AIR.

Oh well, let’s just see when I get around to actually coding this. I have all sorts of half-baked ideas these days, but way too little spare time to realize any of them.

Drop me a comment if how have an idea for a killer feature to include.

Pre-publish update: OK, so I found one more client that almost discouraged me from writing my own: Lounge.

It looks very nice and IMO is easier to use than Twitterific because it uses a regular window instead of an overlayed one (or whatever it’s called). However, Lounge is an early beta so some features are incomplete or simply missing. For example Growl integration is pretty basic and the Vanity category doesn’t seem to work(?).

One feature that could have been great is the ability to display pictures from Twitpic inline. However, only thumbnails are shown, so you need to launch your webbrowser anyway. Pretty useless then…

Also, it’s closed source – in fact the beta expires on July 1, which makes me a bit suspicious…what then, will I have to buy it?

So in conclusion, it looks quite promising, but I still need to write my own client ;-)

Discovering Project Euler

December 14, 2008

A colleague of mine recently introduced me to Project Euler, and I am slowly becoming hooked.

Project Euler is a collection of mathematical challenges that usually require some programming to be solved. Some are relatively simple (eg. “Find the 10001st prime”) while others are complex (“Using up to one million tiles how many different “hollow” square laminae can be formed?”).

Since I did not study mathematics, most of these problems seem intimidating to me at first. However, a lot of them can be solved with programming skills, simply by doing a more or less brute force search for the answer.

Solving the problems in order, they become progressively harder. But the beauty of it is that a problem often builds on the skills acquired while working on a previous problem.

So if you’re a programmer and you like puzzles involving maths, I highly recommend that you check it out. You can use any programming language you like, only the answer matters, always expressed as some big number.

Follow

Get every new post delivered to your Inbox.