Chris Lamothe joins Cantina as User Experience Principal


Please join me in welcoming Chris Lamothe, the newest partner at Cantina Consulting, as a User Experience Principal.

Chris brings nearly 10 years of experience in a variety of aspects of user-centric design and interactive development, with specific areas of focus in e-learning and Rich Internet Application (RIA) design and development.

Both Matt and I have worked alongside Chris in the past, during consulting engagements at Molecular as well as a variety of side projects over the years. We’re thrilled to have Chris a partner as Cantina continues to grow.

Feel free to read Chris’ bio in the About section.

Gotchas in using BackgrounDRb in Ruby on Rails


I’ve been working on a bit of code to perform audio and video encoding for media files uploaded to one of our client’s sites and I was thrilled to come across BackgrounDRb, a Rails plugin that allows developers to build scheduled background tasks, similar to the OpenSympony’s Quartz for Java. The plugin also allows you to allow user actions to initiate long running processes by spawning worker threads from controllers (or other places).

Of particular interest to me was the ability to spawn (fork) worker threads from user actions, or in this case, ActiveRecord callbacks which were called when the user action caused my model object to be saved. The basic order of operations is this:

  1. User uploads an audio file
  2. Uploaded file data is loaded into an attachment_fu model object
  3. Model object is saved
  4. Model object’s after_save callback is called
  5. The after_save callback determines whether the uploaded file requires encoding, and spawns a BackgrounDRb worker process if it does

In this process lay many issues. I’ve attempted to describe some of them below.

Close your connections

This may be obvious to many, but I figured with all that ActiveRecord does for you, that if I do some work with ActiveRecord in a BackgrounDRB worker process, my database connections would be closed automatically. This was not the case, however it can be quickly remedied by adding the following after all your code is done doing what it needs to do:

ActiveRecord::Base.connection.disconnect!

Don’t try to pass entire ActiveRecord objects to workers from Rails

I started building a simple call to spawn a new BackgrounDRb worker from an ActiveRecord lifecycle callback method (see below), and starting running into errors like these:

undefined method `[]‘ for #<DRb::DRbUnknown:0×2501bd4>

I found a number of postings online indicating that simply adding the following to my model objects would clear this up:

include DRbUndumped

Doing so did not fix my undefined method problems, so when I came across this on another blog, I decided to cut my losses and just pass the database ID.

When passing arguments from Rails to BackgrounDRb workers, don’t pass huge ActiveRecord objects. Its asking for trouble. You can easily circumvent the situation by passing id of AR objects.

From: BackgrounDRb best practises

One thing that I didn’t think I’d be dealing with in coming to Rails from a Java background is serialization issues, but there we have it.

Don’t create new workers from new record callbacks

This may have been common knowledge, but I had to discover this the hard way. The after_save callback in ActiveRecord model objects is called before the save transaction has been committed. The implication this has is that if I pass an ID of an ActiveRecord object to BackgrounDRb for further processing, strange things can happen.

In the case of my after_save callback being called on a newly created object, I found that most of the time, by the time the BackrounDRb process starts working and tries to retrieve the object using the ID I have passed, my original object.save() transaction has not yet committed, so I get errors indicating the record could not be found, even though I check the database no more than a second later and the record is there and intact.
It seems that I’m not the only one that was looking for a post-commit callback:

Happenings


There’s been a bit of radio silence on our blog lately, and it’s for good reason!  We’ve been busy on several Rails and Grails-based projects lately which is why we’re very excited about the releases of both Rails 2.0.x and the first full 1.0 release of the Grails framework. 

The Grails release in particular is significant as the framework has finally emerged from "in development" status which removes one more barrier for web project teams and IT departments to using the framework.  We’ve had success with the Grails framework particularly due to the fact that it sits very well in the Java enterprise ecosystem, including the relatively straightforward integration of existing Java-based Hibernate models.  This integration with Hibernate, which is a key factor in Grails’ ability to be a contender for new IT projects with teams that are Java EE and Spring/Hibernate-oriented, is a fairly straightorward process that involves dropping in your existing mapping XML and Java POJOs.  As if by some magic, you get all the benefits of GORM on your existing Java POJO data model, including dynamic search methods and the criteria builder DSL.

 

Amazon EC2: 1st Impressions Mounting S3


We’ve got a small internal project at Cantina that aims to make use of the Groovy on Grails framework and some of our Grails plugins, and we plan on using S3 as the persistent storage for the project.  We decided to test out a small EC2 instance at Amazon to use as an integration point for the project for a couple reasons:

  • EC2 instances are very quick to setup
  • They are not charged for data transfer to S3
  • They are presumably the closest you can be to your S3 storage in terms of network latency

Setting up our initial EC2 instance was incredibly easy.  There’s a wealth of pre-built Amazon Machine Images, or AMIs, out there with various configurations for different application servers, including Apache, MySQL, JBoss, Tomcat, and, our new favourite, Red5.  I was able to get one up and running fairly quickly with the packages I needed (I love yum). 

Actually, I was floored by how fast I had a brand new server up and running, considering I’ve had requests submitted to fairly large enterprise grade hosting companies for such intensely complicated things as, say opening a new port in the firewall, take over a week.  I guess the landscape is changing, but I digress…

In order to use Amazon S3 as the backing store for our new EC2 instance, we seemed to have a few options:

  1. Code our application to manage the transfer and synchronization of files to S3, perhaps via our Amazon S3 Grails Plugin
  2. Utilize an S3-aware file synchronization tool such as the jets3t Synchronize tool, or the Ruby s3sync tool
  3. Mount S3 as a filesystem in the EC2 instance

Since we’ve already been doing #1, and #2 isn’t exactly real-time, we decided to give #3 a go.  To do so, we enlisted the help of a couple of tools:

  • JungleDisk: A multi-platform tool that provides a local WebDAV interface to S3, suitable for mounting as a filesystem from the Mac OS X Finder, or from Linux using…
  • davfs2: Linux filesystem driver that will mount a WebDAV URL to a mount point in the local filesystem

JungleDisk is a great tool in general for interacting with an S3 account, and I’ve been using it for a little while now for my own backup purposes on my Mac development laptop.  The Linux version provides a standalone command line program (in addition to the GUI that comes on all platforms) which can be run as a daemon and scripted to startup on boot. 

The setup was surprisingly simple.  To get JungleDisk running from the command line client, you need to provide a configuration file, commonly called jungledisk-settings.ini.  The documentation says that you should run the GUI first to generate the file before running the command line version, but I was able to copy over the file from my Mac laptop and update the values for the EC2 instance.  Here’s an example of the configuration file:

LoginUsername=
LoginPassword=PROTECTED:
AccessKeyID=XXXXXXXXXXXXXXXXXXXXX
SecretKey=PROTECTED:XXXXXXXXXXXXXXXXXXXXX
Bucket=default
CacheDirectory=/var/cache/jungledisk
ListenPort=2667
CacheCheckInterval=120
AsyncOperations=1
Encrypt=0
ProxyServer=
EncryptionKey=PROTECTED:
DecryptionKeys=PROTECTED:
MaxCacheSize=1000
MapDrive=
UseSSL=0
RetryCount=3
FastCopy=1
WebAccess=0
LogDuration=30
ArchiveFlag=5
ArchiveDuration=60
PasswordPrompt=0

I setup jungledisk to startup on boot using a really handy /etc/init.d script from the JungleDisk forums found here.

Once JungleDisk was configured and running, I had a local WebDAV server running at http://localhost:2667.  This could be used in the KDE to browse the filesystem, but since I’d like my web application to be able to access it via the filesystem, the next step was to get davfs2 running.

The EC2 instance did not have davfs2 installed by default (at least not the AMI I chose, which was based on the default AMI), so I simply downloaded the source distribution and compiled locally on the EC2 instance.  The base AMI I was using did not have gcc or the neon development libraries (Neon is a WebDAV library that davfs2 uses for communicating with WebDAV servers).  Luckily yum was installed on the EC2 instance so getting these dependencies was pretty straightforward.

davfs2 can be configured to run by non-root users, or by root.  The configuration of davfs2 depends on your choice here as some configuration options are intended only for system wide davfs2 configuration started from root.  I opted to take this approach, setup my /usr/local/etc/davfs2.conf file with the following configuration options:

dav_user        mydavuser
dav_group       mydavgroup
kernel_fs       fuse
ask_auth        0

Since JungleDisk provides WebDAV access to localhost only, and does not require authentication, setting ask_auth to 0 is useful to prevent davfs2 from prompting for a password when mounting the WebDAV URL.  Last but not least, I added an entry to /etc/fstab to configure the mount:

http://localhost:2667 /mountpoint davfs nolocks,noaskauth,rw

Voila!  Now my S3 instance is mounted to the local Linux filesystem on my EC2 instance.  I have not done any performance testing or cache tuning, but this article over at Right Scale looks promising.  Once we have our application and up and running, I’ll post back with some more information on how this configuration is working.

 

 

 

 

Previous Articles

One step closer to Grails 1.0


Red5 Plugin for Grails


Sweet Potato Pie!: A Video Plugin for Groovy on Grails


Hot Off The Press: An Amazon S3 Plugin for Groovy on Grails


Red5 wicked cool


How Healthy is Grails?


Welcome to Cantina Consulting

We're a boutique web development firm that's passionate about next generation web development based on open source technologies.