>Unicode in Ruby with SQL Server

>I’ve been using Ruby on Rails a lot recently, and in one instance I found myself in a bit of a pickle: using UTF-8 fields with SQL Server 2005. After a bit of googling and a lot of playing around, I finally found a solution that seems to work; figured that someone out there might find it useful…

Here’s the environment in which I tried this out:

Here are a few things I tried beforehand and which didn’t work:

  • Installed mssqlclient (had high hopes for this one but I kept getting errors about the adapter not existing. If anyone’s got some insight here, please let me know and I’ll add your URL or text to the article :-D)
  • Tried altering the column types (biggest problem here was with the constraint. Yes, you can drop the constraint but I found that it beat the purpose of using rails’ scripts to generate the data model because every time you’d want to drop the column you’d have to edit your migration and plug in the constraint’s unique name… Yuck.)

You’ll also note that, according to rubyonrails’ wiki on using unicode strings (http://wiki.rubyonrails.org/rails/pages/HowToUseUnicodeStrings), you don’t need to change much of your application’s code to switch to UTF8. As a matter of fact, I simply performed the necessary tests from script/console and realized that $KCODE and ActiveSupport:MultiByte were already set / installed.

My problem actually occured when my migrate scripts were generating the data model: instead of generating the utf8 fields nvarchar and ntext, it was creating varchar and text. If your environment is already set up to support utf8, the only thing you need to do is modify the sqlserver_adapter to generate the correct fields.

This is how it’s done:

  • Navigate to C:rubylibrubygems1.8gemsactiverecord-sqlserver-adapter-1.0.0.9216libactive_recordconnection_adapters (assuming that you’re using the default installation paths)
  • Save a copy of sqlserver_adapter.rb (duh :-D)
  • Edit the sqlserver_adapter.rb file and locate the native_database_types method.
  • Change ‘varchar’ to ‘nvarchar’ and ‘text’ to ‘ntext’ – Save the file
  • Regenerate your data model (yup, you’ll need to delete your existing database). If you’ve already got production data in your database, you’re most certainly going to want to backup your data, restore it to another database and migrate data using ETL’s (hand-rolled or SSIS…) Best of luck!

A few last things:

  • In your database.yml file, make sure you’ve got this line in the profile:

encoding: utf8

  • In the ApplicationController class, you’ll need to define a before_filter method to set the correct headers; open up the application.rb file and set up a before filter like this:

before_filter :set_charset

def set_charset

#Sets the response headers to UTF-8. Alone, this doesn’t do squat: 

response.headers[“Content-Type”] ) “text/html; charset=utf-8”

#Sets the code page; very important…

WIN32OLE.codepage = WIN32OLE::CP_UTF8

end

Many thanks to Andres Mondano Pellegrini for the code. You’ll find his article, which saved my ass, here: www.gaugeus.com (if you’re having trouble finding it, perform a search on “Unicode Support in Ruby on Rails with MySQL and Microsoft SQL Server”)