I work on a project which is currently still locked in Rails 2.3 running on Ruby 1.8 – of course, as years have gone by, more and more support for internationalization has come up, and now with emojis being part of the UTF-8 standard, and people people trying to use them in blog posts and comments and the like, I obviously encounter the fiasco that is trying to have Ruby on Rails on MySQL deal with this.
It’s been a mess.
In the end, I’ve just opted for a hack on the String class which gets used at the point that the model’s properties are assigned:
class String # # Converts multi-byte characters which use more than 2 bytes into HTML entities # def to_multibyte_html_entities each_char.map { |c| c.bytes.count > 2 ? "&#x#{c.multibyte_ord.to_s(16)};" : c }.join end # # Identical to #ord but properly supporting multibyte, like later versions # of Ruby # def multibyte_ord unpack('U')[0] end end