Tag: Rails4

Rack/Ruby on Rails: ArgumentError: invalid byte sequence in UTF-8

# Quick test - just copy-paste this: ?%28t%B3odei%29 into your app url - if app crashes, you should read stuff below ;)

If you're here, than probably you've encountered this weird issue:

ArgumentError: invalid byte sequence in UTF-8

you might even have a backtrace like this:

rack-1.5.2/lib/rack/utils.rb:104→ normalize_params
rack-1.5.2/lib/rack/utils.rb:96→ block in parse_nested_query
rack-1.5.2/lib/rack/utils.rb:93→ each
rack-1.5.2/lib/rack/utils.rb:93→ parse_nested_query
rack-1.5.2/lib/rack/request.rb:373→ parse_query
actionpack-4.0.4/lib/action_dispatch/http/request.rb:321→ parse_query
rack-1.5.2/lib/rack/request.rb:188→ GET
actionpack-4.0.4/lib/action_dispatch/http/request.rb:274→ GET
actionpack-4.0.4/lib/action_dispatch/http/parameters.rb:16→ parameters
actionpack-4.0.4/lib/action_dispatch/http/filter_parameters.rb:37→ filtered_parameters
activesupport-4.0.4/lib/active_support/cache/strategy/local_cache.rb:83→ call
rack-1.5.2/lib/rack/sendfile.rb:112→ call
railties-4.0.4/lib/rails/engine.rb:511→ call
railties-4.0.4/lib/rails/application.rb:97→ call
railties-4.0.4/lib/rails/railtie/configurable.rb:30→ method_missing
puma-2.7.1/lib/puma/configuration.rb:68→ call
puma-2.7.1/lib/puma/server.rb:486→ handle_request
puma-2.7.1/lib/puma/server.rb:357→ process_client
puma-2.7.1/lib/puma/server.rb:250→ block in run
puma-2.7.1/lib/puma/thread_pool.rb:92→ call
puma-2.7.1/lib/puma/thread_pool.rb:92→ block in spawn_thread

First of all, this issue is not super-important. It's not a security issue as well. It's just an invalid byte sequence in your request url. Either way it would be good to fix it, even for a sole purpose of getting rid of this from our bug tracker.

But before we do anything with this, how can we determine, that our URL is an invalid UTF-8? We can use URI decode method for that:

# With an invalid byte sequence
url = 'http://senpuu.net/?techniki,Sawarabi_no_Mai_%28taniec_m%B3odej_paproci%29'
URI.decode(url).force_encoding('UTF-8').valid_encoding? #=> false

# and with a valid one
url = 'http://www.senpuu.net/aktualnosci'
URI.decode(url).force_encoding('UTF-8').valid_encoding? #=> true

So, how can we handle this? Well we need to catch it in middleware before anything else wants to process it. I think, that in such cases we should just raise 400 error - bad request, since this is not something that we expect. Middleware like this can be really simple:

class Utf8Sanitizer

  def initialize(app)
    @app = app

  def call(env)
    SANITIZE_ENV_KEYS.each do |key|
      string = env[key].to_s
      valid = URI.decode(string).force_encoding('UTF-8').valid_encoding?
      # Don't accept requests with invalid byte sequence
      return [ 400, { }, [ 'Bad request' ] ] unless valid


and after that you just put into your config/application.rb this:

  config.middleware.use Utf8Sanitizer

and you're resistant to this issue.

## Update

It seems that there's a gem called utf8-cleaner that sanitizes non-utf8 strings. It has one issue - instead of rising 400 error it just removes invalid bytes but still - it's way better that nothing. If you just want to get rid of this problem, put this into your gemfile:

gem 'utf8-cleaner'

## Update 2
I've got a response from Rack guys and it seems that it's more like a Rails issue than a Rack one:
Raggi stated here that:

It is a web servers responsibility to translate IO to valid binary representations for the application layer. This isn't the whole picture though, in this case, the webserver has done that - the webserver does not know the encoding of the URI...

It is the responsibility of the IETF to define the validity of URI data in various encodings (not done), and so it is not entirely valid for web servers to make no assumptions for this field for the above...

Rack itself uses a binary regular expression here, which expects binary input strings. This is our response to the above subtleties. In normal operation (say, Webrick + Rack), this error is not raised...

The reason that this error is raised in your application is:

You have middleware in your stack that is forcing this string to UTF-8, even when it is not valid UTF-8. The code that is doing this is bugged.


s = "a=\xff"
# => "a=\xFF"
# => "a=\xFF"
# => true
# => {"a"=>"\xFF"}
# => "a=\xFF"
# => false
ArgumentError: invalid byte sequence in UTF-8
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `split'
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/lib/ruby/gems/2.0.0/gems/rack-1.5.2/lib/rack/utils.rb:93:in `parse_nested_query'
        from (irb):21
        from /usr/local/google/home/raggi/.rbenv/versions/2.0.0-p247/bin/irb:12:in `<main>'

This is a rails bug. Calls to force_encoding should always assert that their output is valid.

Learning Mongoid – Build scalable, efficient Rails web applications with Mongoid – Book review

First of all I will point one thing: I'm not a professional book reviewer. I don't do this too often. Probably because I don't have enough time. However, I've decided to do a review of "Learning Mongoid" because I wanted to learn something new and Packt Publishing was kind enough to lend me a copy for this review. So here it is. I'll start with things that I really liked. As usual, there were some things that could be corrected, but if you have Rails experience, this book will be really helpful for you.

7501OS_Learning Mongoid

Things I did like about this book

It's not extremely long

You may consider this an issue, but I've found this really helpful. Chapters aren't long, so getting through them is not painful. I bet you've sometimes wondered "what is the author getting at?". Not with this one. Chapters (and the book itself) are really consistent. You won't get bored reading this one or feel like giving up.

A lot of examples

I don't like theoretical texts and books, without any examples of good practices. We're developers, we should be able to play around with new stuff that we learn! And one of the things that I really liked about Learning Mongoid is that I was able to copy-paste almost every example and play-around with it on my computer.

Field aliases

Even now I can recall times, where I had to rename fields, so I would be able to create an index for them :). I don't know why, but this is not a thing that is covered in tutorials or other books (at least not in those that I know). On the other hand this is super useful. I was really surprised to see this one here. It made me realize one thing - this book was written by other guys who develop Rails-Mongoid software.

Geospatial searches and querying in general

When doing a lot of Geolocalization stuff - Mongo can be really helpful and can simplify a lot of things. All basic geo-search options are covered in this book. In general, the whole querying chapter is well-written and together with aggregation framework, it covers all common cases that you may want to use.

Performance tuning and maintenance

Performance is really important. If you don't do it right, you might end up with really slow application. This book covers the basics of both - performance tuning and Mongoid maintenance, so after reading it you will be able to use some of Mongo and Mongoid properties to gain few seconds of users life ;)

Things I didn't like about this book

A good book - but not sure whether or not for pros or beginners

Learning Mongoid by Packt Publishing is a solid book about Mongoid, although it lacks some information that would be super useful for beginners. I've got a feeling that it covers most of "stuff you need to know to start working with Mongo and Mongoid", but as mentioned above, when it comes to people who want to start using Mongoid and they know only a bit about Ruby - it can be harsh.

Install RVM - but do this on your own

I know that this book should be (and it is!) about Mongoid, but since we're talking about it, it is worth at least mentioning how to install RVM, especially because it is one of the prerequisites. 1-2 pages about RVM would be really helpful.

Need some config hints? Well not this time

The second thing that is lacking is a Mongoid setup instruction. Not even a word on what should/should not be in mongoid.yml, what are the most important options, etc. There is even mention of it in the book:

There are entirely new options in mongoid.yml for database configuration

Although none of the changes are listed. No information about replica_set, allow_dynamic_fields, preload_models or any other important setup options. This is a must be in any good Mongoid book.

Want to upgrade to most recent Mongoid version? We won't help you out :(

I've mentioned that below, but I will point it out again. Authors say, that there are several differences between new and old Mongoid, although they don't list them (except IdentityMap). I think they should.

Want to migrate your app to Mongoid?

Maybe you want to move your app from ActiveRecord to Mongoid (I did it few times myself)? If so, "Learning Mongoid" will help you handle Mongo part, but it won't help you with the migration process itself. Sodibee (example book app) is a Mongoid base app. Maybe authors assumed, that if you master ActiveRecord and Mongoid, you don't need any extra help to switch between them...


Would I recommend this book? Yes - I already have! It can be a solid Mongo and Mongoid starting point for begginers (apart some issues that I've mentioned) and a "knowledge refresher" for people that use Mongoid longer that few weeks. It is well written and it has a lot of examples. Really a good one about Mongoid.

If you're interested in buying this book, you can get it here.

Copyright © 2023 Closer to Code

Theme by Anders NorenUp ↑