Running with Ruby

Tag: Rails3.1 (page 1 of 12)

Learning Mongoid – Build scalable, efficient Rails web applications with Mongoid – Book review

First of all I will point one thing: I’m not a professional book reviewer. I don’t do this too often. Probably because I don’t have enough time. However, I’ve decided to do a review of “Learning Mongoid” because I wanted to learn something new and Packt Publishing was kind enough to lend me a copy for this review. So here it is. I’ll start with things that I really liked. As usual, there were some things that could be corrected, but if you have Rails experience, this book will be really helpful for you.

7501OS_Learning Mongoid

Things I did like about this book

It’s not extremely long

You may consider this an issue, but I’ve found this really helpful. Chapters aren’t long, so getting through them is not painful. I bet you’ve sometimes wondered “what is the author getting at?”. Not with this one. Chapters (and the book itself) are really consistent. You won’t get bored reading this one or feel like giving up.

A lot of examples

I don’t like theoretical texts and books, without any examples of good practices. We’re developers, we should be able to play around with new stuff that we learn! And one of the things that I really liked about Learning Mongoid is that I was able to copy-paste almost every example and play-around with it on my computer.

Field aliases

Even now I can recall times, where I had to rename fields, so I would be able to create an index for them :). I don’t know why, but this is not a thing that is covered in tutorials or other books (at least not in those that I know). On the other hand this is super useful. I was really surprised to see this one here. It made me realize one thing – this book was written by other guys who develop Rails-Mongoid software.

Geospatial searches and querying in general

When doing a lot of Geolocalization stuff – Mongo can be really helpful and can simplify a lot of things. All basic geo-search options are covered in this book. In general, the whole querying chapter is well-written and together with aggregation framework, it covers all common cases that you may want to use.

Performance tuning and maintenance

Performance is really important. If you don’t do it right, you might end up with really slow application. This book covers the basics of both – performance tuning and Mongoid maintenance, so after reading it you will be able to use some of Mongo and Mongoid properties to gain few seconds of users life ;)

Things I didn’t like about this book

A good book – but not sure whether or not for pros or beginners

Learning Mongoid by Packt Publishing is a solid book about Mongoid, although it lacks some information that would be super useful for beginners. I’ve got a feeling that it covers most of “stuff you need to know to start working with Mongo and Mongoid”, but as mentioned above, when it comes to people who want to start using Mongoid and they know only a bit about Ruby – it can be harsh.

Install RVM – but do this on your own

I know that this book should be (and it is!) about Mongoid, but since we’re talking about it, it is worth at least mentioning how to install RVM, especially because it is one of the prerequisites. 1-2 pages about RVM would be really helpful.

Need some config hints? Well not this time

The second thing that is lacking is a Mongoid setup instruction. Not even a word on what should/should not be in mongoid.yml, what are the most important options, etc. There is even mention of it in the book:

There are entirely new options in mongoid.yml for database configuration

Although none of the changes are listed. No information about replica_set, allow_dynamic_fields, preload_models or any other important setup options. This is a must be in any good Mongoid book.

Want to upgrade to most recent Mongoid version? We won’t help you out :(

I’ve mentioned that below, but I will point it out again. Authors say, that there are several differences between new and old Mongoid, although they don’t list them (except IdentityMap). I think they should.

Want to migrate your app to Mongoid?

Maybe you want to move your app from ActiveRecord to Mongoid (I did it few times myself)? If so, “Learning Mongoid” will help you handle Mongo part, but it won’t help you with the migration process itself. Sodibee (example book app) is a Mongoid base app. Maybe authors assumed, that if you master ActiveRecord and Mongoid, you don’t need any extra help to switch between them…

Summary

Would I recommend this book? Yes – I already have! It can be a solid Mongo and Mongoid starting point for begginers (apart some issues that I’ve mentioned) and a “knowledge refresher” for people that use Mongoid longer that few weeks. It is well written and it has a lot of examples. Really a good one about Mongoid.

If you’re interested in buying this book, you can get it here.

Ruby, Rails + objects serialization (Marshal), Mongoid and performance matters

Introduction

Sometimes, we want to store our objects in files/database directly (not ORmapped or DRmapped). We can obtain this with serialization. This process will convert any Ruby object into format that can be saved as a byte stream. You can read more about serialization here.

Serializing stuff with Ruby

Ruby uses Marshal serialization. It is quite easy to use. If use use ActiveRecord, you can use this simple class to store objects in AR supported database:

class PendingObject < ActiveRecord::Base

  # Iterate through all pending objects
  def self.each
    self.all.each do |el|
      yield el, el.restore
    end
  end

  # Marshal given object and store it on db
  def store(object)
    self.object = Marshal.dump(object)
    self.save!
  end

  # "Unmarshal" it and return
  def restore
    Marshal.load(self.object)
  end

end

Of course this is just a simple example of how to use serialization. Serialized data should be stored in a binary field:

      t.binary :object

Mongo, Mongoid and its issues with serialization

Unfortunately you can’t just copy-paste this ActiveRecord solution directly into Mongoid:

class PendingObject

  include Mongoid::Document
  include Mongoid::Timestamps

  field :object, :type => Binary

  # Iterate through all pending objects
  def self.each
    self.all.each do |el|
      yield el, el.restore
    end
  end

  def store(object)
    self.object = Marshal.dump(object)
    self.save!
  end

  def restore
    Marshal.load(self.object)
  end

end

It doesn’t matter whether or not you use Binary or String in a field type decleration. Either way you’ll get this as a result:

String not valid UTF-8

I can understand why this would happen with a String, but why when I set it as a binary value? It should just store whatever I put there…

Base64 to the rescue

In order to fix this, I’ve decided to use Base64 to convert serialized data. This has an significant impact on the size of each serialized object (30-35% more) but I can live with that. I was more concerned about the performance, that’s why I’ve decided to test it. There are 2 cases what I’ve wanted to check:

  • Serialization
  • Serialization and deserialization (reading serialized objects)

Here are steps that I took:

  1. Create simple ruby object
  2. Serialize it 100 000 times with step every 1000 (without Base64)
  3. Serialize it 100 000 times with step every 1000 (with Base64)
  4. Benchmark creating of ruby simple objects (just as a reference point)
  5. Analyze all the data

Just to be sure (and to minimize random CPU spikes) I’ve performed test cases 10 times and then I took average values.

Benchmark

Benchmark code is really simple:

  1. Code responsible for iteration preparing
  2. DummyObject – object that will be serialized
  3. PendingObject – object that will be used to store data in Mongo
  4. ResultStorer – object used to store time results (time taken)
  5. Benchmark – container for all the things
  6. Loops :)

You can download source code here (benchmark.rb).

Results, charts, fancy data

First the reference point – pure objects initialization (without serialization). We can see, that there’s no big decrease in performance, no matter how many objects we will initialize. Initializing 100 000 objects takes around 0.25 second.

Now some more interesting data :) Objects initialization and initialization with serialization (single direction and without base64):

It is pretty straightforward, that serialization isn’t the fastest way to go. It might slowdown whole process around 10 times. But it’s still like 2.5 seconds for 100 000 objects. Now lets see what will happen when we add a base64 to all of it (for a reference we will leave previous values on the chart as well):

It seems, that Base64 conversions will slow down the whole process about 10-12% max. It is still bearable (since for 100 000 objects its around 2.7s).

Now it is time for the most interesting part: deserialization. By “deserialization” I mean time that we need to convert a stream of bytes into objects (serialization time is not taken into consideration here):

Results are quite predictable. Adding Base64 to the deserialization process, increases overall time required around 12-14%. As previously, it is an overhead that can be accepted – especially when you realize that even then, 100 000 objects can be deserialized in less than 2 seconds.

Lets summarize all that we have (pure initialization, serialization, serialization with Base64, deserialization, deserialization with Base64, serialization-deserialization process and the serialization-deserialization with Base64):

Conclusions

Based on our calculations and benchmarks we can see, that the overall performance drop when serializing and deserializing using Base64 is around 23-26%. If you’re not planning to work with huge number of objects at the same time, the whole process will still be extremely fast and you can use it.

Of course if you can use for example MySQL with Binary – there is no need to use Base64 with it. But on the other hand, if you’re using MongoDB (with Mongoid) or any other database that has some issues with Binary and you still want to store serialized objects in it – this is a way to go. If you consider also the bigger size of Base64 data, the total performance loss should not exceed 35%.

So: if you don’t have time to look for a better solution and you will be aware of disadvantages of this solution – you can use it ;)

Olderposts

Copyright © 2017 Running with Ruby

Theme by Anders NorenUp ↑