Tag: Performance

ActiveRecord count vs length vs size and what will happen if you use it the way you shouldn’t

One of the most common and most deadly errors you can make: using length instead of count. You can repeat this multiple times, but you will always find someone who'll use it the way it shouldn't be used.

So, first just to make it clear:

#count - collection.count

  • Counts number of elements using SQL query (SELECT COUNT(*) FROM...)
  • #count result is not stored internally during object life cycle, which means, that each time we invoke this method, SQL query is performed again
  • count is really fast comparing to length
2.1.2 :048 > collection = User.all; nil
 => nil
2.1.2 :049 > collection.count
   (0.7ms)  SELECT COUNT(*) FROM `users`
 => 16053
2.1.2 :050 > collection.count
 => 16053

#length - collection.length

  • Returns length of a collecion without performing additional queries... as long as collection is loaded
  • When we have lazy loaded collection, length will load whole colletion into memory and then will return length of it
  • Might use all of your memory when used in a bad way
  • Really fast when having a eagerly loaded collection
2.1.2 :055 > collection = User.all; nil
 => nil
2.1.2 :056 > collection.length
  User Load (122.9ms)  SELECT `users`.* FROM `users`
 => 16053
2.1.2 :057 > collection = User.all; nil
 => nil
2.1.2 :058 > collection.to_a; nil
  User Load (140.9ms)  SELECT `users`.* FROM `users`
 => nil
2.1.2 :059 > collection.length
 => 16053
2.1.2 :060 > collection.length
 => 16053

#size - collection.size

  • Combines abilities of both previous methods;
  • If collection is loaded, will count it's elements (no additional query)
  • If collection is not loaded, will perform additional query
2.1.2 :034 > collection = User.all; nil
 => nil 
2.1.2 :035 > collection.count
   (0.3ms)  SELECT COUNT(*) FROM `users`
 => 16053 
2.1.2 :036 > collection.count
   (0.3ms)  SELECT COUNT(*) FROM `users`
 => 16053 
2.1.2 :037 > collection.size
   (0.2ms)  SELECT COUNT(*) FROM `users`
 => 16053 
2.1.2 :038 > collection.to_a; nil
  User Load (64.2ms)  SELECT `users`.* FROM `users`
 => nil 
2.1.2 :039 > collection.size
 => 16053 

Why would you even care?

Well it might have a huge impact on your apps performance (and resource consumption). In general if you don't want to care at all and you want to delegate this responsibility to someone else, use #size. If you want to care, then play with it and understand how it works, otherwise you might end up doing something like this:

print "We have #{User.all.length} users!"

And this is the performance difference on my computer (with only 16k users):

       user     system      total        real
count     0.010000   0.000000   0.010000 (  0.002989)
length    0.730000   0.060000   0.790000 (  0.846671)

Nearly 1 second to perform such simple task. And this could have a serious impact on your web app! Keep that in mind.

Using multiple MongoDB databases instead of one – performance check

I'm starting to develop a new application. Can't say what it is, but it perfectly fits MongoDB Document Oriented Database approach. Everything is great. except of small detail - I don't want to store everything in one database. Of course I could use collections and embedded documents to organize whole nested structures and keep users stuff separated, although it would make source code much more complicated than it should be. Instead I've decided to use one MongoDB database per user. That way I can separate users data and I don't need to worry about scoping it out. There will be a gateway, that will authorize incoming requests to a proper database.

Gateway

Schema is pretty straightforward and the only thing that was bothering me was the multi-db switching performance. I've decided to make a simple benchmark that would test if there's a difference when using one or many databases. Results are promising. It seems, that there's only around 5% (5.3% exactly) performance loss when using many databases instead of one. 5% is a difference level that I can easily accept. To be honest I think, I will gain much more especially when I will have a lot of data. Lets say I have 100 customers with 100 000 000 records. With one database I would have to query all of it. With separate databases, I will have to query only 1% of it.

Below you can see performance difference when querying one vs many MongoDB databases.

Dbs I will definitely go with that approach and I will try to keep you posted.

Note: This is not a full-pro-extremely accurate long-time test - more like a proof of concept. Keep that in mind ;)

Copyright © 2025 Closer to Code

Theme by Anders NorenUp ↑