RSpec | Closer to Code

ActiveSupport#descendants can be slow. In a bigger system with layers of descendants, finding all of them can be time-consuming:

puts Benchmark.measure do
  100.times { Dispatchers::Base.descendants }
end

# 5.235370   0.015754   5.251124 (  5.251069)

In the code I've been working on, it meant that a single lookup was taking around 50ms. That is a lot, especially if used extensively.

To mitigate this, I've implemented a simple caching layer on top of the lookup that would make things fast:

module Mixins
  module CachedDescendants
    extend ActiveSupport::Concern

    cattr_accessor :descendants_map

    self.descendants_map = Concurrent::Hash.new

    class << self
      # Clears the descendants map cache - can be hooked to Rails reloader
      def reload!
        descendants_map.clear
      end
    end

    included do
      class << self
        # @return [Array<Class>] array with descendants classes
        def cached_descendants
          ::Mixins::CachedDescendants.descendants_map[self] ||= descendants
        end
      end
    end
  end
end

When included and used, it would give great results:

puts Benchmark.measure do
  100.times { Dispatchers::Base.cached_descendants }
end

# 0.000023   0.000001   0.000024 (  0.000024)

99,99956% faster!

Such code, like any other, deserves to be tested. I wrote some specs for it, including a relatively simple one:

  context 'when there are two independent bases' do
    let(:base1) do
      Class.new do
        include ::Mixins::CachedDescendants
      end
    end

    let(:base2) do
      Class.new do
        include ::Mixins::CachedDescendants
      end
    end

    before do
      Array.new(5) { Class.new(base1) }
      Array.new(5) { Class.new(base2) }
    end

    it 'expect for them not to interact' do
      expect(base1.cached_descendants.size).to eq(5)
      expect(base2.cached_descendants.size).to eq(5)
      expect(base1.cached_descendants & base2.cached_descendants).to be_empty
    end
  end

It would just ensure that the way we cache does not create collisions for independent descendants trees.

But once in a while, this code would randomly fail:

  1) Mixins::CachedDescendants when there are two independent bases expect for them not to interact
     Failure/Error: expect(base1.cached_descendants.size).to eq(4)
       expected: 5
            got: 4
       (compared using ==)

How can I create five classes and suddenly have only 4? I initially thought something was wrong with the descendants lookup for anonymous classes. However, this functionality is heavily used by many, including me, and it never created any problems. On top of that, why would it fail only once in a while?

When something fails randomly, it usually means that there's an external factor to it. One that operates under the hood. It wasn't different in this case.

After some investigation, I was able to reproduce it:

GC.disable

puts "Total classes before: #{ObjectSpace.count_objects[:T_CLASS]}" 
puts "String subclasses count before: #{String.subclasses.count}" 

100.times { Class.new(String) }

puts "Total classes after defining: #{ObjectSpace.count_objects[:T_CLASS]}" 
puts "String subclasses count after defining: #{String.subclasses.count}" 

GC.enable
GC.start

puts "Total classes after GC: #{ObjectSpace.count_objects[:T_CLASS]}" 
puts "String subclasses count after GC: #{String.subclasses.count}" 

# Total classes after defining: 1324
# String subclasses count after defining: 102
# Running GC...
# Total classes after GC: 1124
# String subclasses count after GC: 2

Boom! Anonymous classes are being garbage collected! Classes that are not referenced anywhere are subject to garbage collection like other objects, and this code was not memoizing them:

before do
  Array.new(5) { Class.new(base1) }
  Array.new(5) { Class.new(base2) }
end

Hence, the spec would fail if GC kicked in exactly between the classes definitions and the spec execution. This is why it would only fail once in a while.

Fixing such an issue required only minimal changes to the spec:

    let(:descendants) do
      # This needs to be memorized, otherwise Ruby GC may remove those in between assertions
      [
        Array.new(5) { Class.new(base1) },
        Array.new(5) { Class.new(base2) }
      ]
    end

    before { descendants }

That way, the anonymous classes would be referenced throughout the lifetime of this spec.

Summary (TL;DR)

Anonymous classes and modules are a subject of garbage collection like any other object. Unless you reference them, they may be gone before wanting to use them via #descendants or similar lookups. Always reference them in some way or face the unexpected.

Afterword

It was pointed to me by Jean Boussier, that there's an ActiveSupport::DescendantsTracker that can also be used to improve the lookup performance and that Ruby 3.1 has a #subclasses that also is much faster than iterating over ObjectSpace. While their performance is several times faster than the "old" lookup, it is still a magnitude slower than caching the descendants in dev but has similar performance in production.

0.002020 vs. 0.000023 in dev and 0.000039 vs. 0.000042 in production.

The best code is no code, so now I can deprecate my code. The strong class references, however, are still valid and worth keeping in mind.

Cover photo by Alper Orus on Attribution-NonCommercial-ShareAlike 2.0 Generic (CC BY-NC-SA 2.0). Image has been cropped.

After we've added RSpec and Cucumber (with PhantomJS) to our CI build process, it got really, really slow. Due to the application character, after each scenario (for Cucumber) we truncate and restore the whole database. 45 minutes for a single build is definitely not what we aimed to get. So, how to speed up tests execution?

First we thought, that we could run RSpec and Cucumber stuff in parallel (using parallel tests gem). We've got a much better machine on AWS to make sure that a single process has a single core to use. Unfortunately everything got... slower. We've decided to pinpoint a single RSpec spec and a single Cucumber scenario that would be representative and figure out what the hell. What we've discovered at the beginning, is that all the specs were running faster on the Ruby level. It all got significantly slower because of the database. Our tests were heavy in terms of DB communication and as I said before, due to it's character, it will probably stay that way.

So, what were our options?

We could get a much better hardware for our testing DBs. Bigger, faster, with SSD, however it would definitely make things more expensive
We could compromise data consistency. Since it is a testing cluster - in case of a system failure / crash /shutdown, we can just drop all the databases and repopulate them again

We've decided to try out the second approach and use fsync PostgreSQL flag to tweak this database a little bit.

What is fsync (quote from PostgreSQL documentation)?

If this parameter is on, the PostgreSQL server will try to make sure that updates are physically written to disk, by issuing fsync() system calls or various equivalent methods (see wal_sync_method). This ensures that the database cluster can recover to a consistent state after an operating system or hardware crash.

While turning off fsync is often a performance benefit, this can result in unrecoverable data corruption in the event of a power failure or system crash. Thus it is only advisable to turn off fsync if you can easily recreate your entire database from external data.

Examples of safe circumstances for turning off fsync include the initial loading of a new database cluster from a backup file, using a database cluster for processing a batch of data after which the database will be thrown away and recreated, or for a read-only database clone which gets recreated frequently and is not used for failover. High quality hardware alone is not a sufficient justification for turning off fsync.

Results were astonishing! Since we're no longer as much dependent on our HDDs performance for each operation, the database layer does not slow us down that much.

Overall thanks to this tweak and parallel execution, we've managed to get down from 45 minutes for a whole build, down to 12 minutes. That is 75% faster than before and this build time is acceptable for us.

Research done by: Adam Gwozdowski

Tag: RSpec

RSpec story about disappearing classes

Summary (TL;DR)

Afterword

Speeding up RSpec and Cucumber on the CI server with PostgreSQL fsync flag and parallel execution