Tag: Ruby 3

The hidden cost of a Ruby threads leakage

Bug hunting

Recently I’ve been working with one small application that would gradually become slower and slower. While there were many reasons for it to happen, I found one of them interesting.

To give you a bit of context: the application was a simple single topic legacy Kafka consumer. I rewrote it to Karafka, and all of the logic looks like this:

class EventsConsumer < Karafka::BaseConsumer
  def initialize(...)
    super
    @processor = Processor.new
  end

  def consume
    @processor.call(params_batch.payloads)
  end
end

And the processor looks like so (I removed all the irrelevant code):

class Processor
  def initialize
    @queue = Queue.new
    @worker = Thread.new { work }
  end

  def call(events)
    # do something with events
    results = complex_inline_computation(events)
    @queue << results
  end

  private

  def work
    while true
      result = @queue.pop
      # some sort of async storing operation should go here
      p result
    end
  end

  def complex_inline_computation(events)
    events.join('-')
  end
end

So, we have a Karafka consumer with a processor with one background thread supposed to flush the data async. Nothing special, and when putting aside potential thread crashes, all looks good.

However, there is a hidden problem in this code that can reveal itself by slowly degrading this consumer performance over time.

Karafka uses a single persistent consumer instance per topic partition. When we start processing a given partition of a given topic for the first time, a new consumer instance is created. This by itself means that the number of threads we end up with is directly defined by the number of topics and their partitions we're processing with a single Karafka process.

If that was all, I would say it's not that bad. While for a single topic consuming process, with 20 partitions, we do end up with additional 20 threads, upon reaching this number, the degradation should stop.

It did not.

There is one more case where our legacy consumer and Karafka would spin-up additional threads because of the processor re-creation: Kafka rebalance. When rebalancing happens, new consumer instances are initialized. That means that each time scaling occurred, whether it would add or remove instances, a new processor thread would be created.

Fixing the issue

Fixing this issue without a re-design is rather simple. As long as we can live with a singleton and we know that our code won't be executed by several threads in parallel, we can just make the processor into a singleton:

class Processor
  include Singleton
  
  # Remaining code
end

class EventsConsumer < Karafka::BaseConsumer
  def initialize(...)
    super
    @processor = Processor.instance
  end

  # Remaining code
end

While it is not the optimal solution, in my case it was sufficient.

Performance impact

One question remains: what was the performance impact of having stale threads that were doing nothing?

I'll try to answer that with a more straightforward case than mine:

require 'benchmark'

MAX_THREADS = 100
STEP = 10
ITERS = 50000000

(0..MAX_THREADS).step(STEP).each do |el|
  STEP.times do
    Thread.new do
      q = Queue.new
      q.pop
    end
  end unless el.zero?

  # Give it a bit of time to initialize the threads
  sleep 5

  # warmup for jruby - thanks Charles!
  5.times do
    ITERS.times do ; a = "1"; end
  end

  Benchmark.bm do |x|
    x.report { ITERS.times do ; a = "1"; end }
  end
end

I've run this code 100 times and used the average time to minimize the randomness impact of other tasks running on this machine.

Here are the results for Ruby 2.7.2, 3.0.0-preview2 (with and without JIT) and JRuby 9.2.13, all limited with time taskset -c 1, to make sure that JRuby is running under the same conditions (single core):

CRuby performance degradation is more or less linear. The more threads you have that do nothing, the slower the overall processing happens. This does not affect JRuby as JVM threads support is entirely different than the CRubys.

What worries me more, though, is that Ruby 3.0 seems to degrade more than 2.7.2. My wild guess here is that it's because of Ractors code's overhead and other changes that impact threads scheduler.

Below you can find the time comparison for all the variants of CRuby:

It is fascinating that 3.0 is slower than 2.7.2 in this case, and I will try to look into the reasons behind it in the upcoming months.

Note: I do not believe it's the best use-case for JIT, so please do not make strong claims about its performance based on the charts above.

Summary

The more complex applications you build, the bigger chances are that you will have to have threads at some point. If that happens, please be mindful of their impact on your applications' overall performance.

Also, keep in mind that the moment you introduce background threads, the moment you should introduce proper instrumentation around them.


Cover photo by Chad Cooper on Attribution 2.0 Generic (CC BY 2.0) license.

Building a Ractor based logger that will work with non-Ractor compatible code

Recently Mike Perham shared a tweet with this comment and a code sample on the Ruby 3.0 Ractors.

If this code doesn't work, how could Rails ever work? Ractor seems fundamentally incompatible with many heavily-used Rails APIs.

require 'logger'

class Rails
  def self.logger
    @logger ||= Logger.new(STDOUT)
  end
end

Ractor.new do
  Rails.logger.info "Hello"
end.take

During the weekend I've added support of Ractors in the Diffend.io, a free platform for an OSS supply chain security and management for Ruby and Rails, so I’m relatively fresh with the topic. Mike’s code illustrates one of the issues developers will face when making their code Ractors compatible.

When you try to run it, you will end up with an exception:

terminated with exception (report_on_exception is true):
`take': thrown by remote Ractor. (Ractor::RemoteError)
`logger': can not access instance variables of classes/modules
  from non-main Ractors (RuntimeError)

Is there any way to preserve the Rails#logger API and allow it to be used from any Ractor we want?

There is!

So, let's start by explaining why this code cannot work:

  def self.logger
    @logger ||= Logger.new(STDOUT)
  end

There are actually 2 problems with this code, though only one is visible immediately:

  1. You cannot access instance variables of the shareable objects from Ractors other than the main one.
  2. You cannot access STDOUT from the non-main Ractor (at least not that way).

The good news is said between the lines: while we cannot use shareable objects and cannot refer to instance variables, we can preserve the Rails.logger API!

class Rails
  def self.logger
    rand
  end
end

Ractor.new do
  Rails.logger
end.take.then { p _1 }

#=> 0.06450369439220172

But we want to share a logger, right? Well, not exactly. What we want is to be able to use the same API to log pieces of information. And that's the key point here.

We can bypass all of our problems quickly. We just need a separate Ractor that will run all the logging for our application with a standard logger compatible API.

What do we need to achieve this? Not much. We need to:

  1. Create a Ractor that will have the one and only application wide logger.
  2. Create API for logging.
  3. Connect the Ractor to the Rails#logger interface.

It all can be achieved with a few lines of code:

class Rogger < Ractor
  def self.new
    super do
      # STDOUT cannot be referenced but $stdout can
      logger = ::Logger.new($stdout)

      # Run the requested operations on our logger instance
      while data = recv
        logger.public_send(data[0], *data[1])
      end
    end
  end
 
  # Really cheap logger API :)
  def method_missing(m, *args, &_block)
    self << [m, *args]
  end
end

class Rails
  LOGGER = Rogger.new

  def self.logger
    LOGGER
  end
end

Ractor.new do
  Rails.logger.info "Hello"
end

and when we run it, we end up with a different challenge:

terminated with exception (report_on_exception is true):
ruby/3.0.0/logger/formatter.rb:15:in `call': can not access global variables $$ from non-main Ractors (RuntimeError)
  from ruby/3.0.0/logger.rb:586:in `format_message'
  from ruby/3.0.0/logger.rb:476:in `add'
  from ruby/3.0.0/logger.rb:529:in `info'
  from test.rb:23:in `public_send'
  from test.rb:23:in `block in new'

UPDATE: The pull request that I'm talking about below has been merged, so this monkey patch is no longer needed.

It turns out, the Ruby defaulf logging formatter is not Ractor-friendly. I've opened the pull request to fix this, so once that's merged, the basic Ruby logger formatter will work just fine. For the time being, we will monkey patch it:

class Logger::Formatter
  def call(severity, time, progname, msg)
    Format % [
      severity[0..0],
      format_datetime(time),
      Process.pid,
      severity,
      progname,
      msg2str(msg)
    ]
  end
end

With this, we can run our logging from any ractor we want:

require 'logger'

class Logger::Formatter
  def call(severity, time, progname, msg)
    Format % [
      severity[0..0],
      format_datetime(time),
      Process.pid,
      severity,
      progname,
      msg2str(msg)
    ]
  end
end

class Rogger < Ractor
  def self.new
    super do
      logger = ::Logger.new($stdout)

      while data = recv
        logger.public_send(data[0], *data[1])
      end
    end
  end

  def method_missing(m, *args, &_block)
    self << [m, *args]
  end
end

class Rails
  LOGGER = Rogger.new

  def self.logger
    LOGGER
  end
end

Ractor.new do
  Rails.logger.info "Hello"
end

sleep(1)
ruby test.rb

I, [2020-09-28T18:23:56.181512 #11519]  INFO -- : Hello

Summary

Providing the Ractor support in the things like Rails won't be easy. There are many challenges to tackle, but at the same time, I see it as an excellent opportunity to leverage new Ruby capabilities. It's also a great chance to get away from anti-patterns that are in Ruby and Rails for as long as I can remember. There's a whole new world of engineering that will be much easier to achieve thanks to Ractors.

This year, I want to also explore the possibility of running homogenous Docker containers with Ruby VM in which I could load balance services running in particular guilds. Theoretically, this could allow for sub-second mitigation of sudden traffic spikes without having many overprovisioned instances.


Cover photo by David Stanley on Attribution 2.0 Generic (CC BY 2.0) license.

Copyright © 2024 Closer to Code

Theme by Anders NorenUp ↑