transactions | Closer to Code

When you use Ruby on Rails with ActiveRecord, you can get used to having separate transaction on each request. This is valid also when using Trailblazer (when inside of a request scope), however Trailblazer on its own does not provide such functionality. It means that when you're using it from the console and/or when you process stuff inside background workers, you no longer have an active transaction wrapping around an operation.

This behavior is good most of the time. Since background tasks can run for a long period of time, there might be a risk of unintentional locking a big part of your database. However, sometimes you just need to have transactions.

In order to provide this feature for each operation, we will use a concern that will include that logic. We will also make it configurable, so if we inherit from a given operation, we will still have an option to disable/enable transaction support based on the operation requirements.

The code itself is pretty simple - it will just wraps around a #run method of the operation class with a transaction (as long as transaction support is enabled). Note, that by default transactional flag is set to false.

module Concerns
  module Transactional
    extend ActiveSupport::Concern

    included do
      class_attribute :transactional

      self.transactional = false
    end

    def run
      if self.class.transactional
        self.class.transaction do
          super
        end
      else
        super
      end
    end

    class_methods do
      def transaction
        ActiveRecord::Base.transaction do
          return yield
        end
      end
    end
  end
end

In order to use it, just include it into your operation:

class ApplicationOperation < Trailblazer::Operation
  # Including on its own won't turn transactions on
  include Concerns::Transactional
end

class DataOperation < ApplicationOperation
  # This operation will have a single transaction wrapping it around
  self.transactional = true
end

Background processing is a must-be in any bigger project. There's no way to process everything in a lifetime of a single request. It's such common, that I venture to say that each and every one of us made at least one background worker. Today I would like to tell you a bit about reentrancy.

What is reentrancy

I will just quote Wikipedia, since their description is really nice:

In computing, a computer program or subroutine is called reentrant if it can be interrupted in the middle of its execution and then safely called again ("re-entered") before its previous invocations complete execution. The interruption could be caused by an internal action such as a jump or call, or by an external action such as a hardware interrupt or signal. Once the re-entered invocation completes, the previous invocations will resume correct execution.

So basically it means, that if our worker crashes, we can execute him again and everything will be just fine. No database states fixing, no cleanups, no resetting - nothing. Just re-executing worker task.

How many workers do you have like this? ;) I must admit: I've created non-reentrant workers many times and many times I wish I didn't.

Why our workers should be reentrant and why it's not an overhead to make them like this

Making workers reentrant, especially at the beginning will take you more time than creating a "standard" one. "This will create an overhead" you might think. This might be true but... it's not. If you have many workers that are constantly doing something, and some of them crash, reentrancy will save you a lot of time. It allows you to just fix the issue and rerun tasks again, without having to worry about anything else. Without it, you would spend some time fixing database structure, cleaning things up, performing requests to external APIs from production console and other non-programming related stuff. I guarantee that you will waste much more time doing this, than writing your workers well in a first place.

How to make my workers reentrant?

It's not so hard as you might thing, although sometimes it can be a bit tricky. Of course every worker is somehow unique but there are some general rules that you can use.

Transactional, non-API example test case

The easiest stuff is with non-API, transaction-only workers, that calculate and update some data:

class ScoreWorker
  include Sidekiq::Worker

  def perform(user_id)
    user = User.find(user_id)
    user.update(status: 'calculating')
    # This can take a while...
    user.calculate_score!
    user.update(status: 'calculated')
  end
end

If something would happen when calculate_score! is executed, we would end up with a user with endless "calculating" state. The easiest way to fix this, is to use ActiveRecord::Base.transaction block:

class ScoreWorker
  include Sidekiq::Worker

  def perform(user_id)
    ActiveRecord::Base.transaction do
      user = User.find(user_id)
      user.update(status: 'calculating')
      # This can take a while...
      user.calculate_score!
      user.update(status: 'calculated')
    end
  end
end

If anything goes wrong, we will get back to where we started. Unfortunately this approach has one huge disadvantage: user status is changed in transaction, so until it is committed, we won't have the 'calculating' status (at least if you don't have dirty reads).

Non-transactional, non-API example test case

Approach presented below can be also used to improve previous example. Let's imagine we don't have transactional DB and that every operation is performed separately. We need to catch an exception, rewind everything back and then just reraise error:

class ScoreWorker
  include Sidekiq::Worker

  def perform(user_id)
    user = User.find(user_id)
    # State machine is always nice :)
    user.calculating_score!
    # This can take a while...
    user.calculate_score!
    user.calculated_score!
  rescue e
    # Reset everything so it can be processed again later
    # We "if" in case error was raised in the first line
    user.reset_score! if user
    raise e
  end
end

Of course this will not prevent us from DB failures, but in my experience, workers tend to fail mostly not because of database issues but because of some problems in the app (worker) logic.

Non-transactional, API example test case

What about external API interactions? When we change something remote, we cannot just simply "unchange" stuff. Let's say that we have a charging mechanism, that makes a call (charge) and then it sends an invoice to this user:

class PaymentWorker
  include Sidekiq::Worker

  def perform(user_id)
    user = User.find(user_id)
    Payment::Gateway.charge!(user)
    user.send_invoice_confirmation!
  end
end

How can we provide reentrancy for worker like that? What will happen when user.send_invoice_confirmation! fails? We cannot charge user again for the same month. This means, that we cannot execute this worker task again. We might check whether or not user has been charged:

class PaymentWorker
  include Sidekiq::Worker

  def perform(user_id)
    user = User.find(user_id)
    Payment::Gateway.charge!(user) unless user.charged?
    user.send_invoice_confirmation!
  end
end

or we can delegate invoice sending to a separate worker:

class PaymentWorker
  include Sidekiq::Worker

  def perform(user_id)
    user = User.find(user_id)
    Payment::Gateway.charge!(user)
    EmailWorker.perform_async(user_id, :invoice)
  end
end

In this case, if user.send_invoice_confirmation! fails, we just need to rerun EmailWorker task that will try to send this email again.

Summary

If you want to build systems that are easy maintain and work with - reentrancy is a must be;
Building reentrant workers will save you a lot of time during crashes;
It's not always about transactions - it's more about the state before worker is executed, as long as we can provide same starting point, we can be reentrant;
Reentrancy can be obtained by splitting workers into "atomic" operations that can be rerun;
It's much easier to introduce reentrancy if you use one of finite-state machines available for Ruby;

Tag: transactions

Integrating Trailblazer and ActiveRecord transactions outside of a request lifecycle

Ruby (Rails, Sinatra) background processing – Reentrancy for your workers is a must be!