Tag: Ruby on Rails

RubyKaigi 2018 Review – conference in a nutshell

RubyKaigi 2018 has ended, but the excitement is still fresh. After 25 hours in planes, trains, buses, and cabs we’re finally home. I guess it’s a good time to summarize and review 4 days on the best Ruby conference in the world.

Castle.io support

First of all, I would like to express special thanks to Castle.io for backing me up, providing me possibility to go to RubyKaigi and for their ongoing Karafka support. Wouldn't happen without them.

What is RubyKaigi?

I’ll let RubyKaigi speak for itself:

RubyKaigi is the authoritative international conference on the Ruby programming language, attracting Ruby committers and Ruby programmers from around the world to Japan, the birthplace of Ruby. Held nearly every year since 2006, RubyKaigi is a truly international event.

RubyKaigi is the most prestigious Ruby conference in the world. It is probably the only conference where you can meet all the core Ruby team members at the same time. It's also one of few conferences, if not the only one, that is in both Japanese and English simultaneously.

What is amazing about this conference is the fact that despite being hosted in Japan, English is not a second-class citizen. In fact, a majority of talks had slides in English, and during Japanese presentations, there was always a real-time translation available.

RubyKaigi gathers people from all around the world, creating a great place for exchanging experience and spreading new ideas. It's also an amazing place if you want to meet some (really a lot!) new people and recharge your Ruby batteries for at least a year.

This year's RubyKaigi was hosted in Sendai.

Traveling to Japan

Traveling to Japan from Poland is not that hard. There's a direct flight from Warsaw to Tokyo Narita  Airport. It takes roughly 10 hours and 30 minutes. Sounds great as long as you're from Warsaw. Unfortunately, I live and work in Cracow. This and the fact, that Narita airport is located far away from the Tokyo city center, makes the whole journey a bit longer. It took us almost 24 hours to get from our apartment in Cracow to our hotel in Tokyo. Still not that bad for a 8658-kilometer distance.

Visiting Asakusa.rb

We've arrived in Japan a couple days before the conference. That was a great opportunity to visit Asakusa.rb, one of the most active and probably the most important Japanese local Ruby group. It's named after Asakusa - the district in Taitō that is famous for the Sensō-ji, a Buddhist temple dedicated to the bodhisattva Kannon.

Asakusa.rb about themself:

Asakusa.rb is probably the most active regional Rubyist group in Japan. We're having weekly meetup on every '''Ruby Tuesday''' somewhere in Tokyo.
If you have a chance to spend a day on Tuesday in Tokyo, come and join our meetup (or drink up)! We're always welcoming foreign guests :)

And to be honest, it's hard not to agree with that. They meet quite frequently, so if by any chance you end up in Tokyo, don't forget to visit them for a little hackathon.

Arriving in Sendai

Sendai is the capital city of Miyagi Prefecture, the largest city in the Tōhoku region, and the second-largest city north of Tokyo. Getting there from the Tokyo Station is straightforward and easy. All you have to do is take an Akita or Tohoku Shinkansen (bullet train). It takes less than two hours.

The city of Sendai is clean, big and friendly :) The only downside is the fact that the exchange rates for buying JPY with USD are much lower than in Tokyo (between 10 and 15 percent).

About the conference

Looking at how the conference was organized, hosted and managed by Akira Matsuda and the rest of the organizers, I can only say one thing: amazing job! It's really hard to point any flaws in the organization and the way everything happened.

I feel that I should also praise the Japanese-English translators. They did amazing job, especially since it was a very technical conference. The only moment one of them got confused was during the TRICK, but it  was understandable. Almost everyone got confused in a way during this part of the conference.

I can really complain only on one thing: inaccurate location descriptions in English. Every party was hosted somewhere else, and the English descriptions and/or addresses of the places were either incomplete or incorrect. It could be a problem for anyone who does not know Japanese and has to rely on Google Maps. Surprisingly, there were even differences in the English and Japanese addresses for some of the locations.

However, in the end, I was able to get to each of the parties (sometimes with a bit of delay), so let's say it was just an eliminating challenge ;).

Day 1 Speeches

Matz’s keynote

The conference was opened by no one else than Matz himself. To be honest, no one knew what he is going to talk about, as his topic was "TBD" till the last moment.

He talked about one of the hardest problems in programming, naming. He admitted that the yield_self name was not exactly as it should be, and that instead of saying what it does, it described how it does it. He also announced that he accepted the then alias name for this method.

He also touched a couple of other "future Ruby" topics like Guilds and JIT but without any technical details.

His talk was the least technical one that I attended during the whole conference, but overall it was interesting. He even admitted that his talk is not going to be technical, because when he starts talking technical stuff people either get bored or they don't understand what he's talking about. I guess many of us are still not there ;).

Karafka - Ruby Framework for Event Driven Architecture

Since it was my talk, it's not for me to judge. All I can say is that I had a lot of fun preparing and presenting this subject to the Ruby community. 40 minutes is definitely not enough to cover the complexity of Karafka, but I hope that the other attendees got at least a general overview of what this piece of software can do for them and how can it help them process a lot of data.

You can get the slides from my talk by clicking on the image below.

All about RuboCop

If you do Ruby for a living, there's a high probability that you know RuboCop. But what you may not be aware of is how it all began, how it evolved and where it will go in the future. BBatsov covered pretty much everything non-technical that you could be interested in reference to this awesome tool, as well as some technical details like deciding on a cross-platform parser and solving many problems in RuboCop development.

Day 2 Speeches

Guild prototype

If this were the only talk at this conference, I would still have come to Sendai. At least this is how I felt before the talk. I had high hopes about this topic and I (still) see guilds as one of the most important (if not the most important) components that will move the Ruby language and its community forward. I expected to get a lot of technical details on managing guilds from Koichi Sasada, plus some implementation details. Did I get what I expected? Well, not exactly...

After the talk I had mixed feelings, mostly because Koichi-san repeated many things that we already knew from his previous presentations. Although this is understandable as not everyone is into guilds as I am, I still was not satisfied. We've received only a glimpse of the whole API, some code examples and (really promising) benchmarks. That was good, but I feel that the earlier the API (an imperfect one, without full implementation) betcomes public, the faster the Ruby core team will get a feedback on it.

One thing that saved all of this was the Guilds initial source code release, right after the conference. I hope, that thanks to this initial code release, things get faster now.

You can download Koichi-san's presentation by clicking on the image below:

Firmware programming with mruby/c

In my humble opinion, this was the best talk during the whole RubyKaigi 2018.

Hasumi's talk included everything I love about Ruby and more:

  • Ruby and more Ruby;
  • mRuby/c that is not well known and not often used in Europe;
  • Non-standard business use case that was super interesting (sake brewery);
  • Hardware and how to interact from within mRuby with it;
  • IoT;
  • Sake.

All of it was wrapped in a really acceptable form with a newbie introduction to the subject. It was really easy to keep track with all the things he was talking about, even despite the fact, that it was in Japanese and some things could be distorted during the translation. He was also able to prove, that the ventilation at the venue was really good due to CO2 sensors running with Ruby!

You can download Hasumi Hitoshi's great presentation by clicking on the image below:

RubyData Workshop (1) Data Science in Ruby

It's always really interesting to see how Ruby is being used in this area. The workshop focused on introducing the audience to some of the data science libraries available for Ruby and going through example sets of data and trying to get some valuable information out of them.

Too bad that the second part was in Japanese. :)

Day 3 Speeches

Parallel and Thread-Safe Ruby at High-Speed with TruffleRuby

TruffleRuby is one of the most interesting projects in the Ruby world at the moment. It's not just a new version of JRuby, it's a high performance implementation of the Ruby programming language built on GraalVM with Oracle support. TruffleRuby aims to run not only the Ruby code fast, but also do that with support for all the native C libraries. This means that once finished, it should be interchangeable with cRuby.

Benoit Daloze's keynote was about bringing more concurrency into Ruby world. Really good, technical talk with a lot of low-level details you won't find anywhere else.

The Method JIT Compiler for Ruby 2.

K0kubun-san's work is always exceptional. His talk was about the second most important (in my opinion) thing that is being currently developed by Ruby core team: YARV-MJIT. The talk included a short introduction and not-so-short technical part during which we all went together through how things are being constructed during the JITing process. It's really hard to summarize this type of talk, without having to write a whole separate article on that matter, so hopefully the presentation and the video recording will be soon available.

The most interesting part for me, was the part about getting rid of C and being able to achieve better performance with Ruby than with C. Remarkable magic it is!

High Performance GPU computing with Ruby

GPU computing with Ruby? Well yes! The talk was about the ArrayFire-rb. It's a library that can be used for high-performance computing in Ruby like statistical analysis of big data, image processing, linear algebra, machine learning, and more.

Good introduction to the subject with a bit of algebra and statistics here and there.


TRICK (Transcendental Ruby Imbroglio Contest for RubyKaigi) is just something else. It's really hard to describe what it really is, but as a big simplification you can say that TRICK is the abstract painting using Ruby language as a brush. The goal of this contest is to create a small and eloquent Ruby program that will illustrate in an abstract way abilities of Ruby language.

This year's TRICK was the biggest, and by looking at the previous editions, I can clearly say that the most crazy. Many were following the patterns that you could see in the previous years. That is creating some obfuscated libraries that would build animations or render 3D models but the winning code really stood out! Being able to create a working Ruby code just from the reserved keywords is not only mind-blowing, is just... something else.

Before parties, parties, after parties and after-after parties - Omotenashi

Ruby is not just a language. It is also a community of super friendly, open-minded people that share the same passion. The same can be said about this conference. It's not only a conference, but it is also a place for exchanging ideas and concepts. And is there a better way to do that than over a bit of sake? I think not.

Every single conference day had a party. There was a party for some of us before the conference and an after party after the after party as well ;)

This part of each conference is, for me, as important as the talks. Being able to ask speakers about their work and hobbies and  to discuss various Ruby-related matters with the best Ruby programmers in the world is always a pleasure. So many great conversations happened during these days...

Here, I must send a shout out to Hiroshi Shibata, Akira Matsuda, Joker007, Satoshi "moris" Tagomori and the rest of the Asakusa.rb members for taking such a good care of me and making me feel like one of them :-) It was amazing experience, and as I mentioned already, you are always welcome in Cracow!


RubyKaigi is really special: A purely technical conference with an extreme Ruby focus.

Technical stuff and technical discussions is what drives me, and the only problem that I had is that the conference was multi-track. I was unable to attend all of the talks and making a choice was almost impossible :(.

I can recommend this conference to any Ruby and Japan fan. Whether you're a beginner or a pro, you will definitely find something for yourself during presentations, workshops and the parties.

Karafka framework 1.2.0 Release Notes (Ruby + Kafka)

Note: These release notes cover only the major changes. To learn about various bug fixes and changes, please refer to the change logs or check out the list of commits in the main Karafka repository on GitHub.

Note: 1.2 release is the last release that will require ActiveSupport to work.

Code quality

I will start with the same thing as with 1.1. We're constantly working on having a better and easier code base. Despite many changes to our code-base stack, we were able to maintain a pretty decent offenses distribution and trends.

It's worth pointing out, that we're now using more extensively many components of the Dry-Rb ecosystem and we love it!


This release brings significant performance improvements allowing you to consume around 40-50k messages per second per single topic. We could do a bit more (around 5-10%) by using symbols as defaults for metadata params key names, but this would bring up a lot of complexity and confusion since JSON parsing returns string keys. It would also introduce some problematic incompatibilities when using additional backend engines that serialize the whole params_batch and deserialize it back.

Karafka is a complex piece of software and benchmarking it can be tricky. There are many use-cases that need to be considered. Some of them single threaded, some of them multi-threaded, some with a non-parsed data rejections and some requiring multiple-thread interactions. That's why it is really hard to design a single benchmark that will be able to compare multiple Kafka + Ruby frameworks in a fair way.

We've decided not to go that way, but rather compare new releases with the previous once. Here are the results of running the same logic with 1.1 and 1.2 multiple times (the more the better):

For some edge cases, Karafka 1.2 can be up to 3x faster than 1.1.

If you are looking for some cross-framework benchmark results, they are available here.


Controllers are now Consumers

Initial versions of Karafka were built with an idea, that we could ignore the transportation layer when working with data. Regardless whether it was an HTTP request, Kafka message or anything else, as long as the data is in a compatible format, we should not have to adapt our business logic to it.

That was the primary reason, why prior to Karafka 1.2 you would put logic in controllers that inherited from ApplicationController or KarafkaController. And this was a mistake.

More and more companies use Karafka within a typical Ruby on Rails stack in which controllers are meant to be Rails controllers. Less experienced developers that encounter Karafka controllers within Rails app/controllers namespace would often end up trying to use some Rails controllers API specific magic without realizing that they're within Karafka controller scope. To eliminate this problem and to match Kafka naming conventions, the processing units that are responsible for feeding you with Kafka data are being renamed to Consumers and from now on, there are no controllers in the Karafka ecosystem.

# Within app/consumers
class UsersCreatedConsumer < ApplicationConsumer
  def consume
    params_batch.each { |params| User.create!(params['user']) }

New instrumentation engine using Dry-Monitor

Note: Dry-Monitor usage requires a separate article. Here's just a brief summary of what we did with it.

Old Karafka monitor was too magical. It would auto-detect the context in which it is invoked, automatically building notification scopes and doing a lot of other things. This was really cool but it was:

  • Slow
  • Hard to maintain
  • Bug sensitive
  • Code change sensitive
  • Not isolated from the rest of the system
  • Hard to use with custom tools like NewRelic or Airbrake
  • Limited when it comes to instrumenting with multiple tools at the same time
  • Too custom to be easily replaced

We are proud to announce, that from now on, Dry-Monitor is the instrumentation backbone of the whole Karafka ecosystem. Here's a simple example of what you can achieve using it:

Karafka.monitor.subscribe 'params.params.parse.error' do |event|
  puts "Oh no! An error: #{event[:error]} occurred!"

and to be honest, possibilities are endless. From simple logging, through in-production performance monitoring up to multi-target complex instrumentation. Please refer to the Monitoring and logging section of Karafka Wiki for more details.

Dynamic Karafka::Params::Params parent class

Karafka is designed to handle a lot of messages. Each incoming message is wrapper with a lazy evaluated hash-like object. Prior to 1.2, each params object was built based on ActiveSupport::HashWithIndifferentAccess. Truth be told, it is not the fastest library ever (benchmark details here), especially when compared to a PORO Hash:

Common Hash#[] access:  8306261.5 i/s
Common Hash#fetch access:  6053147.2 i/s - 1.37x slower
HashWithIndifferentAccess #[] String:  3803546.0 i/s - 2.18x slower
HashWithIndifferentAccess#fetch String:  1993671.6 i/s - 4.17x slower
HashWithIndifferentAccess#fetch Symbol:  1932004.0 i/s - 4.30x slower
HashWithIndifferentAccess #[] Symbol:  1422367.3 i/s - 5.84x slower
Hash#with_indifferent_access #[] String:   470876.8 i/s - 17.64x slower
Hash#with_indifferent_access #fetch String:   414701.6 i/s - 20.03x slower
Hash#with_indifferent_access #fetch Symbol:   410033.7 i/s - 20.26x slower
Hash#with_indifferent_access #[] Symbol: 381347.2 i/s - 21.78x slower

Now imagine that in some cases, we create 50 0000 objects like that per second. This had to have a serious impact on the framework performance. As always, there needs to be a trade-off. Should we go with a Hash in the name of performance or should we use HashWithIndifferentAccess for the sake of the "simplicity"? We will let you choose whatever you find more suitable.

For that reason, we've provided a config params_base_class setting that you can use to set up the base params class from which Karafka::Params::Params will inherit. By default, it is a plain Hash.

require 'active_support/hash_with_indifferent_access'

class App < Karafka::App
  setup do |config|
    # Other settings...
    # config.params_base_class = Hash
    config.params_base_class = HashWithIndifferentAccess

Keep in mind, that you can use other base classes like for example concurrent hash for your advantage. The only requirement is that it needs to have the same API as a Ruby Hash.

System callbacks reorganization with multiple callbacks support

Note: This will be unified with a one set of events that you will be able to hook up to in 1.3 using Dry-Events.

Due to the fact, that some of the things happen in Karafka outside of consumers scope, there are two types of callbacks available:

- Lifecycle callbacks - callbacks that are triggered during various moments in the Karafka framework lifecycle. They can be used to configure additional software dependent on Karafka settings or to do one-time stuff that needs to happen before consumers are created.
- Consumer callbacks - callbacks that are triggered during various stages of messages flow

You can read more about them and how to use them in the Callbacks wiki section.

before_fetch_loop configuration block for early client usage (#seek, etc)

This new callback will be executed once per each consumer group per process before we start receiving messages. This is a great place if you need to use Kafka's #seek functionality to reprocess already fetched messages again.

Note: Keep in mind, that this is a per process configuration (not per consumer) so you need to check if a provided consumer_group (if you use multiple) is the one you want to seek against.

class App < Karafka::App
  # Setup and other things...

  # Moves the offset back to 100 message, so we can reprocess messages again
  # @note If you use multiple consumers group, make sure you execute #seek on a client of
  #   a proper consumer group not on all of them
  before_fetch_loop do |consumer_group, client|
    topic = 'my_topic'
    partition = 0
    offset = 100

    if consumer_group.topics.map(&:name).include?(topic)
      client.seek(topic, partition, offset)

Rewritten NewRelic client

Thanks to NewRelic kindness, we were able to rewrite the whole listener that now can collect various information about the Karafka data flow. It is super easy to use and extend. You can find it in the Monitoring and Logging wiki section.

Key and/or partition key support for responders

You can now provide key and/or partition_key when using responders:

module Users
  class CreatedResponder < KarafkaResponder
    topic :users_created

    def respond(user)
      respond_to :users_created, user, key: user.id

Alias for client#mark_as_consumed on a consumer level

Simple yet powerful. For max performance, you may use manual offset commit management. If you do that, you can now use the #mark_as_consumed directly, without having to refer to the #client object.

class UsersCreatedConsumer < ApplicationConsumer
  def consume
    params_batch.each { |params| User.create!(params['user']) }
    mark_as_consumed params_batch.last

Incompatibilities and breaking changes

Controllers are now Consumers

Please refer to the features section with this one. It is both a feature and a breaking change at the same time.

after_fetched renamed to after_fetch to normalize the naming convention

class ExamplesConsumer < Karafka::BaseConsumer
  include Karafka::Consumers::Callbacks

  after_fetched do
    # Some logic here

  def consume
    # some logic here

is now:

class ExamplesConsumer < Karafka::BaseConsumer
  include Karafka::Consumers::Callbacks

  after_fetch do
    # Some logic here

  def consume
    # some logic here

received_at renamed to receive_time to follow ruby-kafka and WaterDrop conventions

received_at params key is now receive_time. This means that two timestamp values are available for each params object:

  • receive_time - the moment message was received by our Karafka process
  • create_time - the moment our message was created in the producer

Hash is now the default params base class in favor of ActiveSupport::HashWithIndifferentAccess

Long story short: performance and fewer dependencies. You can still use it though:

require 'active_support/hash_with_indifferent_access'

class App < Karafka::App
  setup do |config|
    # Other settings...
    config.params_base_class = HashWithIndifferentAccess

All metadata keys are strings by default

Since now the default params class is a Hash, we had to pick either symbols or strings as key names for all the metadata attributes. We've decided to go with strings as they are more serialization friendly and cooperate with various backends used with Karafka.

Note: If you use HashWithIndifferentAccess, nothing really changes for you.

def consume
  params_batch.first.keys #=> ["parser", "partition", "offset", "key", "create_time", ...]

JSON parsing defaults now to string keys

Since there is no indifferent access by default, when lazy parsing the JSON Kafka data, it will default to string keys that will be merged to the params object. If you're not planning to use the HashWithIndifferentAccess make sure that your code-base is ready for this change.

Karafka 1.1:

class UsersCreatedConsumer < ApplicationConsumer
  def consume
    # Assuming user data is in the 'user' json scope
    params_batch.each do |params| params[:user] #=> { name: 'Maciek' }
      params['user'] #=> { name: 'Maciek' }
      params['receive_time'] #=> 2018-02-27 18:53:31 +0100

Karafka 1.2:

class UsersCreatedConsumer < ApplicationConsumer
  def consume
    # Assuming user data is in the 'user' json scope
    params_batch.each do |params| params[:user] #=> nil
      params['user'] #=> { name: 'Maciek' }
      # Note, that system keys are strings as well
      params['receive_time'] #=> 2018-02-27 18:53:31 +0100

Configurators removed in favor of the after_init block configuration

What were configurators? Let me quote 1.1 wiki on that one:

For additional setup and/or configuration tasks you can create custom configurators. Similar to Rails these are added to a config/initializers directory and run after app initialization.

Due to a changed lifecycle of Karafka process, more things are being built dynamically upon boot. This means that in order to run initializers in a good way, we would have to control the load order in a more granular way. That's why this functionality has been replaced with an after_init callback declaration:

class App < Karafka::App
  # Setup and other things...

  # Once everything is loaded and done, assign Karafka app logger as a Sidekiq logger
  # @note This example does not use config details, but you can use all the config values
  #   to setup your external components
  after_init do |_config|
    Sidekiq::Logging.logger = Karafka::App.logger

Note: you can have as many callbacks of any type as you want to. They also can be objects as long as the respond to a #call method.

Karafka ecosystem gems versioning convention

Karafka is combined from several independent libraries. The most important are:

  • Karafka - The main gem that is used to build Karafka applications that consume messages
  • WaterDrop - WaterDrop is a standalone Karafka component library for generating Kafka messages
  • Capistrano-Karafka - Integration for deployment using Capistrano
  • Karafka Sidekiq Backend - an optional proxy that will pass messages received from Karafka into Sidekiq jobs

Some Karafka users had problems using mismatched versions of those gems. From now on, they all will be released in sync up to the second version point. It means that if you decide to use Karafka 1.2 with other ecosystem libraries, you should match them to 1.2.* as well.

Note: This should be resolved automatically as we locked all the proper versions within gemspec, but still worth mentioning.


Our Wiki has been updated accordingly to the 1.2 status. You probably may want to look at the rewritten Monitoring and logging section and the new Testing guide that illustrates how you can test various Karafka ecosystem components.

Upgrade guide

Controllers are now Consumers

Following steps are required to move from controllers:

  • 1. Create app/consumers directory
  • 2. Rename ApplicationController (or KarafkaController) to ApplicationConsumer / KarafkaConsumer
  • 3. Move the ApplicationController and all Karafka controllers to app/consumers
  • 4. Rename files and classes by replacing "Controller" with "Consumer"
  • 5. If you use callbacks, don't forget about Karafka::Consumers::Callbacks
  • 6. Do exactly the same with your specs/tests
  • 7. Replace the controller consumers groups definition in the karafka.rb file with consumer
  • 8. Rename all the "Controller" with "Consumer" in the karafka.rb file

Karafka, WaterDrop and friends version match

This should be resolved automatically but if you prefer, you can always lock all the Karafka ecosystem gems in your gemfile:

gem 'karafka', '~> 1.2'
gem 'karafka-sidekiq-backend', '~> 1.2'
gem 'capistrano-karafka', '~> 1.2'

Ruby on Rails HashWithIndifferentAccess params compatibility mode

If you still want to use HashWithIndifferentAccess, feel free to:

require 'active_support/hash_with_indifferent_access'
class App < Karafka::App
  setup do |config|
    # Other settings...
    # config.params_base_class = Hash
    config.params_base_class = HashWithIndifferentAccess

Default monitor and logger update

Please refer to the Monitoring and logging Wiki section for details of the way both of those things work now. If you used the default monitoring and logging without any customization, all you need to do is add this to your karafka.rb file after the setup part:


NewRelic client update

If you use our NewRelic example client, please take a look at the new one and upgrade accordingly.

Callbacks rename

class ExamplesConsumer < Karafka::BaseConsumer
  include Karafka::Consumers::Callbacks
  # Rename this
  after_fetched do
    # Some logic here

  # To this
  after_fetch do
    # Some logic here

Karafka params received_at renamed to receive_time

Again, just a name change: if you use 'received_at' params timestamp, you'll enjoy it under the 'receive_time' key.

Getting started with Karafka

If you want to get started with Kafka and Karafka as fast as possible, then the best idea is to just clone our example repository:

git clone https://github.com/karafka/karafka-example-app ./example_app

then, just bundle install all the dependencies:

cd ./example_app
bundle install

and follow the instructions from the example app Wiki.

Copyright © 2024 Closer to Code

Theme by Anders NorenUp ↑