Running with Ruby

Category: Rails (page 1 of 75)

Kafka on Rails: Using Kafka with Ruby on Rails – Part 2 – Getting started with Ruby and Kafka

  1. Kafka on Rails: Using Kafka with Ruby on Rails – Part 1 – Kafka basics and its advantages
  2. Kafka on Rails: Using Kafka with Ruby on Rails – Part 2 – Getting started with Ruby and Kafka

Kafka Docker local setup

Before we proceed with combining Kafka with Ruby, it would be good to have a workable local Kafka process. Kafka requires Zookeeper and to be honest, a local setup can be a bit tricky. The easiest way to do that is by running a docker container for that. Here’s an example script that should be enough for the basic local work. It will spin up a single node cluster of Kafka that you can use out of the box:

KAFKA_ADVERTISED_HOST_NAME=127.0.0.1

docker stop zookeeper
docker stop kafka
docker rm zookeeper
docker rm kafka

# You can disable those two once initially pulled
docker pull jplock/zookeeper
docker pull ches/kafka

docker run \
  -d \
  --name zookeeper \
  jplock/zookeeper:3.4.6

docker run \
  -d \
  --name kafka \
  -e KAFKA_ADVERTISED_HOST_NAME=$KAFKA_ADVERTISED_HOST_NAME \
  --link zookeeper:zookeeper \
  -p $KAFKA_ADVERTISED_HOST_NAME:9092:9092 \
  ches/kafka

ZK_IP=$(docker inspect --format '{{ .NetworkSettings.IPAddress }}' zookeeper)
KAFKA_IP=$(docker inspect --format '{{ .NetworkSettings.IPAddress }}' kafka)

echo "Zookeeper: $ZK_IP"
echo "Kafka: $KAFKA_IP"

To check that it works, you can just telnet to it:

telnet 127.0.0.1 9092
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.

Note: If you need anything fancy, you can find a more complex Dockerfile setup for running Kafka here.

Getting started with Karafka framework

Karafka is a framework used to simplify Apache Kafka based Ruby and Rails applications development. It provides a higher-level abstraction, that allows you to focus on your business logic development, instead of focusing on implementing lower level abstraction layers. It provides developers with a set of tools that are dedicated for building multi-topic applications similarly to how Rails applications are being built.

As README states:

  • You can integrate Karafka with any Ruby-based application.
  • Karafka does not require Sidekiq or any other third party software (apart from Kafka itself).
  • Karafka works with Ruby on Rails but it is a standalone framework that can work without it.
  • Karafka has a minimal set of dependencies, so adding it won’t be a huge burden for your already existing applications.
  • It handles processing, using multiple threads, so it will utilize your CPU better (especially for IO-bound applications).

The way you should start with Kafka and Karafka heavily depends on your system state. I always recommend a different approach for tackling the already existing complex systems and for greenfield applications, especially those that don’t use Rails at all.

It’s quite common when using Kafka, to treat applications as parts of a bigger pipeline (similarly to Bash pipeline) and forward the processing results to other applications. Karafka provides two ways of dealing with that:

  • Via responders (recommended for a more complex, complete integration)
  • Using WaterDrop directly – as a messaging layer that can be easily introduced to any  applications that are already running.

Brownfield system initial integration

Note:This introduction aims to get you going as fast as possible with sending messages. A broad description on decomposing an already existing Rails application will be provided in one of the upcoming  posts in this series.

One of the easiest ways to get started with Kafka and Karafka in an already existing (and often complex) system is by introducing a simple messaging layer that will broadcast events to the Kafka cluster. This approach has several advantages:

  • You can get familiar with the stack without bigger changes to your system.
  • It’s easier.
  • It does not require much configuration and setup.
  • You won’t have to change your deployment process as messaging can happen from any Ruby process you run, like: Puma processing, Sidekiq process, Resque process, etc.

To do so, you need to install WaterDrop. It is a standalone Karafka component library for sending Kafka messages. Despite being one of the framework components, it can also act independently to allow an easier bootstrapping and usage from already running production systems. You can consider it to be an intermediate step in between not having Karafka and having it running on a full-scale.

In order to use it, you need to add this to your Gemfile:

gem 'waterdrop'

and run

bundle install

Once you’re done, you also need to create a config/initializers/water_drop.rb configuration file that will contain at least a single Kafka seed broker address:

WaterDrop.setup do |config|
  config.kafka.seed_brokers = %w[kafka://localhost:9092]
end

After that, you should be able to send messages. To check, that everything works as expected, just try do deliver a single message with a sync producer:

WaterDrop::SyncProducer.call('message', topic: 'my-topic')

Note: It’s a really good idea to disable a topic auto-creation for the Kafka production cluster. Typos happen to everyone. You can read more about Kafka brokers configuration options here.

Note: If you want to go full-scale for both producing and processing messages, just go to the Integrating with Ruby on Rails and other frameworks section of the Karafka Wiki and follow the setup instructions.

Fresh start with a greenfield system

When you don’t need integration with your current stack or you already send messages and want to consume them from a separate application, you can start easily with a clean installation:

mkdir app_dir
cd app_dir
echo "source 'https://rubygems.org'" > Gemfile
echo "gem 'karafka'" >> Gemfile

bundle install
bundle exec karafka install

The karafka install command will create all the files and directories that are required to run Karafka server process. The most interesting file is the karafka.rb file that contains all the configuration details and will contain your routing details to match controllers with proper Kafka topics.

Note: Karafka controllers will be renamed to Karafka consumers in the upcoming 1.2 release.

Summary – Getting started is easy!

This part of the series wasn’t really long. Karafka is well written and adding it to the stack is not a big problem. And because Kafka messages are immutable, sending messages is a great way to start working with it.

One thing that I can suggest to you at the end of this article, is not to throw yourself in at the deep end by implementing producing and consuming at the same time (especially if you don’t have experience with Kafka). Quite often, the initial concept and vision related to the processing flow may change after some modeling. Broadcasting without consumption gives you a really good playground to test your ideas without any risk.

Stay tuned :-)

Read more:

Ruby 2.5.0 upgrade remarks

There’s a lot of articles already on the new shiny features of Ruby 2.5.0. But here are some problems and challenges you may encounter during the upgrade process.

Devise SyntaxError

Note: Devise team already fixed this one with the new Devise release.

If you encounter this error:

SyntaxError: /.../devise-3.5.5/app/controllers/devise/sessions_controller.rb:5: syntax error, unexpected '{', expecting keyword_end
...ter only: [:create, :destroy] { request.env["devise.skip_tim...
...                              ^

the only thing you can do for now, is to edit the devise/sessions_controller.rb in your Devise local location and change:

prepend_before_filter only: [:create, :destroy] { request.env["devise.skip_timeout"] = true }

to (note the brackets):

prepend_before_filter(only: [:create, :destroy]) { request.env["devise.skip_timeout"] = true }

Note: to determine your local Devise path, use following command:

bundle show devise

Probably in few days, there will be a Devise gem release that will fix that for RubyGems as well.

ActionCable Argument Error

If you use ActionCable, then it is a good to postpone the upgrade until this issue is fixed and the upgraded version of ActionCable has been released.

#<Thread:0x00007fbb44c01b90@/Users/petercopter/.rbenv/versions/2.5.0-rc1/lib/ruby/gems/2.5.0/gems/actioncable-5.1.4/lib/action_cable/connection/stream_event_loop.rb:73 run> terminated with exception (report_on_exception is true):
Traceback (most recent call last):
	5: from /Users/petercopter/.rbenv/versions/2.5.0-rc1/lib/ruby/gems/2.5.0/gems/actioncable-5.1.4/lib/action_cable/connection/stream_event_loop.rb:73:in `block (2 levels) in spawn'
	4: from /Users/petercopter/.rbenv/versions/2.5.0-rc1/lib/ruby/gems/2.5.0/gems/actioncable-5.1.4/lib/action_cable/connection/stream_event_loop.rb:84:in `run'
	3: from /Users/petercopter/.rbenv/versions/2.5.0-rc1/lib/ruby/gems/2.5.0/gems/actioncable-5.1.4/lib/action_cable/connection/stream_event_loop.rb:84:in `loop'
	2: from /Users/petercopter/.rbenv/versions/2.5.0-rc1/lib/ruby/gems/2.5.0/gems/actioncable-5.1.4/lib/action_cable/connection/stream_event_loop.rb:94:in `block in run'
	1: from /Users/petercopter/.rbenv/versions/2.5.0-rc1/lib/ruby/gems/2.5.0/gems/actioncable-5.1.4/lib/action_cable/connection/stream_event_loop.rb:94:in `select'
/Users/petercopter/.rbenv/versions/2.5.0-rc1/lib/ruby/gems/2.5.0/gems/actioncable-5.1.4/lib/action_cable/connection/stream_event_loop.rb:94:in `lock': wrong number of arguments (given 283856384, expected 0) (ArgumentError)
WebSocket error occurred: Broken pipe

OpenSSL problem

SecureRandom now prefers OS-provided sources over OpenSSL. It also means, that OpenSSL is not required by default, so if you use it in a gem, you will have to add:

require 'openssl'

or you will end up with an error like this one:

NameError:
  uninitialized constant OpenSSL
  Did you mean?  Open3

Travis does not yet support 2.5.0

Note: Travis already fixed this one.

If you manage a multi-ruby version library, don’t update Travis yet unless you want to compile Ruby yourself. If you list 2.5.0 as one of the versions, Travis will pick the preview1 instead of the final release and you might end up with an error similar to this one:

Traceback (most recent call last):
	1: from /home/travis/.rvm/gems/ruby-2.5.0-preview1@global/bin/ruby_executable_hooks:15:in `<main>'
/home/travis/.rvm/gems/ruby-2.5.0-preview1@global/bin/ruby_executable_hooks:15:in `eval': /home/travis/.rvm/rubies/ruby-2.5.0-preview1/bin/bundle:4: syntax error, unexpected tSTRING_BEG, expecting keyword_do or '{' or '(' (SyntaxError)
exec "$bindir/ruby" -x "$0" "$@"
                       ^
/home/travis/.rvm/rubies/ruby-2.5.0-preview1/bin/bundle:9: syntax error, unexpected keyword_do_block, expecting end-of-input
Signal.trap("INT") do
                   ^~

yield_self as an incompatibility

Again, if you are a maintained of a multi-ruby version library, don’t forget to backport the yield_self into your code-base if you are planning to use it:

class Object
  def yield_self(*args)
    yield(self, *args)
  end
end

Dir::Tmpname#make_tmpname is no longer available

If you use Dir::Tmpname#make_tmpname, it is no longer available. Long story short: you need to generate unique names on your own. Click here to see how Rails core team did it.

Summary

There aren’t many problems with 2.5.0. All of the things that I’ve encountered are either easy to fix or things that will disappear after few weeks of an adoption time. Long live Ruby core team! :)

Credits

Cover photo by: Victoria Pickering on Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license. Changes made: added an overlay layer with an article title on top of the original picture.

Olderposts

Copyright © 2018 Running with Ruby

Theme by Anders NorenUp ↑