Tag: parallel

Ruby: Stream processing of shell command results

There are various methods for calling shell commands in Ruby. Most of them either not wait at all for the results, or wait until the command execution is finished.

In many cases it is ok, because programmers want shell command results for further processing. Unfortunately this means that while a shell command runs, there's no way to get partial results and process them (multitasking FTW). It also means that all the results have to be buffered. It might be (for a long running intensive commands) a source of memory leaks in your applications.

Luckily there's a great way to process shell command data in a stream (row after row).

The task

Lets assume that we want to find first 10 files in our file system that match a given pattern (note that it could be achieved way better with just shell commands but it's not the point here).

The bad way

Here's a typical code to achieve that (and believe me - I've encountered solutions like that in production systems):

require 'memory_profiler'
report = MemoryProfiler.report do
  pattern = /test/
  results = `find / 2> /dev/null`.split("\n")
  selection = results.select { |file| file =~ pattern }


This might seem elegant and it definitely works, but let's check Ruby's process memory usage:

Memory usage in MB (before and after find)

Total allocated: 661999052 bytes (2925598 objects)
Total retained:  40 bytes (1 objects)

allocated memory by gem
 661999052  other

allocated memory by class
 632408588  String
  26181752  Array
   3408440  MatchData
       232  Hash
        40  Process::Status

We had to use nearly 700MB of memory and it took us 4.7 seconds just to find few matching files. The time wouldn't be that bad,  but memory usage like this is a bit overkill. It is bad mostly because find / lists all the files and the more things we have, the bigger output we get. This also means that our code will behave differently dependent on what machine it will run. For some we might not have memory problems but for others it might grow over 1GB.

Now imagine what would happen if we would execute this code in 25 Sidekiq concurrent workers...

Of course with GC running we might not kill our machine, but memory spikes will look kinda weird and suspicious.

The good way - hello IO.popen

Instead of waiting for all the files from find / command, let's process each line separately. To do so, we can use the IO.popen method.

IO.popen runs the specified command as a subprocess; the subprocess’s standard input and output will be connected to the returned IO object. (source)

It means that we can execute find command and feed our main process with every line of the output separately.

Note: IO.popen executed without a block will not wait for the subprocess to finish!

require 'memory_profiler'
report = MemoryProfiler.report do
  pattern = /test/
  selection = []

  IO.popen('find / 2> /dev/null') do |io|
    while (line = io.gets) do
      # Note - here you could use break to get out and sigpipe
      # subprocess to finish it early. It will however mean that your subprocess
      # will stop running early and you need to test if it will stop without
      # causing any trouble
      next if selection.size > 10
      selection << line if line =~ pattern




Total allocated: 362723119 bytes (2564923 objects)
Total retained:  394 bytes (3 objects)

allocated memory by gem
 362723119  other

allocated memory by class
 362713215  String
      8432  IO
      1120  MatchData
       232  Hash
        80  Array
        40  Process::Status

45% less memory required.

The best way (for some cases)

There's also one more way to do the same with the same #popen but in a slightly different style. If you:

  • Don't need to process all the lines from the executed command
  • Can terminate subprocess early
  • Are aware of how to manage subprocesses

you can then stream data into Ruby as long as you need and terminate once you're done. Than way Ruby won't fetch new lines and won't have to GC them later on.

require 'memory_profiler'
report = MemoryProfiler.report do
  pattern = /test/
  selection = []
  run = true

  io = IO.popen('find / 2> /dev/null')

  while (run && line = io.gets) do
    if selection.size > 10
      Process.kill('TERM', io.pid)
      run = false

    selection << line if line =~ pattern



Since we don't wait for the subprocess to finish, it definitely will be faster but what about memory consumption?

Total allocated: 509989 bytes (5613 objects)
Total retained:  448 bytes (4 objects)

allocated memory by gem
    509989  other

allocated memory by class
    499965  String
      8432  IO
      1120  MatchData
       232  Hash
       200  Array
        40  Process::Status

99% less than the original solution!

Note: This solution is not always applicable.


The way you execute shell commands really depends on few factors:

  • Do you need an output results at all?
  • Do you need all the lines from the output at the same time?
  • Can you do other stuff and return once the data is ready?
  • Can you process partial data?
  • Can you terminate subprocess early?

When you go with your code out of Ruby scope and when you execute shell commands, it is always good to ask yourself those questions. Sometimes achieving stream processing ability can be done only when the system is being built, so it is really good to think about that before the implementation. In general I would recommend to always consider streaming in every place where we cannot exactly estimate the external command result size. That way you won't be surprised when there will be a lot more data that initially anticipated.

Note: Attentive readers will notice, that I didn't benchmark memory used in the subprocess. That is true, however it was irrelevant to our case as the shell command for all the cases was exactly the same.

Cover photo by: heiwa4126 on Creative Commons license.

Getting ready for new concurrency in Ruby 3 with Guilds

Ruby Guilds are the new way concurrency will be handled in Ruby 3. There's still a long way to go until we reach that point, but I believe that we can already start implementing some of the concepts that will make our lives easier when we reach  Ruby 3.

Note: this is not an article explaining what are Guilds and how do they work. You can read excellent explanations on that here:

Note 2: everything here is based on some assumptions which means that in few years the concept might look totally different. However it doesn't mean that you shouldn't use good practices or some recommendations that I describe below.

Guilds basic concepts in a TL;DR version

  • Guilds have at least one thread (and a thread has a fiber)
  • Threads in different guilds can run in parallel
  • Threads in a same guild can not run in parallel because of GIL/GVL/GGL (Giant Guild Lock)
  • A guild can't access the objects of other guilds
  • Guilds are allowed to communicate with each other using channels (Guild::Channel)
  • Objects can be copied between guilds (deep copy)
  • Objects can be moved between guilds
  • Immutable objects (deeply immutable) can be shared between guilds

It sounds simple (and Koichi said that the initial concept implementation has only 400 lines), but it creates many problems that will have to be solved. I will try to cover some of them as they might have an impact on the overall performance of our code.

Don't try to unlearn locking and multi threading Ruby 2 approach

TL;DR: you will still have to know how to deal with multi threading and it's problems the way it is handled in Ruby 2.

GVL is insufficient to guard against data races on Ruby2 and this won't change inside single guild with multiple threads. Since Ruby core team aims to make Ruby 3 compatible with Ruby 2 software (so the community won't split with incompatible Ruby versions), any Ruby 2 software will run in Ruby 3 in a single Guild. So all your synchronization and locking problems won't go away without an effort.

Scaling with guilds won't be linear so don't think it will solve all your problems

Guilds won't be silver bullets. They will give Ruby programmers a new, great set of tools but they will for sure create some problems. If you hope that memory usage will drastically go down and that performance will go up, without you doing anything you might be really disappointed when new Ruby appears.

Objects owners mean more checking

TL;DR: the less you share the smaller the transferring overhead will be.

Objects will have guild owners. It means that Ruby will have to have references to which guild an object belongs. One of the slides from Koichi's presentation states that an exception will be thrown when trying to access object from other guild. It means that Ruby will have to have some sort of checker that will run either on:

  • every object access
  • every object access for objects that were transfered at least once (flag or something?)
  • every object access of an object that is not frozen and references in other guilds

Either way, there will be way more access checking. Ruby already checks the class of each object on it's access, so maybe this could be combined together.

What that means for us? The less objects we will share between guilds the faster they will run.

Transferring ownership - moving references vs copying

TL;DR: It might be faster (and for sure safer) to start using immutable structures if you plan to transfer a lot of data in between guilds.

Programmers will definitely want to transfer ownership not only for simple objects but also for more complex (and big) data structures. It means that Ruby not only will have to move main objects but all sub-objects (arrays of arrays of objects, etc).

And here a question emerges: wouldn't it be better to just copy the whole structure instead of updating all the references? Is there even a programming language that has a GC and allows moving mutable objects directly (without deep copying) between threads?

Method cache will remain global

TL;DR: OpenStruct will be a worse idea than it is right now. Try even harder not to invalidate method cache too much.

When we redefine a method (or add new), method lookup will have to be performed an cache invalidation needs to occur on all the guilds. This means that we will have to stop execution of all the guilds at once (since there shouldn't be a case when one runs on an old method version and another already uses new one).

If you wonder what OpenStruct does to your Ruby code and what impact exactly it has, you can read my article about that here: http://mensfeld.pl/2015/04/ruby-global-method-cache-invalidation-impact-on-a-single-and-multithreaded-applications/.

Global data

TL;DR: Freeze all the global data you define, stop using global variables, don't redefine stuff unless you really need to and don't overwrite constants.

By global data I mean:

  • Global variables
  • Class and module objects
  • Class variables
  • Constants
  • Instance variables of class and module objects

Operations that redefine things will be either impossible or slower. It will impact also access (mutable constants access between guilds).


There isn't much to summarize since for each section there's a TL;DR but it's worth pointing out that we need to be more cautions about sharing our data and about doing a lot of meta-programming beyond good practices (like redefining dynamically built constants) and everything should be fine.

Cover photo by: Shuets Udono on Creative Commons license

Copyright © 2022 Closer to Code

Theme by Anders NorenUp ↑