Tag: caching

Rails 3.2, Redis-store, views caching and expire_fragment with Regexp

In one of my projects, I have a VPS with low I/O, so I decided to move from disk cache to something else. Since I use some Regexps in expire_fragment method, I've decided to use Redis-store. It gives me exactly what I need:

  • Performance
  • Persistence
  • I already know Redis ;)
  • I use Redis in the same project for other purpose
  • "Almost" working Regexp support

expire_fragment with Regexp why aren't you working?

After view minutes with Redis-store source code, I've figured out why ;) Well, it uses native Redis KEYS method to get all the matching keys and it expires them. Unfortunately KEYS don't support Regexp matching :( Instead it works with wild-cards matching and the same goes for Redis-store.

Quick fix

Ok, this isn't so bad. My Regexps are relatively simple and there's not to much of them, so converting them should not be a problem. Most of the time I expire fragments outside controllers, so I've created an additional layer in the expire process (read more about this issue). We just need to map all the Regexps into an "Redis acceptable" form. As I mentioned above, my Regexps are simple, so mapping them was really easy (few examples):

/announcements-index/ => "*announcements-index*"
/weekly-topics-index/ => "*weekly-topics-index*"

Ruby code for such conversions looks like this:

fragment = "*#{fragment.to_s.split(':').last.gsub(')', '')}*"

this solution works for simple Regexps and it works for me. Unfortunately this isn't the only issue with Redis-store. I've overwritten the expire_fragment method for my layer:

  def expire_fragment(fragment, options = nil)
    if Rails.configuration.cache_store == :redis_store
      if fragment.is_a?(Regexp)
        fragment = "*#{fragment.to_s.split(':').last.gsub(')', '')}*"
      end
    end
    super
  end

But still only direct cache hits expire would work.

cache_store.delete_matched doesn't work?

Expire_fragment method under ActionController::Caching looks like this:

def expire_fragment(key, options = nil)
  return unless cache_configured?
  key = fragment_cache_key(key) unless key.is_a?(Regexp)

  instrument_fragment_cache :expire_fragment, key do
    if key.is_a?(Regexp)
      cache_store.delete_matched(key, options)
    else
      cache_store.delete(key, options)
    end
  end
end

So, as you can see, the delete_matched method is invoked only when we pass a Regexp. But hey! we never pass one :( we pass a string with a wild-card and it tries to expire it using the delete method. Luckily patching this is really simple:

module ActiveSupport
  module Cache
    class RedisStore < Store
      def delete(key, options)
        delete_matched(key, options)
      end
    end
  end
end

And that's all :) After applying both presented here solutions, Redis-store should work with simple Regexps without any problems.

Using Redis as a temporary cache for data shared by multiple independent processes

There are times, when you need to share some data between multiple processes, one of the ways is to use Redis. Not so long ago I've been working with an application which performs some types of calculations based on a crawled websites. Crawler pulls URLs from a page and sends them to a RabbitMQ. Then a worker is fetching those URLs and ignites multiply calculating processes (each of them is independent from the others - they just use the same dataset). They should use same dataset but to be honest... they didn't. Each process has been downloading same website over and over again, so instead of a page been downloaded one time - it has been downloaded minimum N times (N is a number of processes working with that page). That was insane! Each page was analyzed by all the processes only once and max time needed to handle those calculations was approximately  10-15 minutes (with 10 processes). A quite big amount of time was devoted to a page download.

Below you can see how it was working before optimization:

How it was working before

And how it should work:

As you can see above, there is one method which communicates with Redis. It should try to retrieve a cached page (or any other resource you want to store) and if it fails, it should download this page directly from the Internet.  As mentioned before, data should stay in Redis for a certain amount of time (max 15 minutes). Further more I don't want to take care of expired data. Would if be greate if data could just "magically" dissapear? It certainly would!

Redis key expiration

On each Redis key we can set a timeout. After the timeout has expired, the key will automatically be deleted. You can set expiration time like this (example from Redis doc):

redis>  SET mykey "Hello"
OK
redis>  EXPIRE mykey 10
(integer) 1
redis>  TTL mykey
(integer) 10
redis>  SET mykey "Hello World"
OK
redis>  TTL mykey
(integer) -1
redis>

We will store pages in Redis and set TTL on 15*60 seconds (15 minutes).

Connecting Ruby to Redis

Making Ruby work with Redis is really easy. There is a gem called "redis", which is quite simple to use:

require 'rubygems'
require 'redis'

options = {
  :timeout     => 120,
  :thread_safe => true
}

host = ENV["REDIS_HOST"] || 'localhost'
port = ENV["REDIS_PORT"] || 6379
db   = ENV["REDIS_DB"]   || 2

options.merge!({ :host => host, :port => port, :db => db })

redis = Redis.connect(options)

redis.set "my_key", "value"
p redis.get "my_key" # => "value"

Creating our main class

Source code is self explaining:

class WebPage
  attr_reader :redis, :ttl

  def initialize(redis, ttl = 5)
    @redis = redis
    @ttl = ttl
  end
  # Read page
  def read(url)
    # Try to get page content from radis
    # and if it fails
    unless d = get(url)
      # Just open it and read
      d = open(url).read
      set(url, d)
    end
    d
  end

  private

  # Set url key with page content
  def set(url, page)
    k = key(url)
    @redis.set(k, page)
    @redis.expire(k, @ttl)
  end

  # Get data from a Redis key (null if failed)
  def get(url)
    @redis.get(key(url))
  end

  # Prepare redis key for a certain url
  def key(url)
    "webpage:#{Digest::MD5.hexdigest(url)}"
  end
end

Of course if you want to use this in production, you should include also SSL (https) support and error handling (404, 500, timeouts, etc).

Performance benchmark

I've prepared a simple chart, where you can see performance boost when requesting same page over and over again (Redis cache benchmark source code):

To minimize temporary CPU load overhead (some random requests to my dev server, etc), I've repeated test set 10 times and I've averaged data so it should be quite reliable.

Making 100 requests takes about 6.5 seconds. Making 100 requests with cache takes about 0.09 second! It is 72 times faster!

Summary

Caching data is really important and differences can be really big (10, 100, 1000 times faster with caching enabled!). If you design any type of application, always think about caching as the one of the ways to speed it up.

Copyright © 2022 Closer to Code

Theme by Anders NorenUp ↑