Tag: Ruby

Reading the uncompressed GZIP file size in Ruby without decompression

There are cases where you have a compressed GZIP file for which you want to determine the uncompressed data size without having to extract it.

For example, if you work with large text-based documents, you can either display their content directly in the browser or share it as a file upon request depending on the file size.

Luckily for us, the GZIP file format specification includes the following statement:

         +=======================+
         |...compressed blocks...| (more-->)
         +=======================+

           0   1   2   3   4   5   6   7
         +---+---+---+---+---+---+---+---+
         |     CRC32     |     ISIZE     |
         +---+---+---+---+---+---+---+---+

         ISIZE (Input SIZE)
            This contains the size of the original (uncompressed) input
            data modulo 2^32.

It means that as long as the uncompressed payload is less than 4GB, the ISIZE value will represent the uncompressed data size.

You can get it in Ruby by combining #seek, #read and #unpack1 as followed:

# Open file for reading
file = File.open('data.gzip')
# Move to the end to obtain only the ISIZE data
file.seek(-4, 2)
# Read the needed data and decode it to unsigned int
size = file.read(4).unpack1('I')
# Close the file after reading
file.close
# Print the result
puts "Your uncompressed data size: #{size} bytes"

Cover photo by Daniel Go on Attribution-NonCommercial 2.0 Generic (CC BY-NC 2.0). Image has been cropped to 766x450px.

RubyGems dependency confusion attack side of things

Note: This article is not to deprecate any of the findings and achievements of Alex Birsan. He did great work exploiting specific vulnerabilities and patterns. It is to present the RubyGems side of the story and to reassure you. We actively work to provide a healthy and safe ecosystem for our users.

After reading the Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies I felt, that the Ruby community requires a bit of explanation from people involved in RubyGems security assessment. So here it is.

It’s you who is responsible for the security of your software, and bugs do exist

First of all, let me remind you that your system security should never rely solely on other people’s OSS work. In the end, it will be your system that will get hacked. Secondly, any software has bugs, whether we’re talking about Bundler, RubyGems, Yarn, or any other piece of code.

Looking at this incident as someone heavily involved with making the Ruby ecosystem secure, RubyGems did everything right.

Does RubyGems allow for malicious packages?

Starting from September 2020, Alex has uploaded gems used for his research. Diffend spotted them, and each of the findings was reported to the RubyGems security team.

RubyGems security team assessed each of them, going through the source codes to ensure they were not malicious. And they were not. The reason why they were allowed to stay is that:

  1. A security researcher uploaded them that we knew of.
  2. They included a description and reasoning why they do what they do (see here).
  3. Several people from RubyGems checked them to ensure they were not doing any harm.

RubyGems policy regarding gems is simple:

As long as the gem is not doing any harm or is not misleading in a harmful way, it won’t be removed.

Thousands of gems download things upon being installed, run compilers, and send requests over the internet. For some advanced cases, there is no other way.

It is a constant race

We inspect gems and take many countermeasures to ensure the whole ecosystem’s safety. While we cannot promise you that we will catch every single attack ever, we are making progress, and we are getting better and better with our detection systems.

Even the day I’m writing this article, we’ve yanked (removed) 8 gems that were a build-up for a more significant scale attack.

It is not about RubyGems only but also about Bundler

Bundler is a complex piece of code. The resolving engine needs to deal with many corner-cases. Security of resolution is the most important thing, but other factors like correctness and speed are also important.

While we do not know the exact way of installing those gems, our gut feeling is that they might have been resolved “incorrectly”. “Incorrectly,” however, indicates that they were resolved in an invalid way. At the same time, it may turn out that they were resolved exactly as expected but not as the end-user/programmer wanted. It might be, that it was due to this bug. That is, “dependency of my dependency will be checked in RubyGems.”

I use private gems, please help!

If you use private gems with private dependencies, you can do few things at the moment:

  1. Most important: Update Bundler once the patch is released.
  2. Most important: Update Bundler to the 2.2.10 version.
  3. Always use Bundler source blocks to ensure that private gems can only come from private sources. If your private gems depend on other private gems, you may need to declare those gems in the private source block as well.
  4. Setup a tool like Diffend and create rules that will block Bundler from using RubyGems as a source for gems that match your internal gems naming conventions.
  5. Mirror RubyGems and upstream only gems you are interested in + your own.
  6. [workaround] Book the gems names in RubyGems (althought this may also be a potential insight for hackers into internal naming conventions of your company).

Summary

So yes, there was a bug in Bundler (fixed in 2.2.10), but at the same time, if it was not for explicit permission from the RubyGems security team, those gems would have been yanked soon after they were released. That’s why in this particular case, I would rather say that those companies got “researched” rather than hacked.

This incident made RubyGems reconsider the “no harm” policy, and it may be subject to change in the near future.


Cover photo by Jan Hrdina on Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0).

Copyright © 2021 Closer to Code

Theme by Anders NorenUp ↑