Fixing a Simple JRuby Bug
A little over a week ago, JRuby 9.3 was
released which is kind of
a major thing for the project. Unbeknownst to us, our GitHub Actions spec tests
for Facter were running with
jruby-head
,
meaning that the workflows silently switched to testing with the newly released
JRuby. Shortly after, some of our tests started to fail. But first, some background.
What is JRuby?
JRuby is an implementation of the Ruby programming language atop the Java Virtual Machine, written largely in Java.
The Puppet Agent and Facter run under MRI, the standard C Ruby implementation, but there are some puppetlabs projects which actively use JRuby.
Why JRuby?
I just said that Facter runs under CRuby, so why do we run our tests on JRuby? The answer is: puppetserver.
Like its name suggests, Puppet Server is an application that compiles configurations for nodes, and serves them to the agents. Historically, the server has been a WEBrick/Passenger Ruby application, but due to poor performance under heavy loads it was since rewritten and now runs on the Java Virtual Machine. The server is mostly a Clojure application now, but Ruby is still an important part of it, especially since critical parts of the puppet code are executed both by the server and the agent. So, what better way to run Ruby in Java than JRuby?
With this in mind, and because Facter is a tight dependency of Puppet, we have to validate that the project behaves properly when ran under JRuby.
The problem
Some time this week, our Facter PR spec tests started failing on JRuby. At first we thought the failures were transient—this often happens for JRuby, but we get them predominantly on Puppet, not Facter. We rekicked the jobs, they failed again. The error messages were not suggestive at all, so we thought we’d rekick the jobs some more before actually starting to investigate what happened.
Unluckily enough, the same tests were failing over and over again. Here’s what a failing test looked like:
4) Facter#fact when there is a resolved fact with type nil returns nil
Failure/Error: value = fact_collection.value(user_query)
ArgumentError:
wrong number of arguments calling `initialize` (given 0, expected 1..3)
# ./lib/facter.rb:569:in `resolve_fact'
# ./lib/facter.rb:430:in `fact'
# ./spec/facter/facter_spec.rb:279:in `block in <main>'
Now, whether you know about Facter or not, this functionality is pretty important; it handles retrieving information from the fact collection, which is like the central point of Facter.
There were 6 failures in total, all of them complaining about the same thing:
ArgumentError: wrong number of arguments calling `initialize` (given 0, expected 1..3)
Who was calling initialize
? And on what?
The cause
Jumping to the location of the code in question, we can narrow things down to the following:
begin
value = fact_collection.value(user_query)
add_fact_to_searched_facts(user_query, value)
rescue KeyError, TypeError
nil
end
The block rescues KeyError
and TypeError
, but our actual exception is an
ArgumentError
. Let’s dig deeper by checking the FactCollection#value
method.
def value(user_query)
fetch(user_query) do
split_user_query = Facter::Utils.split_user_query(user_query)
split_user_query.reduce(self) do |memo, key|
raise KeyError unless memo.respond_to?(:fetch)
memo.fetch(key) { memo.fetch(key.to_s) }
end
end
end
What the heck? There’s nothing calling initialize
on anything here. It’s
pry
time. I set a breakpoint in the #value
method
and ran a specific failing test, then attempted to execute everything in that method:
From: /home/gabi/repo/facter/lib/facter/models/fact_collection.rb @ line 50 Facter::FactCollection#value:
45: def value(user_query)
46: require 'pry'; binding.pry
47: fetch(user_query) do
48: split_user_query = Facter::Utils.split_user_query(user_query)
49: split_user_query.reduce(self) do |memo, key|
=> 50: raise KeyError unless memo.respond_to?(:fetch)
51:
52: memo.fetch(key) { memo.fetch(key.to_s) }
53: end
54: end
55: end
[1] pry(#<Facter::FactCollection>)> raise KeyError unless memo.respond_to?(:fetch)
=> nil
[2] pry(#<Facter::FactCollection>)> raise KeyError
ArgumentError: wrong number of arguments calling `initialize` (given 0, expected 1..3)
[3] pry(#<Facter::FactCollection>)> KeyError.new
ArgumentError: wrong number of arguments calling `initialize` (given 0, expected 1..3)
from org/jruby/RubyClass.java:878:in `new'
And there you have it, something went wrong with the KeyError
exception which
could no longer be initialized without arguments. As a sanity check I attempted
the same on CRuby:
irb(main):001:0> raise KeyError
Traceback (most recent call last):
4: from /home/gabi/.rbenv/versions/2.7.1/bin/irb:23:in `<main>'
3: from /home/gabi/.rbenv/versions/2.7.1/bin/irb:23:in `load'
2: from /home/gabi/.rbenv/versions/2.7.1/lib/ruby/gems/2.7.0/gems/irb-1.2.3/exe/irb:11:in `<top (required)>'
1: from (irb):1
KeyError (KeyError)
irb(main):002:0> KeyError.new
=> #<KeyError: KeyError>
irb(main):003:0> RUBY_VERSION
=> "2.7.1"
The reason
At this point I didn’t know for sure, but I suspected something changed within
JRuby since this definitely worked with the latest JRuby 9.2 release. I found
this
commit which changed some stuff within the KeyError
class and went in the
9.3.0.0 release. I don’t know Java so I have no idea whether or not this is the
actual culprit.
I searched through the pull requests and issues on GitHub, and wasn’t able to find anything related to my issue. After all, 9.3 had been out for only a week or so. Then I said, jokingly, what if I fixed this?
The fix
I have no experience whatsoever in Java, I always associated the language with guys in suits working at large enterprises, and I always preferred the conciseness of Ruby. Java was… long.
Either way, I decided to shoot my shot. Building JRuby is quite easy and the compile/test feedback loop is fast so I assured myself that I’d be able to quickly iterate on the dumb code I was about to write.
I discovered that there was a test suite pulled from CRuby which validated the
behavior of the KeyError
exception. I ran it. There were 3 failures, and this
is before I started writing any code. Exciting!
I went ahead and implemented an initialize
method with no arguments in the RubyKeyError
class:
@JRubyMethod
public IRubyObject initialize(ThreadContext context) {
return context.nil;
}
I ran the test suite again:
$ jruby test/mri/runner.rb test/mri/ruby/test_key_error.rb
Run options:
# Running tests:
[3/6] TestKeyError#test_key = 0.01 s
1) Failure:
TestKeyError#test_key [/home/gabi/repo/jruby/test/mri/ruby/test_key_error.rb:26]:
ArgumentError expected but nothing was raised.
[5/6] TestKeyError#test_receiver = 0.00 s
2) Failure:
TestKeyError#test_receiver [/home/gabi/repo/jruby/test/mri/ruby/test_key_error.rb:19]:
ArgumentError expected but nothing was raised.
Finished tests in 0.140380s, 42.7411 tests/s, 142.4704 assertions/s.
6 tests, 20 assertions, 2 failures, 0 errors, 0 skips
ruby -v: jruby 9.3.1.0-SNAPSHOT (2.6.8) 2021-10-02 e793316754 OpenJDK 64-Bit Server VM 11.0.12+7 on 11.0.12+7 +jit [linux-x86_64]
So, initializing a KeyError
with no arguments worked now, but there were
still some inconsistencies between JRuby and CRuby. Both rubies allow you to initialize
a KeyError
with a key
and receiver
. If you attempt to query one of those
and they are not set, you get an ArgumentError
in CRuby:
irb(main):001:0> KeyError.new.key
...
ArgumentError (no key is available)
irb(main):002:0> KeyError.new.receiver
...
ArgumentError (no receiver is available)
Compare this with JRuby:
irb(main):001:0> KeyError.new.key
=> nil
irb(main):002:0> KeyError.new.receiver
=> nil
The JRuby implementation mistakenly sets the variables to nil
if
uninitialized, so I set them to the Java null
instead, and for the #key
and
#receiver
methods themselves, I added a null
check where I threw the ArgumentError
. This is how the impacted code looks now (my changes are highlighted):
private IRubyObject initializeCommon(ThreadContext context, IRubyObject message, IRubyObject[] receiverKey) {
IRubyObject receiver;
IRubyObject key;
if (receiverKey == null) {
receiver = null;
key = null;
} else {
receiver = receiverKey[0];
key = receiverKey[1];
}
setMessage(message);
this.receiver = receiver;
this.key = key;
return context.nil;
}
@JRubyMethod
public IRubyObject receiver() {
if (receiver == null) {
throw getRuntime().newArgumentError("no receiver is available");
}
return receiver;
}
@JRubyMethod
public IRubyObject key() {
if (key == null) {
throw getRuntime().newArgumentError("no key is available");
}
return key;
}
I reran the test suite and everything passed now! It was peculiar to me that
the tests were failing from the get-go, so I correctly assumed that they were
not running in JRuby’s CI. The rake task which executes MRI tests collects
files to run from a
file,
and this specific suite wasn’t there. I added it, and then I was able to
properly run the tests with jruby -S rake test:mri
.
I opened a pull request which was promptly merged, so I guess this was my first open-source contribution to a Java project, which to my luck happened to be JRuby.
The moral of this story is: don’t let your dreams be dreams, if you see a bug that seems simple enough to fix, try to do it (even if it’s Java and you have 0 experience with the language).