RubyConf 2019 – Digesting MRI by Studying Alternative Ruby Implementations by Christian Bruckmayer

RubyConf 2019 – Digesting MRI by Studying Alternative Ruby Implementations by Christian Bruckmayer


(upbeat music) – Hello, hello, RubyConf. – [Audience] Hello, woo! – Hey, I hope you all had a great lunch. Welcome to my talk. And I would like to start
my talk with a little story and it’s about my first job,
was actually my last job and it was the first job
what I had after university and I was one of the first
engineers in the team. There was only my boss
and another engineer and then I joined. And over time, we started to grow this team and we hired
more and more developers. So in the end, we had
around 12 developers. And so over this time,
I was helping onboarding all this new engineers,
I did pair programming and mob programming, and I was taking more and more responsibilities. And at some point, I got promoted
to be a senior developer. And it was still my first job and I was at a point in my
career where I was sitting down and thinking and reflecting,
what is a senior developer? What is actually a senior developer? So I was sitting down, thinking about it. And so one thing what to
my mind is experience. So probably most of you would agree, somebody who’s a software
engineer for 10 years, 15 years, who has a lot of experience is maybe a senior developer. Or somebody who provides leadership, who can help bring a feature
from inception to production and split it up and delegate
different work tasks for different developers. Or maybe somebody who is a good
mentor, who can help junior developers to level up to
become better developers. And one thing what also
came to my mind is somebody who is a really good programmer,
like masters programming is a little bit strong but somebody who is a really good programmer
and I was a Ruby programmer, so for me, it was somebody who was really good in programming Ruby. So I was sitting down and thinking like, “How can I become a
better Ruby programmer?” And it came to my mind. Ruby is just a computer program. And as you can study a raise
application to understand it better, you could just study Ruby to become a better Ruby programmer. And that is the topic of my talk today, is digesting MRI by studying alternative
Ruby implementations. My name is Christian Bruckmayer. Most of what I talk today is
on my website, bruckmayer.net, so I wrote a few blog articles,
most stuff about JRuby, about the performance work I did there. So if you’re interested,
if you want to learn more, and also see more benchmarks,
then go to my blog. I live at the moment in
Bristol in the United Kingdom, Bristol is around two
hours away from London, so it’s not London, and there are more cities
in England except London. I currently work at Cookpad,
so the global headquarter of Cookpad is also in Bristol. If you’re into home cooking,
definitely check out Cookpad, if you’re interested in cooking, then talk to me after the talk. And today, I brought three
different examples to you. So one example is about MRI, my second example is
about Rubinius and Opal and the last example is about JRuby. So the first two examples are quite small and then most stuff I
will talk about JRuby. So let’s get started. So my first example is about
MRI/CRuby, so if most people, when they speak about Ruby,
they mean actually MRI, which is the reference
implementation of Ruby, started by Matz in 1993. It’s implemented in C
and as said, most people, when they mean Ruby, they mean MRI. So to recap, I wanted to become
a better Ruby programmer. I thought, “Ruby is a computer program, “so I will just go to GitHub,
read the Ruby source code, “go to the Ruby Issue
Tracker, maybe fix a bug, “implement a feature and eventually, “I will be a better Ruby programmer.” So my first step was I went
to the Ruby Issue Tracker, browsed around, searched for features or bugs I could work on
and I found this issue, Array#minmax is much slower
than calling both #min and #max. So what does it mean? So you have an array with
the numbers one to 10, one to nine and then you
can call array.minmax and it will return an array
with the numbers one and nine, or you can call array.min and array.max and it return the same result. So I did a benchmark,
compared the two methods and it turns out array.minmax
is actually 1.8 times slower than calling array.min and array.max and that was surprisingly
because I was expecting calling one method should be faster than calling two methods. And the reason is array.minmax is included by the Enumerable module and doesn’t use the internal,
fast methods of array. So I thought, “Sounds
like a good first issue,” so I started, I thought
Ruby, started programming, created a pull request
and yeah, I was happy. (audience laughing) But my pull request
got closed really quick because there was already a patch attached to the Ruby Issue Trigger,
which I didn’t see. And I was really disappointed
because I did all this work and I was hoping to get a patch into Ruby and then I was thinking back,
my goal was understanding Ruby better, so my goal wasn’t
getting a patch into Ruby. And actually, I learned something new, I learned how to compile
Ruby, I learned to program a little bit in C, which I
didn’t do since university. It challenged me to try out
something new and eventually, it also inspired me to look into different implementations of Ruby. Which brings us to my second example, which is about Rubinius and Opal. So Rubinius is a Ruby
implementation written in Ruby, so only the virtual machine
is implemented in C and C++. And Opal is actually a Ruby
to Java script transpiler, which makes it possible
to run your Ruby code in the browser, so it’s really cool. And my idea was as MRI is
a reference implementation of Ruby, I will go to
the Ruby release notes, see what is new in Ruby
and then I will check out the different implementations and see if they already
implemented these features and if not, I will do it. So in Ruby 2.5, string got a
method called delete_prefix. And delete_prefix does, you have a string and then you can call delete_prefix and it will just remove the
first part of the string if it matches. So in Rubinius, it looks very simple. So it does just a type
conversion and then if the string starts with this prefix,
then it removes it. Otherwise, it will return
a copy of the string. And in Opal, it actually
works very, very similar, so what you can see here is one difference is the percent and X,
which usually shells out and executes on your shell. What it does here is it
executes Java script code. So I was starting implementing
it and my first steps was I was playing around
with this new method and I tried, “It should
work with a symbol, right?” So I was trying it out with a
symbol and I got a type error and I was a little bit surprised because I was expecting
it works with a symbol. So I was looking at the
implementation again and it does a a conversion to to_str. And as it turns out, a symbol
does not implement the to_str method because a symbol is not a string. It implements the to_s
method, but not the to_str. And that’s actually a
really interesting issue on the Ruby Issue Tracker
from six years ago about discussions if a symbol should implement the to_str method. And they also discuss why it shouldn’t, because a symbol is not a
string, just in some places, it behaves like a string. And I learned about explicit
and implicit conversions, so in the explicit case, we
say this symbol is a string and then we use it as a string
and in the implicit case, Ruby does it in the background for us. So for instance, here we can
just create a class prefix, implement the to_str method and then use it as a
prefix and it would work. And interestingly, if you look into Rails, that is what they do on the path class, so path was initially
inherited from string and Aaron Patterson refected
it like seven years ago to be not be inherited by string anymore and he implemented the to_str method. So what you can see here,
you can take a string and concordinate it with
a plus if you implement the to_str method. Which doesn’t work if
you only implement to_s. And there was this pull
request from seven years ago and Aaron explains it
very good on the issue. And then I looked again
in the implementation and I was surprised, so if
you don’t remove the prefix, so if it doesn’t match, it
returns a copy of the string, which surprised me a little bit. And on Monday evening, I was
out for dinner and I met Ryan, who did a talk this
morning about Artichoke, which a Rust implementation of Ruby, and I was talking with
him and I was saying, “I think I found a bug in
Artichoke because you don’t return “a copy of the string, you
return the same string,” so this is actually the
bug, and he was saying, “Ah, that’s interesting,
why did you not fix it, “why did you not send a pull request?” So yesterday evening, I did a pull request and Ryan already merged it, so
it’s fixed now in Artichoke. And I was looking more into
it, why it does return a copy of the string, and turns
out, most methods on string return a copy if they don’t do anything, just to be consistent. For instance, I have a string, Hello, and I want to delete the character a, which doesn’t match in this string, and it returns a copy if you use chomp, which returns all the white space, so there’s no white space in this string, it returns a copy of the string. Or if you use gsub, which
would replace the a with a a, which doesn’t do anything in this case, it also returns a copy of the string. So what I learned with this contribution, it was a really small contribution, three different projects, like
actually only three or four lines of code, but I learned
so many things about Ruby and I digged into the
Rails code, studied Rails, so it was really interesting. Which brings us to my last example. Which is about JRuby. So JRuby is the Java
implementation of Ruby, which runs on the JVM and
it’s implements concurrency and it’s really fast. So actually, what is
written on the documentation of JRuby on the GitHub page,
one of the first sentences is, “It aims to be a complete, correct “and fast implementation of Ruby,” so I thought, “That’s
sounds really interesting,” so if one of the main goals
is fast implementation, then it’s probably fun to work
on some performance issue. So I was again going to the
Ruby Issue Track and the JRuby Issue Track and looking
around what I could do and eventually, I found this issue. Hash tables with open addressing. And that is a feature which
got introduced in Ruby 2.4, which improved hash tables by around 40%. So made hash tables 40% faster. And I thought, “Ah, that
sounds really interesting. “How hard can it be?” Just two days ago, Aaron
tweeted, “Implement a hash? “Bin there done that” So it seems to be quite
easy, so I was looking into the issue and I
found the patch on MRI, which is almost 1500 lines of code. And I’m not a C programmer,
I did some C in university but I’m not a C programmer at all. So I was like, “Okay, this is
probably not going to happen, “I can’t read like 1500 lines of code in C “and then transfer the
knowledge to JRuby.” So my initial idea was,
I already showed you in the first part of the talk, Rubinius. So my first idea was, “I
will look into Rubinius, “if they implemented it, I
will just read the Ruby code “there and then I will
implement it in JRuby.” However, they didn’t implement it. So my next idea was I
already compiled Ruby, so I know how to compile it and I thought, “I will just sprinkle in some
puts and then I will execute “different code and try what it does “to understand it better.” So the rb_p method is just a
print in C and I thought okay, maybe that would work, so I
compiled Ruby, tried it out and got a segmentation fault,
so that obviously didn’t work and so I was going back to
the issue and I was starting to read the patch and
trying to understand it. And in the patch, at the top
is a really good description of the algorithm what they implemented. Just written down with
asking how it works. So I thought, “Maybe I can
try to implement this in Ruby, “try to understand what
they did, and then transfer “the knowledge to Java
and research a little bit “how open addressing works,”
so that was my next step. So a hash, to recap, you create a hash and then we can set a key with a value and then we can retrieve the key again. So their approach, what
they implemented before, was called separate chaining
and so in this case, we create a class hash,
which has a buckets variable, which is an array and the
array is a nested array, so what you see in the
brackets means that the array, each value is by default an empty array. So if you want to set a
key, we just find the key, if the key exists, we replace the value. If the key does not exist,
we search for the bucket and a penned entry to it and the entries just destruct for the key and value. So the find method finds the bucket and then just iterates over
all entries in this array. So we have an array of
nested arrays, basically. And the bucket, you find the
bucket by calculating an index and the index is the hash value of the key modulos the bucket’s length. So I was wondering why
is this approach not good or why do we want to implement
a different approach. And the reason is called cache locality. So cache locality, this
is copied from Wikipedia, is “If a particular storage
location is referenced, “then it is likely that
nearby memory locations “will be referenced in the near future.” So what does it mean? Imagine you have an array and
you iterate over the array to print out just what is
in the array, most cases, if you access the first
value of the array, you also access the second, the
third, the fourth, the fifth elements, so that’s very common
use case, so cache locality, turns out most modern computer
architectures implement a feature if you access
a value from the memory, then it will load nearby
memory into the CPU’s cache and then the next time you access it, it will already be in the cache. So in the case, you access the
first element, it will load, hopefully, the whole
array into the CPU’s cache and then accessing the
second and the third element will be really fast. So that is basically what we want to do with open addressing,
because with that approach, then you have nested arrays, it’s unlikely that you have a cache hit. So in open addressing,
first step what we did is we were getting rid
of the nested arrays. The first step is the
same, we check if the key already exists, if it
exists, we replace the value. If it doesn’t exist, we find the bucket and then we insert on
the bucket the entry, so we don’t append it anymore,
we just insert it there. And find also just use the
bucket from the buckets array and the main difference now
is how we find the bucket. So we calculate the index
again and then we iterate over the one array and
try to find the key. And if we don’t find the key,
we calculate the next index. And in our case, simplified
version for this talk is just we take the
next step in the array, we can take the next entry. So in the real implementation,
we have set hash function but here, we just take the next element. So I did this in Ruby,
tried to understand it, and then I started to
implement it in JRuby. So I started to refect it internally. And there’s this really famous
quote from the Pragmatic Programmer, says, “Different
languages solve the same “problems in different ways.” In the beginning, I found
it really annoying in Java, the static typing, and
then when you started to do a big refactoring,
I really appreciated it because it guided me if you
needed to replace a value, it told you that you forgot
something, you need to go there. So doing the same refactoring in Ruby would be probably really
hairy, really difficult and you also need to wait
until your tests pass or fail until you find some issues. And in software, we
also have this approach, release early, release often. So I started working on
this issue and really quick, I realized I’m stuck. So what I did is I created
like a draft request and asked people from the
JRuby community to help me. And what we heard on
Monday in the keynote is, “Ideas become bigger when you share them.” So I think that was one of the reasons I finished this project,
because I asked early for help and asked for input. Otherwise, I probably
would have been stuck and probably wouldn’t finish it. So I finished the first
implementation of this new approach but turns out, it wasn’t faster. So we were discussing in my pull request, in my draft pull request,
why it’s not faster, what we could try, and one
approach some contributor came up with it, was we should
remove the entry object. Because then we remove a
lot of object allocations. So what we did instead is
we removed the entry object and then on the first
position, we ended the key and the second position,
like plus one bucket, we entered the value and
we removed a lot of object allocations and that actually did improve the performance a lot. And a couple of weeks ago,
I was at a Ruby conference and Akira Matsuda gave
a talk about performance optimizations and he was saying this, “Performance optimization is a game “in which you want to
achieve a highscore.” So stuff like this, removing objects, you can’t usually do
in your day-to-day job because if I would come
in tomorrow morning and start removing
objects, then my coworkers probably would say no,
but in projects like this, you can do it because you want
to make JRuby really fast, so you accept that you
increase the complexity for better performance. So I was finishing the pull request and then eventually we merged it, which took around two month,
I was working in the evenings and then on the weekends, the
whole patch was around 700 lines of code. And after it got merged,
couple of weeks later, we needed to revert it temporarily, so the issue is, turns out, the new implementation
performs is not as robust in concurrency as the
previous implementation was. So yeah, we are currently
working on it to merge it soon back into it but we
need to make it more solid and robust in concurrency
cases, which for instance, is not an issue in MRI because you have the
global interpreter lock. So to recap this section, what helped me really
a lot was prototyping, which I still use a lot
in my day-to-day job, so if there’s a feature
which is too complex or which I’m not sure about, then I just do a really quick prototype, maybe outside even of the project. And then also, ask for help. So ask the community,
make draft pull request, share your ideas. So yeah, these were my
three different examples. And I started my talk with a question, like what defines a senior developer? And I wanted to become
a better Ruby programmer and master Ruby. Probably nobody really masters
Ruby, except maybe Matz, but programming, you never stop learning, every day you learn something new and that’s what you need to
accept and with opensource, you can just browse around GitHub and learn how changes are implemented, how Ruby is implemented
and learn something new. And we also have a huge community which is usually happy to help. So yeah, thank you. (audience applauding) (uplifting music) (toy squeaking)

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *