
\documentclass{acm_proc_article-sp}

\begin{document}

\title{YARV: Yet Another RubyVM}
\subtitle{Innovating the Ruby Interpreter}

\author{
\alignauthor Koichi Sasada \\
  \affaddr{Graduate School of Technology,} \\
  \affaddr{Tokyo University of Agriculture and Technology} \\
  \affaddr{2-24-16 Nakacho, Koganei-shi, Tokyo, Japan.} \\
  \email{sasada@namikilab.tuat.ac.jp}
}

\maketitle

\begin{abstract}

Ruby - an Object-Oriented scripting language - is used world-wide 
because of its ease of use. However, the current interpreter is slow. To 
solve this problem, some virtual machines were developed, but none with 
adequate performance or functionality. To fill this gap, I have 
developed a Ruby interpreter called \emph{YARV (Yet Another Ruby VM)}. 
YARV is based on a stack machine architecture and features optimizations 
for high speed execution of Ruby programs. In this poster, I introduce 
the Ruby programming language, discuss certain characteristics of Ruby 
from the aspect of a Ruby interpreter implementer, and explain methods 
of implementation and optimization. Benchmark results are given at the 
end.

% short abstract
% Ruby - an Object-Oriented scripting language - is used world-wide because of its ease of use. However, Ruby is slow. In this background, I have developed YARV (Yet Another Ruby VM). YARV is based on a stack machine architecture and features optimizations for high speed execution of Ruby programs.




\end{abstract}

\keywords{Interpreter Implementation, Scripting Language, Ruby} 
% NOT required for Proceedings

\section{Introduction}

Ruby is the interpreted scripting language developed by Yukihiro 
Matsumoto for quick and easy object-oriented programming\cite{ruby-lang, 
pickaxe}. It is simple, straight-forward, extensible, and portable. It 
has many features to process text files and to do system management 
tasks (as in Perl\cite{perl}) and many more.

Ruby has following characteristics.

\begin{itemize}
\small
\item Simple syntax
\item Normal OO features (class, method call, etc.)
\item Advanced OO features (all values are objects, Min-in, Singleton method, etc.)
\item Dynamic-typing, re-definable behavior, dynamic evaluation
\item Operator overloading
\item Exception handling
\item Closure and method invocation with a block
\item Garbage collection support
\item Dynamic module loading
\item Many useful libraries
\item Highly portable
\end{itemize}

However, current Ruby intepreter is slow. This is because current 
interpreter (old-ruby) works by traversing abstract syntax tree and 
evaluates each node. To solve this problem, I have developed new Ruby 
interpreter called YARV (Yet Another RubyVM), which is a stack machine 
and runs Ruby programs in compiled intemediate representation of 
sequential instructions. I'm working to replacing old-ruby with YARV.

This poster is dedicated to discussing the Ruby programming language and 
the advantages and challenges that Ruby presents as an interpreter 
target, with speed-up being the principal goal. YARV's implementation 
and optimization features are then presented and the results are 
evaluated.



\section{YARV Implementation}


\subsection{Overview}

YARV is a simple stack machine written in C. The VM has a stack, a 
program counter (PC), a stack pointer (SP), some frame pointers (FP). 
YARV compiles a Ruby script into YARV instruction (intermediate) code 
sequences. The instruction set is designed for Ruby specifically.

YARV reuses many parts of old-ruby, namely the Ruby script parser, the 
object management mechanism, the garbage collector and more. In fact, 
YARV is implemented as an extension module for old-ruby.

YARV currently works on Linux with GCC and Windows2000/XP with Visual 
C++ or cygwin.

\subsection{Intepreter Auto Generation}

To create the virtual machine, I generate the code for the VM from a VM 
description written in a VM description language (VMDL) like 
vmgen\cite{vmgen}. Figure \ref{vmdl} shows the definition of an VM 
instruction (named \verb|instruction1|).

In the VMDL, one declares operands, stack operands, and return values 
for each instruction. The programmer doesn't need to write the code to 
control the PC, SP, stack or fetching operands.

Furthermore, VMDL can generate optimized code automatically. The topic 
will be treated in the next subsection.

\begin{figure}
\begin{quote}
\scriptsize
\begin{verbatim}
DEFINE_INSTRUCTION
instruction1 // instruction name
(VALUE op1)  // operand values
(VALUE sp1)  // popped values from stack
(VALUE r1)   // values will be pushed to stack
{
  // instruction logic of instruction1
  // using op1, sp1 
  // and at last assign value to r1
}
\end{verbatim}
\end{quote}
\caption{VM description language}
\label{vmdl}
\end{figure}

\subsection{Optimization}

YARV was implemented with many optimization techniques in order to 
create a high-speed interpreter.

Instruction dispatch makes use of dynamic threaded 
code\cite{direct-threaded-code} with GCC's extended feature 
(label as value) instead of \verb|switch/case| statements in C.

Since all values in Ruby are objects, Ruby has no primitive types. For 
example the Ruby program \verb|1+2| actually means \verb|1.+(2)| (a 
method \verb|+| is sent to reciever \verb|1| with an argument \verb|2|). 
To maximize efficiency, some methods are compiled to specialized 
instructions. In this case, method \verb|+| is compiled to the 
spezialized specialised instruction \verb|opt_plus|. \verb|opt_plus| 
checks the receiver (self) and the argument. If they are both Fixnums, 
it checks whether method \verb|+| of class Fixnum has been redefined or 
not. If it has not been redefined, it adds these values and pushes the 
result onto the stack. Otherwise, the normal method dispatch sequence is 
performed.

Operands unification and instructions unification (also known as super 
instruction) is used to optimize. If the programmer specifies that an 
instruction with specific operands or instruction sequence should be 
unified, the VM generation system generates unified instructions and 
compiler logic for this instruction.

YARV supports 2-level (2 registers, 5 states) static stack 
caching\cite{stack-caching}. The VM generation system generates stack 
caching instructions and translater for compilation.

Ahead-of-Time (AOT) compilation of Ruby programs is also supported. The 
AOT compiler translates a Ruby program to a C program which runs on YARV. 
The C compiler then generates native machine code that is more efficient 
than YARV instruction code.


\section{Evaluation}

Table \ref{eval} shows running time of benchmarks on old-ruby and YARV. 
This results were evaluated on Pentium-M 1.2Ghz, 1024MB memory, Windows 
XP and cygwin, gcc 3.4.4.

\begin{table}
\caption{Benchmark results}
\label{eval}
\begin{center}
\begin{tabular}{c|rrr}
\hline\hline
Benchmark & Ruby (sec) & YARV (sec) & Ruby/YARV\\
\hline
ackermann & 29.86 & 2.61   & 11.4 \\
Fibonacci & 12.72 & 1.83   &  7.0 \\
tak       & 17.36 & 2.41   &  7.2 \\
matrix    &  3.92 & 1.40   &  2.8 \\
sieve     & 10.45 & 1.81   &  5.7 \\
count\_words & 0.69 & 0.63 &  1.1 \\
whileloop & 16.22 & 0.99   & 16.4 \\
\hline
\end{tabular}
\end{center}
\end{table}

\section{Conclusion}

In this extended abstract, I described characteristics of the Ruby 
language and a new implementation of the Ruby interpreter called YARV. 
Many interpreter optimization schemes have been applied in the creation 
of YARV that have caused speed-up compared to old-ruby.

In the current Ruby interpreter, Ruby's multi-thread system is supported 
in user-level. This means that we can not write true parallel 
application in Ruby. As a solution to this problem, YARV will support 
native threads (Operating System managed it). This will enable Ruby 
programs to be more scalable.

Also I am planning to desgin Multi-VM instances mechanism like Java 
Multi-Talking VM\cite{java-mvm}. This will improve performance and 
assist application programs that embed a Ruby interpreter.

Ultimately, I will replace current Ruby interpreter with YARV and YARV 
becomes The Ruby Virtual Machine.


\section*{Acknowledgement}

This project is assisted by Exploratory Software Project 2004 (youth) 
and 2005 from IPA (Information-technology Promotion Agency, Japan) .

I want to address of thanks to Michael Neumann, Daniel Amelang, 
yarv-devel mailing list members and all Rubyists.

\bibliographystyle{abbrv}
\bibliography{yarve}

\end{document}

