Home arrow Get Informed arrow Blogs arrow Darryl Gove: Ruby performance gains on SPARC
Darryl Gove: Ruby performance gains on SPARC
Written by Darryl Gove   
Friday, 07 March 2008

The programming language Ruby is run on a VM. So the VM is responsible for context switches as well as garbage collection. Consequently, the code contains calls to flush register windows. A colleague of mine, Miriam Blatt, has been examining the code and we think we've found some places where the calls to flush register windows are unnecessary. The code appears in versions 1.8/1.9 of Ruby, but I'll focus on 1.8.* in this discussion.

As outlined in my blog entry on register windows, the act of flushing them is both high cost and rarely needed. The key points at which it is necessary to flush the register windows to memory are on context switches and before garbage collection.

Ruby defines a macro called FLUSH_REGISTER_WINDOWS in defines.h. The macro only does something on IA64 and SPARC, so the changes I'll discuss here are defined so that they leave the behaviour on IA64 unchanged. My suspicion is that the changes are equally valid for IA64, but I lack an IA64 system to check them on.

The FLUSH_REGISTER_WINDOWS macro gets used in eval.c in the EXEC_TAG macro, THREAD_SAVE_CONTEXT macro, rb_thread_save_context routine, and rb_thread_restore_context routine. (There's also a call in gc.c for the garbage collection.)

The first thing to notice is that the THREAD_SAVE_CONTEXT macro calls rb_thread_save_context, so the FLUSH_REGISTER_WINDOWS call in the THREAD_SAVE_CONTEXT macro is unnecessary (the register windows have already been flushed). However, we've not seen this particular flush cause any performance issues in our tests (although it's possible that the tests didn't stress multithreading).

The more important call is the one in EXEC_TAG. This is executed very frequently in Ruby codes, but this flush does not appear to be at all necessary. It is neither a context switch or the start of garbage collection. Removing this call to flush register windows leads to significant performance gains (upwards of 10% when measured in an older v880 box. Some of the benchmarks nearly doubled in performance).

The source code modifications for 1.8.6 are as follows:

$ diff defines.h.orig defines.h.mod

228a229,230

> # define EXEC_FLUSH_REGISTER_WINDOWS ((void)0)

> # define SWITCH_FLUSH_REGISTER_WINDOWS ((void)0)

232a235,236

> # define EXEC_FLUSH_REGISTER_WINDOWS FLUSH_REGISTER_WINDOWS

> # define SWITCH_FLUSH_REGISTER_WINDOWS FLUSH_REGISTER_WINDOWS

234a239,240

> # define EXEC_FLUSH_REGISTER_WINDOWS FLUSH_REGISTER_WINDOWS

> # define SWITCH_FLUSH_REGISTER_WINDOWS FLUSH_REGISTER_WINDOWS

The change to defines.h adds new variants of the FLUSH_REGISTER_WINDOWS macro to be used for the EXEC_TAG and THREAD_SAVE_CONTEXT macros. To preserve the current behaviour on IA64, they are left as defined as ((void)0) on all architectures but IA64 where they are defined as FLUSH_REGISTER_WINDOWS.

$ diff eval.c.orig eval.c.mod

1025c1025

< #define EXEC_TAG() (FLUSH_REGISTER_WINDOWS, ruby_setjmp(((void)0), prot_tag->buf))

---

> #define EXEC_TAG() (EXEC_FLUSH_REGISTER_WINDOWS, ruby_setjmp(((void)0), prot_tag->buf))

10290c10290

< (rb_thread_switch((FLUSH_REGISTER_WINDOWS, ruby_setjmp(rb_thread_save_context(th), (th)->context))))

---

> (rb_thread_switch((SWITCH_FLUSH_REGISTER_WINDOWS, ruby_setjmp(rb_thread_save_context(th), (th)->context))))

The changes to eval.c just use the new macros instead of the old FLUSH_REGISTER_WINDOWS call.

These code changes have worked on all the tests we've used (including `gmake test-all`). However, I can't be certain that there is not a workload which requires these flushes. This appears to be putback that added the flush call to EXEC_TAG, and the comment suggests that the change may not be necessary. I'd love to hear comments either agreeing with the analysis, or pointing out why the flushes are necessary.

Update: to add diff -u output
$ diff -u defines.h.orig defines.h.mod

--- defines.h.orig Tue Mar 4 16:32:05 2008

+++ defines.h.mod Wed Mar 5 14:22:06 2008

@@ -226,12 +226,18 @@

;

}

# define FLUSH_REGISTER_WINDOWS flush_register_windows()

+# define EXEC_FLUSH_REGISTER_WINDOWS ((void)0)

+# define SWITCH_FLUSH_REGISTER_WINDOWS ((void)0)

#elif defined(__ia64)

void *rb_ia64_bsp(void);

void rb_ia64_flushrs(void);

# define FLUSH_REGISTER_WINDOWS rb_ia64_flushrs()

+# define EXEC_FLUSH_REGISTER_WINDOWS FLUSH_REGISTER_WINDOWS

+# define SWITCH_FLUSH_REGISTER_WINDOWS FLUSH_REGISTER_WINDOWS

#else

# define FLUSH_REGISTER_WINDOWS ((void)0)

+# define EXEC_FLUSH_REGISTER_WINDOWS FLUSH_REGISTER_WINDOWS

+# define SWITCH_FLUSH_REGISTER_WINDOWS FLUSH_REGISTER_WINDOWS

#endif



#if defined(DOSISH)

$ diff -u eval.c.orig eval.c.mod

--- eval.c.orig Tue Mar 4 16:32:00 2008

+++ eval.c.mod Wed Mar 5 14:22:13 2008

@@ -1022,7 +1022,7 @@

#define PROT_LAMBDA INT2FIX(2) /* 5 */

#define PROT_YIELD INT2FIX(3) /* 7 */



-#define EXEC_TAG() (FLUSH_REGISTER_WINDOWS, ruby_setjmp(((void)0), prot_tag->buf))

+#define EXEC_TAG() (EXEC_FLUSH_REGISTER_WINDOWS, ruby_setjmp(((void)0), prot_tag->buf))



#define JUMP_TAG(st) do { \

ruby_frame = prot_tag->frame; \

@@ -10287,7 +10287,7 @@

}



#define THREAD_SAVE_CONTEXT(th) \

- (rb_thread_switch((FLUSH_REGISTER_WINDOWS, ruby_setjmp(rb_thread_save_context(th), (th)->context))))

+ (rb_thread_switch((SWITCH_FLUSH_REGISTER_WINDOWS, ruby_setjmp(rb_thread_save_context(th), (th)->context))))



NORETURN(static void rb_thread_restore_context _((rb_thread_t,int)));

NORETURN(NOINLINE(static void rb_thread_restore_context_0(rb_thread_t,int,void*)));

Read the original article: http://blogs.sun.com/d/entry/ruby_performance_gains_on_sparc.

Comments (0)add comment

Write comment
quote
bold
italicize
underline
strike
url
image
quote
quote
smile
wink
laugh
grin
angry
sad
shocked
cool
tongue
kiss
cry
smaller | bigger

security image
Write the displayed characters


busy
 
< Prev   Next >
impersonal-mites
Generated in 0.51201415062 Seconds