Disassemble
Wednesday, October 26, 2011
A neat trick that Factor provides is the ability to disassemble functions into the machine code that is generated by the compiler. In 2008, Slava Pestov created a disassembler, and has improved it a bit since then (switching to udis86 for its implementation).
Constant Folding
The compiler performs constant folding, using the compiler.tree.debugger vocabulary, you can output the optimized form of a quotation:
IN: scratchpad [ 2 2 + ] optimized.
[ 4 ]
Using the disassembler, you can see the machine code this generates:
IN: scratchpad [ 2 2 + ] disassemble
011c1a5530: 4983c608 add r14, 0x8
011c1a5534: 49c70640000000 mov qword [r14], 0x40
011c1a553b: c3 ret
011c1a553c: 0000 add [rax], al
011c1a553e: 0000 add [rax], al
Local Variables
One of the questions that comes up sometimes is whether local variables affect performance. We can examine two words that add numbers together, one using locals and one just using the stack:
IN: scratchpad : foo ( x y -- z ) + ;
IN: scratchpad :: bar ( x y -- z ) x y + ;
The “optimized output” looks a little different:
IN: scratchpad \ foo optimized.
[ + ]
IN: scratchpad \ bar optimized.
[ "COMPLEX SHUFFLE" "COMPLEX SHUFFLE" R> + ]
But, the machine code that is generated is identical:
IN: scratchpad \ foo disassemble
01115de7b0: 488d1d05000000 lea rbx, [rip+0x5]
01115de7b7: e9e49439ff jmp 0x110977ca0 (+)
01115de7bc: 0000 add [rax], al
01115de7be: 0000 add [rax], al
IN: scratchpad \ bar disassemble
01115ef620: 488d1d05000000 lea rbx, [rip+0x5]
01115ef627: e9748638ff jmp 0x110977ca0 (+)
01115ef62c: 0000 add [rax], al
01115ef62e: 0000 add [rax], al
Dynamic Variables
Another frequently used feature is dynamic
variables,
implemented by the namespaces
vocabulary. For example, the definition
of the print
word looks for the current value of the output-stream
variable and then calls stream-print
on it:
IN: scratchpad \ print see
USING: namespaces ;
IN: io
: print ( str -- ) output-stream get stream-print ; inline
The optimized output inlines the implementation of get:
IN: scratchpad [ "Hello, world" print ] optimized.
[
"Hello, world" \ output-stream 0 context-object assoc-stack
stream-print
]
You can inspect the machine code generated, seeing references to the factor words that are being called:
IN: scratchpad [ "Hello, world" print ] disassemble
011c0c6c40: 4c8d1df9ffffff lea r11, [rip-0x7]
011c0c6c47: 6820000000 push dword 0x20
011c0c6c4c: 4153 push r11
011c0c6c4e: 4883ec08 sub rsp, 0x8
011c0c6c52: 4983c618 add r14, 0x18
011c0c6c56: 48b8dbc5a31a01000000 mov rax, 0x11aa3c5db
011c0c6c60: 498946f0 mov [r14-0x10], rax
011c0c6c64: 498b4500 mov rax, [r13+0x0]
011c0c6c68: 488b4040 mov rax, [rax+0x40]
011c0c6c6c: 498906 mov [r14], rax
011c0c6c6f: 48b86c91810e01000000 mov rax, 0x10e81916c
011c0c6c79: 498946f8 mov [r14-0x8], rax
011c0c6c7d: e8de4e36ff call 0x11b42bb60 (assoc-stack)
011c0c6c82: 4883c418 add rsp, 0x18
011c0c6c86: 488d1d05000000 lea rbx, [rip+0x5]
011c0c6c8d: e94e5264ff jmp 0x11b70bee0 (stream-print)
011c0c6c92: 0000 add [rax], al
011c0c6c94: 0000 add [rax], al
011c0c6c96: 0000 add [rax], al
011c0c6c98: 0000 add [rax], al
011c0c6c9a: 0000 add [rax], al
011c0c6c9c: 0000 add [rax], al
011c0c6c9e: 0000 add [rax], al