Stack machine, part 2 - Hear me calling

In the previous part of this series, we looked at implementing a very basic stack-based virtual machine in Elixir. At the end of it we began to ponder the concept of functions.

But a return instruction does more than just returning a value. It also continues program execution at the place of the function call. This leads us into the realm of codifying functions and keeping track of what piece of code we're actually executing. Let's look at that in the next part of this series.

So here we are.

Function, procedure, subroutine, whatever we would like to call it, the concept of executing a different instruction than the next one can be handled in many ways. One way is to use an instruction pointer to keep track of which instruction to execute next. Calling a subroutine would be as easy as altering the instruction pointer. Using an instruction pointer would be straightforward if we had that code as an array. Thus far, we haven't had any explicit instruction pointer, but rather been working through a list representing the code being executed.

Before replacing our list representation with an array, let's see just how far we can get without having an explict instruction pointer.

Starting point

We'll back our implementation a few steps and start with the following implementation of our stack-based virtual machine.

defmodule StackMachine do  
  def execute(code), do: _execute(code, [])

  defp _execute([], stack), do: stack
  defp _execute([:push | [value | code]], stack) do
    _execute(code, [value | stack])
  end
  defp _execute([:add | code], [a | [b | rest_of_stack]]) do
    _execute(code, [a + b | rest_of_stack])
  end
  defp _execute([:multiply | code], [a | [b | rest_of_stack]]) do
    _execute(code, [a * b | rest_of_stack])
  end
end  

Until now, this is the fake bytecode we've been executing with our virtual machine.

[:push, 4, :push, 5, :add, :push, 2, :multiply]

It's quite specific, but we can picture a generic subroutine, sum_and_double, that performs the addition and the doubling.

[:add, :push, 2, :multiply]

We need to be able to refer to this code by some name, a label of sorts. So we expand the bytecode with a new instruction, label. As with push, the instruction will be immediately followed by the label itself . Both should be ignored when executing the code, effectively treating them as a no-op.

[:label, :sum_and_double, :add, :push, 2, :multiply]
  defp _execute([:label | [_label | code]], stack) do
    _execute(code, stack)
  end

Of course, if we have some other code following our subroutine, there's nothing that stops us from also executing it. So we reintroduce our return instruction, with a minor change. We let it return the stack unaltered, rather than the top-most value. This means that the semantics of the instruction have changed from return a result to return to the call site of this subroutine.

[:label, :sum_and_double, :add, :push, 2, :multiply, :return]
  defp _execute([:return | _code], stack), do: stack

Heed the call

So how do we call this subroutine? Remember that we still need to make sure that there are two integers on the stack before doing so.

[:push, 4, :push, 5, ???]

Pushing the label of the desired subroutine onto the stack and then adding a call instruction is a good start.

[:push, 4, :push, 5, :push, :sum_and_double, :call]

Of course, we still need to add the subroutine we previously devised.

[
  :push, 4, :push, 5, :push, :sum_and_double, :call,
  :label, :sum_and_double, :add, :push, 2, :multiply, :return
]

But wait, you say. Won't we continue straight into the subroutine after returning from it? We most certainly will. To avoid that, we add a halt instruction. For good measure we also throw in an inspect instruction that will print the top-most value on the stack without removing it.

[
  :push, 4, :push, 5, :push, :sum_and_double, :call,
  :inspect, :halt,
  :label, :sum_and_double, :add, :push, 2, :multiply, :return
]

Our toy bytecode is certainly growing. Time to implement these new instructions.

  defp _execute([:halt | _code], _stack), do: :halted
  defp _execute([:inspect | code], stack) do
    IO.puts hd(stack)
    _execute(code, stack)
  end

Now for the call instruction. Assuming that a subroutine is always declared after the code calling it, we just need to skip ahead to the subroutine. Once the subroutine returns the resulting stack, we'll use it when executing the instruction following the call instruction.

  defp _execute([:call | code], [label | stack]) do
    new_stack = _execute(subroutine(code, label), stack)
    _execute(code, new_stack)
  end

  defp subroutine([:label | [label | code]], label), do: code
  defp subroutine([_ | code], label), do: subroutine(code, label)

The subroutine function pretty much mimics an Enum.drop_while, skipping bytecode elements until a :label instruction followed by the desired subroutine name is found. Notice how this wouldn't work if our bytecode stored subroutines before the call site.

All in all, we can now put this together and give it a go in iex.

defmodule StackMachine do  
  def execute(code), do: _execute(code, [])

  defp _execute([], stack), do: stack
  defp _execute([:push | [value | code]], stack) do
    _execute(code, [value | stack])
  end
  defp _execute([:add | code], [a | [b | rest_of_stack]]) do
    _execute(code, [a + b | rest_of_stack])
  end
  defp _execute([:multiply | code], [a | [b | rest_of_stack]]) do
    _execute(code, [a * b | rest_of_stack])
  end
  defp _execute([:label | [_label | code]], stack) do
    _execute(code, stack)
  end
  defp _execute([:inspect | code], stack) do
    IO.puts hd(stack)
    _execute(code, stack)
  end
  defp _execute([:call | code], [label | stack]) do
    new_stack = _execute(subroutine(code, label), stack)
    _execute(code, new_stack)
  end
  defp _execute([:return | _code], stack), do: stack
  defp _execute([:halt | _code], _stack), do: :halted

  defp subroutine([:label | [label | code]], label), do: code
  defp subroutine([_ | code], label), do: subroutine(code, label)
end  
iex(3)> StackMachine.execute [:push, 4, :push, 5, :push, :add_and_double, :call, :inspect, :halt, :label, :add_and_double, :add, :push, 2, :multiply, :return]  
18  
:halted

As intended, the number 18 is printed to the console and then the virtual machine halts. Great success!

In our next episode, we'll take a look at making our subroutines callable no matter where they reside in the bytecode. And it all begins with an attempt to make our virtual machine catch fire.