Stack machine, part 3 - Burst into flames

In the first part we built a very basic stack-based machine and we introduced the concept of subroutines in the second part.

Halt and catch fire

Granting the name to a highly enjoyable TV series, the fictive halt and catch fire instruction is too geeky not to include in our small, but growing instruction set. Use it anytime you want to make your virtual machine completely unresponsive.

  defp _execute([:hcf, _code], _stack) do
    :timer.sleep :infinity
  end

The downside with this approach? It's too efficent. Your computer will not be set ablaze. You won't even hear your cooling fans revving.

We can add instructions all day long by implementing them in Elixir, but if we could make it possible to call a subroutine from itself we can create an endless loop that will at least consume some CPU cycles. It won't be enough to combust our computer, but at least it will closer to it.

Going back

In our current version, when calling a subroutine we will traverse the bytecode until we find a matching label. The bytecode beyond that label will be executed, typically until a return instruction is encountered at which point the subroutine will finish and bytecode execution will resume from the original call site.

This poses a number of problems, chief amongst them is that we can only invoke subroutines that are ahead of us in the bytecode. Thus the following bytecode cannot be executed.

[
  :push, 0, 
  :label, :increment_and_inspect, 
  :push, 1,
  :add,
  :inspect, 
  :push, :increment_and_inspect, 
  :call
]

Yes dear reader, that is a recursive subroutine call, and yes, we're not handling infinite recursion quite yet. But one problem at a time.

A number of ways of tackling this comes to mind, for example:

  1. Introduce a pointer to keep track of the original bytecode and use that bytecode to find the subroutine to be invoked.

  2. Extract subroutines when initially loading the bytecode.

  3. Transform the bytecode on load so that every instruction has a memory address that can be referred to by calls and jumps.

These all have their strengths and weaknesses. Let's go through them one by one to see how much we need to change to make use of them.

Retaining the original bytecode

Assuming we have access to the original bytecode, we can introduce a subroutine function and adjust the existing _subroutine function bodies.

  defp subroutine(label) do
    _subroutine(original_code, label)
  end

  defp _subroutine([:label | [label | code]], label), do: code
  defp _subroutine([_ | code], label), do: _subroutine(code, label)

We also need to change the call instruction to only pass the desired label to the new subroutine function.

  defp _execute([:call | code], [label | stack]) do
    new_stack = _execute(subroutine(label), stack)
    _execute(code, new_stack)
  end

Looks nice, but how do we get hold of the original bytecode without passing it through in every function call, thus changing every instruction we've implemented? Well, we could use an agent to hold it for us and wrap the fetching of the original bytecode in an original_code function. This requires expanding the public execute function to start the agent in addition to adding the function for fetching the original code.

  def execute(code) do
    Agent.start_link(fn -> %{original_code: code} end,
                     name: __MODULE__)
    _execute(code, [])
  end

  defp original_code do
    Agent.get(__MODULE__, fn(state) -> state.original_code end)
  end

Given the above changes, we should now be able to execute our bytecode example.

iex(4)> StackMachine.execute [
...(4)>   :push, 0,
...(4)>   :label, :increment_and_inspect,
...(4)>   :push, 1,
...(4)>   :add,
...(4)>   :inspect,
...(4)>   :push, :increment_and_inspect,
...(4)>   :call
...(4)> ] 
1
2
3
...

Congratulations! You now have an infinite loop in your iex session. Hit ctrl-c to terminate it. Later in the series we'll look at conditional branching, but for now we'll resort to more basic ways to end our test runs.

Given this approach, here's the full implementation so far.

defmodule StackMachine do
  def execute(code) do
    Agent.start_link(fn -> %{original_code: code} end,
                     name: __MODULE__)
    _execute(code, [])
  end

  defp original_code do
    Agent.get(__MODULE__, fn(state) -> state.original_code end)
  end

  defp _execute([], stack), do: stack
  defp _execute([:push | [value | code]], stack) do
    _execute(code, [value | stack])
  end
  defp _execute([:add | code], [a | [b | rest_of_stack]]) do
    _execute(code, [a + b | rest_of_stack])
  end
  defp _execute([:multiply | code], [a | [b | rest_of_stack]]) do
    _execute(code, [a * b | rest_of_stack])
  end
  defp _execute([:label | [_label | code]], stack) do
    _execute(code, stack)
  end
  defp _execute([:inspect | code], stack) do
    IO.puts hd(stack)
    _execute(code, stack)
  end
  defp _execute([:call | code], [label | stack]) do
    new_stack = _execute(subroutine(label), stack)
    _execute(code, new_stack)
  end
  defp _execute([:return | _code], stack), do: stack
  defp _execute([:halt | _code], _stack), do: :halted

  defp subroutine(label) do
    _subroutine(original_code, label)
  end

  defp _subroutine([:label | [label | code]], label), do: code
  defp _subroutine([_ | code], label), do: _subroutine(code, label)
end

Now, while we have solved the main problem of not being able to invoke a subroutine again, we haven't addressed the ineffeciency of traversing the bytecode whenever we want to find the subroutine being invoked.

Extracting subroutines on load

What if we not only stored the original bytecode in an agent, but also extracted the various subroutines? That way we need not go through the bytecode list more than once.

  def execute(code) do
    Agent.start_link(
      fn -> %{subroutines: extract_subroutines(code)} end,
      name: __MODULE__)
    _execute(code, [])
  end

  def extract_subroutines(code) do
    _extract_subroutines(code, %{})
  end

  defp _extract_subroutines([], subroutines), do: subroutines
  defp _extract_subroutines([:label, label | code], subroutines) do
    _extract_subroutines(code, Map.put(subroutines, label, code))
  end
  defp _extract_subroutines([_ | code], subroutines) do
    _extract_subroutines(code, subroutines)
  end

We also get rid of the original_code and _subroutine functions and rework the subroutine function.

  defp subroutine(label) do
    Agent.get(__MODULE__,
              fn(state) -> state.subroutines[label] end)
  end

The ineffeciency mitigated, we have another problem to deal with and it's caused by having a single agent running. If we were to attempt executing something like the following, the problem will come out in the open.

iex(1)> code = [:label, :first, :halt]
[:label, :first, :halt]
iex(2)> StackMachine.execute code
:halted
iex(3)> more_code = [:push, :second, :call, :label, :second, :halt]
[:push, :second, :call, :label, :second, :halt]
iex(4)> StackMachine.execute more_code
** (FunctionClauseError) no function clause matching in StackMachine._execute/2
    stack_machine.ex:19: StackMachine._execute(nil, [])
    stack_machine.ex:37: StackMachine._execute/2

What just happened? Well, subroutine(:second) will return nil as there is no such key in the agent's state.

The first mistake we made when introducing an agent was not checking the return value of Agent.start_link! We can fix that particular omission by updating our execute function.

  def execute(code) do
    {:ok, _pid} = Agent.start_link(
      fn -> %{subroutines: extract_subroutines(code)} end,
      name: __MODULE__)
    _execute(code, [])
  end

Given this change, we get a different error when running the same bytecodes as above.

** (MatchError) no match of right hand side value: {:error, {:already_started, #PID<0.60.0>}}
    stack_machine.ex:3: StackMachine.execute/1

Great! Now we can't even call the execute function twice without blowing up. By allowing the initialisation of the virtual machine to be entangled with the repeated use of it, we have failed to make the execute function re-entrant. We could rewrite our execute function to set up a supervision tree handling both the code/function-holding agent and the process actually executing the bytecode. Or we could skip registering the agent's process name and insted pass its PID (process ID) along to all functions needing it. Or we could...

Notice that sinking feeling? It's the one telling you that we would've been better off if we had test-driven our design. In fact, we'll get to that in the next episode, but first let's briefly consider the ramifications of transforming the bytecode on load.

Transforming the bytecode on load

As we clearly saw in our last few attempts, blindly typing code is equally good for exploring our options as it's to getting us into more and more trouble. So let's stop digging for now and just think a little about the next option.

Imagine that we had the code in a data structure that more acted like an array in other programming languages. Indexing any element would be a matter of O(1) rather than O(n). Thus, we could escape having to traverse our bytecode when looking for a subroutine by advancing the program counter to the index of the instruction to execute. When loading the bytecode we would remove the :label instructions and replace their use with the index of the instruction to jump to.

So what data structure can we use for this in Elixir? How about a tuple? It does have the O(1) property we desire and we will only need pay the performance cost of creating the tuple once, when loading the bytecode. Well, unless we start to dabble with self-modifying code which I'm sure we'll get to eventually.

Essentially, we turn this:

[:push, 0, :label, :increment_and_inspect, :push, 1, :add, :inspect, :push, :increment_and_inspect, :call]

into this:

{:push, 0, :push, 1, :add, :inspect, :push, 2, :call}

This involves a bit more complexity than our subroutine-extracting solution as we need to juggle some state and indices (conceptually pointers, but it does remove the need for keeping a separate process around to figure out what code to execute when invoking a subroutine. In essence, it turns our indirect branching scheme into a direct branching one.

This does look like a reasonable path to explore. Just as the BEAM, the Erlang virtual machine, loads and transforms its bytecode on load, so can we. However, we'll not do that without getting some tests in place first.

Stay tuned for the next installment of this epic quest!