the j1 forth cpu

this article requires a basic understanding of forth, but is aimed at those with an intermediate understanding.

modern CPUs are complex. x86-64 cpus have over 557 registers available to a programmer. some arm cpus can execute jvm code via the jazelle mode. modern arm cpus have an instruction specifically for rounding javascript floating point numbers. even the x86 vendors have had enough with their own architecture - VIA cpus have an undocumented alternate instruction set. most of this complexity is unneeded and unneccessary. suffice to say, we have a problem on our hands

luckily for us, x86 and arm cpus aren't the only ones that exist. a variety of considerably more compact architectures exist: the likes of riscv, mips, and powerpc. however, among the architectures that aren't x86 or arm, a unique category exists - stack machines.

forth computers

in the 70s and 80s, making your own computer architecture and instruction set was somewhat more common than it is now. some machines, like the xerox alto, supported programmable instruction sets - they read their microcode from a region of RAM that the user could read, and thus effectively implement their own ISA. however, a xerox system was incredibly expensive, and not at all the topic of this article. stack machines are machines that have instructions that do operands on the stack, rather than instructions that do operands preceeding the instruction. the very first stack machines came into being in the 50s, however they don't really resemble the later 'forth computers', like the Novixes. forth computers are characterised by two main traits: their dual stack, and their focus on subroutine performance.

traditiionally, subroutines in computers are slow. if you use a lot of subroutines, your program slows down immensely. one of the first optimisations that a compiler does is pasting the code of a subroutine into the body of other subroutines, where a call would be. forth hardware is the opposite. calls