DLX

DLX
Registers
Designer	John L. Hennessy and David A. Patterson
Bits	32-bit
Introduced	1990s
Version	1.0
Design	RISC
Type	Register-Register & Load-store
Encoding	Fixed
Branching	Condition register
Endianness	Bi-endian
Extensions	None, but MDMX & MIPS-3D could be used
Open	Yes
General purpose	31 (R0=0)
Floating point	32 (paired DP for 32-bit)

The DLX (pronounced "Deluxe") is a RISC processor architecture designed by John L. Hennessy and David A. Patterson, the principal designers of the Stanford MIPS and the Berkeley RISC designs (respectively), the two benchmark examples of RISC design (named after the Berkeley design).

The DLX is essentially a cleaned up (and modernized) simplified MIPS CPU. The DLX has a simple 32-bit load/store architecture, somewhat unlike the modern MIPS CPU. As the DLX was intended primarily for teaching purposes, the DLX design is widely used in university-level computer architecture courses.

There are two known implementations: ASPIDA and VAMP. ASPIDA project resulted in a core with many nice features: open source, supports Wishbone, asynchronous design, supports multiple ISA's, ASIC proven. VAMP is a DLX-variant that was mathematically verified as part of Verisoft project. It was specified with PVS, implemented in Verilog, and runs on a Xilinx FPGA. A full stack from compiler to kernel to TCP/IP was built on it.

In the original MIPS architecture one of the methods used to gain performance was to force all instructions to complete in one clock cycle. This forced compilers to insert "no-ops" in cases where the instruction would definitely take longer than one clock cycle. Thus input and output activities (like memory accesses) specifically forced this behaviour, leading to artificial program bloat. In general MIPS programs were forced to have a lot of wasteful NOP instructions, a behaviour that was an unintended consequence. The DLX architecture does not force single clock cycle execution, and is therefore immune to this problem.

In the DLX design a more modern approach to handling long instructions was used: data-forwarding and instruction reordering. In this case the longer instructions are "stalled" in their functional units, and then re-inserted into the instruction stream when they can complete. Externally this design behaviour makes it appear as if execution had occurred linearly.

DLX instructions can be broken down into three types, R-type, I-type and J-type. R-type instructions are pure register instructions, with three register references contained in the 32-bit word. I-type instructions specify two registers, and use 16 bits to hold an immediate value. Finally J-type instructions are jumps, containing a 26-bit address.

...
Wikipedia