The PA-8000 (PCX-U), code-named Onyx, is a microprocessor developed and fabricated by Hewlett-Packard (HP) that implemented the PA-RISC 2.0 instruction set architecture (ISA). It was a completely new design with no circuitry derived from previous PA-RISC microprocessors. The PA-8000 was introduced on 2 November 1995 when shipments began to members of the Precision RISC Organization (PRO). It was used exclusively by PRO members and was not sold on the merchant market. All follow-on PA-8x00 processors (PA-8200 to PA-8900, described further below) are based on the basic PA-8000 processor core.
The PA-8000 was used by:
The PA-8000 is a four-way superscalar microprocessor that executes instructions out-of-order and speculatively. These features were not found in previous PA-RISC implementations, making the PA-8000 the first PA-RISC CPU to break the tradition of using simple microarchitectures and high-clock rate implementation to attain performance.
The PA-8000 has a four-stage front-end. During the first two stages, four instructions are fetched from the instruction cache by the instruction fetch unit (IFU). The IFU contains the program counter, branch history table (BHT), branch target address cache (BTAC) and a four-entry translation lookaside buffer (TLB). The TLB is used to translate virtual address to physical addresses for accessing the instruction cache. In the event of a TLB miss, the translation is requested from the main TLB.
The PA-8000 performs branch prediction using static or dynamic methods. Which method the PA-8000 used was selected by a bit in each TLB entry. Static prediction considers most backwards branches as taken and forward branches as not taken. Static prediction also predicted the outcome of branches by examining hints encoded in the instructions themselves by the compiler.
Dynamic prediction uses the recorded history of a branch to decide whether it is taken or not taken. A 256-entry BHT is where this information is stored. Each BHT entry is a 3-bit shift register. The PA-8000 used a majority vote algorithm, a branch is taken if the majority of the three bits are set, and not taken if they are clear. A mispredicted branch causes a five-cycle penalty. The BHT is updated when the outcome of the branch is known. Although the PA-8000 can execute two branch instructions per cycle, only one of the outcomes is recorded as the BHT is not dual-ported to simplify its implementation.