*** Welcome to piglix ***

Fermi (microarchitecture)

Nvidia Fermi
Fabrication process 40 nm and 28 nm
History
Predecessor Tesla
Successor Kepler

Fermi is the codename for a GPU microarchitecture developed by Nvidia as the successor to the Tesla microarchitecture. It was the primary microarchitecture used in the GeForce 400 series and GeForce 500 series. It was followed by Kepler, and used alongside Kepler in the GeForce 600 series, GeForce 700 series, and GeForce 800 series, in the latter two only in mobile GPUs. In the workstation market, Fermi found use in the Quadro x000 series, Quadro NVS models, as well as in Nvidia Tesla computing modules. All desktop Fermi GPUs were manufactured in 40 nm, mobile Fermi GPUs in 40 nm and 28 nm.

The architecture is named after Enrico Fermi, an Italian physicist.

Fermi Graphic Processing Units (GPUs) feature 3.0 billion transistors and a schematic is sketched in Fig. 1.

Each SM features 32 single-precision CUDA cores, 16 load/store units, four Special Function Units (SFUs), a 64KB block of high speed on-chip memory (see L1+Shared Memory subsection) and an interface to the L2 cache (see L2 Cache subsection).

Load/Store Units: Allow source and destination addresses to be calculated for 16 threads per clock. Load and store the data from/to cache or DRAM.

Special Functions Units (SFUs): Execute transcendental instructions such as sin, cosine, reciprocal, and square root. Each SFU executes one instruction per thread, per clock; a warp executes over eight clocks. The SFU pipeline is decoupled from the dispatch unit, allowing the dispatch unit to issue to other execution units while the SFU is occupied.

Integer Arithmetic Logic Unit (ALU): Supports full 32-bit precision for all instructions, consistent with standard programming language requirements. It is also optimized to efficiently support 64-bit and extended precision operations.


...
Wikipedia

...