In computing, the term data warehouse appliance (DWA) was coined by Foster Hinshaw for a computer architecture for data warehouses (DW) specifically marketed for big data analysis and discovery that is simple to use (not a pre-configuration) and high performance for the workload. A DWA includes an integrated set of servers, storage, operating systems, and databases.
In marketing, the term evolved to include pre-installed and pre-optimized hardware and software as well as similar software-only systems promoted as easy to install on specific recommended hardware configurations or preconfigured as a complete system. These are marketing uses of the term and do not reflect the technical definition.
A DWA is designed specifically for high performance big data analytics and is delivered as an easy-to-use packaged system. DW appliances are marketed for data volumes in the terabyte to petabyte range.
The data warehouse appliance (DWA) has several characteristics which differentiate that architecture from similar machines in a data center, such as an enterprise data warehouse (EDW).
Most DW appliances use massively parallel processing (MPP) architectures to provide high query performance and platform scalability. MPP architectures consist of independent processors or servers executing in parallel. Most MPP architectures implement a "shared-nothing architecture" where each server operates self-sufficiently and controls its own memory and disk. DW appliances distribute data onto dedicated disk storage units connected to each server in the appliance. This distribution allows DW appliances to resolve a relational query by scanning data on each server in parallel. The divide-and-conquer approach delivers high performance and scales linearly as new servers are added into the architecture.
"Data warehouse appliance" is a term coined by Foster Hinshaw, the founder of Netezza. In creating the first data warehouse appliance, Hinshaw and Netezza used the foundations developed by Model 204, Teradata, and others, to pioneer a new category to address consumer analytics efficiently by providing a modular, scalable, easy-to-manage database system that’s cost effective.