Oracle Solaris Cluster (sometimes Sun Cluster or SunCluster) is a high-availability cluster software product for Solaris, originally created by Sun Microsystems, which was acquired by Oracle Corporation in 2010. It is used to improve the availability of software services such as databases, file sharing on a network, electronic commerce websites, or other applications. Sun Cluster operates by having redundant computers or nodes where one or more computers continue to provide service if another fails. Nodes may be located in the same data center or on different continents.
Solaris Cluster provides services that remain available even when individual nodes or components of the cluster fail. Solaris Cluster provides two types of HA services: failover services and scalable services.
To eliminate single points of failure, a Solaris Cluster configuration has redundant components, including multiple network connections and data storage which is multiply connected via a storage area network. Clustering software such as Solaris Cluster is a key component in a Business Continuity solution, and the Solaris Cluster Geographic Edition was created specifically to address that requirement.
Solaris Cluster is an example of kernel-level clustering software. Some of the processes it runs are normal system processes on the systems it operates on, but it does have some special access to operating system or kernel functions in the host systems.
In June 2007, Sun released the source code to Solaris Cluster via the OpenSolaris HA Clusters community.
SCGE is a management framework that was introduced in August 2005. It enables two Solaris Cluster installations to be managed as a unit, in conjunction with one or more Data replication products, to provide Disaster Recovery for a computer installation. By ensuring that data updates are continuously replicated to a remote site in near-real time, that site can rapidly take over the provision of a service in the event that the entire primary site is lost as a result of a disaster, either natural or man-made. This is a key to minimizing the Recovery point objective (RPO) and Recovery time objective (RTO) for the service.