High-Availability
High-availability is achieved on three different levels: replication, which typically requires manual failover; clustering, where failover is automatic (and is typically includes replication as well); and multi-site high-availability, which can withstand an infrastructure failure affecting an entire facility.
The common configuration for high-availability infrastructure would include load-balanced front-end Web servers, clustered application servers, a database cluster with a separate standby replica, and mirrored onboard or external storage. Additionally, all critical components such as switches, routers, firewalls, and load-balancers are also deployed in pairs to insure redundancy at every level. Finally, mirrored components are split up among independent racks that reside on separate Power Distribution Units (PDUs), which ensures that both fault domains remain completely discrete.
Build a custom solution for your company with one of Logicworks' sales engineers
Load Balancing
Load balancing provides for both fault tolerance as well as scalability. Using external software-based load balancers from Logicworks’ strategic partner Zeus Technology, the health of individual Web servers is checked continuously, routing traffic around any problematic server. Scalability is as simple as adding a new server to the pool of existing servers. Load balancing similarly facilitates ongoing maintenance of servers, allowing a server to be patched without disrupting service, or even the testing of a patch or an application upgrade by adding the upgraded server to the pool – and pulling it out if the upgrade has caused unforeseen issues.
Logicworks load balancing solutions are GigE capable, and scalable to full wire-speed. Load balancers themselves can be configured in a clustered, multi-site, or cold-standby configuration. The Zeus platform also provides for SSL offload, as well as unwrap and re-wrap, giving secure applications all the capabilities and granularity of unencrypted load balancing.
Replication
Replication strategies are applicable to databases and file systems, and typically involve manual intervention for failover. While the terminology for database replication schemes varies – standby server for Oracle, slave for MySQL, and mirror for MS-SQL, the concepts are relatively consistent. Similar strategies exist for filesystems, such as rsync or dfs. Replicas can be used for reporting or backups, or read-only server pools, without increasing load on the primary. Replicas can also be situated in an alternate datacenter, providing for multi-site fault tolerance. Since corruption or unintended deletions on a master will be also applied to the replica, point-in-time backup and recovery is also typically part of a replication solution.
Clustering
Clusters allow for instantaneous automated failover from a primary to a secondary server, in only a few seconds without any loss of transactions written to disk. Logicworks provides a full suite of clustering solutions for the major database platforms (Oracle, MySQL and MS-SQL) as well as most filesystems. Depending on the application secondary cluster members may also be accessed in read-only mode for reporting, backups or replication. Clustering solutions either involve shared SAN storage or NAS- or DRBD-based solutions. Since both cluster members can be affected by a corrupt block device, Logicworks recommends all cluster solutions include some form of replication.
Warm Standby
Virtualization technology from VMware enables an alternate form of high-availability, a warm standby, which is appropriate for servers where data changes infrequently. When servers are deployed as virtual machines on a pair (or more) of ESX servers, snapshots of virtual machines can be copied to all ESX hosts to be started up, when either the primary ESX server or the virtual machine goes down.
Multi-Site Redundancy
True disaster recovery capability requires multi-site configurations, with duplicate hardware at a secondary facility. All servers at a single facility, regardless of whether they share no hardware, are subject to the same environmental conditions, which are mitigated only by a presence in one of Logicworks' alternate facilities. Ultimately determined by business requirements, the process by which failover to the secondary facility occurs can be implemented manually, through global server load balancing, or round-robin DNS.







