High Availability
High Availability is the essence of mission-critical applications being provided quickly
and reliably to clients looking for your services. If a client can’t get to your services,
then they’re unavailable. Your company is making money to sustain the life of its
business, which depends on only one thing: your client base can shop online. Nerve
racking? You bet.
Not to sound overly simplistic, but systems up, servers serving, and the business
running is what High Availability is all about. Systems will fail, so how will your
company handle this failure? Anyone who has ever been in charge of a service that
needed to be up all the time and watched it crash knows how the company’s CEO or
vice presidents look at their angriest. High Availability, the industry term for systems
available 99.999 (called “Five Nines”) percent of the time, is the way around this. Five
Nines is the term for saying a service or system will be up almost 100 percent of the
time. To achieve this level of availability, you need to deploy systems that can survive
failure. The ways to perform this are through clustering and load balancing.
Throughout the book, you also learn about other forms of High Availability, such
as Redundant Array of Inexpensive Disks (RAID) and redundancy, in all aspects of
hardware and software components. You can see a simple example of a Highly Available
infrastructure in Figure 1-1. Although this book focuses on clustering and load-balancing
solutions, you’re given the big picture, so you can prepare almost all your components
for High Availability and redundancy.
Clustering and Load Balancing Defined
Clustering is a means of providing High Availability. Clustering is a group of machines
acting as a single entity to provide resources and services to the network. In time of
failure, a failover will occur to a system in that group that will maintain availability
of those resources to the network. You can be alerted to the failure, repair the system
failure, and bring the system back online to participate as a provider of services once
more. You learn about many forms of clustering in this chapter. Clustering can allow
for failover to other systems and it can also allow for load balancing between systems.
Load balancing is using a device, which can be a server or an appliance, to balance the
load of traffic across multiple servers waiting to receive that traffic. The device sends
incoming traffic based on an algorithm to the most underused machine or spreads the
traffic out evenly among all machines that are on at the time. A good example of using
this technology would be if you had a web site that received 2,000 hits per day. If, in
the months of November and December, your hit count tripled, you might be unable tosustain that type of increased load. Your customers might experience time outs, slow
response times, or worse, they might be unable to get to the site at all. With that picture
fresh in your mind, consider two servers providing the same web site. Now you have
an alternative to slow response time and, by adding a second or a third server, the
response time would improve for the customer. High Availability is provided because,
with this technology, you can always have your web site or services available to the
visiting Internet community. You have also systematically removed the single point
of failure from the equation. In Figure 1-2, you can see what a clustered solution can
provide you. A single point of failure is removed because you now have a form of
redundancy added in.Pros and Cons to Clustering and Load Balancing
You could now be asking yourself, which is better to implement, clustering or load
balancing? You can decide this for yourself after you finish this book, when you know
all the details necessary to implement either solution. To give you a quick rundown of
the high-level pros and cons to each technology, consider the following. With clustering,
you depend on the actual clustered nodes to make a decision about the state of the
network and what to do in a failure. If Node A in a cluster senses a problem with Node
B (Node B is down), then Node A comes online. This is done with heartbeat traffic,
which is a way for Node A to know that Node B is no longer available and it must
come online to take over the traffic. With load balancing, a single device (a network
client) sends traffic to any available node in the load-balanced group of nodes. Load
balancing uses heartbeat traffic as well but, in this case, when a node comes offline, the
“load” is recalculated among the remaining nodes in the group. Also, with clustering
(not load balancing), you’re normally tied down or restricted to a small number of
participating nodes. For example, if you want to implement a clustered solution with
Windows 2000 Advanced Server, you might use a two-node cluster. With load balancing,
you can implement up to 32 nodes and, if you use a third-party utility, you can scale
way beyond that number. You can even mix up the operating system (OS) platforms, if
needed, to include Sun Solaris or any other system you might be running your services
on. Again, this is something that’s thoroughly explained as you work your way through
the book. This section is simply used to give you an idea of your options. Finally,
you have the option to set up tiered access to services and to mix both architectures(clustering and load balancing) together. You can set up the first tier of access to your
web servers as load balanced and the last tier of access as your clustered SQLdatabases.
This is explained in more detail in the upcoming section on N-tier architecture, “ N-Tier
Designs.”
Hot Spare
A hot spare is a machine you can purchase and configure to be a mirror image of the
machine you want to replace if a failure occurs. Figure 1-3 shows an example of a hot
spare in use. A hot spare can be set aside for times of disaster, but it could sit there unused,
waiting for a failure. When the disaster occurs, the hot spare is brought online to participate
in the place of the systems that failed. This isn’t a good idea because the system sitting idle
isn’t being used and, in many IT shops, it will be “borrowed” for other things. This means
you never have that hot spare. For those administrators who could keep the hot spare as a
spare, you’re missing out on using that spare machine as a balancer of the load. Also, why
configure the hot spare in time of failure? Your clients lose connectivity and you have to
remove the old machine, and then replace it with the new one and have all your clients
reconnect to it. Or, worse yet, the angry client shopping online could be gone forever to
shop somewhere else online if it’s a web server hosting an ecommerce site. Setting up a
second server as a hot spare is redundant, but there is a better way. Set this second machine
up in a cluster. Although the hot spare method might seem a little prehistoric, it’s still
widely used in IT shops that can’t afford highly available systems, but still need some form
of backup solution.A Need for Redundancy
You already learned about some forms of redundancy in the first few portions on this
chapter in the discussion on clustering. Now let’s look at why redundancy of systems
is so important and what options you have besides a cluster. Being redundant (or
superfluous) is the term used to explain exceeding what’s necessary. If this is applied
to an IT infrastructure, then it would be easy to say that if you need a power supply to
power your server, then two power supplies would exceed what’s necessary. Of course,
in time of failure, you always wish you’d exceeded what you need, correct? The need
for redundancy is obvious if you want to have your business continue operations in
time of disaster.
The need for redundancy is apparent in a world of High Availability. Your options
today are overwhelming. You can get redundant “anything” in the marketplace. You
can purchase servers from Dell and Compaq with redundant power supplies: if one
fails, the other takes over. You have redundant power supplies in Cisco Catalyst switches,
for example. For a Catalyst 4006, you can put in up to three redundant power supplies.
This is quite the design you want when configuring your core network. A redundant
network can exceed hardware components and go into the logical configurations of
routes in your routers and wide area network (WAN) protocol technologies, such as
having your frame relay network drop off the face of the Earth and have your router
dial around it using ISDN. All in all, redundant services are key to a Highly Available
network design.
Manageability
With clustered solutions, you have the benefit of managing your systems as one
system. When you configure clustering with network load balancing (NLB) and with
Application Center 2000, you find that setting up and managing systems under one
console, and monitoring performance under one console, makes your life much easier.
Because we all know life as a Network and Systems administrator is far from easy, this
can be an incredible help to your efforts.
Reliability
Reliability is being able to guarantee you’ll have services available to requests from
clients. Think about it: you buy a brand new car—don’t you want it to be reliable?
The theory is the same when dealing with mission-critical network services. If server
components fail, you can plan outages that are usually at night and in off hours. What
if you run 24-hour-a-day operations? You want to be able to absorb the disaster that
occurs and reliably deliver the service you offer.
Scalability
Scalability is your option to grow above and beyond what you’ve implemented today.
For instance, say you purchased two servers to configure into a cluster with a separateshared storage device. If you want to say the solution you have is scalable, then you
would say you could add two more servers to that clustered group when the need for
growth arrived. Scalability (or being able to scale) is a term you would use to explain
that capability to grow either up or out of your current solution.
Scale Up
Scaling up is the term you use to build up a single machine. If you have one server—
and that server provides printing services to all the clients on your network—you
might want to increase its memory because, while performance monitoring the server,
you see that virtual memory is constantly paged from your hard disk. The fact that you
are “adding” to a single system to build it up and not adding more systems to share the
load means you are scaling up, as seen in Figure 1-4.
Scale Out
Scaling out is clustering as seen in Figure 1-5. You have one server providing a web site
to clients and, while performance monitoring, you notice page hits have increased by
50 percent in one month. You are exceeding limits on your current hardware, but you
don’t want to add more resources to this single machine. You decide to add another
machine and create a cluster.CLUSTERING WITH NT 4.0
Before you get into the high-level overview of clustering and load balancing with
Windows 2000 and the Server 2003 platforms, you should know where this all started.
I won’t go over the history of clustering and how Microsoft got involved, but I’ll give
you an overview on why Windows 2000 clustering is a worthy solution to implement
on your network.
Windows 2000 Clustering Services were first born on the Windows NT 4.0 Server
Enterprise Edition. On hearing of its arrival and implementing the services, those
involved quickly discovered this wasn’t something they wanted to implement on their
mission-critical applications. Microsoft Cluster Server, also code-named “Wolfpack,”
wasn’t reliable. A plethora of problems occurred while running the service, including
slow performance when using Fibre Channel and large amounts of hard disks that
stopped serving clients altogether for no apparent reason, only to discover later it was
another bug. This defeated the entire purpose for clustering in the first place and many
quickly lost faith in the solution Microsoft had provided. Faith wasn’t restored when
most of the fixes you could implement were supplied from Microsoft in the form of a
tool called: “Install the latest service pack.”
Fast-forward to Windows 2000 and you have a whole different solution, which you
discover throughout this book. All in all, the service has grown exponentially with the
newer releases of Windows server-based OSs, and has become a reliable and applicable
solution in your network infrastructure. If you plan to design an NT cluster, be aware
that NT Server 4.0 doesn’t support clustering, but it will work with load balancing.
Windows NT 4.0 Enterprise Edition will work with load balancing and can be clustered
with two nodes.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment