Thursday, December 20, 2007

SERVER 2003 CLUSTERING AND LOAD BALANCING

With the upcoming release of Server 2003 on the horizon, now’s the time to start
thinking about using this platform for your Clustered solutions as well. Windows
2000 will be around for quite some time. Companies haven’t even moved away from
NT 4 yet and they have little to no intentions of doing so. Microsoft will also take a
stance at some time in the next decade and will look at what to do with Windows
2000 and its end of life (EOL) sequence. What’s next, you ask? A product called
Server 2003 will eventually replace Windows 2000. This book looks at clustering
and load-balancing Server 2003. One of the most confusing pieces of Microsoft’s new
naming convention is that it has also retired its Backoffice solution and upgraded
the name to Server 2003 Enterprise servers, (“Backoffice” is the name that applied to
running Exchange 5.5 or Proxy 2.0 on top of Windows NT 4.0). Windows 2000 also
has services that can be added to it, such as Exchange 2000 and Internet Security
and Acceleration (ISA) Server 2000, which are the subsequent upgrades from the
previously mentioned products.
Windows Server 2003 Enterprise Servers
The name Server 2003 can be confusing. I want to demystify this term, so you
understand how it will be referenced throughout the remainder of this book. You
have the OS, which is slated to succeed Windows 2000 Server and the Server 2003
Enterprise server line, such as SQL2000. And then you have the products just mentioned,
like Exchange 2000 and ISA 2000. My goal is to cover the configuration and installation
of clustered services that combine with most of these services. SQL 2000 is covered
in great detail because it’s a big player in N-tier architecture. You’ll most likely be
involved with N-tier architecture while configuring High Availability solutions.
Windows Server 2003
At press time, the full version of Windows Server 2003 wasn’t yet released and is
currently in RC2. It’s almost out of testing and ready for full production. After you
read this book, you’ll already know how to configure and cluster the full version of
Windows Server 2003. The program’s release should be in sync with this book’s release.
What I want to accomplish is to lay out the overall strategy and enhancements, so you
can consider this product in your upgrade or migration path for the future. Or, even
more important, you could find the product’s enhancements are so superior, you might
want to wait for its release to implement it immediately. Let’s look at where Server 2003
is going with clustering and load balancing.
Server 2003 Clustering Enhancements
First, your clustered node count went up. In Windows 2000 Advanced Server, you were
locked down to a two-node cluster, but Server 2003 Enterprise version will allow for
four-node clusters. (Datacenter Server moves up to eight nodes). Also new to Server 2003is load balancing for all its Server 2003 versions. Windows 2000 server was incapable of
NLB, but Windows Server 2003 is capable. Another huge addition is adding the Window
Cluster Service in Server 2003 to Active Directory. A virtual object is created, which
allows applications to use Kerberos authentication, as well as delegation. Unless you
have the hardware, it doesn’t matter. If you do have the hardware, though, 64-bit
support is now available. New configuration and management tools have been added,
which you read about in great detail in Chapter 3. They do make life easier. Network
enhancements have also been made to make network traffic run smoother so as to include
a multicast heartbeat default option where unicast traffic is only used if multicasting
fails entirely. You have options to make communication more secure as well. New
storage enhancements have also been worked into the product to allow more flexibility
with a shared quorum device. And, you have new cluster-based troubleshooting tools,
which you look at closely as an enhancement.
Server 2003 Load-Balanced Enhancements
A brand new management utility is being offered in Server 2003 load-balancing
services. You now have a central management utility from which to manage NLB
clusters. You see this in detail in Chapter 3 and make comparisons to Application
Center 2000, as necessary. You can now configure virtual clusters. This is a huge step
up because you previously had limitations on how you perform IP addressing on load
balanced clusters, but now you can configure clustering almost like switch-based
virtual local area networks (VLANs). You learn about this in Chapter 3. You also have
Internet Group Membership Protocol (IGMP) support, which is to have multicast
groupings configured for NLB clusters. Another greatly needed enhancement is the
inception of Bidirectional Affinity in what you need to implement to have server
publishing while using ISA Server 2000. Bidirectional Affinity is what is used to create
multiple instances of NLB on the same host to make sure that responses from servers
that are published via ISA Server can be routed through the correct ISA server in the
cluster. Two separate algorithms are used on both the internal and external interfaces
of the servers to aid in determining which node services the request.
As you can see, huge enhancements exist to the new Server 2003 technology, which
you learn about in great detail in Chapter 3 when we discuss load balancing and
clustering Windows Server 2003. You need to review the basics here so you can plan
for it, if necessary. All the major differences will be highlighted, as we configure
the clustered and load-balanced solutions. Chapter 3 covers the granular details of
configuration and implementation of Server 2003.
APPLICATION CENTER 2000
With the creation and shipment of Application Center 2000, Microsoft placed itself
on a map few others could reach. Application Center 2000 is the future of cluster
management. This Server 2003 enterprise server platform adds massive functionality
to your clustered and load-balanced solutions. You already know Windows 2000
Advanced Server can provide for you with load balancing and clustering, so now
you’ll learn about the benefits Application Center 2000 can add. Microsoft wanted toexpand on the NLB and clustering functionality of Windows 2000 Advanced Server
and it created the ultimate package to get that done. Microsoft Application Center 2000
is used to manage and administer from one central console web and COM+ components.
This was a problem in the past without Application Center 2000. Many customers
complained about how archaic it was to manage their clusters and load-balanced
solutions, so Microsoft obliged them with the Application Center 2000 Management
Console. Through this console, you can manage all your cluster nodes and all your
clusters in one Microsoft Management Console (MMC) snap-in. Health monitoring also
created a snafu, which was unmanageable. As you see in Chapter 8, you can monitor
the entire cluster from one console, instead of having to do performance monitoring
on every cluster node separately with Microsoft Health Monitor. You’ll also see that
configuring a cluster without Application Center 2000 can be difficult.
In the next few chapters, you learn to configure clustered and load-balanced
solutions, and then, in later chapters, you do the same thing using Application Center
2000. You’ll see clearly that the management of difficult settings becomes much easier
to configure and manage. Application Center 2000 also provides the power to manage
your web sites and COM components, all within the same console. This is important
because, many times, most of what you’ll be load balancing are your web site and
ecommerce solutions. You also have some other great add-ons, such as the capability
to use alerting, and so forth. Using Windows 2000 and Application Center 2000 to
manage your clusterComponent Load Balancing
In times of High Availability, you might not only need to cluster and load balance entire
server platforms, but also critical applications that use Component Object Model (COM)
services of COM and COM+ for short. Most high-availability demands come from the
need to produce services quickly and reliably, like application components for an online
store. You might need to load balance specific servers and pages, as well as the COM+
components shared by all servers within the group. With component load balancing
(CLB), the possibilities are endless. CLB is new to Windows 2000, once you install
Application Center 2000, and it offers something that wasn’t available in the past with
older versions of NT 4.0: the capability to scale up to 16 clustered nodes of servers
dedicated to processing the code for COM and COM+ objects. CLB clustering and
routing also needs Application Center 2000, which you use to implement this solution.
Chapters 4, 6, and 7 cover the granular details of configuration and implementation
of Application Center with Microsoft Servers.HIGHLY AVAILABLED ATABASES WITH
SQL SERVER 2000
SQL Server is by far the most up-and-coming database product today. With its lowerthan-
average cost against the bigger players like Oracle, SQL Server eats up more and
more market share as it continues to be promoted and moved into more infrastructures.
That said, more companies are relying on its uptime. For those who don’t know what
SQL Server is, it’s the Microsoft database server product. SQLServer 2000 (a Server 2003
Enterprise product) is mentioned here and is covered in depth throughout the book
because it’s an integral part of web-based commerce sites and it’s finding its way into
nearly every product available that does some form of logging or network management.
I think it’s clear why this product needs to be clustered and highly available. An
example of SQL Clustering can be seen in Figure 1-14. Chapter 5 covers the clustering
in granular detail. You also learn some little-known facts about what clustering this
product costs, how to convince management this product is relatively cheap to cluster,
and why clustering it makes sense.DESIGNING A HIGHLY AVAILABLE SOLUTION
Now that you know all the basics of High Availability, clustering, and load balancing,
you need to learn how to develop its design. This is, by far, the most important phase
in any project. Many networks have been built with good intentions but, because of the
lack of design done in the early stages of rolling out the solution, it always wound up
costing more, taking longer, or not panning out as expected.
In this book, I hope to get you to a point where you can completely bypass that
scenario. I want you to be the one who designs the proper solution and correctly budgets
for it in the early stages of development and project planning. First, you need to develop
what you’re trying to accomplish. This section gives you an overall approach to any
solution you need to accomplish. In other words, I won’t go into deep detail here about
Application Center per se, but you’ll get an overall thorough process to follow up until
you need to design the Application Center task within the project. When you get to the
appropriate chapters where each technology is different, I’ll include a design phase
section to help you incorporate that piece of technology into your overall design and
the project plan you want to create. For this section, you need to get that overall 40,000-foot
view of the entire project. This is critical because, without the proper vision, you might
overlook some glaring omissions in the beginning stages of the plan that could come
back to haunt you later.
To create a great solution, you first need to create a vision on what you want to
accomplish. If this is merely a two-node cluster, then you should take into account what
hardware solution you want to purchase. Getting involved with a good vendor is crucial
to the success of your overall design. You could find each vendor has different costs
that won’t meet your budget or each vendor might have clustering hardware packages
with shared storage solutions, which meet your needs more clearly than other hardware
vendors. For instance, you could find you’d like to have servers with three power
supplies instead of two within each server. You might decide you want your management
network connection to be connected via fiber or Gigabit Ethernet and have your shared
storage at the same speeds. You have much to think about at this stage of overall design.
Something else to think about is what service do you want to provide? You must
understand that the product you’re delivering needs to function properly and you
need to know what the client level of expectations is. You could have a client who
has a specific Service Level Agreement (SLA), which he expects you to honor. When
I shop for new services, I always want to know what’s in the contract based on my
own expectations. You might also want to get an overall feel of the expected deadlines.
By what date does this solution need to be rolled out live into production? This is
important to plan for because, based on what pieces of hardware you need to purchase,
you could have lead time on ordering it. Remember, if the hardware is sizable and
pricey, you might need to account for a little more time to get it.
Another consideration is budget. This is covered in its own section because budget
warrants its own area of discussion. You also need to consider the surrounding
infrastructure. I once encountered a design where the entire clustered solution was laid
out in Visio format and looked outstanding, but the planners didn’t account for the factthat they didn’t order the separate switch for the Management VLAN. Although this
was a painless oversight, my hope is this book can eliminate most of these types of
errors from ever occurring.
Creating a Project Plan
By creating a project plan like the one seen in Figure 1-15, you have a way to keep track
of your budget needs, your resources—whether the resources are actual workers or
technicians of server-based hardware—and many other aspects of rolling out a Highly
Available solution. Make no mistake, creating a Highly Availability solution is no small
task. There is much to account for and many things need to be addressed during every
step of the design during the setup and roll out of this type of solution. Having at least
a documented project plan can keep you organized and on track. You don’t necessarily
need a dedicated project manager (unless you feel the tasks are so numerous, and spread
over many locations and business units that it warrants the use of one), but you should
at least have a shared document for everyone in your team to monitor and sign off on.
Pilots and Prototypes
You need to set up a test bed to practice on. If you plan on rolling anything at all out
into your production network, you need to test it in an isolated environment first. To
do this you can set up a pilot. A pilot is simply a scaled-down version of the real solution,where you can quite easily get an overall feel of what you’ll be rolling out into your
live production network. A prototype is almost an exact duplicate set to the proper scale
of the actual solution you’ll be rolling out. This would be costly to implement, based
on the costs of the hardware but, if asked, at least you can accurately say you could set
up a pilot instead to simulate the environment you’ll be designing. Working with a
hardware vendor directly is helpful and, during the negotiation phase of the hardware,
ask the vendor what other companies have implemented their solutions. I can usually
get a list of companies using their products and make contacts within those companies,
so I can see their solutions in action. And I hit newsgroups and forums to deposit general
questions to see what answers I turn up on specific vendors and their solutions. You
could also find the vendors themselves might be willing to work out having you visiting
one of their clients to see the solutions in action. This has worked for me and I’m sure it
could also be helpful to you.
Designing a Clustered Solution
Now that you’ve seen the 40,000-foot view, let’s come down to 10,000 feet. Don’t worry.
In upcoming chapters (and starting with the next chapter), you get into specific
configurations. To understand all the new terminology, though, it’s imperative for you
to look at basic topology maps and ideas, so we can share this terminology as we cover
the actual solution configurations. As you look at clustering Windows 2000 Advanced
Server in the next chapter, we’ll be at ground level, looking at all the dialog boxes and
check boxes we’ll need to manipulate. First, you need to consider the design of a general
cluster, no matter how many nodes it will service. Let’s look at a two-node cluster for a
simple overview. Now let’s look at some analysis facts.
Addressing the Risks
When I mention this in meetings, I usually get a weird look. If we’re implementing
a cluster, is that what we’re using to eliminate the single point of failure that was the
original problem? Why would you now have to consider new risks? Although you
might think this type of a question is ridiculous, it isn’t. The answer to this question is
something that takes experience to answer. I’ve set up clustering only to find out that
the service running on each cluster was now redundant and much slower than it was
without the clustering. This is a risk. Your user community will, of course, make you
aware of the slow-down in services. They know because they deal with it all day.
Another risk is troubleshooting. Does your staff know how to troubleshoot and
solve cluster-based problems? I’ve seen problems where a clustered Exchange Server 2000
solution took 12 people to determine what the problem was because too many areas
of expertise were needed for just one problem. You needed someone who knew
network infrastructure to look through the routers and switches, you needed an e-mail
specialist, and you needed someone who knew clustering. That doesn’t include the
systems administrators for the Windows 2000 Advanced Servers that were implemented.
Training of personnel on new systems is critical to the system’s success . . .and yours.Have power concerns been addressed? I got to witness the most horrifying, yet hilarious,
phenomenon ever to occur in my experience as an IT professional. One of the junior
administrators on staff brought up a server to mark the beginning of the age of
Windows 2000 in our infrastructure, only to find out the power to that circuit was
already at its peak. The entire network went down—no joke. (Was that a sign or what?)
This was something I learned the hard way. Consider power and uninterruptible power
supplies as well. Power design is covered in more detail in Chapter 2.
Designing Applications and Proper Bandwidth
What will you be running on this cluster? This is going to bring you back to planning
your hardware solution appropriately. In each of the following chapters, you’ll be
given a set of basic requirements, which you’ll need to get your job done with the
solution you’re implementing. Of course, when you add services on top of the cluster
itself, you’ll also need to consider adding resources to the hardware.
You should also consider the bandwidth connections based on the application.
Bandwidth and application flows can be seen in Figure 1-16. Some services will use
more bandwidth than others and this must be planned by watching application flows.
In later chapters, we’ll discuss how to test your clustered solutions with a network and
protocol analyzer to make sure you’re operating at peak performance, instead of trying
to function on an oversaturated and overused network segment.
You also need to consider whether your applications are cluster aware, which means
they support the cluster API (application programming interface). Applications that
are cluster aware will be registered with the Cluster Service. Applications that are
noncluster aware can still be failed over, but will miss out on some of the benefits of
cluster-aware applications. That said, you might want to consider this if the whole
reason you’re clustering is for a mission-critical application that might not be cluster
aware. Most of Microsoft’s product line is cluster aware, but you might want to check
with a vendor of a third-party solution to see if their applications function with the
cluster API.
Determining Failover Policies
Failover will occur through disaster or testing and, when it does, what happens is
based on a policy. Until now, we’ve covered the fundamentals of what failover entails,
but now we can expound on the features a bit. You can set up polices for failover and
failback timing, as well as configuring a policy for preferred node. Failover, failback,
and preferred nodes are all based on setting up MSCS (Microsoft Cluster Service) or
simply the Cluster Service.
Failover Timing Failover timing is used for simple failover to another standby node in
the group upon failure. Another option is to have the Cluster Service make attempts
to restart the failed node before going to failover node to a Passive node. In situations
where you might want to have the primary node brought back online immediately, this
is the policy you can implement. Failover timing design is based on what is an acceptableamount of downtime any node can experience. If you’re looking at failover timing
based on critical systems, where nodes can’t be down at all, which is based on 99.999
percent, then you need to test your systems to make sure your failover timing is quick
enough, so your clients aren’t caused any disruption.
Failback Timing Failing back is the process of going back to the original primary node
that originally failed. Failback can be immediate or you can set a policy to allow timing
to be put in place to have the failback occur in off-hours, so the network isn’t disturbed
again with a changeover in the clustered nodes.Preferred Node A preferred node can be set via policy, so if that node is available, then
that will be the Active node. You’d want to design this so your primary node could
be set up with high hardware requirements. This is the node you’d want to serve the
clients at all times.
Selecting a Domain Model
I’ve been asked many times about clustering domain controllers and how this affects the
design. You can cluster your domain controllers (or member servers), but an important
design rule to consider is this: all nodes must be part of the same domain. A simple
design consideration is that you never install services like SQL on top of a domain
controller; otherwise, your hardware requirements will go sky high. When designing a
Windows 2000 clustered solution, you’ll want to separate services as much as possible.
Make sure when you cluster your domain controllers that you also take traffic overhead
into consideration. Now, you’ll not only have to worry about replication and
synchronization traffic, but also about management heartbeat traffic. Be cautious
about how you design your domain controllers and, when they’re clustered in future
chapters, I’ll point this out to you again.
Limitations of Clusters
But I thought clustering would be the total solution to my problems? Wrong! Clustering
works wonders, but it has limits. When designing the cluster, it’s imperative for you
to look at what you can and can’t do. Again, it all comes down to design. What if you
were considering using Encrypting File System (EFS) on your clustered data? Could
you set that up or would you need to forego that solution for the clustered one? This
question usually doesn’t come up when you’re thinking about clustering a service
because all you can think about are the benefits of clustering. You should highlight
what you might have to eliminate to support the clustered service. In the case of EFS,
you can’t use it on cluster storage. That said, you’ll also need to use disks on cluster
storage configured as basic disks. You can’t use dynamic disks and you must always
use NT file system (NTFS), so you won’t be able to use FAT or any of its variations.
You must also only use TCP/IP. Although in this day and age, this might not be
shocking to you, it could be a surprise to businesses that want to use Windows 2000
clustering while only running IPX/SPX in their environments. This is something you
should consider when you design your clustered solution.
Capacity Planning
Capacity planning involves memory, CPU utilization, and hard disk structure. After
you choose what kind of clustered model you want, you need to know how to equip
it. You already know you need to consider the hardware vendors, but when you’re
capacity planning, this is something that needs to be fully understood and designed
specifically for your system.

No comments: