15 Replies Latest reply on Sep 13, 2007 2:55 PM by Al Baker

    Which way to go with clusters

    The ScareCrow Level 1
      Hi All,

      Just after some information here, maybe some links if anyone has some.

      In the new year I will be setting up a new Intranet for my work.
      At least 3 web servers, 2 db servers (sql server)

      Which is the "best" way to go with setting up the server cluster, use windows or coldfusion to cluster ?

      How are people doing this out there ?

      Ken
        • 1. Re: Which way to go with clusters
          Al Baker
          Ken

          Have you received an answer to this one? I am moving a domain from a single server running MX 7 Enterprise to a two server cluster running MX 8 Enterprise that also includes a dual "LVS Load Balancing and Firewall".

          I've never set up a clustered environment before. Is there a white paper, knowledge base article(s), or any other kind of documentation on moving a domain from a single server to a cluster using MX 8? I can't seem to find any good documentation on this.

          Also, I'm wondering about all kinds of things, such as:

          1) Do I need the "LVS Load Balancing and Firewall"?
          2) What about file uploads and using cffile to write them to both servers?
          3) Are there any coding issues besides using j2ee sticky session variables? (And are there any issues switching to j2ee session variables?)

          I could sure use yours or anyone else's help in figuring all this out!

          -al
          • 2. Re: Which way to go with clusters
            Grizzly9279 Level 1
            This is a good a place to start as any:
            http://www.adobe.com/devnet/coldfusion/clustering.html

            Your approach at clustering will largely depend on the type of application you're running, and what it's specific requirements and usage are. From my experience, I've found that applications which rely heavily on complex data structures (e.g...CFCs) stored in session are probably best suited for a ColdFusion round-robin clustered, "sticky sessions" setup.

            In our production environment, we have 2 load balanced IIS web servers (via Microsoft NLB - network load balancing), each having a JRun connector pointing to a cluster of two ColdFusion application servers. We have NLB configured with IP affinity for the web servers, and the ColdFusion cluster configured with "sticky sessions". So when a user comes in, they tend to get pinned to one web server, and one application server. If any of the servers go down in the middle of that users session on our system, they'll seamlessly get transferred over to the other server. Flip-flopping between web servers (w/ seperate JRun connectors), and between app servers (w/ seperate ColdFusion instances running) works just fine.

            To answers Al's specific questions:
            1) Hardware load balancing may not be necessary in your case. You can use the clustering and load balancing capabilities built directly into ColdFusion. This is desirable since it should handle fail-over scenarios much more reliably, particularly when a ColdFusion service gets "hung" or "stopped".

            2) You should consider using a "shared" file storage location (e.g...a network drive) This of course does introduce a single point of failure however, so from a fault-tolerance perspective, you may find that problematic. If that is a concern, than you can consider investing in software to manage a file-cluster on a SAN for example. We're currently leveraging Microsoft Windows 2003 Enterprise, and it's built-in File-Clustering management capabilities to address this. We've had great success with it.

            3) I've not run into any specific issues with J2EE session vars. Enabling them was a seamless change for us. The only "gotcha" you should probably be aware of, is if you're still using ColdFusion 7.1 or earlier, there are issues with session-replication when CFCs are stored in session. ColdFusion 7 is not able to serialize CFC objects, so when a session is replicated to another session (in a failover scenario for example), CFC data is often times lost. It is for this very reason that we've chosen to use sticky sessions. (We're still on CF 7) It is my understanding that this issue was "fixed" in ColdFusion 8, since CFCs are now serializable.

            Anyways, I hope this helps! Feel free to reply back with any other questions.
            • 3. Re: Which way to go with clusters
              nummsa
              quote:

              Originally posted by: Al Baker
              Ken

              Have you received an answer to this one? I am moving a domain from a single server running MX 7 Enterprise to a two server cluster running MX 8 Enterprise that also includes a dual "LVS Load Balancing and Firewall".

              I've never set up a clustered environment before. Is there a white paper, knowledge base article(s), or any other kind of documentation on moving a domain from a single server to a cluster using MX 8? I can't seem to find any good documentation on this.

              Also, I'm wondering about all kinds of things, such as:

              1) Do I need the "LVS Load Balancing and Firewall"?
              2) What about file uploads and using cffile to write them to both servers?
              3) Are there any coding issues besides using j2ee sticky session variables? (And are there any issues switching to j2ee session variables?)

              I could sure use yours or anyone else's help in figuring all this out!

              -al


              We've been using an LVS powered cluster for 2 years now with a high visit count. I can absolutely attest for this setup being production quality albeit the setup for this LVS system is a true project.

              I have redundant LVS "directors" talking via hearbeat over rs-232 null modem for automated failover that essentially takes 5 seconds - unless your upstream switch has a huge TTL on the ARP cache.

              We have only two "application server" nodes still running MX7 (while we test 8 still) that are running in the Linux environment. There are many considerations that you need to make, here are just a few:
              1) Only use the "WRR" LVS scheduler.
              2) If you absolutely need to use the ColdFusion SESSSION variables, you must setup LVS persistence for the same exact timeout as that you're CF application(s) are setup for.
              3) If you want more true "clustering", turn off LVS persistence and build new or retrofit all CF applications to use the CLIENT variables instead.
              4) If you want to do file uploads (cffile) or cfforms - you'll need a shared filesystem such as NFS

              Our application servers share files over an NFS mount which has proved itself to be VERY stable. All our CF applications use only the CLIENT variables to make sure it's closer to a true "cluster". The "WRR" LVS scheduler is great because if a node goes down it will immediately take it out of the "pool" unlike other LVS schedulers (learned that the hard way). We have an NFS mount for all the coldfusion /lib/ files so that all configuration elements are shared between the nodes which makes administration that much easier for simple things like DSNs. This still causes problems for other configuration elements but is great for the stuff we change often. All the .CFM files are served off another shared NFS mount so that they all have the exact same content. We've configured LVS to be smart enough to know if a node is down or the NFS mount is down for that node through "request/receive" functionality built into the ldirectord package commonly bundled with LVS.

              Beyond that - I think it's a great solution for HA. We've run numerous load-tests on this 2 node configuration and it's been absolutely fantastic.

              I'd love to get a writeup going and have considered it - but it's just a horrific amount of details. If you want to undertake this project - give yourself 2 months with all hardware in hand to get it properly configured. That and DO NOT forget about the always present ARP cache issue that you'll need to face at one point in time.
              • 4. Re: Which way to go with clusters
                The ScareCrow Level 1
                Al,
                No I have not received an answer as yet. But your post has got a couple.
                I find the quote
                quote:

                Your approach at clustering will largely depend on the type of application you're running, and what it's specific requirements and usage are
                from Grizzly9279 interesting as I would have thought it should make no difference.

                I have an Intranet for a large Government department. This has both cf/static web sites and "application" web sites.

                Ken
                • 5. Re: Which way to go with clusters
                  Grizzly9279 Level 1
                  RE: Ken (The ScareCrow)

                  I know what you mean, I used to feel the same way. But the more you get into clustering technologies (particularly with ColdFusion), the more you'll learn that there are a lot of variables that go into it.

                  The architecture you choose is going to be at least in part driven by your application's dependencies on in-memory state variables (SESSION, APPLICATION, CLIENT, etc),

                  Does your intranet site rely heavily on SESSION vars? If it doesn't use SESSIONs at all, than this process does become much more simple. At that point, the choice between software load balancing and hardware load balancing becomes more of an infrastructure/architecture driven decision, and less of an software/application driven decision.

                  I think nummsa's recommendation to give yourself 2 months to "feel it out" is a good one. We actually needed an entire summer to get things to a point where we were comfortable.
                  • 6. Re: Which way to go with clusters
                    nummsa Level 1
                    quote:

                    Originally posted by: Grizzly9279
                    ...
                    Does your intranet site rely heavily on SESSION vars? If it doesn't use SESSIONs at all, than this process does become much more simple. At that point, the choice between software load balancing and hardware load balancing becomes more of an infrastructure/architecture driven decision, and less of an software/application driven decision.
                    ...



                    Grizzly,

                    This is a very good point and my solution a while ago was to force utilizing CLIENT variables. CF has limited the CLIENT variables to only being simple data types, i.e. an integer or a string. This means that you can't store Structs or Arrays in the CLIENT variables like you can in the SESSION variables. What I came up with essentially saves and recalls all CLIENT information in WDDX packets... I have an unfinished writeup here:
                    http://www.overset.com/2006/09/18/memento-pattern-and-client-variables-in-coldfusion/

                    I think this is something you cannot avoid when doing clustering, otherwise you'll have to use persistent connections across the boards. One very important part of this to take into consideration is that persistent cluster connections are flawed because they mostly maintain a persistent connection by saving the IP address of the originating guest. This means that anyone accessing your site through a pool of web caching proxies so that every "hit" originates from a different IP address, this will break persistent clustering. An example is of AOL users... Say you have a customer using your e-commerce system that uses the SESSION variables and you have persistence setup on the cluster based off the originating IP address of the guest. AOL users go through a large pool of caching web proxies (those that use the internal web browser within AOL's software) and then each hit with a different IP address will have a different session. There are ways around this problem, but LVS specifically does not solve this. You would basically need a way to do Layer 7 (application level) persistent clustering which is not easy.

                    In sum, It makes the most sense to use CLIENT variables that are stored in a database so that all cluster nodes have the same access to it.
                    • 7. Re: Which way to go with clusters
                      Grizzly9279 Level 1
                      RE: nummsa

                      I know what you're saying, and yes, one approach is to abandon SESSION vars altogether and convert your application to use CLIENT vars excusively for persistent data.

                      If this is not possible (such as with our current application), you can utilize ColdFusion's software clustering and load balancing technology. (as an alternative to your LVS approach) ColdFusion's software approach to clustering works great, and it doesn't go off of IP address alone. Instead, it uses the combination of your J2EE SESSIONID, CFID, and CFTOKEN to identify you as a unique user. (this will generally require users to have cookies enabled however)

                      You said:
                      "You would basically need a way to do Layer 7 (application level) persistent clustering which is not easy. "

                      And I'd rebut by saying it actually *IS* easy, since ColdFusion Enterprise Edition supports this out of the box! :)

                      • 8. Re: Which way to go with clusters
                        nummsa Level 1
                        quote:

                        Originally posted by: Grizzly9279
                        And I'd rebut by saying it actually *IS* easy, since ColdFusion Enterprise Edition supports this out of the box! :)



                        Grizzly,

                        I'll eat my words and say you're right - but if I remember that this is only true in the windows environment. I unfortunately had the limitation of using the linux environment which is why I referred to it being not easy. I haven't looked since 2 years ago but there might be a graceful way to do this type of clustering. And it's actually really good to hear that you've had great luck with the session-based persistence as it comes with coldfusion - good news!

                        I don't know why it bothers me so much but I don't like having persistent connections to a cluster. It somewhat defeats the purpose of a cluster and I just see having a non-persistent cluster just being that much more efficient. In the end, a cluster is a cluster and the whole concept of High Availability and scalability is just greatness.
                        • 9. Re: Which way to go with clusters
                          Grizzly9279 Level 1
                          RE: nummsa

                          Yeah...I can't vouche for Linux at all, so you might be right! We did used to run ColdFusion 5.0 on Solaris back in ~2002, but it certainly wasn't clustered.

                          Since then we've moved over to Windows exclusively, only because experience had taught us that Windows was really the "bread and butter" platform for hosting ColdFusion. (and that opinion may be out-dated at this point) And it was only after this decision that we starting looking into clustering options.

                          I whole-heartedly agree with you that the "sticky sessions" approach to clustering (having a user "pinned" to one server) is less than ideal. The only reason this becomes necessary is to work around application-level constraints and dependencies. Since we have an application that is woefully and almost indefinitely married to the SESSION scope (w/ complex structures), we really have no choice but to stay with the "sticky sessions" approach. I can assure you that if we did not have this dependency, we would almost certainly opt for a more traditional hardware approach to load balancing.

                          In the end, you're right however. It shouldn't really matter which approach you take. If you have 1000 users accessing the cluster, the "weight" of that processing should be evenly distributed across the cluster regardless of your clustering approach. The "sticky sessions" approach *does* introduce a higher probability for one server to experience a small period of unecessarily high load, but over a longer period of time, the load should find itself fairly evenly distributed.

                          In the end, I think we're in agreement here :) I hope this discussion will be helpful to others who aren't sure which approach to take with clustering ColdFusion application servers.
                          • 10. Re: Which way to go with clusters
                            The ScareCrow Level 1
                            Thanks for all the usefull information guys.

                            Grizzly
                            quote:

                            Does your intranet site rely heavily on SESSION vars?

                            While I would not say heavily, There are a number of sites that have logins and use session vars.
                            We can't use cookies as we have a few thousand pc's that are "locked down" and thus have basically everything turned off.

                            I can see from the posts here that I'm going to have to do a lot of research. Which I expected, but wanted to get a feel from the community first.

                            Just some information incase someone has a particular solution to match.

                            The Department has about 14 thousand pc's that need to access information very regularly.
                            Of that about 2 thousand are "locked down". Can't download anything, can't install anything, cookies turned off, ect...

                            I'm looking at 5 Web servers (windows, CF8) with 2 db servers (sql server).

                            While I would prefer not to have to rebuild the applications, for some it has been a while since they were built and it might be of benefit to upgrade to the latest version of CF.

                            Ken
                            • 11. Which way to go with clusters
                              Grizzly9279 Level 1
                              RE: Ken

                              Interesting...
                              If you have application(s) that are in someway relying on SESSION vars, and you don't see it being possible to re-write the application(s) to use CLIENT vars instead (which is completely understandable), than I would recommend using ColdFusion's built in clustering and load balancing capabilities, with "sticky sessions" enabled. Though not strictly necessary, "sticky sessions" could prevent some of the unecessary overhead of session replication from one server to the next as users are bounced around from server to server.

                              Do you have a ballpark idea on how many concurrent users you can expect at any one time? For our site, we only see < 2% of our total user-base clicking around our site at any given time. So if we had ~10,000 total users, less than 200 of them might have active sessions at any given time. And an even smaller subset of that will have active threads being processed at any given time (maybe 15-20?).

                              2 "beefy" intel xeon boxes (in our IBM BladeCenter) are more than enough for our load. They're hardly sweating most of the time. Based on some jmeter tests I've done, our 2 servers can handle up to ~10x more traffic than they currently see.

                              For you, I'm thinking 5 boxes dedicated to the web/app role might be overkill, but it really will depend on that number of concurrent users (and concurrent threads) you expect to be processing. It will also depend on how "heavy" your application is from a pure processing perspective. You may want to start out with only 1 server, run some tests, and see what your limits are. Then, incrementally add another server and see what you gain from it. Then, add a third server and see if the percent gain is the same as it was from adding the second server. See what I mean?

                              I'd highly recommend using jmeter to simulate various traffic loads to see how your servers perform under expected load (and perhaps peak load). Jmeter allows you to record a web session, including whatever pauses you take as you click around a web site. You can then take that recording and simulate a few hundred, or a few thousand users doing exactly what you did. It's very cool stuff...and very useful in determining how many app servers you're doing to need.

                              Anyways, best of luck with it Ken. It sounds like you have the resources, and hopefully the time to make sure this is done "right" the first time. Feel free to come back with questions if you run into any obstacles.

                              RE: the ~2000 PCs that don't allow cookies, you may run into problems with those. How do those users currently navigate intranet sites that require SESSIONs? Are those sites set up to pass CFID and CFTOKEN on every link?
                              If so, you may run into issues once J2EE session variables are enabled. You'll need to test that out to really see where you stand. Again, best of luck!
                              • 12. Re: Which way to go with clusters
                                Scott_thornton Level 1
                                Ken,

                                you are an aussie, and in health i think? I'd like to hear from you about how you are going about it. At hunter health we are in the process of setting up two clustered environments, one for our extranet, and one for my application. the one for my app has stalled as we are waiting on the experiences gained from the extranet site...

                                I am migrating from SQL 2000 to a clustered 64bit 2005 db over the weekend...

                                scott.thornton at hnehealth.nsw.gov.au

                                cheers,
                                • 13. Re: Which way to go with clusters
                                  Al Baker Level 1
                                  Grizzly,

                                  Your comment, "We're currently leveraging Microsoft Windows 2003 Enterprise, and it's built-in File-Clustering management capabilities to address this. We've had great success with it." appears to be the file upload solution I *think* I'm looking for. I finally got around to trying to Google it and am getting lost in an avalanche of data. Do you have a link or links I can use to get more specifics on how to set this up?

                                  -al
                                  • 15. Re: Which way to go with clusters
                                    Al Baker Level 1
                                    Grizzly,

                                    Thanks. What you sent me and other links pointed me to DFS Replication as a potential solution. Our environment is going to have a large number of domains that live on one web server or the other and a single web site that will use ColdFusion clustering to load-balance across both servers. It looks like I can use DFS Replication on the folder structure in the single web site so that any changes to the web site files are replicated back-and-forth between the two servers. That is, assuming I'm reading it right!

                                    -al