Skip navigation
andrew.large
Currently Being Moderated

How to determine when FMS is starting to get overloaded?

Aug 5, 2011 2:59 PM

We have a set of FMS's deployed on Amazon's EC2.  One of the things we want to be able to do is automatically detect when we should start up another FMS instance.  To do that, I've been looking for metrics I could measure on the local FMS box to help me identify "transition" points, e.g., when we should add capacity or remove excess capacity.

 

I ran some load testing to find out where the capacity limits of a particular box, but ran into a couple of problems:

 

   * Traditional system metrics (cpu/memory/run queue length) did not do a great job of predicting when we'd hit a wall.  Load was really the only thing that seemed to climb much and it was only at about 4 (on a 4-core box) when things went south.

 

   * When we *did* hit a wall, it was a pretty sharp cliff.  We seemed to be doing fine at 70+70 streams (~300kbps streams in reflected out) and at 75+75 streams, but when I went to 80+80 streams, BAM!  Things just started unravelling.  With very little in the way of error logs to indicate what might be happening.  But all of the sudden, my counters for simultaneous streams/etc dropped down from 80ish to 20ish (I was still publishing 80 to the server).

 

I tried bumping up the EC2 instance size (under the theory that we were being bandwidth-capped or stream-capped), but didn't really see much difference.

 

I see two possibilities:

 

  * We actually are being bandwidth- or stream- capped and going up to a bigger box didn't help

 

  * There are a number of other metrics on the server I could look at that would have shown a gradual degradation.

 

Assuming the latter, does anyone have any suggestions for what metrics I might measure on the FMS to decide if we were starting to get loaded?  For example, I've thought about comparing Stream.time to NetStream.time for streams I'm reflecting out of the server.

 
Replies
  • Currently Being Moderated
    Aug 6, 2011 6:14 AM   in reply to andrew.large

    You can determine that your FMS server is overloaded when your server does any of the followi

    ng ....phisically breaks down, runs out of ram, has no bandwidth, you have server side code that is inefficent or you have a  client side file that is making 100 connections per client.

     
    |
    Mark as:
  • Currently Being Moderated
    Aug 8, 2011 5:42 PM   in reply to andrew.large

    Well its like this you need to know how many users your server can handle before it fails there is no real way to load test it without haveing alot of people log into your application. FMS theoretically can handle as many users as you can throw at it before your hardware either fails or runs out of resources such as ram. So if you are unable to monitor your server and determine ok when say 100 more users log on it is going to fail at which point your only option is to say add a new server loadbalancer or edge server. or put a message up saying our site has failed please come back when we have bought more junk to support you. By the way facebook we have more users than you hahahah I say this because your site will probably never acheive more users than your server can handle unless you are using a desktop computer to run FMS.

     
    |
    Mark as:
  • Currently Being Moderated
    Aug 9, 2011 1:21 PM   in reply to andrew.large

    I've pretty much read the entire server side action script doc and have never seen anything like what you are looking for You probably need to use a hardware appliance to monitor the network so you can determine when you are about to have problems. However if you can Identify the variables you wish to monitor you might write a C++ application to alert you. Its possible to use a C++ application to authenticate users with FMS so you might also be able to write a C++ app and use it within FMS for your purposes. I don't have much exsperience doing this so I emphasize that this has medium chances for success. Waste your time with my suggestion at your own risk so to speak.

     
    |
    Mark as:
  • Currently Being Moderated
    Aug 9, 2011 3:39 PM   in reply to andrew.large

    EC2 provides no QOS guarentees.  Your answer will be variable.  That said, more than likely it is an ec2 vm or network resource being exhausted, and probably not FMS related, with the exception of FMS creating the load.

     

    However, if there is something wrong in FMS, you would need to see what the logs show.

     
    |
    Mark as:
  • Currently Being Moderated
    Aug 9, 2011 5:04 PM   in reply to andrew.large

    Thanks everyone for bringing this up and proceeding with some discussions. I would put up my quick thoughts on this one.

    Any load testing , as mentioned would start with CPU and Memory metrics. So is for FMS.


    For a live case, CPU usage, for default FMS configuration would be little high. This is because of the aggregate messages and other queues that are maintained. One can disable these (application.xml) to considerably reduce the CPU usage.

     

    Memory starts increasing as more streams are being served, but it will get stabilized, in my experience, for 1200 connections, all playing a 500 kbps stream, i would expect a memory usage of somewhere around 2-3 GB. (i would confirm the numbers if needed for accuracy, later).

     

    One other good thing to look for is Buffer Length on the subscribers. An abnormal increase in its value shows the server is unable to fill the buffer of the client well in time.

     

    Another related option is to look for frequent NetStream.Buffer.Empty and Netstream.Buffer.Full codes, if they are coming up too fast, it means the buffer on the client side is emptied faster than what we want.

     

    Latency is by far, the best identifier. Mark the deviation of the subscribers from the 'actual' live, queues and aggregation of messages will play a part here again.

     

    There are core logs enabled for any system over load (more than 90%) of FMS CPU. Watch out for these logs. Till the point one wont find them, i am sure the FMS is doing good.

     

    Another option to take a look at is fibers. You can either enable and disable them for perf differences.

     

    In the end, there must be some benchmarking each one of us should do, in order to find the just_before_fail_state. We keep doing that internally, with lots of load and expecting it to crash

     
    |
    Mark as:
  • Currently Being Moderated
    May 11, 2012 4:08 AM   in reply to Nikhil Pavan Kalyan

    Hi Nikhil,

     

    Can you suggest some ways to load test FMS itself.

     
    |
    Mark as:
  • Currently Being Moderated
    May 12, 2012 7:19 AM   in reply to Raj__S
     
    |
    Mark as:
  • Currently Being Moderated
    May 14, 2012 3:03 AM   in reply to SE_0208

    Hi SE_0208,

    any free tools for this?
    (LoadRunner seems to be HP tool and somewhat pricey)

     

    Can we also use fmscheck like suggested here: http://www.richinternet.de/blog/index.cfm?entry=6EA082F4-A85E-FD95-A8A B8C7A1770D09A ?

    And please, if yes, where and how should we check test success/failure?

     

    Thank you in advance.

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points