Customizing Autoscale Functionality in OpenShift - Archived

OpenShift’s autoscaling functionality is one of the most asked about features of OpenShift. The autoscaler is a self contained daemon (called haproxy_ctld) that runs inside your gears and it watches HAProxy (the load balancer OpenShift uses to balance across several gears). As the number of people requesting access to your site goes up, haproxy_ctld sends a request to the broker to add more gears. As the number of concurrent users on your app goes down, haproxy_ctld will tell the broker to start destroying gears.

How it Works Today

This actually works really well as a default for auto-scaling behavior. To look at some scaling and queue theory, consider this… If you have just one user using your app, it would be extremely rare that adding more gears would cause better performance. Scaling out doesn’t really make things ‘faster’. Scaling out, by adding more virtual machines, gears, or whatever, has one specific goal in mind. Try to handle more users, in the same amount of time.

Visually the example most people like to think about is the number of cash registers at a fast food restaurant. If one person shows up to order a burger and fries, if 5 cash registers are open, that person isn’t going to get their food any faster than if only one register was open. Where you see the benefit is when lots of people suddenly show up, more registers makes it easier for them to get their request in and get their food.

So, back to a web example, if lots of people show up to use your site suddenly and things seem slow, the easiest thing to do is add more gears. I’ll also point out, we’re assuming app layer issues here, not data layer issues. More gears means more processing power and this added capacity will help serve these users.

There’s another issue here worth mentioning and that is “what happens when my app is behaving slow even though more users haven’t shown up?” The auto-scaler works well in this scenario as well. If you’re serving a page and it took 100ms before, but suddenly it takes 1,000ms. The net effect is that at any point in time, more concurrent requests are active and haproxy will add more gears. Back to the burger example, if it’s suddenly taking twice as long to take peoples order and give them a burger, the lines are going to get longer just as it would if more people showed up. Though this is all assuming the cooks (data layer) are infinitely fast.

It’s important to remember that haproxy watches these concurrent connections and they go up and down. We recently added some better moving average metrics to make the behavior more consistent. There remains one bug on the scaler today which causes the ‘current concurrent users’ metric to be less reliable than we’d like and are working on a fix. What if this isn’t the behavior you want to exhibit?

I don’t want to use concurrent users!

So what if you don’t want to use concurrent users, or you want to tune some of the parameters of the current auto-scaler? Well, to do this you just need to implement your own auto-scaler daemon (haproxy_ctld). Don’t worry, this isn’t as complex as it might seem. Since this daemon runs inside your own gear, you can customize it as you wish.

The basics are this, create your own haproxy.ctld.rb and make it executable.

.openshift/action_hooks/haproxy_ctld.rb

and place it in your git repo. The haproxy_ctld.rb that ships by default with OpenShift is well documented for this purpose and can be easily adapted. See for yourself:
https://github.com/openshift/origin-server/blob/master/cartridges/openshift-origin-cartridge-haproxy/usr/bin/haproxy_ctld.rb

Lets see an example

First, lets create a scaled app that can use this sort of functionality:

rhc app create -a scaletest -t php-5.3 -s

Next, lets add the haproxy_ctld.rb file to it:

wget -qO./.openshift/action_hooks/haproxy_ctld.rb \
https://github.com/openshift/origin-server/raw/master/cartridges/openshift-origin-cartridge-haproxy/usr/bin/haproxy_ctld.rb
chmod +x .openshift/action_hooks/haproxy_ctld.rb
git add .openshift/action_hooks/haproxy_ctld.rb
git commit -a -m "Added my own custom haproxy_ctld.rb"
git push
rhc app restart -a scaletest

After the git push and app restart, you are now running your own custom auto-scaler. The problem is your custom scaler is identical to our default scaler! So, lets look at making some changes. Perhaps we want to check more often, we could update @check_interval on line 85 to be 2 instead of 5. git commit, git push and you’re good to go.

In reality though, most people will want something more advanced. One of the most difficult limitations is that many people want to make scaling decisions about their app based on some metric on the compute gears. This haproxy_ctld daemon only runs on the haproxy gear though.

To work around this we can re-purpose the haproxy ssh key to ssh into our remote gears and look at various metrics. For example, lets say we want to scale up based on memory usage, we could ssh into the gears and find that information. The below example will scale up as soon as any gear has more than 10MB used. The full haproxy_ctld.rb changes are attached to this blog and available for download:

      # Scale up when any gear is using 400M or more memory.
      mem_scale_up = 419430400

      # Scale down when any gear is using 300M or less memory
      mem_scale_down = 314572800

      # min_gears - Once this number of gears is met, don't try to scale down any lower
      min_gears = 2

      gear_list['web'].each do |uuid, array|
        mem_usage = `ssh -i ~/.openshift_ssh/id_rsa #{uuid}@#{array['dns']} 'oo-cgroup-read memory.memsw.usage_in_bytes'`.to_i
        if mem_usage >= mem_scale_up
          @log.error("memory usage (#{mem_usage}) on #{array['dns']} is above threshold(#{mem_scale_up}), adding new gear")
          self.add_gear
        end
      end

To break this apart, first we use the ssh command to get an integer value from cgroups that includes our current mem_usage:

ssh -i ~/.openshift_ssh/id_rsa/$UUID@$gear_dns 'oo-cgroup-read memory.memsw.usage_in_bytes'

Next, if that mem usage is greater than 10MB, issue a self.add_gear:

self.add_gear if mem_usage >= 10000000

Scaling down is much the same, however we have lots of protections around scaling down by default. The idea here is that we don’t want to constantly scale up and down based on the load that second. It does take time to add and remove new gears and some frameworks deal with adding and removing gears systems more gracefully than others.

In OpenShift we’ve decided to err on the side of performance. Scale up as soon as possible when resources are needed, only scale down when we’re confident the load is gone. This flap protection helps provide some boundaries for OpenShift to not constantly scale up and down and up and down. For now, lets ignore the flap protection and do a check similar to above, but scale down when memory usage for all gears is less than 10Mb:

      mem_gear_flag=0
      gear_count=0
      gear_list['web'].each do |uuid, array|
        mem_usage = `ssh -i ~/.openshift_ssh/id_rsa #{uuid}@#{array['dns']} 'oo-cgroup-read memory.memsw.usage_in_bytes'`.to_i
        gear_count += 1
        if mem_usage <= mem_scale_down
          memory_gear_flag=1
        end
     end
     if mem_gear_flag == 0 and gear_count != min_gears
       @log.error("All gears below threshold(#{mem_scale_down}).  Removing gear. #{gear_count}:#{min_gears}")
       self.remove_gear
     end

Next to test this we need to create some load and ‘code bloat’. For this we will use a very inefficient PHP script to use up memory. This app creates memory bloat that our auto scaler will detect and issue scale up and down events on:

  $chunk = str_repeat('0123456789', 128*1024*70);
  print 'Memory usage: '. round(memory_get_usage()/(1024*1024)) . 'M<br />';
  sleep(5);
  unset($chunk);

After pushing this code into our app, we can trigger scale up and down events by tightly controlling the concurrent connections and AB:

ab -t 100000 -c 5 http://stest-demo.rhcloud.com/

You can increase overall memory usage by increasing the “-c” flag. For example -c 10 would use twice as much memory as -c5. Then, just tail the scale_events.log in haproxy/logs/ in your gear and watch for scale up and down events:

E, [2014-02-04T20:43:58.754323 #245373] ERROR -- : memory usage (453468160) on 52f00e07500446c679000238-mmcgrath.rhcloud.com is above threshold(419430400), adding new gear

Then kill the ab or reduce -c and the gear will be removed:

E, [2014-02-04T20:49:03.582673 #245373] ERROR -- : All gears below threshold(314572800).  Removing gear. 3:2

This is obviously a very basic change to the auto scaler, I wouldn’t recommend it for any production uses but it should serve as a guide for those wanting to create their own auto scale algorithms. An even more advanced usage is to have the application itself call directly into the broker. In this scenario instead of having an auto-scale daemon (haproxy_ctld), the app itself would have it’s own metrics and data in order to decide when to issue scale up and scale down events.

I’ve also attached both the ‘slow.php’ – https://www.openshift.com/sites/default/files/slow.php_.txt and ‘haproxy_ctld.rb’ – https://www.openshift.com/sites/default/files/haproxy_ctld.rb_.txt referenced in this talk, feel free to take them and customize as needed.

Next Steps

Categories
OpenShift Online
Tags
,
Comments are closed.