Posts tagged: worker

Office Grid Computing using Virtual environments – Part 3

By , Friday 4th December 2009 11:37 pm

Introduction

I work in a company where we run many batch jobs processing millions of records of data each day and I’ve been thinking recently about all the machines that sit around each and every day doing nothing for several hours. Wouldn’t it be good if we could use those machines to bolster the processing power of our systems? In this set of articles I’m going to look at the potential benefits of employing an office grid using virtualised environments.

In part 2 we looked at the jobs a server will run, and how jobs should be configured in order to achieve greatest amount of processing whilst ensuring that each job is processed without fail.

Setting up your worker – or LiMP server

The next step in the process is to set up your virtual workers. For this I’m going to use an installation of centOS using VirtualBox. I’m going to install mySQL and PHP on the server, also known as a LiMP (Linux, mySQL, PHP) Server  (I may have made that name up).

  • Install VirtualBox on your windows machine (follow link)
  • Download and install centOS (current version 5.3) within a created virtual machine

There’s no point me going to this there’s probably 1,000’s of great tutorials out there (ok, here’s one: Creating and Managing  centOS virtual machine under virtualbox). The important point to note I suppose is that I called my virtual machine GridMachine.

As far as my choices of virtualisation client and operating system go there is no big compelling reason for each choice. VirtualBox is something I use on my home machine and is supported by the three major operating systems. I chose centOS as its a good stable OS and I use it on my own web server. I am a great believer in the right tools for the job (although I’m applying ‘use the quickest and easiest for you’ mentality here), so if operating system X runs your code quicker and more efficiently use that instead :)

Importantly make sure that your VM uses DHCP, otherwise for each new virtual machine would need to be configured separately which is something we don’t want.By using DHCP we don’t need to configure network settings individually for worker machines, DHCP will hand out IPs for you. Therefore you can copy your virtual machine about the office without worrying about setting each one up (this improves scalability and reduces worker administration).

The process you should aim to achieve would be to obtain a new physical machine, install VirtualBox, and then pretty much deploy the virtual image without much else. It might be wise to setup all your workers on a different subnet so that you can at least see how many machines are running. You’ll also need to set up your machines on a long lease or unlimited lease DHCP.

How to run Jobs on the worker

This is an interesting area and there are several valid methods for processing jobs on the worker. Here I’ll just discuss the two most obvious:

  • Perpetually running script: A script, be it a shell script, or a PHP script is executed once on the worker and runs as part of an infinite loop. I’ve discounted this method as one crash of the script and potentially your workers will cease to run without some sort of intervention.
  • Cron based script execution: Every X minutes the cron daemon kicks off a call to your script to get things going. Without some checking this could lead to many many copies of your worker script running.

My decision was to go with cron which kicks off a shell script every 10 minutes.  My shell script performs the following tasks:

  1. Get a process list and grep this for ‘php’. If not found then continue.
  2. Call your job code, in my case this would be something PHP based
  3. Worker script completes its run
  4. Ready to go again on the next appropriate call

My bash script looks something like the following:

#!/bin/sh
if ps ax | grep -v grep | grep php > /dev/null
then
    echo "Job is currently processing, exit"
else
    echo "Job is not running, start now"
    php yourJobProcessingScript.php
fi

Note: the echo’s are almost completely pointless, but may help the next person who comes along to try and edit them.

That concludes the set up of the worker virtual machine, quick, simple, and easy to copy to each new piece of hardware that is received. The ‘cleverness’ of the grid system really isn’t in the visualised OS, its all to do with the code created to process jobs, the job configuration, and in making sure that the job runs when appropriate (i.e. when the host is idle).

Setting up Windows to Initialise Workers

The first task is to work out the command required to run the virtual machine from the windows command line. If you’ve installed virtualBox in the default location and you’ve named your worker GridMachine then the command required to load up your worker is:

"C:\Program Files\Sun\VirtualBox\VBoxManage.exe" startvm GridMachine

However to run the script in a ‘headless’ state we need to use:

"C:\Program Files\Sun\VirtualBox\VBoxHeadless.exe" -startvm GridMachine --vrdp=off

This will start the virtual machine without the GUI and allow it to save state gracefully. The second argument turns off RDP so it doesn’t conflict with windows RDP, or give you a message about listening on port 3389. The virtual machine name is cAsE sEnSiTiVe!

Next, we’ll need to set windows up to kick off our worker VM once the machine has been idle. To do this (on Windows XP) you’ll need to go Start -> All Programs -> Accessories -> System Tools -> Scheduled Tasks as below:

scheduled tasks

Next click on ‘Add Scheduled Task’ followed by browse to add a custom program. Navigate to your VBoxManage script and click ok. Schedule your task for any of the options (we’ll change this in a minute) and continue. After skipping the next screen windows will ask you who you want to run this task, I’d suggest either ‘Administrator’ or creating a new privileged user. Remember we don’t want to interfere with the standard staff account on the machine at any point. Click next and check show advanced options for this task.

To the end of the run textbox add our ‘startvm GridMachine‘ string and ensure that run only when logged in is left unticked. Visit the schedule task next and change the schedule drop down to the option ‘when idle’, choose the amount of time you’d like the machine to be idle before moving on to the next tab.

Finally untick the option which states stop the task if it has been running X amount of time, but do tick the option to stop the task if the machine is no longer idle.

schedule

That’s it then for the windows host setup!

Summary

In this part we have set up a virtual machine to act as a worker, as well as the way in which we call and execute our job processing scripts (for myself a PHP script). From here we look at how to set up our copies of windows to start up the virtual machine in headless mode when the computer becomes idle, and save its state when the user resumes usage of the machine. Hopefully at this point you’re seeing how simple it is to set up such a system and are itching to get some experiments going yourself!

Next time

In Part 4 we’ll be looking at using tools to ensure that you’re running the latest version of the code and data sources so that obtained results are always up-to-date with the latest business information and logic.

Office Grid Computing using Virtual environments – Part 5

By , Friday 4th December 2009 11:03 pm

Introduction

I work in a company where we run many batch jobs processing millions of records of data each day and I’ve been thinking recently about all the machines that sit around each and every day doing nothing for several hours. Wouldn’t it be good if we could use those machines to bolster the processing power of our systems? In this set of articles I’m going to look at the potential benefits of employing an office grid using virtualised environments.

In Part 4 we looked at using tools to ensure that we’re running the latest version of the code and data sources so that obtained results are always up-to-date with the latest business information and logic.

Pre-Deployment

Before deploying your grid system if there’s one thing you do and one thing alone it’s benchmark your current system! No matter what you tell colleagues about how much extra work your system is going to do unless you have numbers to back this up your guarantees are nothing. So,

  • how many records can you process currently? Per Day? Per Hour?
  • How long does it typically take to turn around a job?
  • How much more capacity do you have?

There’s also additional questions:

  • If your processing server (or one of your processing servers) goes down how will this affect your capabilities, will you be crippled?
  • What advantages do you hope/expect to get from a grid system?
  • Are your office machines capable of running the jobs?
  • Are your (or can you jobs be converted) to wrok in this style of running?

The last major point is to take your time on any major change like this. Update your processing code to work using the new methodology, benchmark again. Possibly set up your processing server to run a virtual machine, after all your processing server will just be another worker (just a very powerful one relatively). Allow the new process to settle.

Deployment

My suggestion would be to pop into the office one weekend perform all the installations and setup. Do this just before a fortnight’s holiday and leave so other poor chap to deal with the consequences… maybe not…

Deployment for a system like this needs to be slow. Despite it being relatively simple to set up this system will affect your entire office infrastructure (well the digital one). Firstly, roll out to a couple of machines at a time, monitor network traffic, how the worker hosts perform on a day-to-day basis. You may need to alter your job configuration in response to your findings.

Once the system has settled with a few machines (lets say 10% of all office machines, i.e. 5) keep monitoring network traffic and host machine performance.  Next benchmark again, you should now be processing 33% more jobs than your first benchmarks. Check this is so, or that you’re at least in this ballpark. If not, investigate what is going on before moving on. Repeat this cycle until you happily have all office machines running without killing individual machine performance or grinding your network to a standstill.

At all times keep benchmarking, even after all deployments are made. Check how new code updates affect speed of your system, check all workers are reporting in and processing jobs. Slowly (very slowly) increment your job configuration to get the best from your workers and network.

Stop!

What if you want to stop your workers from running at some time? They are all out there running, regenerating, and trying their best to process data like hungry insects. The answer may seem obvious but its worth adding just in case its overlooked. Simply edit your processing script with an exit(0) or die() or some other statement to kill your processing job. An important reason why we always try to update to the latest processing script before any run!

Demonstration System

In order to write this set of short articles I created a very small grid to demonstrate the technologies and methodologies. I read lots of articles, tutorials, and used various tools to setup and monitor what was going on. By no means have I gone out and saturated a whole office with traffic and nor have I had access to a regular staff members PC to see how host performance was affected.

My demonstration system was very humble indeed. I used my regular desktop set up as a job control server. On this I had installed mySQL server installed set up as a master in replication, PHP,  and SVN linked through apache (for access via worker VM).

I then created a centOS worker machine on VirtualBox on a 6 year old windows XP laptop. I setup scheduled tasks as specified after copying the VM onto the machine and let it go.

The virtual machine was set up with PHP, subversion, and mySQL. I checked out a branch named ‘worker’ from my job control servers repository and made sure it could be updated using ‘svn update’. Next I setup mySQL as a slave and checked that data was replicating from mySQL on the job control server down to the worker VM. After all this I setup the bash script and the cron job.

My processing script basically went along the lines of this (very simple stuff):

  • Read in the name field
  • Counted the number of similar names in a table from the data source held on the VM
  • Counted the number of names as above but splitting the name by spaces (i.e. forename, middle, surname)
  • Repeated this process 1,000 times

Each job took approximately 20 minutes to run. At one point I opened several copies of the worker VM on the windows laptop and watched the jobs be checked off by each of the worker IP addresses. At this point I also confirmed that replication automatically restarted.

Leaving the laptop to idle resulted in a worker starting to process jobs from the job control server. When resuming laptop usage there was a delay of about 30-60 seconds, this is a fair amount of time and staff would need to be made aware that their machine may pause for a short while when returning to the machine. Newer machines may not have a pause of this long. The benefit of the amount of processing performed by these machines during idle periods would more that outweigh staff members having to wait a short period (say 1 minute) on arriving at their machines of a morning (I frequently wait longer that this for a Windows Defender update to take place) provided they were made aware of this (useful time to grab a morning coffee!).

Overall I feel confident that I have demonstrated the technologies that could be used to create such a system. I have shown that such a system does work on a (very) small scale and with some more experimenting could be scaled up utilise the resources of an office’s machines. If I don’t get to the point of doing this I would be very interested to know/see when someone else does.

Conclusions / Evaluation

The next obvious step would be to actually get a real world example and start to deploy a system such as this within an office environment and see what happens. Asking a business to commit to this without a trail blazing company to prove the technology and effectiveness may be a little difficult. Grid/Distributed computing is very popular is some circles and has some large applications (BIONC, SETI@Home, Folding@Home, etc). I did not, however, find a smaller scale and simple system like this in my searches that could be rolled out within an office environment.

I created a basically free system using mostly open source software and tools available in almost any office. The technologies were basically demonstrated and show to perform and work as expected. Hopefully I have show that with not much work and with a very simple setup you can deploy an office grid computing system that is powerful, cheap,  and scalable all at the same time.

Once a system is up and running there is almost no end to the amount of customisation and improvements you can make. For example statistics / benchmarking can easily be added showing the worth of such a system every day. New machines can be added quickly and easily as and when they arrive with upgrades to existing hardware bolstering your processing power.

I hope you’ve enjoyed reading this series of articles and its given you food for thought on running an office grid system. The solution presented here won’t necessarily work in all situations but should be adaptable to allow you to get your data processing done using your own solution.

Please feel free to send me any comments, corrections, or improvements and I’ll do my best to keep this article updated to match.

Panorama Theme by Themocracy

1 visitors online now
0 guests, 1 bots, 0 members
Max visitors today: 14 at 04:10 pm UTC
This month: 15 at 10-10-2017 02:55 pm UTC
This year: 45 at 02-01-2017 10:28 pm UTC
All time: 130 at 28-03-2011 10:40 pm UTC