myff admin

So what's next?

We now have all currently installed servers in the same rack with 1gb connections between them

So what does this leave to do before we can leave the world of hardware and wholeheartedly get back into feature development?

Quite a lot unfortunately.

1) We need to tweak network connections which is always scary.
2) Backups need reviewing, we have a lot of scope for improving things even more now.
3) We will have one server on a different IP subnet, which limits some things.
4) The old "slarti" server still has not shipped and needs installing.
5) We need to play with running virtual servers via ISCSI on the NAS. This cannot be done with full performance though without taking forums offline again.

It is a bit silly really, but despite us now having the server capacity to run the forums several times over, we still could do with another server to aid in shuffling things around to make everything work to its full potential.
myff admin

Network connections tweaked.

Time now for a "chill pill"  

It really is a very simple job, but the consequences of error are not good at all!
myff admin

Nothing really running in anger yet, but I'm busy tweaking advertising servers running on the NAS in various configurations and it is going quite smoothly.

I was a trifle concerned at one point that the NAS would be an expensive white Elephant. 1500 is not at all a trivial sum of money round here! But it is beginning to prove its worth already, the ability to create and copy server configurations without wondering where the disk space is coming from is very useful.
myff admin

Playing about with the machine import/export process.

Funny one this, there is an official and painful method of doing this via the GUI interface, but if you dig you can be cleverer and do things much more sanely and quickly. I have set things up for the latter method, but I'm doing it the hard way first and it is going to be about 12 hours in total   I figure the easy way will be more like 30 minutes.
myff admin

Oh and all this is not academic, we are still faced with moving forums to our new subnet and this type of thing is what will make it all a lot easier.
myff admin

Shows how far out you can be more like 4 hours for the "easy way", for a forums server that would end up being rather a lot slower still
myff admin

Interesting. A full copy of our imported machine happened in under 19 minutes. This is conceptually much the same as an import and fits in with my initial time estimates. Both the import and the copy were operating via NFS shares on the NAS.
myff admin

Also notable as I'm provisioning the Ford client, getting the backup of one of the other clients over to Ford to act as a base took about 4 minutes  

It is a world removed from backups being transmitted over the web and hence being hard to work with due to their inaccessibility.

I really like the fact that we now have our own internal network behind the firewall After close to 6 months reworking the hardware infrastructure, work that is still not complete, it is really good to start to feel tangible benefits.
myff admin

Wow and again transferring templates to finish up the new forum client and 2.5gb took 39 seconds. 64Mb a second  
myff admin

Nothing is easy, in order to get an additional type of backup onto the NAS, we are going to have to move the main backups to the NAS as well. This still leaves us with the systems on RAID, backup on another RAID system and further backups taken offsite, so it is hardly a big issue, but not what I would prefer to see.
myff admin

From the staggeringly fast to the chronically slow

Exporting the backup machine worked at a tolerable speed.  Importing is working at no more than a few gbs an hour, we're looking at several days for the import to work at this rate. Chances are that it will die before succeeding
myff admin

Oddly the speed has increased to a more tolerable 1gb a minute. This is still pretty painful, but at least it takes the estimate to more like 10 hours which has a good chance of working.
myff admin

hmm, I also note that the GUI interface has finally picked up my commands that were done at the command line level, normally this does happen, but in this case it wasn't happening.

It is notable that there are quite a few howls of anguish on the xenserver forums about many of these areas. Some aspects of virtualisation clearly have a way to go, for example when this import process completes I am going to have to remember to enter some slightly complex details to finish the job off, and whilst this isn't a "disaster recovery" scenario, the type of thing we are doing here is pretty similar, and if you don't get those details right, you can recover but as you might imagine if you are in a panic are you are trying to recover from a disaster then you want/need the procedures to be very simple.
myff admin

Whilst I'm musing, this has been going for a little over 4 hours and the size imported has got to just about the size of one of the forum servers. I suppose on a lot of levels that is not too bad as a disaster scenario.
myff admin

Right so we now have a backup box running on the NAS, but because it took so damn long, a set of backups went to the old backup system before the new one could replace it. So I'm now backing up the old backup system  

At which pint we can clear the old backup system off local disk space and get back to testing the snapshot backup system which needed more local disk space.

I think it is almost certain that this sort of thing is going to convince me to do whatever it takes to expand our hardware base still further.
myff admin

More in the same vein as experimenting with  these new procedures continues, we are up to 76% full on part of the NAS, hardly terminal as the data taking up the space is going to be deleted practically as soon as it finishes being created, but my feeling more and more is that we are running at such a number of forums and other hosting that spare capacity beyond all likely need should not be regarded as a luxury but as as means to ease improving the service.
myff admin

There have been a few more hiccups on the way, but the snapshot backups are starting to fall into place.

On some levels these raise the question as to whether we could drop the nightly database backups, such backups are a pain since they can lock people forums for a few minutes each night, sometimes causing issues with running out of database connections.

Snapshot type backups are far cleverer in the way they operate.

There is however something old fashioned and solid about a textual sql dump, no magic just data you can deal with.
myff admin

Things are continuing to fall into place with the extra snapshot backups, there have been a few hiccups with the amount of temporary space required, and also a scary interval where disk space did not appear to be freed properly when old backups were deleted  

I am impatient to get the "old" big server fully emptied and subnetted so we actually can start using all the capacity with have freely. Currently 25% of load is still on that server, mostly because the adserver still has not been dealt with.
myff admin

As I have said elsewhere one of the things we are moving into, is hosting some major adservers. It is an area that works well with other things we do and obviously it is not good from a company perspective to have all eggs in the forum basket.

One thing I have come up with today that might have good implications elsewhere, is a system whereby we can run two proxy adservers and two real adservers and if anything fails a backup system will automatically take over  

The implication for the forums is probably that we will use the same two proxy servers, and possibly just a dummy adserver along with the real one. Hence if the adserver does go down, peoples forums will not be locked waiting for adverts.
myff admin

The adserver is now working at its new location and will go live later.

Meanwhile more domains are moving from the "old" server, this is getting tricky as until we get the licensing issue sorted, they are having to go to a server that is not exactly over capacious and which also doesn't have enough licenses for all that really should be moved.

But even if there are short term load issues, the job simply has to be pushed through. In effect we are emptying out over 1/3 of capacity on the the remaining 2/3rds, but having to do it in a rather poor fashion as we can't distribute the load properly

Once the job is done the "old" server will be put on the right subnet and things can be switched back to it.
myff admin

Just reviewing the performance logs and the server we are moving stuff do, is the only server that is showing signs of a sweat  

All things are relative though, and it is the server the support forum is on and I see no sign of the forum having a performance problem.
myff admin

Bad news is that we did hit a performance issue on an adserver this afternoon, though I don't actually think it was related to piling things onto the server.

Good news though is that we now have a valid licence for the server we want to move stuff to, so assuming no more technical issues there, the issue has gone away
myff admin

We're now down to the domain and that need moving. will be on route in a minute and poses few problems.

is more rocket science as it is the root of all the forum addresses   It is not simply a web site that can basically be copied.

What will happen is a new version will be set up, have a few improvements done and then get tested before we switch over to it. That is the last scary bit!

We will then have a very large and totally empty server, that can be moved to the new subnet, given some TLC, and then.... well that will be a question
myff admin

Basically have decided to let the dust settle for the rest of today and tonight, and then crack on with the DNS system tomorrow. I'd like to get away from weekend working, but we have a team meeting Monday to map out a lot of roles, responsibilities, procedures etc and so it ain't going to get done Monday and there will have to be a 48 hour gap after this work takes place before we can really move to finalise things on the server.

So it's tomorrow, or even more delay.

myff admin

Things will be more "restful" when we have all the server capacity fully usable.

So one final push is called for.
myff admin


This is 75% done, the main nameserver system is on the new subdomain, though the world will take some time to see it.

What has not been done yet is ensuring all the secondary nameservers get their updates from the right place.

Since the records are currently identical that is not urgent, and I will work through systematically later today, assuming all is working smoothly.
myff admin

All seems to have worked very smoothly, which worries me a bit  Wink

The old server is still being accessed at the rate of one person with a poor broadband connection, this may be people who have not updated their own DNSs, or some systems that not quite got there with the final transfers.

But basically we are virtually at the point where we can switch and let people find out the hard way that instructions about wrong links that won't work forever mean what they say on the box!
myff admin

The old server is close to flat lining, something are proving laggy though, a few emails still get though to it for example

but we have to metaphorically pull the plug at some point.
myff admin

Have cleaned and updated the old big server today.

It is still running the old stuff and will be able to do so until the subnet is changed. But its fast local RAID drives are using zero space.

There was a very big scare with the upgrades, following the upgrade no forums server would boot at all  

This was because following the update to the xen server software all the machines believed they had a phantom dvd in their virtual dvd trays and wanted to boot from it. Telling each machine that its DVD tray was empty fixed things!

myff admin

Virtualisation gremlins!

Virtual servers are where its at! I just did a count and we have 26 machines running right now  

The physical server count is 7.

The benefits are enormous, the pain for those that joined the party a year or two before we did is also rather large!

We are joining as the technology gets a bit beyond the totally bleeding edge. But only a bit!
myff admin

Well today after a week in which the candle has been burning at both ends (to a greater extent than usual!) we are now running adservers delivering well over 500,000 adverts an hour  

There is undeniably some polish needed to that work, but I sincerely hope that after the bank holiday we can start bending the stick back towards running and developing forum code.
myff admin

Good grief, it has got to Thursday, at least one forum improvement has been installed, but it does seem like firefighting on the advert hosting we are doing is the core of the week.

I'm taking the opportunity to tweak some of the server side of things, and to polish up my new Lucid Linux desktop to make for some quicker work bits.
myff admin

And another week has passed   Doesn't time fly when the fire extinguishers are going  

Essentially I'm still playing with ad servers, every day seems to bring a new crisis and new improvements, notably one system not quite live includes two custom compilations of lighttpd, one with a patch I have written to do geo ip location via a proxy server.

It's a  bit of a distraction from the forum work, but the work does pay the bills, and it is pretty important that we stay healthily in the black.

myff admin

A few months ago I would have simply said what do you need, lets set it up.

as of the minute though we have the likelihood of more adservers going onto the system. I am hoping that they might all fit into the capacity so long as we get it right, if they don't look like doing so it means a lot more money being spent on another servers.

As such the last thing we need right now is an unlimited bandwidth gameserver, it might not be that big a thing but it could be the proverbial straw that means we need to increase capacity to be on the safe side.

Can you give some idea on what you would need?

myff admin

Another day fire fighting the ad-server.

Though I might point out that the plan was for the main one to be launched after a few weeks of testing, which turned into a few days.... not surprising that a few things are still needing TLC.

On the whole its good though, and it is certainly a foundation stone for the forums.
myff admin

I guess as we hit the weekend a few things did get done on the forums this week, but no getting away from the fact that I am spending a lot of each day skyping with the ad guys  

There are three ad-servers to spin, and two still need a lot of urgent work.

Meanwhile what I want to do next on the forums needs time and consideration and will not lend itself to part time attention.

