hosteverything.net

Unexpected Server Outage 10/11 December

I'm sorry to report that after a smooth switch of hosteverything.net's server to a new ISP and server with more space and bandwidth, back in August, that suith.hosteverything.net went offline on Saturday the 10th at around 16:00.

Initially I thought it was a network outage - there had been a brief one earlier in the week - and a report was filed with my ISP. The network was reported up, and then the server responded to being up but no longer accepting connections from other systems. The server was rebooted but failed to become available - it was on-line but not accepting connections. Upon visiting the hosting facility I discovered that the server had come back up, but was coming on-line very slowly due to an email problem - the server was being flooded by incoming email. This was resolved and the server left running. At approximately 1:30 on Sunday the 11th the server went off-line for a second time.

A second visit, the following morning, isolated the problem was with a faulty network cable that had been further strained by the previous visit. This was replaced and the server became available at 10:30 on the 11th. I decided to check the server was starting up properly, and upon reboot a faulty hard disk was discovered, which was also slowing the system from starting up. The drive was removed for testing & replacement.

No data was lost during the outages, though a few pieces of mail - mostly spam - may fail to be delivered most will have come through. An additional offsite backup of the live system was taken at 11:30am on the 11th.

A replacement hard drive for the faulty unit - a 'live backup' of the main hard drive in the server - was purchased. This was installed in the server at around 15:00 on the 11th, taking the server offline intermittently within the following 2 hours. During this process I discovered the new drive was 41Mb smaller (grrrr) than the required size and could not be used as a mirror of the live volume. Instead a snapshot backup of the server was taken onto the new drive and it removed.

The server has been on-line since 16:10 on the 11th and is running to pull performance. Incremental offsite snapshot backups continue to be taken nightly at 01:00.

The new hard drive is currently being set up and tested on a second system and I will hopefully install it within the next week. I'll send out a notice to let you know when to expect downtime, but I'm afraid it's looking likely to be around 4-5 hours of downtime whilst all data is backed up, the new drive is switched in (as live drive) and the old live drive is moved to live backup, all tested and the system brought on-line. I'm sorry for any inconvenience this may cause.

Let me know if there is a particularly bad time for the server to be taken down and I'll schedule to avoid that. It's already scheduled for outside of office hours.