Hypervisor Acting up In Tacoma

The Xen Hypervisor is acting up in Tacoma, causing intermittent crashes there.   The watchdog timeout is catching the crash and rebooting the system, so we come back up, and I’m still working on it.

Also, we’ll be moving again, and ironically, back to Arizona, as I’m now in sole control of the colocation contract for that server, and connectivity there is stable.    But it’ll be with brand new hardware on Xen once again.

I’ll update this post as things change.

Update 6:46PM PDT: Okay, I’ve tracked down what is triggering the crash in the hypervisor.   I’m attempting to build new patches with a potential fix applied.   Unscheduled crashing should be over, but I will be using non-prime times to do some testing.

Short Outage at the Phoenix Server

The Phoenix virtualization server stopped responding around 00:28 on Monday, April 9th.   I don’t have a reason, and the hardware self-checkout is only complaining about DRM (Direct Rendering Manager, not Digital Rights Management) checksum failing and three bad sectors on one of the RAID disks.  The DRM thing is Xen related I’m pretty sure, and the bad sectors are normal for a drive like that one.

So, no idea why it went down.

Effect: Podcast downloads for Ask an Atheist were down, and DNS for vis.nu was out of broken after our TTL of one hour expired.  Mail delivery for tonight will be delayed, but we’re well within tolerances for mail delivery.