Bringing Phoo back to life
2006-03-25
16:12
I started Apache with the old config, so it’s working. All services on Phoo should be online now, but the DNS may be lagging a bit behind.
11:20
Dovecot and backups were running at around 03.00 this morning, then I had to sleep some. The initial backup is still running (lots of data). Now I’m going to play around with Apache again, hopefully I can get it to work with FastCGI the way I wan’t, pre-spawned processes for all users that is.
01:44
Mails are flowing from Beaver to Phoo! MySQL-replication is working, Postfix is running. Dovecot will be enabled soon. New backups will be made in a few minutes.
2006-03-24
Apache, Dovecot and MySQL
23:55
Since OpenBSD 3.9 has MySQL 5.0 in ports, and MySQL master and slaves need to run the same MySQL major version, I needed to update Beavers (which runs OpenBSD 3.8) MySQL from 4.1 to 5.0.
It took quite some time to fix this because it:
Took like 30 minutes to build the package
I forgot to run mysql_fix_privilege_tables, ran mysqlcheck on all databases and kept getting “errno 9”, took some time to figure out that the user wasn’t allowed to use more file descriptors.
Had to increase the openfiles-value in /etc/login.conf
Restore a copy of the databases from backup since I hade fiddled around with it
Then start with –skip-grant-tables and run mysql_fix_privilege_tables
But now 5.0 is running och Beaver and Phoo, just have to move around some temporary databases created on Beaver, resync the databases from Phoo and then enable replication.
Postfix and Dovecot should “just work”.
19:20
I’ve got the replication/syncing going now, using rsync and SSH. Now it’s time for MySQL, Postfix and Dovecot!
15:30
Right now I’m looking for a secure way to replicate all the important data on Phoo to Beaver, I’m probably going to use Rsync through SSH or through IPsec, trying to sync the files every 10th minute or so. I just need a small script that checks if there is an active transfer going on, if so it will abort and check again in 10 minutes.
Things that need to replicate to Beaver is
All mails in IMAP storage
Home directories and websites (i.e. /home and /var/www)
Bitlbee accounts
MySQL (Already using MySQL built in replication)
Past tense:
2006-03-23
Package updates and Apache wrestling
I updated all the installed packages and removed some unused ones, because 3.9 isn’t released yet there is no pre-built packages, so I had to build all the packages from ports. It took some time to get everything working because of old libs and such.
Previously we’ve been using Apache with mod_php for PHP5, but I never really liked it, so I thought I’d try and install a better PHP5 environment now that the service is down anyway. I want to use Apache with FastCGI and pre-spawned PHP5-processes. I’m using this, but with Lighttpd, on Hera and I’m pretty happy with the results. Every user has a couple of PHP-processes that can interpret their scripts, no scripts are run as a shared user. Everything is chrooted in /var/www too.
With Apaches mod_fastcgi I couldn’t figure out a way to configure it for pre-spawned FastCGI-processes for all .php files, I could only make it work with specific script (ie. /users/jage/htdocs/index.php). This wasn’t whan I wanted to do, so after whining some on IRC, xevz mentioned mod_perl and using embedded Perl in the httpd.conf, as I’m not a Perl programmer I took a look at mod_ruby instead, but since the documentation was lacking I started reading about mod_perl. I did some tests and couldn’t get it to work then.
I enabled Bitlbee on Phoo and dentarg pointed im.starkast.net back to Phoo. Wonder how many users of our Bitlbee-server will continue to use it for IM, since this massive downtime. I’m going to copy the bitlbee-accounts from Phoo to Beaver every fifth minute and have a copy of the Bitlbee configuration on Beaver so that we can use it as an backup server for all users, in case Phoo wants to cause more trouble.
2006-03-22
The visit
Me and serp visited Phoo in the colocation facility this wednesday. The first thing we did was to connect a monitor the Phoo and see what was going on. What we saw was a normal login-screen without any error messages. I could write my username and press enter, but no password would be asked for, it just “hung”. After we pushed the reset button, we ran some fsck and disabled all the services, we then scoured the logs but didn’t find much of a clue why it had stopped working like that. In the daemon-log it was as if Phoo had been offline for over a month, and in the messages log all I could see was some attempts by ntpd to connect to servers.
We updated OpenBSD from 3.8-stable to 3.9 from CVS, this because we have had some weird reboots with 3.8, hopefully this code will be more stable.
Memtest86 ran a full test on all the RAM (2GiB) and no errors or faults were found.
We spent several hours trying to flash the IPMI card without success, we decided to bring the card home with us, hopefully we can get it replaced.
This trip took us a whole day, when I came home all I did was reboot the server with a freshly built kernel and then went to bed.