Ugh…

November 11th, 2009

So much to do, and so little time.

Here is a quick update on some of the things I’m working on.

CP2 – CP3 migration.  I was able to get CP3’s /root from iSCSI SAN, now its time for me to schedule and begin testing the transfer process of some accounts between CP2 to CP3.  Once I can verify that transfers works, update DNS, and then schedule the transfer of more important (high-visibility) accounts.  Then we observe to see if these accounts continue to have the problems they’ve had in the past.

SAN problems.  Random issues with an older NSM160.  One of our Network Storage Modules went off-line on Friday last week (11/6/09), about mid-day.  I was able to call HP and get support immediately.  We initially rebooted the unit and it returned to normal for a few hours.  It then went off-line again later that day.  HP took a closer look at the unit and discovered some really odd artifacts.  The unit had stopped writing logs on August.  HP engineering also found some really odd permission problems with the software RAID components.  We took the unit off-line using their Repair NSM function, rebuild & re-initialized the RAID array, and then reintroduced the unit into the cluster so that the data could re-stripe itself to the rebuild unit.   Fortunately, we have not seen any additional problems from that unit since doing so.  HP has shipped a replacement unit, which is back-ordered, but should hopefully arrive in the next 3-5 days.

SAN/iQ upgrade.  Meet with our HP/LHN Storage Sales Specialist and talked (for a good 2 hours) about updates down the line and our current situation.  It was kinda fortunate that we had encountered problems the previous week so that I could bring up any support issues in our conversation.  Overall, I feel that their support services are getting much better.  I assume that most of their merger-pains have subsided and they are doing a rather decent job.  I plan to schedule and perform a SAN/iQ software upgrade in early Januaray.  We shall see how that goes.

WAD2.  WAD is the acronym we’ve given to our Web, Auth, & Database servers.  The redundant set (thus the reasoning behind 1 & 2) in a secondary site randomly went off-line on Monday morning (11/09/09).  It was quite cumbersome to have had these units go down during that time since I was in a meeting with our HP/LHN rep..  After the meeting, I did some digging around and I believe I’ve narrowed down the cause of the interruption to an Automatic Transfer Switch.  The switch may be faulty.  We’re looking towards replacing it, and two faulty UPSes in the same space.

During the interruption of the WAD2 servers, actually a bit before that…  While I was helping our Network Admin with some Radius configuration I ran into two headaches.  1) our service agreement for this product needs to be renewed.  2) it appears as if all production services are running on the WAD1 setup, except for Radius and  bulk import utilities for administrative data.  When WAD2 when down, it impacted those utilities and more importantly Radius Authentication for our new wireless network.  Furthermore, once those services were restored, Joe found some issues with NFS and iSCSI mounts.  Oh, I also discovered that Radius Authentication services (for some perl reason) cannot be enabled on WAD1 systems and so I’m faced with trying to figure out what the best fix will be in the foreseeable future.

Support Contract renewal for VMware.  I’ve completed all of the leg work and paper work.  I just need the P.O. to be signed so that I can submit it.  Once that is done I can begin planning an update and eventually a capacity upgrade.

We’ve hired a Web Programmer / Systems Administrator.  I don’t know the specifics but I’m content.  The CIO hire is another process altogether that I’m not very keyed into.

I guess that about wraps it up.

–Raf

iSCSI Boot, er, well okay then how about Root…

October 7th, 2009

I meant to post this last week, but we know how that goes…

The original title was going to be, iSCSI Boot w/ an HP DL380 G5.

We first purchased three of these units back in …  < SEE PO DB > and got them racked and powered.  We allocated one of them to replace NISC40, an image server for a Linux Lab.  The other unit we christened CP3 (a replacement for CP2) which will run CPanel for our Academic Faculty, Staff, & Depts..  We never really got around to configuring the 3rd one.  I believe I spent some time trying to get iSCSI boot working but I’m not really certain.

Several years later, it’s time for me to revisit the project.  I began by downloading the latest binary from HP and reading the updated documentation.  The process is fairly straight forward.  You configure the Option ROM on the NIC by providing the necessary iSCSI parameters.  You tell the BIOS you want to boot iSCSI from the NIC, you reboot and hope that you establish a connection and that somehow your system recognizes the iSCSI LUN as a boot disk and off you go.

I was able to update all of the firmware and what not on the box using a USB key and an HP FirmWare CD image.  I was able to find out what IP Address we had configured the iLO2 port for and uploaded the iSCSI Boot configuration file.  I then modified the BIOS to enable iSCSI Boot and to load the Option ROM, but I wasn’t able to establish a connection.  I tried changing a few parameters in the iSCSI Boot configuration file but was unsuccessful.  To top it off, there wasn’t any error messages displayed on the console, so I didn’t know what exactly was going on.

So, in order to try and find out what was going on I configured an old Mac laptop w/ Wireshark (an tedious task that ended up consuming a day and a half, but thank you MacPorts) and decided to sniff the network traffic to see what exactly was going on.  I believe I discovered two major issues.

1.  It appears that the Option ROM doesn’t support iSCSI Load Balancing.  That is to say it cannot communicate w/ our VIP and then change its connection to the specific module, as told by the VIP.  No problem, we’ll just tell it to communicate w/ a specific module.  That workaround, however, raises the question as to what happens after initialization, after the OS boots if redirection has to occur, will the OS comply.  There are a few issues with this, but for the moment I have a workaround.

2.  I suspected that network initialization between the NIC and the port on the switch doesn’t occur fast enough for the Option ROM to establish a connection and login to the iSCSI target.  This doesn’t occur if I have a small hub as an intermediary since the link is already active.  Aaron suspect that we might be able to hard code the port to Gig speeds @ full duplex to get around this issue.  Before I do that, I’d like to make sure that I get iSCSI Boot to work.  However, I’m now thinking that this might be more important to verify now, rather than later.  It would be a great pain to have come all this way to get iSCSI Boot working, only to find out that there is no way to delay the initialization of the Option ROM.  Being stuck at 10 Base-T @ half duplex would not be ideal.

Well, with those two out of the way, my attempts to configure iSCSI boot have so far failed.  My first attempt was to simply copy the existing OS installation onto the iSCSI LUN, make the appropriate changes, and hopefully boot that way.  Wishful thinking…  There is still far too much that is unknown for that to successfully work.

My next step was to configure iSCSI Boot and try a clean OS install, as documented in the documentation.  There is a KickStart file that helps configure the installation to support an iSCSI LUN from the Option ROM.  I managed to get that initial portion to work, however during the end of the install the post scripts failed and brought the entire thing to a screeching halt.  It looks as if the installation didn’t properly create a kernel image on the /boot lun (no initrd file to be found).  I spent some time mucking about the install trying to see if I can create said kernel, but w/o the working install directly I didn’t get too far.

I decided that I would try again w/ a more recent OS release (RHEL 5.4).  So, at the end of the day on Friday (10/2/09) I downloaded a newer install image and burnt it to DVD, on Monday.  Hopefully the post install scripts in the KS.cfg file will work w/ this release.  If they don’t, I’ll be forced to post to HP and give their support line a call to see if they can help.  Thankfully, the install completed and the post install scripts of the KS.cfg file ran.  Unfortunately, iSCSI boot still didn’t work.  The iSCSI connection was initialized at boot by the Option ROM, but GRUB failed to load.  All I was able to get was a black screen w/ the letter GRUB in the upper left hand corner.  No prompt (GRUB>) just GRUB.  I troubleshooted this for a bit on Monday.  I though that perhaps the MBR wasn’t properly installed on the iSCSI LUN.  Somewhere in the process of trying to fix that I hosed the /boot partition on the physical disks installed in the server and had to spend a few hours getting that back to normal.

If I can get iSCSI boot to work, that would be great.  It would mean that any future RHEL builds (although none are planned and it is highly unlikely that we will continue to run RHEL unless specific software required it) can be booted from the SAN and have backups (via SAN snapshots) taken care of.  The reality is that we will probably employ Ubuntu Server on most new Linux boxes.  The unfortunate aspect of this is that even if I do manage to get iSCSI boot working, I’ll have to perform a new clean install of CPanel in order to begin migrating existing Academic hosts from our older install.  Regardless, it is a worth while project to spend my time on.

The good news is that since the RHEL install worked and the post scripts ran I had a somewhat capable initrd image to work with. I tried copying that to the /boot partition of the internal disks and tried using a local /boot to mount an iSCSI root (/).  That failed, but at least it would get to a certain point and then fail.  I decided that I could probably take a look at the initrd image and hack it in order to initialize an iSCSI connection and possibly mount /root from iSCSI.  I would have gotten to it sooner except for the fact that a backup volume on my backup server filled and I was forced to repair the volume / server for the entire day on Tuesday.

Which brings me up to today.  I’m happy to report that I have successfully managed to get this system booting /root from iSCSI.  Now, if I can modify one of the original initrd files to include the additional drivers and scripts of the latest RHEL install I might be able to copy the local root to another iSCSI LUN and try to boot root off of iSCSI w/ the OS and our CPanel build installed and configured.

Joy,

–Raf

OMG, since June ! ! !

September 22nd, 2009

I was going to post some massive update to cover June, July, Aug., & Sept. but I need coffee at the moment.  Oh well, maybe next time (in January).

–Raf

Updates…

June 17th, 2009

I’ve been managing to keep my self busy with various projects here and there.  Here is a re-cap of things since my previous post.

I’m going to order myself a new work laptop.  My old G4 is simply not cutting it anymore.  I was having a hard time deciding wether or not to go with a portable workstation or another compact and I think I’ve decided on getting the smaller of the two.  I’ve also decided to stick with a Mac since I feel that the OS is simply better (since its practially tailored for the hardware).  I’ll aim to have it ordered, hopefully, early next week.

Raf back @ Work

Raf back @ Work

For some reason, my webcam stopped working.  I think it was due to either a Windows XP update, or a VMware update.  Regardless, it took a while to get it working again for a few reasons.  1) It really isn’t supported on Windows XP.  2) It is also not supported on Linux.  3) I forgot the steps I took to get it to work the first time.  I started to suspect that their might have been a problem with the VMware update and how it passes USB to the VMs.  I eventually clean up my XP VM, allocated more RAM to it (since it was painfully slow), and re-ran multiple installs until I got it right.

I’ve been working on a demo install of Xythos Enterprise Document Management Suite.  It took me a while to get past the install, but I managed to get it installed with some help form the Xythos Support Staff (we apparently didn’t setup PostgreSQL properly, thanks Ubuntu), but I’m now in the process of thinking through a possible migration strategy.  It is too bad that their import scripts don’t deal well with Solaris ACLs and recursive directories.  There is also a larger issue with looking up group information, but I’ll have to revisit that one with Xythos Support, or most likely Professional Services ($$$).  On the surface, this might work however our migration path is looking to be an absolute pain.  I’m at the point where I’ll probably import over a few user folders and let them loose on the demo system to get some feedback.

Hrm, I have another WordPress update to do.  I’m thinking I’ll hold off a bit longer, however since the last one went rather smoothly I’ll probably do it once I have some downtime.  I also have a Gallery update to install.  Since my last Gallery update when horribly wrong, I’m definately going to wait until I have some downtime.

I still have to plan some SAN updates, however I need everyone in my Dept. to be tuned in and on-board regarding the procedure.  At this rate, I may end up doing them in late July or early August.  I don’t anticipate them being too troublesome, as long as I can schedule a large enough down time window.  Beyond a simple software update I would prefer to do a hardware update but, without a boss to budget and approve such an order, that operation is above my pay grade.  What I really want to do is try out a few Failover Manager(s) on our VMware config., that might be asking too much especially since it puts the VMware config in a mission critical sort of production that we’d rather avoid.

2nd planter close up

At home, I’ve been busting ass building planters from recycled pallets.  I was surprised by how awesome they look, if I do say so myself.  Click on the image to see more pictures of the progress I’ve made thus far.  I have two more planters to make from heavy duty industrial pallets, which took a while to break apart.  I have a few more ideas for recycling pallets, but I have to finish two more before I can move on.  I got the idea to use pallets from the Instructables website.  The best part of the project was hunting pallets.  The worst is a three way tie between pulling nails, taking apart industrial pallets, and hauling dirt.

Hrm, I guess that pretty much sums it up.  There are probably a few thing here and there that I’ve omitted, but I think I’ve included the juicy bits.

Laters,

–Raf

Thrashing

May 28th, 2009

So, after updating my PowerBook G4 (12″ DVI) everything seemed to be okay until I realized that the disk was being accessed quite heavily (based on sound alone), and that the computer seemed a bit slow.  I though it was due to some DB update, or something along those lines, and decided to wait it out.  The next day it was still at it and the usability had worsened.  I decided to reboot to see if that would resolve the issue, which it did.  After rebooting, I checked the Console.app to see what was being logged. As it turns out, I had run out of memory and was thrashing.

May 26 19:51:42: — last message repeated 4 times —
May 26 19:51:45 rafs-powerbook-g4-12 kernel[0]: (default pager): [KERNEL]: no space in available paging segments
May 26 19:51:54: — last message repeated 1 time —
May 26 19:51:54 rafs-powerbook-g4-12 kernel[0]: proc: table is full

Let that be a lesson, reboot early, reboot often.

–Raf

Our Kitchen Uplift

May 26th, 2009
Kitchen Facelift

Kitchen Facelift

This is just a test to see how the image upload/insert feature works in the WordPress Dashboard.  I had always meant to add a plug-in in order to figure out how to do that, but instead I opted for the software update to take care of that for me.  The completed kitchen face lift took about 3 months.

[UPDATE] Hrm, the image had to be scaled down to 70% in order not to interfer w/ the right side bar.  I imagine that multiple reductions would allow test to align to the left or right.  Interesting…  Once the image has been posted, the Edit feature resets the scale so that the last change in scale is the new base line, or 100%.  I assume that is good because it would allow one to rescale the image if they choose to do so.

Overall, the new features are quite nice.

–Raf

That wasn’t too bad…

May 26th, 2009

Ooo, the new Dashboard is cool.

More later,

–Raf

Ha ha…

May 26th, 2009

So much for keeping this thing up-2-date.

I just managed to get myself out of a busted Gallery update.  I’m now going to try to update WordPress.  If I don’t succeed or if I can’t easily recover, this might be the end of this blog.

–Raf

Blah, blah, blah…

March 9th, 2009

I’m still feeling ridiculously busy, even though work is pretty quiet because it’s Spring Break.  There are many projects that we’re finalizing at work, and more along the way.

We finally got around to finishing up dovecot testing w/ various e-mail clients, two weeks ago.  Vasantha and I put the new dovecot in place on Sunday while Aaron & Sarah updated out core network infrastructure, things look good.  Aaron is on schedule to replace our routing core tonight.

Regarding Dovecot, there was a moment where we though we might have had to update / patch our mail server, however we were able to get around that.  We’ll probably have no alternative but to patch our mail server once it comes time to update sendmail.  We did eventually discover an issue w/ the subscriptions list, but we were able to work around it pretty quickly.  Like I said, things look good.

On the home front, Mali and I have finished phase 2 of our kitchen redo.  We’ve both been working pretty hard the last couple of weeks putting it all together.  We still have to put down our floating floor, but that will wait until later this week.

Part of the reason that I feel so stressed and busy is that I also have a lot of ideas regarding some house related projects that I would like to undertake, in addition to my regular work during the week.

I’d like to begin planning on the initial stages to plan out the back yard and build a few planters for Mali’s garden.  I’ve been thinking about making some kind of book shelving monstrosity for the library/study/lounge.  Then there is the fact that we’re rearranging the layout and purpose of the rooms we initially setup.

The “Entertainment Room” is going to be converted into a formal dining room and the “Entry Room / Grand Room” is going to be converted into a formal grand room / living room.   Both of these require moving the TV.  My TV is broke and so I have to see about getting it fixed or simply getting rid of it.  It would be ashamed to not be able to get some money for it.  Additionally, there doesn’t appear to be any way to move it into the living room w/o complicating the layout of the furniture, so it might just have to go away.

Our library / study / lounge will continue to be just that.  Eventually, we’ll need to rearrange the things that are in there and figure out what to do with the rest of the room.  I’m pretty certain that all of these will take more time to accomplish that we anticipate.  We’ll just have to see how much we can get done this summer / year.

We had a big party this Sunday to celebrate our new kitchen an a few birthday parties.  We made another huge batch of tamales w/ the help of some friends.  We’re all talking about possibly making more tamales later this month.

Always a blast,

–Raf

Updates…

February 24th, 2009

Emotionally, I feel busy, very busy…  Not quite over-whelmed, however I’ll have some long days and nights ahead of me if I’m to finish all that I’d like to finish.

Friday -

Friday evening, Mali and I when shopping for most of the hardware we needed to finish up our kitchen.  We purchased our faucet and a small base cabinet unit, to use next to our fridge, from Ikea.  We’ll probably return once more to purchase a bar table once we get the kitchen in order.  We then went to Lowe’s to pick up the rest of the hardware that we needed for the plumbing and electrical work that we planned on doing this weekend.

Saturday -

We tore out our kitchen sink on Saturday.  I managed to get most, of the plumbing done, and I prepped the dishwasher for installation.  We cut the piece of wood that we’re going to use as our end panel and Mali began painting it, as well as the drawers from the buffet table that we’ll be using in what will become our formal dining room.

Sunday -

Finished the plumbing work and began prepping the sink for installation.  We also cut notches out of the counter top to accommodate the sink and prepared the sink cabinet by removing the front and rear boards that provided stiffening support (I’m not too worried about it, but we’ll see how well it holds up).  The Ikea sink provides to long rods to use a stiffeners; I installed one of them, but will have to wait to see where to install the 2nd.   Unfortunately, since we were still waiting for paint to dry, we didn’t get much else done.  Had lunch w/ Gabe & Rahul @ Cuba Libre.  Upon returning from lunch,  I installed the new faucet on our sink, as well as the included strainer and the strainer/garbage disposal mounting unit.

Monday -

  • Tapes
  • Started using our new secure-wireless network, now w/ 802.11 A and N; not that I have the hardware to utilize these new features.
  • RAM Upgrade for a PowerBook G4 = DONE

At home, I installed the power cord for the garbage disposal.  I’ll have to spend some time working on routing the new power receptacle and switch later this week.  I then read up for class.

Tuesday -

  • Updated Ubuntu desktop, took some time to get VMware Workstation back in order (it always does).

Not much else planned for today.  I have to help Vasantha test a new version of Dovecot and write a short paper for my class tonight.  I won’t be doing much @ home tonight since I’ll be getting home so late in the evening.

–Raf