Seth's Blog: Ms. In-between

If the only reason you’re only wearing one hat is because you’ve always only worn one hat, that’s not a good reason.

Seth’s Blog: Ms. In-between.

Tradition for tradition’s sake is never a good reason to continue doing something in a technical environment.  The landscape is ever changing.  You should always be re-evaluating what you’re doing while keeping an eye on the bleeding edge of your profession. It’s the difference between being relevant and being the (only) caretaker of a legacy system.

When was the last time you challenged a tradition?

Your Cloud Needs a Sys Admin – O'Reilly Broadcast

The programmer-managed infrastructure suffers from a death by a thousand cuts. The programmer is competent with technology and fully capable of setting up a system that can support the application being built. The programmer, however, lacks a detailed understanding of ongoing infrastructure management. Consequently, the programmer-managed infrastructure ultimately leads to an environment incapable of adjusting to changing demands and potentially opens vulnerabilities to hackers through discreet channels.

The reverse is true of the sys admins who fancy themselves programmers. They can craft Perl programs to do just about any task. Those programs, however, ultimately lack the solid architecture that programming skills provide.

Your Cloud Needs a Sys Admin – O’Reilly Broadcast.

Fair warning:  I’m a sysadmin so my opinion might be slightly biased on this idea.

Yes, you do need a sysadmin.  Just like the blog posts suggests.  Like everything in business, you want to use the right tool for the right job.  Sure, you can use a wrench to hammer a nail in, but in the process you’re likely going to smash a finger, chip the wood, and spend excessive time frustrating yourself when pounding that nail into place doesn’t go fast enough to meet your nail-hammerin’ schedule.

Sysadmins are hammers (the right tool) when it comes to nails (managing systems and infrastructures).  A good sysadmin, like the blog post alludes, has a broad and necessary knowledge required for running a system (or set of systems) effectively and safely (read:  securely).  More often than not, we have the experience needed to tell you when exactly you can cut corners, when you shouldn’t, and why doing so may or may not be the right thing for your environment.

There’s an old AI koan that I’m reminded of.

A novice was trying to fix a broken Lisp machine by turning the power off and on. Knight, seeing what the student was doing spoke sternly: “You can not fix a machine by just power-cycling it with no understanding of what is going wrong.” Knight turned the machine off and on. The machine worked.

Ubuntu Linux adds private cloud backing | Open Source – InfoWorld

Ubuntu Linux adds private cloud backing

Canonical’s upcoming server upgrade supports the Eucalyptus project’s open source system for cloud implementation using hardware and software already in place

Canonical is touting private cloud capabilities in an upgrade to its Ubuntu Linux OS being announced on Tuesday.

Available for free download on October 29, Ubuntu 9.10 Server Edition introduces UEC (Ubuntu Enterprise Cloud), an open source cloud computing environment based on the same APIs as Amazon EC2 (Elastic Compute Cloud). Businesses can take advantage of private clouds, Canonical said.

Ubuntu Linux adds private cloud backing | Open Source – InfoWorld.

This should prove interesting.  If we were able to leverage something like this, we could build out a private cloud for researchers.  The Eucalyptus system certainly looks useful.  Especially if they’re touting it as API compatible with other external cloud vendors.  We’d certainly need to do some heavy investigation to figure out what running our own cloud would actually mean.  I can certainly see it as being completely different than running a classic high performance computing grid.

You, too, can have a cloud in the privacy of your own home!  Time to keep up with the Jones’s again!

Four Short Links, Oct 14, 2009

  • Larry Ellison hates cloud computing – funny clip of Ellison lambasting the idea of clouds. Yes, really, clouds have been around for over a decade, we just didn’t know it (or realize it).
  • Dynamic general and slow query log before MySQL 5.1 – This is an interesting way of handling the slow and general query logs on pre-5.1 MySQL instances. We don’t need this on slow, but there’s been occasions that we’ve needed the general query log, but enabling it and disabling it requires a full restart of the service on 5.0 and earlier. You still take a performance hit because you’re always logging, but I would think it to be fairly minimized on modern fast hardware.
  • Watch out for your CRON jobs – Over at the MySQL Performance Blog, Peter Zaitsev gives some good guidelines on things to pay attention to when designing your cron jobs. Not just for databases. I like the idea of keeping historical run time information so you can see when large jumps in run time occur (which could be a problem.
  • How Did Danger Not Backup Its Servers? How Did Microsoft Allow Such A Failure? – Oy. A few days late on this one, but really? Total data loss from an upgrade. Scary. This is a reminder: we all test our backups, but how many of us test our restores?

    Seth's Blog: Make a decision

    It doesn’t have to be a wise decision or a perfect one. Just make one.

    In fact, make several. Make more decisions could be your three word mantra.

    Seth’s Blog: Make a decision.

    Finally, someone agrees with me.  One of the biggest demoralizers in an organization (in my experience) was the lack of someone standing up and just making a decision.  Pick a direction.  Any direction.  It may not be the right decision (and may even turn out to be flat wrong).  But, it’s a direction that your employees can latch onto and do something with.

    The lack of decisions mean no traction, no effort, no movement.  Decisions mean progress.

    You can always correct a bad decision.  You can’t always recover from the lack of one.

    Four Short Links: Thurs., October 1st, 2009

    • Your People – Rand in Repose: “Your People will piss you off because the relationship is genuine. They do not coddle and they do not spin. Consequently, Your People error-correct you in ways that others cannot.”
    • Wordle: ” Wordle is a toy for generating “word clouds” from text that you provide.” This looks like a great way to make designs for tshirts or posters. Imagine doing these for your organization, covering all the words that describe your mission statement.
    • justniffer: a packet sniffer that outputs common log format style information on your packets. Nifty!
    • piwik: Someone’s gone and done it … an open source clone of Google Analytics.

    Crazy … bcfg2 for netapp?

    While talking with Tom, it struck me that we could use bcfg2 to manage
    our filer configs. Filers are basically a bunch of files with some
    sort of way to trigger a re-read. If we mount the /etc dirs of a filer
    on the management node and run bcfg2 in a chrooted environment against
    that directory structure, we get “instant” manageability of the config
    files.

    A few custom actions that fire off when the config files change and we
    have something we can use to better enforce policy on the filers. And
    we could build up a fake filer using the Netapp emulator to act as a
    test environment.

    Of course, two things occur to me after talking with Narayan about it.
    I’m now a bit deflated that someone already thought of doing this (though
    with VM images and read-only netboot systems). Second, I think someone
    would chase me around with a baseball bat if we actually went through
    with this.

    Nonetheless, it was a fun thought experiment. Just thought I would
    share. :-)

    “The distance between insanity and genius is measured only by success.”

    The Duct Tape Programmer – Joel on Software

    Jamie Zawinski is what I would call a duct-tape programmer. And I say that with a great deal of respect. He is the kind of programmer who is hard at work building the future, and making useful things so that people can do stuff. He is the guy you want on your team building go-carts, because he has two favorite tools: duct tape and WD-40. And he will wield them elegantly even as your go-cart is careening down the hill at a mile a minute.

    [...]

    And the duct-tape programmer is not afraid to say, “multiple inheritance sucks. Stop it. Just stop.”

    You see, everybody else is too afraid of looking stupid because they just can’t keep enough facts in their head at once to make multiple inheritance, or templates, or COM, or multithreading, or any of that stuff work. So they sheepishly go along with whatever faddish programming craziness has come down from the architecture astronauts who speak at conferences and write books and articles and are so much smarter than us that they don’t realize that the stuff that they’re promoting is too hard for us.

    The Duct Tape Programmer – Joel on Software.

    Oy.  How closely this parallels the sysadmin world!  It’s interesting to see unique and cool frameworks come together in these amazing bread-slicing-new-wheel-building-ultra-rad panaceas that get things done.  Assuming you have time to complete them.  One of the things I’ve had to force myself to learn is what I’ve termed “rational perfectionism in technology”.  Otherwise known as “Is it good enough?  Ship it!”

    It’s a difficult thing to accept, I know.  As a geek, sometimes it just kills me to let something go out that isn’t up to the uber-high standards I generally have.  But, sometimes you just have to.  You’ve got a job to do and the customer is waiting on you to do it.  Very often we will sit on something trying to achieve this golden-age of usefulness in a piece of software instead of taking a step back and trying to figure out if what we have now will work for the customer.  It’s the technological equivalent to gilding the lily.

    But many times we are faced with a task that cannot or should not be delayed if at all possible.  Many times, it’s perfectly acceptable to put something out there that hits ony 80% of the need so the customer can start doing the job they need to do.  From there, you can iteratively improve as necessary.  And, more often than not, I’ve found that the customer is perfectly happy with that 80% solution and may not even notice the warts that you’ve laboriously fretted over.

    An artist is never satisfied with his work.  An art lover very often is.

    Of course, we now approach this with the devil’s advocate voice in mind.  I’m not saying half-ass the job.  We should always strive to do our best when trying to give the customer what they want (or need).  Just be balanced in it.  It’s a constant juggle for me (which makes it kind of fun) to understand just where that technological center of gravity is and hover around it for as long as I can.

    Just remember:  just as you don’t knock the ducttape if it gets the job done, don’t knock the 80% mark if it makes your customer happy.  Afterall, that’s why we’re here … to do what we can to help the customer.

    -Update-

    Joel’s post references the Worse-is-better concept. I thought this quote was rather interesting (and man do I agree with it …)

    The lesson to be learned from this is that it is often undesirable to go for the right thing first. It is better to get half of the right thing available so that it spreads like a virus. Once people are hooked on it, take the time to improve it to 90% of the right thing.

    Ugh. Four hours wasted.

    I’m sitting here with a browser on one window and a text console to an installing system in another. Why? Because I’m waiting for the installation to finish. I’ve been debugging an odd bcfg2 failure during kickstart post-install for our provisioning system. It first started last night when I left the office. I’d just fired off a reinstall of an IAM system to verify that it would work correctly from the production kickstart (as I’d just pushed out the first real production bits to it).

    This morning, I got in only to stare at a console still stuck in the kickstart post install. Sigh. Ok, dig around to find the magic remote rescue arcana so I can poke around for the logs. See that two files aren’t binding correctly in bcfg2, which potentially croaked the install (it certainly looked like it hung, that’s for sure). Get the kickstart updated to use the “right” profile for now.

    Reboot, reinstall. Lather. Rinse. Repeat.

    Ok, kickstart is completing successfully! Yay! Confetti and champagne for everyone!

    Reboot.

    Hey, grub doesn’t have the right setup. Easy fix in the repo by moving the TGenshi template processing into the right group. Go to run a quick update on the IAM system and .. hey, where’s bcfg2?

    *headdesk*

    No wonder post didn’t error out. It didn’t actually do anything! Well .. it did. It errored out on yum because … the rpmforge repo got corrupted. Why did it get corrupt? Well, it appears that the stable thing we’ve been doing for months is now broken because the repository where the rpmforge gpg keys and yum repo setup is at isn’t answering requests.

    Fix the url, reinstall and now I’m back where I started early this morning. A broken bcfg2 config that stalls out in post.

    I love four hour snipe hunts.

    At least we know where we need to fix some things, including:

    • Pulling the rpmforge-repo rpm locally
    • Possibly mirroring all of rpmforge for the repos we need
    • better error handling on the kickstart post
    • need better portholes into the post install to see where errors are. Like, why isn’t our safety shell starting on tty2 like it should be?

    Tips: See the contents of an RPM post-install script

    Today I had the need to look at MySQL.com’s post-install scripts for MySQL Advanced. Unfortunately, I was lazy not having quick and easy access to the specfile, I used my blackbelt in Google-fu to figure out that it was simpler than I thought. In my case, I had already installed the RPM. All I needed to do was:

    rpm -q --scripts MySQL-server-advanced-gpl | less

    and out popped the post-install script. Why did I need to do this? Part of the work I’m doing with bcfg2 is to automate the installation and configuration of my MySQL servers. My requirements state that I need to be able to run more than one instance on the same server. Unfortunately, the default installation of MySQL-server-advanced-gpl sets up a basic db instance for you, right smack in the directory I’m working in. The problem is, this walks all over the directory layout I need for my servers.

    At least I know what’s causing the default install to occur. Now, to go make bcfg2 do something about it.