Improving Hadoop datanode disk fault tolerance

By design, Hadoop is meant to tolerate failures in a responsible manner. One of those failure modes is for an HDFS datanode to go off line because it lost a data disk. By default, the datanode process will not tolerate any disk failures before shutting itself off. When this happens, the HDFS namenode discovers that […]

Rebooting Linux temporarily loses (some) limits.conf settings

Like any wildly managed environment, you probably have to create custom-defined settings in your /etc/security/limits.conf because of application specific requirements. Maybe you have to allow for more open files. Maybe you have to reduce the memory allowed to a process. Or maybe you just like being ultra-hardcore in defining exactly what a user can do. […]

Augeas made my grub.conf too quiet!

A reader contacted me after working through the examples in my last previous post on Augeas. He was having a difficult time figuring out how to add a valueless key back into the kernel command line. This was the opposite of what I was doing with quiet and rhgb Many thanks. I have been pounding […]

Using augeas and puppet to modify grub.conf

In my environment, we rely heavily upon Puppet to do large-scale automation and updating of our various systems. One of the tasks that we do on an infrequent basis is to modify grub.conf across many (or even all) systems to apply the same types of changes. In Puppet, there are several ways you can do […]

Running Hadoop data nodes from USB thumb drives?

I received an interesting question today from a reader regarding the use of USB thumb drives for the OS drives in Hadoop datanodes. Have you ever put the OS for a Hadoop node on a USB thumb drive? (or considered it) I have a smaller 8 node cluster and that would free up one of […]

Pig ‘local’ mode fails when Kerberos auth is enabled.

I ran across this interesting Kerberos authentication bug today on Cloudera’s CDH4. It appears to affect all versions of pig, but only when running in local mode. I want to run pig in local mode. This implies that pig fires up everything it needs to run the MapReduce job on your local machine without having […]

Followup on Cloudera HUE’s Kerberos kt_renewer

Just a short followup about the HUE kt_renewer issue I discovered. It turns out that the issue was me and not HUE. The fix turned out to be pretty simple once I saw the clue in a related issue. It seems like Cloudera Manager had the same issue. The problem ended up being a missing […]

Kerberos kt_renewer failures with HUE on CDH4

First off, I’m not exactly sure if this is a Hadoop User Environment (HUE) issue or if this is a broken setup on my Kerberos environment. I have a thread open on the HUE users list, but haven’t had any followup. I’ve just fired up HUE for the first time to talk with a kerberos-enabled […]

Mass-gzip files inside HDFS using the power of Hadoop

I have a bunch of text files sitting in HDFS that I need to compress. It’s on the order of several hundred files comprising several hundred gigabytes of data. There are several ways to do this. I could individually copy down each file, compress it, and re-upload it to HDFS. This takes an excessively long […]

Using cobbler with a fast file system creation snippet for Kickstart %post install of Hadoop nodes

I run Hadoop servers with 12 2TB hard drives in them. One of the bottlenecks with this occurs during kickstart when we’re using anaconda to create the filesystems. Previously, I just had a specific partition configuration that was brought in during %pre, but this caused the filesystem formatting section of kickstart to take several hours […]