in Misc

Less, More, and None

I’ve been thinking about positive life choices lately. Trying to improve my outlook, really. More of the good things, less of the time sucking things, none of the time-wasting stuff. I ran across this format recently and thought it would be a good way to keep track of things I wanted to accomplish throughout the year. I’m starting a bit late, but hey … better late than never.

This is about bring priority and focus to my life.

➖ Less

  • time in-front of a television
  • fewer YouTube channel subscriptions
  • shopping for physical books (kindle first, if I must)
  • focusing on acquiring stuff for hobbies instead of doing the hobby
  • Facebook

➕ More

  • exercise starting with walking and weight lifting
  • hiking with my wife
  • camping and fishing
  • working on the radio
  • reading through the stack of backlogged books
  • woodworking
  • getting out to see friends
  • keeping the office tidy
  • personal time for introspection and note taking
  • writing
  • time sitting on the back porch to relax
  • researching places to live
  • photography (and get those photos printed)
  • self-awareness and taking more responsibility in my actions
  • playing with the cats
  • getting dirty in the garden
  • saving for the future
  • giving to the non-profits that mean something to me
  • learning to better engage with others

🚫 None

  • paying attention to politics (bye bye /r/politics)
  • paying attention to evening news
  • getting sucked into Netflix binge watching
Inspired by a post from Jacoby Young

Things I wished I knew before archiving data in Hadoop HDFS

I was recently in a good discussion about sizing a Hadoop HDFS cluster for doing long-term archiving of data. Hadoop seems like a great fit for this, right? It has easy expansion of data storage as your data foot print grows, it is fault tolerant and somewhat self-recovering, and generally just works. From a high-level […]

Hive Metastore and Impala UnknownHostException during table creation

Like many environments, we run a few long-lived Hadoop clusters in our lab for doing testing of various feature and functionality scenarios before they are placed in a production context. These are used as big sandboxes for our team to play with and do development upon. Today, we encountered a strange Hive Metastore error on one environment […]

Docker for Mac Tips for Troubleshooting Container Problems

I’ve used Docker for Mac since the Beta release opened to wider audiences. With the rapid prototyping I’m doing on Hadoop environments, I’m finding it great for providing quick environments to test out theories. Problem: How do you access the Docker for Mac VM? The problem with a black box is not being able to easily get inside […]

A followup on the strange stunnel behavior in docker

This is a quick followup to the strange stunnel behavior I was seeing that I wrote about previously. After discussing the issue with a colleague, we came up with two different solutions to this problem with stunnel writing to /dev/console inside a docker container. Indirect route with docker exec In his method, we invoke the container and […]

Strange stunnel debug logging behavior in docker

I’ve been playing with xenserver lately to quickly model small Hadoop clusters. One of the frustrating things about xenserver is the lack of good graphical user interfaces that provide for a minimal amount of automation. This means I’m frequently dropping to the command line on the xenserver master and running the xe tools by hand […]

Containing a snakebite with python pex

I’ve used Hadoop for several years now. One of the most frustrating parts of using Hadoop is the time it takes to start-up the Java HDFS client to run simple tasks.  Even listing a directory can take several seconds because of the startup cost associated with launching the JVM. In 2013, Spotify open sourced a pure […]

Creating RPMS with fpm and docker

Now and again, you need to create RPMS of third-party tools, such as Python libraries or Ruby gems.  The most effective (and best!) way to do this is with fpm. Historically, I have built RPMS using a few dedicated virtual machines for CentOS5 and CentOS6-specific builds. These virtual machines have gotten crufty with all the various libraries installed. Wouldn’t it be nice to have […]

Hadoop distcp network failures with WebHDFS

… or why do I get “Cannot assign requested address” errors?! At some point or another, every Hadoop Operations person will have to copy large amounts of data from one cluster to another. This is a trivial task thanks to hadoop distcp.  But, it is not without its quirks and issues. I will discuss a […]

Google Chrome, SPNEGO, and WebHDFS on Hadoop

I’ve previously noted that we’re using Kerberos to handle the authentication on our Hadoop clusters.  One of the features that we had previously not had because of configuration issues, was the ability to use WebHDFS to browse around the cluster.  With our latest cluster, we figured out the right incantation of Kerberos and SPNEGO configurations […]