Have you ever been confused about the different ways to handle missing data in the Ruby language? I know I have, and I’m sure I’m not alone in that. The options Ruby offers come in the form of several methods: “present?”, “blank?”, “nil?”, and “empty?”. There are all somewhat related since all of them check for the absence of data in some way. But they are also all different, in the sense that they check for different types of absence, so to speak. Besides, there’s also a question of availability: not all of them are available to the same objects.
Sound confusing? Well, you’ve come to the right place. This post is going to clear up that confusion. We’ll start by explaining the general need for ways in which to model absence not just in Ruby, but in programming in general. Then we’ll briefly cover some of Ruby’s characteristics because such knowledge is important for understanding some of the choices made by the designers of the language.
Finally, we’ll explain how to deal with missing/unknown data in Ruby, covering each one of the methods in detail, including examples. Let’s get started.
We’re going to dive into the different ways in which the Ruby programming language models absence. But before we do that, let’s cover the need for absence modeling with programming in general.
I’ve seen some people define programming as the manipulation of data. And I’d say most people think of data as “a thing”, rather than “not a thing.” Or better yet, they think of data as the digital representation of a number of things—that can range from physical objects to events, quantities, or even very abstract concepts. However, such a definition of data is inaccurate or, at least, incomplete. It’s missing something—the very concept of “missing something.”
This might be surprising for beginners, but in software development, we devote a non-trivial amount of energy into handling the concept of absence. It’s all too common to have a piece of data that is missing, unknown (or even unknowable), or generally unavailable for some reason.
Ruby, of course, isn’t the only language that needs the concept of “nothing.” Far from it. If you’ve ever coded in PHP, then you’ve certainly had to deal with NULL (and the same goes for C#, Java, and pretty much every mainstream modern programming language). In the functional paradigm, it’s common to use approaches such as optional types. Procedural languages could use return codes to sign that a given piece of data is missing.
Why is the absence of data such an important concept in programming? Well, for the same reason it’s important in life in general. We—as in the human species—don’t know a lot of things. In math, you have the concept of variables to represent unknown values. The concept of absence is present (no pun intended) even in music. Those of you who are musicians and can read sheet music know there are symbols to denote pauses, i.e. absence of sound.
So, it shouldn’t come as a surprise that, in programming, we also often have to deal with the concept of missing data. As I see it, the need for “nothingness” comes in three categories. First, we have the situation of unavailable data. We don’t know the answer yet, but we might know it in the future. An example of that would be data that is still not ready due to some time-sensitive constraint. For instance, sales reports of a given month, when the month isn’t over yet.
Another category of missing/invalid data would be data representing events that haven’t occurred yet and might never occur. For instance, think about an “employees” table in a relational database. It probably makes sense for it to have a column called “termination date”, which might never come.
Finally, there’s also the case of absent data used to indicate that the question itself is not valid. If someone asks me whether the current king of Brazil is bald or not…I’d think such a person is mocking me. Or maybe they’re just terribly misinformed. Anyway, there is no king of Brazil, which means that the question itself isn’t valid. In database design, for instance, we’d probably use “null” to model that answer.
In the previous section, we’ve covered the concept of “nothingness” or absence of data in programming, drawing comparisons between the need for representing absence in programming and in other areas. Then, we’ve shown you the different use cases for representing absence.
With that out of the way, now it’s time to start talking about Ruby specifically. This language handles missing data in several different ways. It’s time for you to learn about them so you can employ the correct tool for the correct situation.
The Ruby programming language first appeared in 1995. Yukihiro “Matz” Matsumoto created the language drawing inspiration from some of his favorite languages, such as Perl, Smalltalk, Eiffel, Ada, and Lisp.
Ruby, like most mainstream programming languages, supports developing in several paradigms, effectively making it a multi-paradigm language. But obviously, Ruby’s focus is the object-oriented paradigm. People like to say that in Ruby everything is an object. That’s not quite accurate, but for the sake of our post, let’s consider that it is.
What is the implication of “everything is an object”? Well, to accomplish things in Ruby, you call methods on objects. Or, to put it another way, you pass messages to them. So, consider this trivial line of code:
result = 2 + 2
Even someone who’s never written a line of Ruby can understand the line above. The value of ‘result’ is obvious. But where is the object orientation? Where are the methods? The answer is simple. The line above is nothing but syntactic sugar. What’s really happening is this:
result = 2.+(2)
You’re calling the ‘+’ method on the ‘2’ object, and passing the other ‘2’ as an argument. So, even when expressing such a simple idea, Ruby still keeps loyal to its object-oriented ideas.
This Ruby characteristic of almost everything being an object results in interesting—and often surprising—choices when modeling some concepts. For instance, in Ruby, even “true” and “false” are objects! They’re the single instances (singletons) of the “TrueClass” and “FalseClass”, respectively.
So, it shouldn’t come as a surprise that in Ruby, even absence is modeled as some kind of object, because that would be simply staying true to the language’s philosophy.
Now we’re going to cover each one of the four main methods of checking for missing data in the Ruby language. We’ll start with the most general and universally available one, and progress towards the more specific.
Let’s start out with “Nil” since it’s the most common and easy-to-understand way of representing nothingness in Ruby. In terms of what it means, Nil is exactly the same thing as null in other languages. So, what is null?
Null, as it exists in many languages, is a sort of value that represents “nothing”. The way it’s defined can and does vary a lot between languages, but the gist of it is the same. Null represents the absence of something. It can be returned from functions, and also passed as an argument. But there’s something problematic about that construct.
Most of the time, when some consumer code receives null/nil, things go sour. Even though this special value can be gladly passed around as if it was a valid value, it isn’t. It doesn’t answer to the same contract as the valid object it’s impersonating. Let’s see a quick example. Suppose we have a “Calculator” class, which exposes methods to perform a number of different calculations. I’ll omit the code for this class, for brevity’s sake. Then, consider the following code:
def printFactorial(calculator, number)
puts calculator.factorial number
end
If you then call this method, passing an instance of the Calculator class along with a number, it should correctly calculate its factorial. So, consider the following line:
printFactorial(Calculator.new, 5)
You’ve certainly figured out that the line above would print 120. But what about the following line:
printFactorial(nil, 5)
Were Ruby a compiled language, the compiler wouldn’t bat an eye about the line above. But in execution time, an exception would be thrown, with the following message:
undefined method `factorial’ for nil:NilClass
Ruby being a dynamically typed language means that, in our example, we can pass all sorts of things that aren’t a calculator to the printFactorial method. There’s nothing preventing us from passing a string, a number, a date or any other object to the method. As long as it doesn’t have the “printFactorial” method, we would get the same type of error. And that would be most likely a bug, resulting from a mistake by the developer.
How do we handle the possibility of a received parameter being null? Glad you’ve asked. To deal with that we should use the “nil?” method. Take a look at the updated example:
def printFactorial(calculator, number)
if calculator.nil?
"Invalid calculator"
else
calculator.factorial(number)
end
end
The “nil?” method only returns true for the nil object. Everything else returns false, which makes this method the perfect way to test for nil, that stays true to the object-oriented values of the Ruby language. Sure, nothing prevents you from performing a simple comparison to nil:
def printFactorial(calculator, number)
if calculator == nil
"Invalid calculator"
else
calculator.factorial(number)
end
end
Finally, you could use the ternary-if operator in order to simplify the code even further:
def printFactorial(calculator, number)
calculator == nil ? "Invalid calculator" : calculator.factorial(number)
end
We could go even further and apply the safe-navigation operator to simplify—or not, depending on your personal tastes—the code even more. Let’s see how it goes:
def printFactorial(calculator, number)
calculator&.factorial(number) || "Invalid calculator"
end
However, in terms of how it’s implemented, nil is fundamentally different than in other languages. In Ruby, nil is—you’ve guessed it—an object. It’s the single instance of the NilClass class. Since nil in Ruby is just an object like virtually anything else, this means that handling it is not a special case. You’re just calling methods on an object (or passing messages to it, for you Smalltalkers out there) and that’s it.
In the previous section, we’ve covered the “nil” construct in Ruby. You can think of nil as the most general way of representing absence in Ruby. Anything can potentially be nil, and every object can answer to the “nil?” method. Now let’s cover a much more specific way of representing a lack of data, and that is emptiness.
Everything can be nil, but only collections can be empty. The “empty?” method can be used with objects such as Array, Set, Hash, and will return true when the collection doesn’t have any elements. Let’s see some examples:
(1..10).select{|x|x>10}.empty?
=> true
["ruby", "python", "java"].empty?
=> false
Set.new.empty?
=> true
However, the empty method is not available for Enumerable. Which makes sense if you consider that not every object that enumerates knows whether it has any values to enumerate. So, the following line, which tries to use the method on an instance of Range doesn’t work:
(1..10).empty?
In order for it to work, we could first convert the range to an array and only then use the “empty?” method:
(1..10).to_a.empty?
And in this case, the answer will obviously be false.
The “empty?” method can also be used for strings, which makes perfect sense since strings can be thought of as a collection of characters. The method will not work for numbers or dates, though.
Let’s now cover the “blank?” method. There’s one important difference between this method—and the next one as well—and those that came before. The first two methods are native to the Ruby language. The “blank?” method, on the other hand, was introduced by the Ruby on Rails framework. With that out of the way, let’s see what this method is all about.
This method is interesting because it checks for several properties of a given object at once. This is a very common use case. Suppose you have a method that is supposed to take an array as a parameter. You’ll often want to check not only that it isn’t nil, but also that it’s not empty. And since a string can be considered a collection of characters, the same reasoning applies here. A similar argument could be made for a boolean value. You might want to check that it isn’t nil and it isn’t false. Yes, you could write several checks, but that might get really old, pretty fast. Enters “blank?”.
The Rails framework offers this method as a quicker way of performing several checks related to the validity of an object. The method will return true for a given object if it’s nil, false, empty or a whitespace string. Check out the following examples:
[].blank?
=> true
" ".blank?
=> true
false.blank?
=> true
[1, 2, 3, 4, 5].blank?
=> false
Finally, we get to the “present?” method. This one, like the previous method, isn’t native to the Ruby language itself, but provided by the Rails framework. This method is, by far, the easiest of all to understand. Here it goes: “present?” is just the negation of “blank?”. That’s it. So, let’s revisit the example from the previous section, but this time using “present?”, instead of “blank?”:
[].present?
=> false
" ".present?
=> false
false.present?
=> false
[1, 2, 3, 4, 5].present?
=> true
Some people might wonder whether it’s really necessary to have a method that is just a negation of another one. Well, personally, I always find conditions written in the positive to be way easier to understand than the ones written in the negative. So, at least to me, asking if something is present is clearer than asking if it’s absent. Your mileage may vary, of course, which isn’t a problem, since you have both options at your disposal.
Nothing is definitely something. At least, when it comes to programming. The amount of effort that is put into figuring out clever ways to deal with absent or missing data in programming might seem weird, especially to beginners. But as you’ve just seen, there are valid use cases for that. The Ruby programming language offers not only one, but many ways to handle “nothing.” In fact, things can even get confusing with so many similar methods. This post was our attempt to clear up the confusion.
We started out by explaining the needs to express missing or unknown data, not only in Ruby but in programming in general—and even in other areas. Then we proceeded to cover some characteristics of Ruby that stem from its object-oriented philosophies. With those principles laid out, we finally explained in detail the four main methods of checking for the absence or presence of data in Ruby.
While we hope the knowledge presented in this post will be useful to you, it’s likely not enough. If you want to write amazing applications—in Ruby or other languages— you’ll have to employ a number of tools and techniques. From unit testing to a solid logging approach, from sound exception handling to monitoring tools like Retrace. You can learn about all of that here, on the Stackify blog.
Thanks for reading, until next time.
If you would like to be a guest contributor to the Stackify blog please reach out to [email protected]