Using JavaScript to Learn about JavaScript

My friend asked my to explain how object-oriented programming works in JavaScript. I decided that if I was to do that, the first thing for me to do would be to, uh, learn how it works, and then figure out where to go from there. Part two would probably involve writing some sort of "computer code" as well, but I'm not afraid of that; I'm a Computer Programmer, you see.

So the last five times in five years I did any reading on OOP in JavaScript, I remember someone saying, "remember, JS OOP is /prototype-based/". Oooookay. What I couldn't help noticing was that every other upstanding OO language (C++, Java), as well as even the more subversive ones (Python) have a keyword to announce "I'm defining a class here". Usually that keyword is, surprise, "class". But JS accomplishes all (well, most) of its class defining by making the word "function" do double-duty.

So, how do you tell, by looking at a function definition, whether you're really looking at a class declaration, or just another function from the pumpkin patch? This is what I needed to find out, if I was to get this whole mess straightened out in my head. So I set about writing myself a few examples of classes... or functions... well, bits of code, with various permutations of the features that characterize classes and functions in JS. Along the way, I was able to use a few neat properties of JavaScript not found in conventional languages, properties that I do understand pretty well. These things are tangential to the topic, but that's exactly what makes them good real-world examples of those JavaScript features.

Here We Go

From my naive starting vantage point, it seemed as if what set a function apart from a class declaration was the existence of assignments to members of "this". See, your typical function looks something like this:

  
function AddTwoToMe(x) {
	 return 2 + x;
}

while your typical sample object constructor looks like this:

function Employee(in_name, in_salary) {
	 this.name = in_name;
	 this.salary = in_salary;
}

Anybody can look at these two and clearly tell that the first is supposed to operate as a function, as understood in C, Pascal, Lisp, Python, etc., while the second has all the hallmarks of a constructor and OOP (namely, assignments to the "this" object). But let's step back a moment. You and I know which is which here because we understand how human programmers typically write code. The first function returns something; why would anybody use that to create an object? We "know" what the writer "meant". Likewise, we can see that nothing is happening in the second function, except for some assignments to a variable that only has meaning within the function. So we can see that the author of the function had an object constructor in mind. But ultimately, we work with dumb computers. Computers only do what you tell them. What happens if you take the first function, the one which actually does something, and add a "this" assignment? Does it magically change the function to a constructor? Does it lose its ability to simply return the input parameter incremented by 2? I knew this question had to be answered before I could get anywhere in my understanding of what JS's object system is all about.

So, I needed: a function which simply returns a value; a function which simply makes some assignments; and a third function which does a little of both. And I wanted to try to use each of these in the contexts where a simple function call was expected, and where an object is expected.

Here's a basic model of what these functions might look like:

function returnAValue () {
        return 5;
}

function initAnObject() {
        this.foo = 6;
}

function initAnObjectAndReturnAValue() {
        this.foo = 7;
        return 8;
}

OK, now suppose I have these three functions. How do I need to poke and prod them to figure out what they do and don't do? Well, these -- "things" can be called as functions, something like:

    var a = returnAValue();

But objects, of the OOP type, are created, it turns out, by something like:

    var o = new initAnObject();

(Admittedly, as I notice this, I'm starting to get an inkling of where the difference lies. But I've got a story to tell, so we'll just pretend we didn't see anything, and forge ahead blindly.)

So I have these three functions, and I want to call each of them with a simple function-style call, and with that "new" keyword. Trouble is, I hate, fscking hate to type (never mind this blog entry, whose word count even as I type stands at 738 -- I'm not that good with conciseness in expository prose), and nothing would take the wind out of my sails than to have to type

   x = A();
   doSomethingWithX();
   x = new A();
   doSomethingWithX();
   x = B();
   doSomethingWithX();
   :
   :
   :

for functions A, B, and C. Life is just too damned short to go cutting and multiply pasting chunks of code, hand-editing a few variable names, constants, or whatever in each of the pasted copies. Tedious. How can we avoid doing this here?

Well, notice that we have three functions to do stuff with, and two things that we want to do with each function. Since three is greater than two, we want to focus on abstracting the functions, and iterating through them somehow.

One of my two favorite things in most scripting languages, and the thing I probably miss most in C and C++, is "for ... in" loops -- the ones where you can put arbitrary objects in the list. In Bourne-style shells, you can only do this with text strings, but that's usually all you need, and it looks like this:

for filename in groceries.txt to-do.txt expenses.odc
do
	if ! [ -f $filename ]
	then
		echo $filename has gone missing\!
	fi
done

Just list the things you want to look at right on the line, and the for loop will do stuff with each of them. Can we do that in JS with each of our three functions? You know, like

     for (x in returnAValue, initAnObject, initAnObjectAndReturnAValue) {
	 doStuff();
     }

The answer, at least for current versions of JS, is no. But since functions are objects, we could load them up into an array, even using "array literal" syntax:

    fArray = [returnAValue, initAnObject, initAnObjectAndReturnAValue];

Then we could do:

     for (i = 0; i < fArray.length; i++) {
	 x = fArray[i]();
	 doSomethingWithX();
	 x = new fArray[i]();
	 doSomethingWithX();
     }

Well this is pretty nice. If you've never seen this done before, the syntax for calling the function may seem a little unsettling at first. Most Java programmers, I guess, have never seen the brackets [] followed immediately by parentheses (). But it makes sense if you think about it: when i is 0, fArray[i] is returnAValue (a reference to the function as an object), while fArray[i]() is returnAValue() (that function being executed).

But there are a couple of problems with this. First, there's something less-than-satisfying about defining all your functions, then having to put them all in a list. It's not a huge deal, but there does seem to be some redundancy there, as well as a potential maintenance problem: every time you add a new function that you want to do stuff with, you have to define it first, then find the array and add the name of the function to the array.

The second problem is more serious for our particular exercise, which will soon become apparent. Let's start looking at what we're really going to end up putting inside that for loop. Up to now, I've been using "doSomethingWithX()" as a dummy stand-in for something as-yet unnamed. What do we really want to do in this loop? Poke and prod the results of these "function calls". We want to see the results and, very important, to know which functions and invocation styles produced which results. So, to accomplish that without the benefit of the array, I might have something like this:

x = initAnObjectAndReturnAValue();
alert("Return value of initAnObjectAndReturnAValue() is " + x);
x = new initAnObjectAndReturnAValue();
if (x) {
   alert ("got an object out of initAnObjectAndReturnAValue");
}

But when we try to put these functions in a list, we can't do this! Inside the for loop, the functions are dereferenced by an index into the array, and there's no way to take the element in the array and reverse the operation to find the name of the original function*. We have to create some sort of association between each function and its name if we want to iterate through them and still have access to the names from inside the loop.

This is a place where we can use the other of my two favorite features of most modern scripting languages: that container known variously as hash table, dictionary, and associative array. Let's replace our numerically-indexed array with a JS associative array:

functionlist = {"returnAValue":returnAValue,
	     "initAnObject":initAnObject,
	     "initAnObjectAndReturnAValue":initAnObjectAndReturnAValue}

With this hash/table/whatever-you-want-to-call-it, we can also use a for loop, and now we have access to the function name inside the loop:

for (name in functionlist) {
     x = functionlist[name]();
     alert("Return value of " + name + " is " + x);
     x = new functionlist[name]();
     if (x) {
         alert("got an object out of " + name);
     }
}

OK, now we've found a good workaround for the problem of not having access to the names. That was one of the problems with the plain array version. The other one -- the "ugliness factor" -- seems not only to remain but to have gotten worse. I still have to write each function name not twice, but three times now (though there's no need to use the exact function names for the hash keys if you can get away with meaningful abbreviations), and if I add a new function, I have to remember to add it to the hash. Isn't there some way to do this all at once?

Why, yes, there is! It turns out there's no need to give a function a name in JavaScript. "Anonymous functions" can, syntactically, be used anywhere that any other object can go: as an argument to another function, as an rvalue in an assignment, as a member of an array, etc. Just to warm you up to this idea, let's take a look at an alternate way of defining one of our functions.

function returnAValue () {
        return 5;
}

can alternately be defined as

var returnAValue = function () {
	return 5;
}

So this shows you what an anonymous function looks like; why don't we try stuffing a bunch of them as values into a hash?

functionlist = {"returnAValue":function () {
			return 5;
			},

             "initAnObject":function () {
		       this.foo = 6;
		       },

             "initAnObjectAndReturnAValue":function () {
		       this.foo = 7;
		       return 8;
		       }
};

OK, this is it. We've reached nirvana. We've achieved all our goals for eliminating redundancy and enhancing maintainability: all function names are only mentioned once, and we have a convenient way to call them all in turn in a loop. We can use the same calling code that we used in the first example with the hash table; we have access to the function names (in some sense, anyway -- in fact these functions have no names in the traditional sense, but the important thing is that we have names for them).

OK, now where were we? Learning about OOP coding in JS. Oh, but wait, one more nicety. This one's real quick and short, I promise, as long as you know what exception handling is. So, debugging JavaScript can be a real exercise in frustration. You do something wrong, refer to an object property that's undefined or something, and the script just stops. You have no idea where or why. You can litter your code with debugging statements which usually throw up an alert every time you "got here", and then you have to take them all out at production time. Yes, there are some tools to make this less painful, but you can smooth things out a hell of a lot just by catching exceptions. For example, this

     x = functionlist[name]();
     alert("Return value of " + name + " is " + x);

becomes this

    try {
	x = functionlist[name]();
	alert("Return value of " + name + " is " + x);
    }
    catch(ex) {
	alert("Couldn't call " + name + " as a func:" + ex);
    }

so by just coding this once, you either get the desired result, or you get a message with decent info on where your code failed, but the script doesn't have to die at that point.

OK, we're really done with the tangents. For real. I mean it. Here's the final code we're going to do our exploring with -- and we're going to analyze the results, too! Hey, you back in the corner! Wake up! I don't get paid to stand here to lecture to a bunch of snorers.

functionlist = {"returnAValue":function () {
                        return 5;
                        },

                "initAnObject":function () {
                       this.foo = 6;
                       },

                "initAnObjectAndReturnAValue":function () {
                       this.foo = 7;
                       return 8;
                       }
};

for (name in functionlist) {
    try {
	x = functionlist[name]();
	alert("Return value of " + name + " is " + x);
    }
    catch(ex) {
	alert("Couldn't call " + name + " as a func:" + ex);
    }
    try {
	x = new functionlist[name]();
	alert ("got an object out of " + name);
	//alert ("...and vlue of x is now " + x); //Always returns [Object object]
    }
    catch(ex) {
	alert("Couldn't get an object out of " + name + ": " + ex);
    }
}
alert ("Done");

We'll put this code into a file called objects.js, and include the file in a very small HTML file like the following:

<html>
<head><title>Messin' with OOP in JS</title>
<script src="js/objects.js"/>
</head>
<body>
This is my body.
</body>
</html>

When we do this, here are the alerts we get:

Return value of returnAValue is 5
got an object out of returnAValue (aha, an object was created even without initializing any members!)
Return value of initAnObject is undefined
got an object out of initAnObject
Return value of initAnObjectAndReturnAValue is 8
got an object out of initAnObjectAndReturnAValue
Done

So, what can we conclude from these outputs? We first note that no exceptions were thrown, so everything we've done in our script is a legitimate (but not necessarily useful) way to use JavaScript. We also note that there is not much difference between the different types of functions. The biggest difference is in whether or not you return a value, but that's not really a big revelation; you've probably written JS functions that don't return a value, and the real surprise here is that it doesn't go to pieces when you try to get a return value from such a function. But as far as OOP goes, there really is no difference between these functions. If you call them with the "new" operator, each of them will return an object.

So here's our general rule for the "difference" between functions and class declarations: a function defines/instantiates an object if and only if it is invoked with the "new" operator.

OK, we're now at 2,490 words, so I hope that's enough to clarify things.

I didn't manage to get to prototypes; that'll have to wait until the next installment.

*Some browsers provide a name attribute for functions, but not all do.