Robert Biddle and Ewan Tempero
In this paper we will examine the Java language, and consider how easy it is for beginning programmers to learn. We address primarily the issues that arise directly from the language itself, and discuss whether the promises are compromised by the pitfalls. This analysis is the result of our teaching of Java to people in industry, our consideration of whether Java is suitable as a first programming language for university students, and our earlier work involving similar issues with regard to C++.
More than most new languages, Java [1] is being considered for and by beginning programmers. In particular, Java is being considered by educators like us. We spent some time considering Java as the introductory programming language for our first year courses in computer science at our university. This paper documents a number of issues that arose in our discussions that concerned the language itself.
There has been much promotion of the advantages of Java, starting with the original Sun "white paper" [4] which promised some advantages, like simplicity and robustness, which are of value to beginners. In this paper we take a different approach, and concentrate on identifying potential difficulties, hoping this may be of use to anyone considering teaching Java to beginners.
Our primary interest was in whether Java is a good language to teach beginning programmers, and so we were looking to see how easy the language is to learn, and whether the language supports principles that will serve the learners well in their future programming. Java is similar to C++, and because there has been much recent debate about teaching and learning C++, we base most of this paper on comparisons with C++. Our discussion is based on our examination of the language, our experience in teaching Java to industry professionals, and our experience in teaching other languages, especially C++ [2].
In this section we address the detail level of the Java language. This level of the language is quite significant for beginning programmers, because beginners really need to understand the fundamentals of data and control to understand how computing works.
In Java the primitive types, their operations, and the syntax of using them, are all very similar to those of C++. For beginners this is probably a richer set of opportunities than is desirable, but many of the complicated issues, such as bitwise operations, or operations combined with assignment, need not be introduced early. In Java the
boolean type is heavily used, an improvement over C++ where the common practice of using numeric values instead of booleans undermines important lessons of type safety and type checking.The Java statements for selection and iteration are almost exactly the same as those in C++. A number of weaknesses are carried over, including the need to distinguish single statements from multiple statements, and the odd characteristic of the C++
switch statement that cases drop through from one to the next. However, the Java if, while, do, and for statements have the advantage of working with boolean conditions, rather than the C distinction between zero and non-zero. Like C, Java allows control over iteration with break (immediate end of loop) and continue (immediate end of one iteration), but Java adds label matching to allow multi-level escape ¾ probably not something to introduce early to beginners. Java has no direct branch statement like C's goto at all, but this is seldom introduced to beginners anyway.In Java, as in C++, assignment is an operator, meaning that assignments in mid-expression are allowed (e.g.
x = b + (c =
5);). This does not have to be introduced early, but many
C++ beginners learn about this in the worst way, by confusing the assignment
operator = with the boolean equality
== operator. An if
statement with the condition (x = 0) performs
not a comparison but an assignment, and the value of the condition then depends
on whether the assignment produced a zero or non-zero value. Beginners in Java
make the same mistake, but in Java because the boolean operators return boolean
values, and the if statement requires a boolean
value as well. Accordingly, this common mistake results in a compile-time error
in Java, instead of the more problematic run-time incorrect behaviour that
results in C and C++. (However, this can still happen when the assignment
expression produces a boolean type!)
In Java there are important distinctions between primitive data, discussed above, and object data. Beginners need to use objects even at an early stage to access the world beyond their program, and there are pedagogical reasons for beginners to get experience in using objects before they go on to create their own classes. Simple use of pre-existing classes and objects is reasonably straightforward, and uses a syntax similar to C++. Of course, beginners will have to be introduced to the underlying concepts, such as classes, methods, and fields. (It will also be necessary to explain that some methods can be used for a class, whereas others apply only to individual objects: we discuss this in a later section.)
To use a pre-existing class to declare and use objects of that class requires beginners understand in some detail the differences between primitive types and object types. The key difference is that a primitive type always has value semantics, and an object type always has reference semantics. Primitive type variables can be initialised or assigned from a literal value, or from data of the same type, and a copy is made. Object type variables can be initialised or assigned from a newly created object of the type, or from an existing object of the same type. However, in initialisation or assignment of objects, the variable does not receive a copy, but a reference. A variable of object type is a reference to an object. Assigning one such variable to another results in two variables that reference the same object, so changes to one also affect the other.
Object variables can also be initialised or assigned from the special value
null, and at this point the beginner must really understand the idea of reference semantics. In many ways, understanding object references is very similar to understanding pointers. As a result, although Java is claimed to be simpler than C++ because of the absence of explicit pointers, it is unfortunately true that it may seem more complex at initial stages because of the importance of the implicit pointers.Understanding reference assignment also means understanding that when an object variable is assigned to, the original referenced object value is no longer accessible via that variable. This in turn means understanding that when that variable had been the only reference to the object value, the object value is now inaccessible. And this, at quite an early stage, may lead to the need to explain "garbage collection".
Another reason that beginners must cope with understanding object data and reference semantics is that in Java both arrays and strings are objects. In both cases, there is special syntax involved, but understanding the reference semantics is still important. Moreover, strings and arrays are very common in programming, and cannot easily be avoided or postponed for long.
The syntax for using arrays does resemble that of C++, but there is an important difference. In C++, array indexing is syntactic sugar disguising pointer dereferencing, and rigorous checks on array use are difficult. In Java, array use is syntactic sugar disguising object use, and checking is more more straightforward. For example, all array references are always bounds checked ¾ this is a significant improvement for beginners.
The declaration of an array requires understanding of objects, and involves initialising with the
new operator and the desired size of the array (e.g.,int[] marks = new int[10]
). While it may be a good thing that the syntax suggests the
object nature, it is unfortunate that such a fundamental data structure requires
such complication.
As objects, arrays can be assigned or passed as parameters, and of course reference semantics apply ¾ which may lead to surprises as changes to one array will of course also change the other. Arrays are assignment compatible if their base type is the same, so arrays of different size can be assigned to each other or passed as parameters.
The situation of strings is broadly similar to that of arrays, but strings are not arrays. They are objects, albeit with special syntax. Using them in a straightforward way is easy, and the literal string syntax resembles that of C. But variables of type
String are references to objects, with the potential for problems. Perhaps the worst problem is that the equality operator == tests for the equality of the references ¾ a pitfall almost as dangerous as the C = vs. == confusion.The top structural element of a Java program is the package. Pragmatically, it allows the programmer freedom in naming when creating new code, without fear of making library or other programmer's code difficult to access. Neither of these is a critical concern for beginners, but they are reasonable things for beginners to grow into, and Java makes this easy because, if not otherwise specified, everything goes into a default, nameless package that beginners do not have to be aware of to use. There is a problem for beginners in using packages, but it concerns access control in classes, and is discussed below.
The principal structural element of a Java program is the class. A Java class consists of fields and methods. Like C++, each field and method can be designated
public (access allowed to any class user), private (access allowed only to class methods), or protected (access allowed only to class methods or inheriting class methods). All this is reasonable, and properly emphasizes the importance of encapsulation ¾ and the more complicated concepts involved with protected do not need to be introduced right away.There are however two bothersome problems that do affect beginners. Firstly, whereas in C++ the absence of an explicit access control implies
private, in Java it implies a nameless fourth access control category. This category allows access throughout the package in which the class is declared, but denies access beyond the package. While such an access category is not itself unreasonable, it does create a problem for beginners. Because it is the default, it is the result if the programmer forgets to specify access control. And because beginners typically work with smaller programs that consist of classes within a single package (usually the default nameless one), the effect is that forgetting access control yields open access. So in this way, Java fails to reinforce its support for encapsulation, and beginners can easily miss the point.The second problem with Java access control is more superficial, but also involves a failure to support the concept of encapsulation. In Java, the class declaration involves fields and methods, as with C++. However, in C++ the class can specify only the function prototypes for the methods, and the function implementations can be specified elsewhere. By grouping all
public prototypes, a C++ class makes the interface to the class apparent, and the distinction between the interface and the implementation is made clear to beginners. A Java class, however, requires method implementations to be given in detail in the class declaration. As a result, it is no trivial matter to look at a Java class declaration and see what the interface is. For beginners, this is unfortunate, and the important idea about separation of interface and implementation is undermined.Java supports exceptions and exception handling in a similar way to recent additions to C++, and features similar structures for
throw and catch. Which exceptions methods throw is part of their interface, and this is another reason why making it difficult to clearly see the interface is problematic. Worse still is that we regard exception handling as a complicated topic, certainly unsuitable for very new programmers. Yet the Java style and elements of the Java standard class library extensively use exceptions that the programmer must handle.Like C++, Java supports "constructor" methods to initialise instances of the class, and they work in a similar way. However, there are several advantages with Java. Firstly, in C++ constructors are implicitly called whenever an instance is declared or dynamically created; in Java they are only called on dynamic creation. One of the problems in C++ is that beginners often forget that declaration involves a constructor call, but this is less common with dynamic creation. Moreover, because of Java's reference semantics, there is no need for the C++ copy-constructor, a common source of nightmares for C++ learners. And lastly, because of Java's garbage collection, there is no need for the C++ "destructors". Java does support a
finalize method to be called for instances about to be garbage collected, but it is not necessary for dynamic memory management the way a C++ destructor is, and so beginners need not be introduced to these complexities.There are some significant differences between Java and C++ concerning data. The way in which Java methods are passed parameters reflects assignment: primitive types are passed by value, and object types are passed by reference. This does mean that beginners who understand assignment should have little difficulty understanding parameter passing. However, it doesn't support a greater understanding of the possibilities of parameter passing that might be useful. Moreover, the restricted options mean that programmers will have difficulty if they need a method to return a primitive type other than via the method return value, or if they need to be sure a method will not change the value of an object parameter.
Another issue that may be problematic for beginners concerns a difference between class fields and local variables inside methods. Fields are initialised by default, primitive types to various "zero" values, and object types to
null. Local variables, however, are not initialised by default at all. While there are some language implementation reasons for this distinction, it makes one more thing of which beginners must be aware.The most important way in which classes are used together in Java is by composition, whereby one class has a field that is of another class. In this way classes can be built using other classes, and this use can be a hidden implementation detail. As with C++ and most languages, this important relationship is not highlighted or explicit in any way, and it can be difficult for beginners to realise its great significance.
The relationship that is typically more celebrated and explicitly acknowledged is inheritance. Java supports inheritance in a similar way to C++. Java makes a helpful simplification by only supporting single inheritance, and not multiple inheritance.
As well as inheritance, Java includes something similar involving the concept of "interface". An
interface is like a class, but only has interface specifications. An interface can be used as a type specifier, and a class can be designated as implementing an interface, whereupon instances of that class conform to the type specification. This is very like inheritance, and indeed our approach to teaching about inheritance is based on emphasising interface conformance [3]. Whereas Java limits inheritance to single inheritance, it allows a class to conform to multiple interfaces. This allows something very similar to multiple inheritance, although it is less problematic.The Java support for inheritance ¾ and interfaces ¾ seems reasonable and justifiable. However, for beginners it has the problem that there are a number of concepts that must be understood before seeing how it all works together. In some ways it is simpler than with C++, but in other ways more complex, and great care is needed in introducing these relationships to beginning programmers.
Another important technique in structuring programs concerns genericity, especially the ability to parameterize a container class with respect to the type of what it contains. In C++ this is explicitly supported by the
template construct. Java has no equivalent, and adopts the practice of many other object-oriented languages in using inheritance to accomplish genericity.Using inheritance to accomplish genericity is done by writing the container class so that the type of what it contains is a superclass from which other classes can inherit. In Java, this usually means writing the container class in terms of the class
Object, which is an unstated superclass of every class. This allows the container class to contain any object type.The problem with this approach is that it allows objects of any type to be mixed together. With a generic container (such as a C++ template class) this kind of error can be detected at compile time, but the inheritance approach requires run-time checks. Of course, run-time checks can only detect such errors when they actually arise, and so programs may contain latent errors of this kind that may remain undetected despite testing.
While supporting genericity via inheritance does limit the concepts that a beginning programmer must learn, we feel this advantage comes at a significant cost. Not only does it require run-time checks for type safety, but the whole approach muddles two different ideas and programming strategies. We feel that the Java approach is less helpful to beginners than is desirable.
There are other problems that come up, mainly when dealing with the Java standard class library.
An unexpected problem comes from the requirement that most kinds of exceptions must be either explicitly caught, or explicitly passed through to any caller. Because a number of methods in the library throw exceptions, it is very difficult to write anything but the most simple of applications and not have to deal with exceptions. For example, a common early assignment is to read input and manipulate it in some way. Almost any kind of use of an input stream may throw an exception, and in some cases (such as when reading numbers) possibly more than one kind (from the input stream itself, and from the method that parses numbers). The exception mechanism is not something we want to talk about early in the course, but the design of both Java exceptions and the Java standard libraries makes this difficult. Our solution has been to wrap up such code in our own classes, but that somewhat defeats the lesson of "standard libraries".
As mentioned earlier, Java does not have template types as in C++, instead genericity is achieved by the use of the class
Object. Thus the standard container classes of Vector and Stack allow values of any object type to be mixed freely. The usual convention when using these classes is to cast the value to the expected type when getting it out of these containers. This prevents weird behavior because the cast does a run-time check, however it does mean that if values of the wrong type are put into the container, this error will not be caught until run-time. But probably the most aggravating consequence of the lack of template types is the fact that beginners must now be exposed to the concept of down-casting, if they want to use the container classes that are part of the Java standard libraries.Another complication with the standard container classes is due to the distinction between primitive and object types. Because the container classes expect instances of classes that extend
Object, they cannot be used with primitive types such as int and char, types that are very commonly used in early discussions. While the Java standard libraries do contain "object" versions of the primitive types (such as Integer and Character), having to explain this to beginners is a distraction we would prefer to avoid.Most container classes have a method such as
contains, which determines whether the specified value is in the container. In order for this to work, the values placed in the container must be able to be compared for equality. Recall that == tests for reference equality, not value equality. This means that classes such as Vector rely on the fact that there is a method in Object named equals. This method does not do anything interesting, it is expected that the class of any actual value placed in a Vector will override equals to provide the correct behaviour. This causes two problems for beginners.First, beginners often forget to override
equals, but, because Object provides it, they get no warning as to what they have done ¾ their program compiles fine but does not execute as they expect.Second, if programmers do remember to create their own
equals, they often prefer it to look like:class Myclass {
public boolean equals(Myclass e) {
// ....
}
}
but here Java's overloading mechanism regards this as a different operation than the one expected by
Vector; the expected one requires the argument to be of type Object. Understanding why it has to be this way (given the choices Java has made about generic types) is more than we normally expect from beginners.Finally, in Java there is no way to avoid talking about
static methods or fields early. For example, many useful methods, such as those for cosine, logarithm, or random number generation, appear as static methods in the Math class. Such methods are often used in example programs for beginners. Another construct that is commonly used is enumerated types. Java does not have enumerated types. Its designers favour using constructs that involve static such as: class Direction extends Object {
public static final int North = 1;
public static final int South = 2;
public static final int East = 3;
public static final int West = 4;
}
Even more difficult to avoid is the fact that Java uses a static method to deal with the "startup" problem. The "startup" problem concerns how an application should start: what happens first? According to the object-oriented model, the first thing that should happen is a message should get sent to an object. So how do you get an object? In Java, there are no global variables, so in order to get an object, it must be created inside of a class somehow, but then a message is needed to be sent to that class to get that object created. The Java solution is to call a
static method, and in Java, that method must look like this:public static void main(String[] args) {
// ...
}
Thus, short of avoiding applications all together, there is no way to avoid introducing beginners to
static methods. Note that avoiding applications all together is possible in Java ¾ just use applets. Applets are the kind of application intended to provide a user interface embedded in a web page, and they are supported by structures in the standard library. However, new kinds of applets are created by inheritance, meaning that for the first program we show to beginners, they must see either static or inheritance ¾ not a happy choice.We have presented a discussion on potential problems with teaching Java to beginners. We believe there a number of pitfalls that educators must be aware of when teaching Java to beginning programmers:
In this paper we have not addressed whether the object-oriented paradigm is itself reasonable for teaching beginners; however, many educators have made the decision that it is. Having made this decision, the promises made for Java raise hopes that it might be an excellent teaching language: simplicity and robustness seem very attractive. And because Java resembles C++ so much, it seems reasonable to regard C++ as a benchmark for evaluating Java. And so, we expect that Java will be better as a teaching language than C++.
After examining Java, however, and on the basis of early experience, we are disappointed. As we have detailed in this paper, we do believe Java is a superior teaching language to C++ in many ways, but the promises are compromised by pitfalls.
Is Java a sensible language to use for teaching beginning programmers? This is a difficult question to answer, because a sensible answer cannot be based on the language alone. We do believe, on balance, Java to be marginally better for teaching than C++. However, if Java had remained in obscurity, this margin would not be enough for us to choose it to teach with ¾ the popularity of C++ is a practical advantage in many ways. However, Java has itself become very popular, with its own practical advantages such as its widespread availability and the existence of standard libraries. Anyone making a decision must do so taking into account these practical considerations, which are still unfolding and which will differ in individual circumstances.
We believe that for teaching beginning programmers, the Java language itself is reasonable, but not compelling.