Tuesday, May 3, 2011

LANGUAGE DESIGN

Computer programming languages are developed to make it easier for humans to direct computation.
At some times in the past it was thought that a single language could be best for all programming tasks.
For instance, IBM planned to “unify” scientific and business programming in the 1960s with PL1, replacing
both FORTRAN and Cobol. In the 1980s there was talk of Pascal replacing all other languages because of its
superior type checking and block structure.
As time has passed, however, more languages, not fewer, have come into use, and new ones still appear.
We think this is due to the maturing of the programming discipline. Just as any able mechanic will carry several
different tools for working with a 10 mm nut (open-end wrench, box wrench, crows-foot wrench, shallow
socket, deep socket, etc.), any able programmer will carry knowledge of several different languages so that they
can select the best one for a particular circumstance.
Some languages provide better run-time performance, some provide unusually compact syntax for quick
“one-off” programs, some offer particularly strong features for manipulating text, some for working with matrices
of numbers, etc. In evaluating a language, computer scientists consider many properties.

From the earliest days, efficiency of execution has been a desirable property. In fact, FORTRAN was
widely adopted in large part because it created code that was very nearly as fast as assembly language code.
Without its characteristic efficiency, FORTRAN would have been adopted much more slowly by the programmers
of the 1950s and 1960s who worked in an environment where the cost of running a program was an expensive
multiple of the CPU seconds the program consumed.
Human readability is another desirable trait in a language. Cobol syntax is as “wordy” as it is because the
designers of Cobol wanted the code to be self-documenting. The designers hoped to guarantee that Cobol would
be easy for a human to read, regardless of the commenting style of the author.
A language that is easy to implement has an advantage. The language ADA can serve as a contrary example.
While ADA is an excellent and carefully designed language, ADA has been adopted more slowly than some
others, in part because its size and complexity initially made it more difficult to implement, especially on
smaller computers.
Computer scientists also praise a language for expressiveness. This is a somewhat subjective judgment, but
an example of unusual expressiveness will illustrate the property. Perl offers the “if” conditional familiar to us
in most languages, and Perl also offers the “unless” conditional, which is the converse of “if.” Having both
forms can be called “syntactical sugar,” since there is no functional requirement for a language to have both,
but having both allows more natural expression of some conditions.
Expressiveness is also relative to particular types of applications. C’s built-in facilities for manipulating
bits mark it as unusually expressive in that way, and make it an especially good language for writing operating
systems and drivers. Matlab’s matrix manipulation syntax is wonderfully expressive for matrix algebra applications
like statistics and image processing.
Another very desirable trait in a language is regularity. Regularity means consistency of behavior, consistency
of appearance, and avoidance of special cases. In C, an example of an irregularity is the use of the == Boolean operator.
Any two values can be compared using ==, but two arrays cannot be compared using ==; arrays must be compared
element by element. The == operator cannot be applied in a general way to all data structures. There are almost
always good reasons for irregularities, but, other things being equal, a more regular language is more desirable.
Computer scientists praise languages that are extensible. Many languages today allow the writer to define
new data types, for instance. That was not an option in early versions of FORTRAN, which came on the scene
supporting only integers and floating-point data types. Languages can also be extended by adding to libraries
of shared routines. A language like LISP even allows the writer to extend the keywords of the language by
writing new functions.
Standardization is another advantage; a language with a formal standard encourages wider adoption. Ada, C,
Cobol, Java, and many others now boast international standards for the languages. Perl, on the other hand, does
not—Perl is whatever Larry Wall and the Perl Porters decide they want “everyone’s favorite Swiss Army
Chainsaw” to be (http://www.perl.com/pub/a/2000/04/whatsnew.html).
Another desirable property of a language is machine independence. Java is the best example of a machineindependent
language. Given that a Java Virtual Machine is available for the host hardware, the same Java
source code should run the same way on any machine. (This promise of “write once, run anywhere” has largely
been fulfilled today, but in the beginning days of Java, the popular quip was, “Java: write once, run away.”)
On the other hand, programmers using C must keep in mind the hardware platform on which the code will
run since, for example, the sizes of data types vary on different machines. An int variable may be 16 bits long
on one computer, and 32 bits long on another. The programmer seeking to write a C program to run on multiple
platforms must accommodate these differences somehow.
Finally, some languages are more secure than others. Strict type checking is one feature designed to
enhance security. This was one of the lauded virtues of Pascal, when Pascal was being promoted in the 1980s
as the answer to all programming problems. Boundary checking on arrays is another feature designed to promote
security, and descriptions of the Java security model boast Java’s array boundary checking as an advance over
languages such as C.
While all these properties may be desirable, they are not all possible to achieve in the same language. For
instance, the security of strict type checking probably will reduce some forms of programmer expressiveness
(e.g., treating characters as integers, which can be used to improve execution speed in some applications),
increase program size, and perhaps reduce ultimate efficiency. Tradeoffs make language design a challenging
occupation, and different tradeoffs make different languages more suitable for different types of tasks.

No comments:

Post a Comment