Aims
The general aim of this course is to provide an overview of the basic concepts that appear in modern programming languages, the principles that underlie the design of programming languages, and their interaction.
Lectures
· Introduction, motivation, and overview. What is a programming language? Application domains in language design. Program execution models. Theoretical foundations. Language standardisation. History.
· The first procedural language: FORTRAN (1954-58). Execution model. Data types. Control structures. Storage. Subroutines and functions. Parameter passing.
· The first declarative language: LISP (1958-62). Expressions, statements, and declarations. S-expressions and lists. Recursion. Static and dynamic scope. Abstract machine. Garbage collection. Programs as data. Parameter passing. Strict and lazy evaluation.
· Block-structured procedural languages: Algol (1958-68), BCPL (1967), Pascal (1970), C (1971-78). Block structure. Parameters and parameter passing. Stack and heap storage. Data types. Arrays and pointers.
· Object-oriented languages -- Concepts and origins: Simula (1964-67), Smalltalk (1971-80). Dynamic lookup. Abstraction. Subtyping. Inheritance. Object models.
· Types, data abstraction, and modularity: C++ (1983-98), SML (1984-97). Types in programming. Type systems. Type checking and type inference. Polymorphism. Overloading. Type equivalence. Information hiding. Modularity. SML module system: signatures, structures, and functors. [2 lectures]
· State of the art.
Objectives
At the end of the course students should
· be familiar with several language paradigms and how they relate to different application domains
· understand the design space of programming languages, including concepts and constructs from past languages as well as those that may be used in the future
· develop a critical understanding of the programming languages that we use by being able to identify and compare the same concept as it appears in different languages
Recommended reading
Books:
* Mitchell, J.C. (2003). Concepts in programming
languages. Cambridge University Press.
* Pratt, T.W. & Zelkowitz, M.V. (2001). Programming languages: design and implementation.
Prentice Hall.
Papers:
Kay, A.C. (1993). The early history of
Smalltalk. ACM
SIGPLAN Notices, Vol. 28, No. 3.
Kerninghan, B. (1981). Why Pascal is not my favorite programming language.
AT&T Bell Laboratories. Computing Science Technical Report No. 100.
Koenig, A. (1994). An anecdote about ML type inference. USENIX Symposium on Very
High Level Languages.
McCarthy, J. (1960). Recursive functions of symbolic expressions and their
computation by machine. Communications of the ACM, 3(4):184-195.
Stroustrup, B. (1991). What is Object-Oriented Programming? (1991 revised
version). Proceedings 1
European
Software Festival.
Concepts
Calling/Passing by Value/Name
Pass/Call-by-value. The most common method is to evaluate the expressions in the actual-parameter list, and pass the resulting values.
In pass-by-value, the actual parameter is evaluated. The value of the actual parameter is then stored in a new location allocated for the function parameter.
Under call-by-value, a formal parameter corresponds to the value of an actual parameter. That is, the formal x of a procedure P takes on the value of the actual parameter.
The idea is to evaluate a call P(E) as follows:
x := E;
execute the body of procedure P;
if P is a function, return a result.
Pass/Call-by-name.? A less common method is to transmit the expression in the actual parameter list unevaluated, and let the call function evaluate them as needed using
eval.
1. In pass-by-reference, the actual parameter must have an
L-value. The L-value of the actual parameter is then
bound to the formal parameter.
2. Under call-by-reference, a formal parameter becomes a
synonym for the location of an actual parameter. An actual
reference parameter must have a location.
important to the programmer in several ways:
Side effects. Assignments inside the function body may have different effects under pass-by-value and pass-by-reference.
Aliasing. Aliasing occurs when two names refer to the same object or location.
Aliasing may occur when two parameters are passed by reference or one parameter passed by reference has the same location as the global variable of the procedure.
Call-by-value/result is also known as copy-in/copy-out
because the actuals are initially copied into the formals and the formals are eventually copied back out to the actuals. Actuals that do not have locations are passed by value.
Actuals with locations are treated as follows:
1. Copy-in phase. Both the values and the locations of the actual parameters are computed. The values are assigned to the corresponding formals, as in call-by-value,
and the locations are saved for the copy-out phase.
2. Copy-out phase. After the procedure body is executed, the nal values of the formals are copied back out to the locations computed in the copy-in phase.
A parameter in Pascal is normally passed by value. It is passed by reference, however, if the keyword var appears before the declaration of the formal parameter.
procedure proc(in: Integer; var out: Real);
The only parameter-passing method in C is call-by-value;
however, the effect of call-by-reference can be achieved using pointers. In C++ true call-by-reference is available using reference parameters.
In Java Objects are not passed by reference. A correct statement would be Object references are passed by value.
Ada supports three kinds of parameters:
1. in parameters, corresponding to value parameters;
2. out parameters, corresponding to just the copy-out phase of call-by-value/result; and
3. in out parameters, corresponding to either reference parameters or value/result parameters, at the discretion of the implementation./
The Algol 60 report describes call-by-name as follows:
1. Actual parameters are textually substituted for the formals.
Possible conflicts between names in the actuals and local names in the procedure body are avoided by renaming the locals in the body.
2. The resulting procedure body is substituted for the call. Possible conflicts between nonlocals in the procedure body and locals at the point of call are avoided by renaming the locals at the point of call.
ALGOL 60 allowed for two evaluation strategies for parameter passing: the common call-by-value, and call-by-name. Call-by-name had certain limitations in contrast to call-by-reference, making it an undesirable feature in imperative language design. For example, it is impossible in ALGOL 60 to develop a procedure that will swap the values of two parameters if the actual parameters that are passed in are an integer variable and an array that is indexed by that same integer variable
A formal parameter is a declaration that appears in the declaration of the subprogram. (The computation in the body of the subprogram is written in terms of forma parameters.)
An actual parameter is a value that the calling program sends to the subprogram.
Types
A type is a collection of computational entities that share
some common property.
There are three main uses of types in programming
languages:
1. naming and organizing concepts,
2. making sure that bit sequences in computer memory
are interpreted consistently,
3. providing information to the compiler about data
manipulated by the program.
A type system for a language is a set of rules for associating a
type with phrases in the language.
Terms strong and weak refer to the effectiveness with which
a type system prevents errors. A type system is strong if it
accepts only safe phrases. In other words, phrases that are
accepted by a strong type system are guaranteed to evaluate
without type error. A type system is weak if it is not strong.
A programming language is type safe if no program is allowed
to violate its type distinctions.
A type error occurs when a computational entity is used in a
manner that is inconsistent with the concept it represents.
Type checking is used to prevent some or all type errors,
ensuring that the operations in a program are applied properly.
Run-time type checking: The compiler generates code so
that, when an operation is performed, the code checks to
make sure that the operands have the correct types.
Examples: LISP, Smalltalk.
Compile-time type checking: The compiler checks the
program text for potential type errors.
Example: SML.
Name equality. Pure name equality. A type name is equal
to itself, but no constructed type is equal to any other
constructed type.
Transitive name equality. A type name is equal to itself
and can be declared equal to other type names.
Type-expression equality. A type name is equal only to
itself. Two type expressions are equal if they are
formed by applying the same constructor to equal
expressions. In other words, the expressions have to
be identical.
Type inference is the process of determining the types
of phrases based on the constructs that appear in them.
Object Orientated Programming
Four main language concepts for object-oriented languages:
1. Dynamic lookup.
2. Abstraction.
3. Subtyping.
4. Inheritance.
ML isn't an OOP so attempts such as:
val BoolStack = ref {pop=fn,push=fn}
: {pop:unit -> bool, puattempts ssh:bool -> unit} ref
fail.
Dynamic lookup means that when a message is sent to an object, the method to be executed is selected dynamically, at run time, according to the implementation of the object that receives the message. In other words, the object .chooses. how to respond to a message. The important property of dynamic lookup is that different objects may implement the same operation differently, and so may respond to the same message in different ways.
Abstraction means that implementation details are hidden inside a program unit with a specic interface. For objects, the interface usually consists of a set of methods that
manipulate hidden data. Abstraction based on objects is similar in many ways to abstraction based on abstract data types: Objects and abstract data types both combine functions and data, and abstraction in both cases involves distinguishing between a public interface and private implementation. Other features of object-oriented languages, however, make abstraction in object-oriented languages more exible than abstraction with abstract data types.
Subtyping is a relation on types that allows values of one type to be used in place of values of another. Specically, if an object a has all the functionality of another object b,
then we may use a in any context expecting b. The basic principle associated with subtyping is substitutivity: If A is a subtype of B, then any expression of type A may be used without type error in any context that requires an expression of type B.
The primary advantage of subtyping is that it permits uniform operations over various types of data.
Inheritance is the ability to reuse the denition of one kind of object to dene another kind of object.
The importance of inheritance is that it saves the effort of duplicating (or reading duplicated) code and that, when one class is implemented by inheriting form another, changes to one affect the other. This has a significant impact on code maintenance and modification.
Comparison of Languages
| Language | Typing | Value passing | OOP | Model of execution | Paradigms | Datatypes |
| Fortran | Incomplete. Strong-static. | Reference | 2003 onwards. | Compilation | imperative,procedural, object-oriented | Can't resize arrays. |
| Lisp | Untyped (though scheme is strong, dynamic) |
Call by value+
name. |
No | Interpretation, Compilation | functional | |
| C | static, weak, unsafe | Arrays and functions by reference, rest by value. | No | Compilation | imperative, flow-driven | |
| C++ | static, strong, unsafe, nominative | Both | Yes | Compilation | imperative, object-oriented, generic,multi-platform | |
| C# | static, strong, both safe and unsafe | Both. Keyword ref | Yes | JIT compilation | imperative, object-oriented, generic, multi-platform | |
| Algol | strong | Value and name | No | |||
| Pascal | static, strong, safe | No. Object pascal. | Compilation | imperative | ||
| Small Talk | dynamic, strong, safe, duck | Yes | JIT compilation | object-oriented, functional, concurrent, event-driven, imperative, declarative | ||
| SML | strong, static | Pass by value. | No | Interpretation | multi-paradigm: functional, imperative | |
| Java | static, strong | Pass by value | Yes | Interpretation | imperative, object-oriented, multi-platform, generic |
Major advances in Languages
● FORTRAN: Flat register machine; memory arranged as linear array
● LISP: cons cells, read-eval-print loop
● Algol family: stack of activation records; heap storage
● BCPL, C: underlying machine + abstractions
● Simula: Object references
● FP, ML: functions are basic control structure
● Smalltalk: objects and methods, communicating by messages
● Java: Java virtual machine
Fortran
Example
C FORTRAN IV WAS ONE OF THE FIRST PROGRAMMING
C LANGUAGES TO SUPPORT SOURCE COMMENTS
WRITE (6,7)
7 FORMAT(13H HELLO, WORLD)
STOP
END
Execution model
● A FORTRAN program= main program +subprograms
Each is compiled separately, during loading different programs are linked into an executable form.
● All storage is allocated statically before program execution begins; no run-time memory management
● Register machine- memory is a linear array. No stacks, no recursion
● A value is returned in a FORTRAN function by assigning a value to the name of a function.
● Parameter passing is uniformly by reference.
Data types
● Can't resize arrays and character strings.
● Incomplete type checking.
● Global or local scope.
COMMON blocks are are named blocks of data used to share data between different subprograms.
EQUIVALANCE allows two or more variables to link to the same data location-aliasing.
In a block-structured language, each program or subprogram is organised as a set of nested blocks. The trend in programming is to reduce the size of subprograms, so the
use of unnamed blocks is less useful than it used to be.
Fortran 2003 is oop : type extension and inheritance, polymorphism, dynamic type allocation, and type-bound procedures.
Lisp
Expressions. A syntactic entity that may be evaluated to
determine its value.
Statement. A command that alters the state of the
machine in some explicit way.
Declaration. A syntactic entity that introduces a new
identier, often specifying one or more attributes.
Type
Most operations in LISP take list arguments and return list
values.
( cons '(a b c) '(d e f) )
Lisp is an untyped language.
Scheme has strong, dynamic typing.
Scope
In the program text; dynamic scope rules relate references with associations for names during program execution.
There are two main rules for nding the declaration of a global identier:
Static scope. A global identier refers to the identier with that name that is declared in the closest enclosing scope of the program text.
Dynamic scope. A global identier refers to the identier associated with the most recent environment.
Abstract
The idealized abstract machine for Pure LISP has four parts:
1. A LISP expression to be evaluated.
2. A continuation, which is a function representing the remaining of the program to evaluate when done with the current expression.
3. An association list, also know as the A-list.
The purpose of the A-list is to store the values of variables that may occur either in the current expression to be evaluated or in the remaining expressions in the program.
4. A heap, which is a set of cons cells (or dotted pairs) that might be pointed to by pointers in the A-list.
Recursion: In general . . . , the routine for a recursive function uses itself as a
subroutine.
Garbage Collection
At a given point in the execution of a program P, a memory location ` is garbage if no completed execution of P from this point can access location `. In other words, replacing the contents of ` or making this location inaccessible to P cannot affect any further
execution of the program.
Garbage collection is the process of detecting garbage during the execution of a program and making it available.
LISP data and LISP program have the same syntax and internal representation. This allows data structures to be executed as programs and programs to be modified as data.
Passing parameters
The actual parameters in a function call are always expressions, represented as lists structures.
LISP provides two main methods of parameter passing:
call/pass by value and call/pass by name.
Algol
Introduction
The main characteristics of the Algol family are:
the familiar semicolon-separated sequence of statements, block structure, functions and procedures, and static typing.
/
Typing
Automatic type conversions were not fully specified
(e.g., x := x/y was not properly defined when x and y were integers )
The type of a procedure parameter to a procedure does not include the types of parameters.
An array parameter to a procedure is given type array, without array bounds.
Algol 60 was designed around two parameter-passing mechanisms, call-by-name and call-by-value. Call-by-name interacts badly with side effects; call-by-value is expensive for arrays.
Pascal
Pascal is a block-structured language in which static scope rules are used to determine the meaning of nonlocal references to names.
Pascal is a quasi-strong, statically typed programming language.
BCPL
Data types
The most important feature is the store: a set of numbered storage cells arranged so that the numbers labelling adjacent cells differ by one.
All storage cells are of the same size and each of them holds a value (= bit-pattern). A value is the only kind of object that can be manipulated directly in BCPL, and every variable and expression in that language will always
evaluate to one of these.
The design of BCPL distinguishes between two classes of data types.
1. Conceptual types. The kind of abstract object the programmer had in mind.
2. Internal types. Basic types for modelling conceptual types.
There is no need for type declarations in the language. This helps to make programs concise and also simplies problems such as the handling of actual/formal parameter correspondence and separate compilation.
2. It gives the language nearly the same power as one with dynamically varying types (as in LISP), and yet retains the efciency of a language (like FORTRAN) with manifest types. In languages (such as Algol) where the elements of arrays must all have the same type, one needs some other linguistic device in order to handle dynamically varying data structures.
BCPL uses a form of static storage, called global vector, which allows separately compiled modules to reference and call each other and to share data. This facility is not
unlike the FORTRAN COMMON storage area.
In BCPL, variables may be divide into two classes:
1. Static variables. The extent of a static variable is the entire execution time of the program. The storage cell is allocated prior to execution and continues to exist until execution is complete.
2. Dynamic variables. A dynamic variable is one whose extent starts when its declaration is executed and continues until
A pointer in BCPL is the address of a word of store.
Parameters
The BCPL procedure call uses the call-by-value technique for parameter passing.
BCPL has been carefully designed so that it is possible to represent a procedure by a simple BCPL value, called the procedure value. The procedure value is placed in a
variable bearing the name of the procedure.
C
Introduction
B was a pared-down version of BCPL, designed to run on the small computer used by the Unix project. The main difference between B and C is that B was untyped whereas C has types and type-checking rules.
The C programming language is similar to Algol 60, Algol 68, and Pascal in some respects: command-oriented syntax, blocks, local declarations, and recursive functions.
However, C also shares some features with its untyped precursor BCPL, such as pointer arithmetic. C is also more restricted than most Algol-based languages in that functions cannot be declared inside nested blocks: All functions are declared outside the main program. This simplies storage management.
Simula
OOP was invented with SIMULA and improved with small talk.
Simula: Extremely influential as the first language with classes objects, dynamic lookup, subtyping, and inheritance.
Objects in SIMULA
Class: A procedure returning a pointer to its activation record.
Object: An activation record produced by call to a class, called an instance of the class.
Objects are deallocated by the garbage collector (which deallocates objects only when they are no longer reachable from the program that created them).
Objects: A SIMULA object is an activation record produced by call to a class.
Classes: A SIMULA class is a procedure that returns a pointer to its activation record. The body of a class may initialise the objects it creates.
Dynamic lookup: Operations on an object are selected from the activation record of that object.
Abstraction: Hiding was not provided in SIMULA 67 but was added later and used as the basis for C++.
Subtyping: Objects are typed according to the classes that create them. Subtyping is determined by class hierarchy.
Inheritance
Example
POINT CLASS COLOREDPOINT(C); COLOR C;
BEGIN
BOOLEAN PROCEDURE EQUALS(Q); REF(COLOREDPOINT) Q;
...;
END***COLOREDPOINT**
Small Talk
Methods are public.
Any code with a pointer to an object may send any message to that object. If the corresponding method is defined in the class of the object, or any superclass, the method will be invoked. This makes all methods of an object visible to any code that can access the object.
Instance variables are protected.
The instance variables of an object are accessible only to methods of the class of the object and to methods of its subclasses.
The run-time structures used for Smalltalk classes and objects support dynamic lookup in two ways.
1. Methods are selected through the receiver object.
2. Method lookup starts with the method dictionary of the class of the receiver and then proceeds upwards through the class hierarchy.
● Objects: A Smalltalk object is created by a class.
At run time, an object stores its instance variables and a pointer to the instantiating class.
● Classes: A Smalltalk class defines variables, class methods, and the instance methods that are shared by all objects of the class.
● Abstraction: Abstraction is provided through protected instance variables. All methods are public but instance variables may be accessed only by the methods of the class and methods of subclasses.
● Subtyping: Smalltalk does not have a compile-time type system. Subtyping arises implicitly through relations between the interfaces of objects. Subtyping depends on the set of messages that are understood by an object, not the representation of objects or whether inheritance is used.
● Inheritance: Smalltalk subclasses inherit all instance variables and methods of their superclasses. Methods defined in a superclass may be redefined in a subclass or deleted.
Java
The language itself derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities.
are not objects. Primitive types store their values in the stack rather than being references to values. This was a conscious decision by Java's designers for performance reasons. Because of this, Java is not considered to be a pure object-oriented programming language.
Java does not support pointer arithmetic as is supported in for example C++. This is because the garbage collector may relocate referenced objects, invalidating such pointers. Another reason that Java forbids this is that type safety and security can no longer be guaranteed if arbitrary manipulation of pointers is allowed.
C#

Simplicity
-perhaps a reaction against C++
-though even Java 1.0 had complex features: e.g. overloading resolution, multiple class loaders
-Safety and security
-strongly typed (mostly static)
-automatic memory management (=> no pointer errors)
-code access security by "stack inspection"
-Portability
-compiled to bytecode, executed by Java Virtual Machine
-Object-orientation
-simple model of implementation inheritance for classes
-multiple interface inheritance
