17 Typing etive use ofobtoyqre that wecearly spcifythofou systems,the types ofall objects that they will manipulate at run time.This rule,known as static typing-a notion defined precisely in the next sections-makes our software: More reliable,by enabling compilers and other tools to suppress discrepancies before they have had time to cause damage. More readable,by providing precious information to authors of client systems, future maintainers of our own software,and other readers. More efficient,since this information helps a good compiler generate better code. Although the typing issue has been extensively discussed in non-O-O contexts,and static typing applied to many non-O-O languages,the concepts are particularly clear and relevant in object technology since the approach as a whole is largely based on the idea of type,merged with the idea of module to yield the basic O-O construct,the class. The desire to provide static typing has been a major influence on the mechanisms discussed in earlier chapters.Here we need to take a comprehensive look at typing and devise solutions to the remaining difficulties raised by this concept. 17.1 THE TYPING PROBLEM One nice thing can be said about the typing issue in object-oriented software construction: it may not be an easy problem,but it is a simple problem-simple,that is,to state. The Basic Construct The problem's simplicity comes from the simplicity of the object-oriented model of computation.If we put aside some of the details,only one kind of event ever occurs during the execution of an object-oriented system:feature call,of the general form x.f(arg) which executes on the object attached to x the operation f,using the argument arg,with the understanding that in some cases arg stands for several arguments,or no argument at all.Smalltalk programmers would say "pass to the objectx the message fwith argument arg",and use another syntax,but those are differences of style,not substance. That everything relies on this Basic Construct accounts in part for the general feeling of beauty that object-oriented ideas arouse in many people
17 Typing E ffective use of object technology requires that we clearly specify, in the texts of our systems, the types of all objects that they will manipulate at run time. This rule, known as static typing — a notion defined precisely in the next sections — makes our software: • More reliable, by enabling compilers and other tools to suppress discrepancies before they have had time to cause damage. • More readable, by providing precious information to authors of client systems, future maintainers of our own software, and other readers. • More efficient, since this information helps a good compiler generate better code. Although the typing issue has been extensively discussed in non-O-O contexts, and static typing applied to many non-O-O languages, the concepts are particularly clear and relevant in object technology since the approach as a whole is largely based on the idea of type, merged with the idea of module to yield the basic O-O construct, the class. The desire to provide static typing has been a major influence on the mechanisms discussed in earlier chapters. Here we need to take a comprehensive look at typing and devise solutions to the remaining difficulties raised by this concept. 17.1 THE TYPING PROBLEM One nice thing can be said about the typing issue in object-oriented software construction: it may not be an easy problem, but it is a simple problem — simple, that is, to state. The Basic Construct The problem’s simplicity comes from the simplicity of the object-oriented model of computation. If we put aside some of the details, only one kind of event ever occurs during the execution of an object-oriented system: feature call, of the general form x ● f (arg) which executes on the object attached to x the operation f, using the argument arg, with the understanding that in some cases arg stands for several arguments, or no argument at all. Smalltalk programmers would say “pass to the object x the message f with argument arg”, and use another syntax, but those are differences of style, not substance. That everything relies on this Basic Construct accounts in part for the general feeling of beauty that object-oriented ideas arouse in many people
612 TYPING $17.1 From the Basic Construct follows the basic kind of abnormal event that might occur at execution time: Definition:type violation A run-time type violation (or just type violation for short)occurs in the execution of a call xf(arg),wherex is attached to an object OBJ,if either: VI.There is no feature corresponding to fand applicable to OBJ. V2.There is such a feature,but arg is not an acceptable argument for it. The typing problem is the need to avoid such events: Object-oriented typing problem When do we know whether the execution of an object-oriented system may produce a type violation? The key word is when.If the feature or arguments do not match,you will find out sooner or later:applying the feature "raise salary"to an instance of SUBMARINE or "fire the torpedoes"to an instance of EMPLOYEE will not work;somehow the execution will fail.But you may prefer to find out sooner rather than later. Static and dynamic typing Although intermediate variants are possible,two main approaches present themselves: Dynamic typing:wait until the last possible moment,the execution of each call. Static typing:rely on a set of rules that determine,from the text of a system,whether its executions may cause type violations.Only execute systems for which the rules guarantee that no violation will ever occur. The names are easy to explain:with dynamic typing,type verification occurs at execution time(dynamically),with static typing,it is performed on the text of the software (statically,that is to say before any execution). The terms "typed"and "untyped"are sometimes used for "statically typed"and "dynamically typed".To avoid any confusion we will stick to the full names. Static typing is only interesting if the rules can be checked automatically.Since software texts are usually processed by a compiler before being executed,it is convenient to have the compiler,rather than a separate tool,take care of these checks.The rest of the discussion will indeed assume for simplicity that the compiler and the type checker are the same tool.This assumption yields a simple definition: Definition:statically typed language An object-oriented language is statically typed if it is equipped with a set of consistency rules,enforceable by compilers,whose observance by a system text guarantees that no execution of the system can cause a type violation
612 TYPING §17.1 From the Basic Construct follows the basic kind of abnormal event that might occur at execution time: The typing problem is the need to avoid such events: The key word is when. If the feature or arguments do not match, you will find out sooner or later: applying the feature “raise salary” to an instance of SUBMARINE or “fire the torpedoes” to an instance of EMPLOYEE will not work; somehow the execution will fail. But you may prefer to find out sooner rather than later. Static and dynamic typing Although intermediate variants are possible, two main approaches present themselves: • Dynamic typing: wait until the last possible moment, the execution of each call. • Static typing: rely on a set of rules that determine, from the text of a system, whether its executions may cause type violations. Only execute systems for which the rules guarantee that no violation will ever occur. The names are easy to explain: with dynamic typing, type verification occurs at execution time (dynamically); with static typing, it is performed on the text of the software (statically, that is to say before any execution). The terms “typed” and “untyped” are sometimes used for “statically typed” and “dynamically typed”. To avoid any confusion we will stick to the full names. Static typing is only interesting if the rules can be checked automatically. Since software texts are usually processed by a compiler before being executed, it is convenient to have the compiler, rather than a separate tool, take care of these checks. The rest of the discussion will indeed assume for simplicity that the compiler and the type checker are the same tool. This assumption yields a simple definition: Definition: type violation A run-time type violation (or just type violation for short) occurs in the execution of a call x ● f (arg), where x is attached to an object OBJ, if either: V1 • There is no feature corresponding to f and applicable to OBJ. V2 • There is such a feature, but arg is not an acceptable argument for it. Object-oriented typing problem When do we know whether the execution of an object-oriented system may produce a type violation? Definition: statically typed language An object-oriented language is statically typed if it is equipped with a set of consistency rules, enforceable by compilers, whose observance by a system text guarantees that no execution of the system can cause a type violation
$17.1 THE TYPING PROBLEM 613 In the literature you will encounter the term "strong typing".It corresponds to the all-or-nothing nature of this definition,which demands rules that guarantee the absence of type violations.Weak forms of static typing,whose rules eliminate certain type violations but not all,are also possible,and some O-O languages are indeed weakly-statically-typed in this sense.We shall strive,however,for the strongest possible form. Some authors also talk about strong forms of dynamic typing.But this is a contradiction In a dynamically typed language (also known as an"untyped"language),there are no type declarations;entities simply become associated with whatever values the execution of the software attaches to them.No static type checking is possible. Typing rules Our object-oriented notation is statically typed.Its type rules have been introduced in earlier chapters;they boil down to three simple constraints: Every entity or function must be declared as being of a certain type,as in acc:ACCOUNT;every routine declares zero or more formal arguments,with a type for each,as in put (x:G;i:INTEGER). Type Conformance In any assignmentx:=y,and in any routine call using y as the actual argument for rule,page 474. the formal argument x,the type of the source y must conform to the type of the target x.The definition of conformance is based on inheritance-B conforms to 4 if it is a descendant of4-complemented by rules for generic parameters. Feature Call rule, In a call of the form x.f(arg),fmust be a feature of the base class ofx's type,and page 473. must be available to the class in which the call appears. Realism Although the definition of"statically typed language"is precise,it also highlights the need for informal criteria in devising type rules.Consider the following two extreme cases: An all-valid language in which every syntactically correct system is also typewise- valid,with no need for type rules.Such languages are possible(imagine for example a small notation for Polish-style additions and subtractions with integers); unfortunately,as readers familiar with the theory of computation will know,no useful general-purpose language can meet that criterion. An all-invalid language,easy to devise:just take any existing language and add a type rule that makes any system invalid!This makes the language typed according to the definition:since no system passes the rules,no system that passes the rules can cause a type violation. We may say that an all-valid language is usable,but not useful for general-purpose development;an all-invalid language may be useful,but it is not usable. What we need in practice is a type system that makes the language both useful and usable:powerful enough to express the computations we need;convenient enough not to force us into undue complications to satisfy the type rules
§17.1 THE TYPING PROBLEM 613 In the literature you will encounter the term “strong typing”. It corresponds to the all-or-nothing nature of this definition, which demands rules that guarantee the absence of type violations. Weak forms of static typing, whose rules eliminate certain type violations but not all, are also possible, and some O-O languages are indeed weakly-statically-typed in this sense. We shall strive, however, for the strongest possible form. Some authors also talk about strong forms of dynamic typing. But this is a contradiction. In a dynamically typed language (also known as an “untyped” language), there are no type declarations; entities simply become associated with whatever values the execution of the software attaches to them. No static type checking is possible. Typing rules Our object-oriented notation is statically typed. Its type rules have been introduced in earlier chapters; they boil down to three simple constraints: • Every entity or function must be declared as being of a certain type, as in acc: ACCOUNT; every routine declares zero or more formal arguments, with a type for each, as in put (x: G; i: INTEGER). • In any assignment x := y, and in any routine call using y as the actual argument for the formal argument x, the type of the source y must conform to the type of the target x. The definition of conformance is based on inheritance — B conforms to A if it is a descendant of A — complemented by rules for generic parameters. • In a call of the form x ● f (arg), f must be a feature of the base class of x’s type, and must be available to the class in which the call appears. Realism Although the definition of “statically typed language” is precise, it also highlights the need for informal criteria in devising type rules. Consider the following two extreme cases: • An all-valid language in which every syntactically correct system is also typewisevalid, with no need for type rules. Such languages are possible (imagine for example a small notation for Polish-style additions and subtractions with integers); unfortunately, as readers familiar with the theory of computation will know, no useful general-purpose language can meet that criterion. • An all-invalid language, easy to devise: just take any existing language and add a type rule that makes any system invalid! This makes the language typed according to the definition: since no system passes the rules, no system that passes the rules can cause a type violation. We may say that an all-valid language is usable, but not useful for general-purpose development; an all-invalid language may be useful, but it is not usable. What we need in practice is a type system that makes the language both useful and usable: powerful enough to express the computations we need; convenient enough not to force us into undue complications to satisfy the type rules. Type Conformance rule, page 474. Feature Call rule, page 473
614 TYPING $17.1 We will say that a language is realistic if it is both useful and usable.Unlike the definition of static typing,which always yields an indisputable answer to the question"Is language X statically typed?",the definition of realism is partly subjective;reasonable people may disagree on whether a language,equipped with certain type rules,is still useful and usable In this chapter we will check that the ty ped notation defined in the preceding chapters is realistic. Pessimism In discussing approaches to O-O typing we should keep in mind another general property of static typing:it is always,by nature,a pessimistic policy.Trying to guarantee that no computation shall ever fail,you disallow some computations that might succeed. To see this,consider a trivial non-O-O language,Pascal-like,with distinct types INTEGER and REAL.With the declaration n:INTEGER,the assignment n:=r will be rejected as violating the type rules.So all the following will be considered type-invalid and rejected by the compiler: n=0.0 [A] n=1.0 [B] n=-3.67 [C] n=3.67-3.67 [D] Of these invalid operations,[A],if permitted to execute,would always work since any number system will provide an exact representation for the floating-point number 0.0, which can be transformed unambiguously to the integer 0.[B]would almost certainly work too.[C]is ambiguous(do we want the rounded version,the truncated version of the number?)But [D]would work.So would if n2<0 then n :3.67 end E] because the assignment will never be executed (n^2 denotes the square of n).If we replace n2 by just n,where n is read from user input just before the test,some executions would work (those for which n is non-negative),others would not.Assigning to n a very large real number,not representable as an integer,would not work. In a typed language,all these examples-those which would always work,those which would never work,and those which would work some of the time-are equally and mercilessly considered violations of the type rules,and any compiler will reject them. The question then is not whether to be pessimistic but how pessimistic we can afford to be.We are back to the realism requirement:if the type rules are so pessimistic as to bar us from expressing in a simple way the computations that we need,we will reject them.But if they achieve type safety with little loss of expressive power,we will accept them and enjoy the benefits.For example making n=rinvalid turns out to be good news if the environment provides functions such as round and truncate,enabling you to convert a real into an integer in exactly the way you want,without the ambiguity of an implicit conversion
614 TYPING §17.1 We will say that a language is realistic if it is both useful and usable. Unlike the definition of static typing, which always yields an indisputable answer to the question “Is language X statically typed?”, the definition of realism is partly subjective; reasonable people may disagree on whether a language, equipped with certain type rules, is still useful and usable. In this chapter we will check that the typed notation defined in the preceding chapters is realistic. Pessimism In discussing approaches to O-O typing we should keep in mind another general property of static typing: it is always, by nature, a pessimistic policy. Trying to guarantee that no computation shall ever fail, you disallow some computations that might succeed. To see this, consider a trivial non-O-O language, Pascal-like, with distinct types INTEGER and REAL. With the declaration n: INTEGER, the assignment n := r will be rejected as violating the type rules. So all the following will be considered type-invalid and rejected by the compiler: n := 0.0 [A] n := 1.0 [B] n := —3.67 [C] n := 3.67 — 3.67 [D] Of these invalid operations, [A], if permitted to execute, would always work since any number system will provide an exact representation for the floating-point number 0.0, which can be transformed unambiguously to the integer 0. [B] would almost certainly work too. [C] is ambiguous (do we want the rounded version, the truncated version of the number?) But [D] would work. So would if n ^ 2 < 0 then n := 3.67 end [E] because the assignment will never be executed (n ^ 2 denotes the square of n). If we replace n ^ 2 by just n, where n is read from user input just before the test, some executions would work (those for which n is non-negative), others would not. Assigning to n a very large real number, not representable as an integer, would not work. In a typed language, all these examples — those which would always work, those which would never work, and those which would work some of the time — are equally and mercilessly considered violations of the type rules, and any compiler will reject them. The question then is not whether to be pessimistic but how pessimistic we can afford to be. We are back to the realism requirement: if the type rules are so pessimistic as to bar us from expressing in a simple way the computations that we need, we will reject them. But if they achieve type safety with little loss of expressive power, we will accept them and enjoy the benefits. For example making n := r invalid turns out to be good news if the environment provides functions such as round and truncate, enabling you to convert a real into an integer in exactly the way you want, without the ambiguity of an implicit conversion
$172 STATIC TYPING:WHY AND HOW 615 17.2 STATIC TYPING:WHY AND HOW Although the advantages of static typing seem obvious,it is necessary to review the terms of the debate. The benefits The reasons for using a statically typed form of object technology were listed at the very beginning of this chapter:reliability,readability and efficiency. The reliability value comes from the use of static typing to detect errors that would otherwise manifest themselves only at run time,and only in certain runs.The rule that forces you to declare entities and functions-the first of our three type rules above- introduces redundancy into the software text;this enables the compiler,through the other two rules,to detect inconsistencies between the purpose and actual use of an entity,feature or expression. Catching errors early is essential,as correction cost grows quickly with the detection delay.This property,intuitively clear to all software professionals,is confirmed quantitatively,for specification errors,by Boehm's well-known studies,plotting the cost of correcting an error against the time at which it is found(base 1 if found at requirements time),for both a set of large industrial projects and a controlled small project experiment: Relative cost of Correction cost correcting 1000 errors After [Boehm 1981]. Reproduced with permission 500 LARGE PROJECTS 20 SMALL PROJECT 1 十 HTime Require- Design Code Develop- Accep- Opera- error ments ment test tance test tion found The readability benefit is also appreciable.As the examples appearing throughout this book should show convincingly,declaring every entity and function with a certain type is a powerful way of conveying to the software reader some information about its intended uses.This is particularly precious for maintainers of the software If readability were not part of the goal we might be able to obtain some of the other benefits of typing without explicit declarations.It is possible indeed,under certain conditions,to use an implicit form of typing in which the compiler,instead of requiring software authors to declare entity types,attempts to determine the type of each entity automatically from its uses.This is known as type inference.But from a software engineering perspective explicit declarations are a help,not a penalty;types should be clear not just to the compiler but to the human reader
§17.2 STATIC TYPING: WHY AND HOW 615 17.2 STATIC TYPING: WHY AND HOW Although the advantages of static typing seem obvious, it is necessary to review the terms of the debate. The benefits The reasons for using a statically typed form of object technology were listed at the very beginning of this chapter: reliability, readability and efficiency. The reliability value comes from the use of static typing to detect errors that would otherwise manifest themselves only at run time, and only in certain runs. The rule that forces you to declare entities and functions — the first of our three type rules above — introduces redundancy into the software text; this enables the compiler, through the other two rules, to detect inconsistencies between the purpose and actual use of an entity, feature or expression. Catching errors early is essential, as correction cost grows quickly with the detection delay. This property, intuitively clear to all software professionals, is confirmed quantitatively, for specification errors, by Boehm’s well-known studies, plotting the cost of correcting an error against the time at which it is found (base 1 if found at requirements time), for both a set of large industrial projects and a controlled small project experiment: The readability benefit is also appreciable. As the examples appearing throughout this book should show convincingly, declaring every entity and function with a certain type is a powerful way of conveying to the software reader some information about its intended uses. This is particularly precious for maintainers of the software. If readability were not part of the goal we might be able to obtain some of the other benefits of typing without explicit declarations. It is possible indeed, under certain conditions, to use an implicit form of typing in which the compiler, instead of requiring software authors to declare entity types, attempts to determine the type of each entity automatically from its uses. This is known as type inference. But from a software engineering perspective explicit declarations are a help, not a penalty; types should be clear not just to the compiler but to the human reader. Relative cost of correcting errors After [Boehm 1981]. Reproduced with permission. 1000 500 20 Requirements Design Code Acceptance test Operation Development test LARGE PROJECTS SMALL PROJECT 1 Time error found Correction cost
616 TYPING $17.2 Finally,the efficiency benefit can make the difference between success and failure For more detailson of object technology in practice.Without static typing,the execution ofx.f(arg)can take the implememtarion an arbitrary long time:as we saw in the discussion of inheritance,the basic algorithm looks techniques discussed for a feature fin the base class Cofx's type;if it does not find it,it looks in C's parents, in this section see and so on.This is a fatal source of inefficiency.It can be mitigated by improvements to “D7 namic binding and efficiency",page the basic algorithm,and the authors of the Self language have done extensive work to 505.On Sel),see the enable better code generation for a dynamically typed language.But it is through static bibliographical notes. typing that O-O software has been able to approach or equal the efficiency of traditional software. The key idea was explained in the earlier discussion.When the compiler generates the code forxf(arg),it knows the type of x.Because of polymorphism,this is not necessarily the type of the attached run-time object OBJ,and so does not uniquely determine the proper version of.But the declaration restricts the set of possible types, enabling the compiler to generate tables providing run-time access to the right at minimum-and constant-bounded-expense.Further optimizations of static binding and inlining,also facilitated by typing,eliminate the expense altogether in applicable cases. Arguments for dynamic typing In spite of these benefits of static typing,dynamic typing keeps its supporters,found in particular in the Smalltalk community.Their argument mainly follows from the realism issue cited above:they contend that static typing is too constraining,preventing the unfettered expression of software ideas.Terms such as"stranglehold"and"chastity belt" are often heard in such discussions. This argument can be correct,but only for a statically typed language that misses some important facilities.It is indeed remarkable that all the type-related concepts introduced in preceding chapters are necessary;remove any of them,and the straitjacket comment becomes valid in at least some cases.But by including them all we obtain enough flexibility to make static typing both practical and pleasurable. The ingredients of successful typing Let us review the mechanisms which permit realistic static typing.They have all been introduced in earlier chapters,so that we only need a brief reminder for each;listing them all together shows the consistency and power of their combination. Our type system is entirely based on the notion of class.Even basic types such as INTEGER are defined by classes.So we do not need special rules for predefined types. (Here the notation departs from "hybrid"languages such as Object Pascal,Java and C++, which retain the type system of an older language along with the class-based system of object technology. "COMPOSITE OBJECTSAND Expanded types give us more flexibility by allowing types whose values denote EXPANDEDTYPES" objects along with types whose values denote object references. 87,page254. Crucial flexibility is afforded by inheritance and the associated notion of “Limits to polyn1on- conformance.This addresses the major limitation of traditional typed languages such as phism".page 474
616 TYPING §17.2 Finally, the efficiency benefit can make the difference between success and failure of object technology in practice. Without static typing, the execution of x ● f (arg) can take an arbitrary long time: as we saw in the discussion of inheritance, the basic algorithm looks for a feature f in the base class C of x’s type; if it does not find it, it looks in C’s parents, and so on. This is a fatal source of inefficiency. It can be mitigated by improvements to the basic algorithm, and the authors of the Self language have done extensive work to enable better code generation for a dynamically typed language. But it is through static typing that O-O software has been able to approach or equal the efficiency of traditional software. The key idea was explained in the earlier discussion. When the compiler generates the code for x ● f (arg), it knows the type of x. Because of polymorphism, this is not necessarily the type of the attached run-time object OBJ, and so does not uniquely determine the proper version of f. But the declaration restricts the set of possible types, enabling the compiler to generate tables providing run-time access to the right f at minimum — and constant-bounded — expense. Further optimizations of static binding and inlining, also facilitated by typing, eliminate the expense altogether in applicable cases. Arguments for dynamic typing In spite of these benefits of static typing, dynamic typing keeps its supporters, found in particular in the Smalltalk community. Their argument mainly follows from the realism issue cited above: they contend that static typing is too constraining, preventing the unfettered expression of software ideas. Terms such as “stranglehold” and “chastity belt” are often heard in such discussions. This argument can be correct, but only for a statically typed language that misses some important facilities. It is indeed remarkable that all the type-related concepts introduced in preceding chapters are necessary; remove any of them, and the straitjacket comment becomes valid in at least some cases. But by including them all we obtain enough flexibility to make static typing both practical and pleasurable. The ingredients of successful typing Let us review the mechanisms which permit realistic static typing. They have all been introduced in earlier chapters, so that we only need a brief reminder for each; listing them all together shows the consistency and power of their combination. Our type system is entirely based on the notion of class. Even basic types such as INTEGER are defined by classes. So we do not need special rules for predefined types. (Here the notation departs from “hybrid” languages such as Object Pascal, Java and C++, which retain the type system of an older language along with the class-based system of object technology.) Expanded types give us more flexibility by allowing types whose values denote objects along with types whose values denote object references. Crucial flexibility is afforded by inheritance and the associated notion of conformance. This addresses the major limitation of traditional typed languages such as For more details on the implementation techniques discussed in this section see “Dynamic binding and efficiency”, page 508. On Self, see the bibliographical notes. “COMPOSITE OBJECTS AND EXPANDED TYPES”, 8.7, page 254. “Limits to polymorphism”, page 474
$17.2 STATIC TYPING:WHY AND HOW 617 Pascal and Ada,where an assignment x:=y requires the types ofx and y to be identical. This rule is too strict:it prevents you from using an entity that may denote objects of various related types,such as a SAVINGS ACCOUNT and a CHECKING ACCOUNT. With inheritance,all we require is that the type ofy conform to the type ofx;this is the case ifx is of type ACCOUNT,y of type SAVINGS ACCOUNT,and the latter class is a descendant of the former. Chapter 15. To be practical,a statically typed language requires its inheritance scheme to support multiple inheritance.A principal part of common objections against static typing is that it prevents you from looking at objects in different ways.For example an object of type DOCUMENT might need to be transmitted over a network,and so will need the features associated with objects of type MESSAGE.But this is only a problem with a language that is restricted to single inheritance;with multiple inheritance you can introduce as many viewpoints as you need. Multiple inheritance DOCUMENT MESSAGE MAILABLE DOCUMENT Chapter 10. We also need genericity,to define flexible yet type-safe container data structures. For example a list class will be defined as class L/ST [G]...Without this mechanism, static typing would force us to declare a different class for each type of list element-an obviously unsustainable solution. "CONSTRAINED Genericity needs in some cases to be constrained,allowing us to apply certain GENERICITY". operations to entities of a generic type.For example if a generic class SORTABLE L/ST 16.4page585. has a sort operation,it requires a comparison operation on entities of type G,the generic parameter.This is achieved by associating with G a generic constraint COMPARABLE: class SORTABLE LIST [G->COMPARABLE]... meaning that any actual generic parameter used for SORTABLE L/ST must be a descendant of class COMPARABLE,which has the required comparison features. “ASSIGNMENT Another indispensable mechanism is assignment attempt,to access objects whose ATTEMPT"16.5. type the software does not control.Ify denotes an object obtained from a database or a page 591. network,you cannot be sure it has the expected type;the assignment attempt x ?=y will assign to x the value ofy if it is of a compatible type,but otherwise will make x void. Without assignment attempt we could not abide by the type rules in such cases. Chapter 11. Assertions-associated,as part of the idea of Design by Contract,with classes and features in the form of preconditions,postconditions and class invariants-allow you to describe semantic constraints which cannot be captured by type specifications.Although with the "interval types"of such languages as Pascal and Ada you can declare,for
§17.2 STATIC TYPING: WHY AND HOW 617 Pascal and Ada, where an assignment x := y requires the types of x and y to be identical. This rule is too strict: it prevents you from using an entity that may denote objects of various related types, such as a SAVINGS_ACCOUNT and a CHECKING_ACCOUNT. With inheritance, all we require is that the type of y conform to the type of x; this is the case if x is of type ACCOUNT, y of type SAVINGS_ACCOUNT, and the latter class is a descendant of the former. To be practical, a statically typed language requires its inheritance scheme to support multiple inheritance. A principal part of common objections against static typing is that it prevents you from looking at objects in different ways. For example an object of type DOCUMENT might need to be transmitted over a network, and so will need the features associated with objects of type MESSAGE. But this is only a problem with a language that is restricted to single inheritance; with multiple inheritance you can introduce as many viewpoints as you need. We also need genericity, to define flexible yet type-safe container data structures. For example a list class will be defined as class LIST [G] … Without this mechanism, static typing would force us to declare a different class for each type of list element — an obviously unsustainable solution. Genericity needs in some cases to be constrained, allowing us to apply certain operations to entities of a generic type. For example if a generic class SORTABLE_LIST has a sort operation, it requires a comparison operation on entities of type G, the generic parameter. This is achieved by associating with G a generic constraint COMPARABLE: class SORTABLE_LIST [G –> COMPARABLE] … meaning that any actual generic parameter used for SORTABLE_LIST must be a descendant of class COMPARABLE, which has the required comparison features. Another indispensable mechanism is assignment attempt, to access objects whose type the software does not control. If y denotes an object obtained from a database or a network, you cannot be sure it has the expected type; the assignment attempt x ?= y will assign to x the value of y if it is of a compatible type, but otherwise will make x void. Without assignment attempt we could not abide by the type rules in such cases. Assertions — associated, as part of the idea of Design by Contract, with classes and features in the form of preconditions, postconditions and class invariants — allow you to describe semantic constraints which cannot be captured by type specifications. Although with the “interval types” of such languages as Pascal and Ada you can declare, for Chapter 15. Multiple inheritance MESSAGE MAILABLE_DOCUMENT DOCUMENT Chapter 10. “CONSTRAINED GENERICITY”, 16.4, page 585. “ASSIGNMENT ATTEMPT”, 16.5, page 591. Chapter 11
618 TYPING $17.2 example,that a certain entity takes its values between 10 and 20,no type mechanism will enable you to state that i must be either in that interval or negative,and always twice as much as j.Here class invariants come to the rescue,by letting you specify exactly what you need,however sophisticated the constraint. Anchored declarations are essential in practice to avoid redeclaration avalanche. “ANCHORED By declaring y:like x you make sure that y will follow any redeclaration of the type ofx DECLARATION" in a descendant.Without this mechanism developers would be endlessly redeclaring 167,page599. routines for type purposes only. Anchored declarations are a specific case of our last required language mechanism: covariance,which will be discussed in more detail later in this chapter. A practical property of the environment is also essential:fast incremental recompilation.When you write a system or(more commonly)modify an existing system, you will want to see the effect soon.With static typing you must first let the compiler re- typecheck the system.Traditional compiling techniques require recompiling the whole system (and going through a linking process),the time may be painfully long,especially for a proportionally small change to a large system.This phenomenon has been a major a contrario argument for interpreted approaches,such as those of early Lisp and Smalltalk environments,which execute systems with no or little processing,hence no type checking. But modern compiler technology removes this argument.A good compiler will detect what has changed since the last compilation,and reprocess only that part,keeping the recompilation time small-and proportional to the size of the change,not of the system. The Melting Ice Technology described in the last chapter of this book achieves this goal, typically permitting recompilation in a matter of seconds after a small change even to a large system. “A little bit typed”? It was noted above that we should aim for a strong form of static typing.This means that we should avoid any loopholes in the static requirements-or,if any such loopholes remain,identify them clearly,if possible providing tools to flag any software using them. The most common loophole,in languages that are otherwise statically typed,is the presence of conversions that disguise the type of an entity.In C and its derivatives, conversions are called "casts"and follow a simple syntax:(OTHER TYPE)x denotes the value ofx presented to the compiler as if it were of type OTHER TYPE;there are few limitations on what that type may be,regardless ofx's actual type. Such mechanisms evade the constraints of type checking;casting is indeed a pervasive feature of C programming,including in the ANSI C variant(which is"more" typed than its precursor,the so-called Kernighan and Ritchie version).Even in C++, examination of published software shows that casts,although less frequent,remain an accepted and possibly indispensable occasional practice
618 TYPING §17.2 example, that a certain entity takes its values between 10 and 20, no type mechanism will enable you to state that i must be either in that interval or negative, and always twice as much as j. Here class invariants come to the rescue, by letting you specify exactly what you need, however sophisticated the constraint. Anchored declarations are essential in practice to avoid redeclaration avalanche. By declaring y: like x you make sure that y will follow any redeclaration of the type of x in a descendant. Without this mechanism developers would be endlessly redeclaring routines for type purposes only. Anchored declarations are a specific case of our last required language mechanism: covariance, which will be discussed in more detail later in this chapter. A practical property of the environment is also essential: fast incremental recompilation. When you write a system or (more commonly) modify an existing system, you will want to see the effect soon. With static typing you must first let the compiler retypecheck the system. Traditional compiling techniques require recompiling the whole system (and going through a linking process); the time may be painfully long, especially for a proportionally small change to a large system. This phenomenon has been a major a contrario argument for interpreted approaches, such as those of early Lisp and Smalltalk environments, which execute systems with no or little processing, hence no type checking. But modern compiler technology removes this argument. A good compiler will detect what has changed since the last compilation, and reprocess only that part, keeping the recompilation time small — and proportional to the size of the change, not of the system. The Melting Ice Technology described in the last chapter of this book achieves this goal, typically permitting recompilation in a matter of seconds after a small change even to a large system. “A little bit typed”? It was noted above that we should aim for a strong form of static typing. This means that we should avoid any loopholes in the static requirements — or, if any such loopholes remain, identify them clearly, if possible providing tools to flag any software using them. The most common loophole, in languages that are otherwise statically typed, is the presence of conversions that disguise the type of an entity. In C and its derivatives, conversions are called “casts” and follow a simple syntax: (OTHER_TYPE) x denotes the value of x presented to the compiler as if it were of type OTHER_TYPE; there are few limitations on what that type may be, regardless of x’s actual type. Such mechanisms evade the constraints of type checking; casting is indeed a pervasive feature of C programming, including in the ANSI C variant (which is “more” typed than its precursor, the so-called Kernighan and Ritchie version). Even in C++, examination of published software shows that casts, although less frequent, remain an accepted and possibly indispensable occasional practice. “ANCHORED DECLARATION”, 16.7, page 599
$17.2 STATIC TYPING:WHY AND HOW 619 It seems difficult to accept claims of static typing if at any stage the developer can eschew the type rules through casts.Accordingly,the rest of this chapter will assume that the type system is strict and allows no casts. You may have noted that assignment attempts,mentioned above as an essential component of a realistic type system,superficially resemble casts.But there is a fundamental difference:an assignment attempt does not blindly force a different type;it tries a candidate type,and enables the software to check whether the object actually matches that type.This is safe,and indispensable in some circumstances.The C++ literature sometimes includes assignment attempts ("downcasts")in its definition of casts;clearly,the above prohibition of casts only covers the harmful variant,and does not extend to assignment attempts. Typing and binding:avoiding the confusion Although as a reader of this book you will have no difficulty distinguishing static typing from static binding,you may meet people who confuse the two notions.This may be due in part to the influence of Smalltalk,whose advocacy of a dynamic approach to both typing and binding may leave the inattentive observer with the incorrect impression that the answer to both questions must be the same.(The analysis developed in this book suggests that to achieve reliability and flexibility it is preferable to combine dynamic binding with static typing.)Let us carefully compare the two concepts. Both have to do with the semantics of the Basic Construct xf(arg);they cover the two separate questions that it raises: Typing and binding .Typing question:When do we know for sure that at run time there will be an operation corresponding to /and applicable to the object attached to x(with the argument arg)? Binding question:Which operation will the call execute? Typing addresses the existence of at least one operation;binding addresses the choice of the right one among these operations,if there is more than one candidate. In object technology: The typing question follows from polymorphism:since x may denote run-time objects of several possible types,we must make sure that an operation representing f is available in all cases. The binding question follows from redeclaration:since a class can change an inherited feature -as with RECTANGLE redefining perimeter inherited from POLYGON-there may be two or more operations all vying to be the one representing for a particular call. Both answers can be dynamic,meaning at execution time,or static,meaning before execution.All four possibilities appear in actual languages:
§17.2 STATIC TYPING: WHY AND HOW 619 It seems difficult to accept claims of static typing if at any stage the developer can eschew the type rules through casts. Accordingly, the rest of this chapter will assume that the type system is strict and allows no casts. You may have noted that assignment attempts, mentioned above as an essential component of a realistic type system, superficially resemble casts. But there is a fundamental difference: an assignment attempt does not blindly force a different type; it tries a candidate type, and enables the software to check whether the object actually matches that type. This is safe, and indispensable in some circumstances. The C++ literature sometimes includes assignment attempts (“downcasts”) in its definition of casts; clearly, the above prohibition of casts only covers the harmful variant, and does not extend to assignment attempts. Typing and binding: avoiding the confusion Although as a reader of this book you will have no difficulty distinguishing static typing from static binding, you may meet people who confuse the two notions. This may be due in part to the influence of Smalltalk, whose advocacy of a dynamic approach to both typing and binding may leave the inattentive observer with the incorrect impression that the answer to both questions must be the same. (The analysis developed in this book suggests that to achieve reliability and flexibility it is preferable to combine dynamic binding with static typing.) Let us carefully compare the two concepts. Both have to do with the semantics of the Basic Construct x ● f (arg); they cover the two separate questions that it raises: Typing addresses the existence of at least one operation; binding addresses the choice of the right one among these operations, if there is more than one candidate. In object technology: • The typing question follows from polymorphism: since x may denote run-time objects of several possible types, we must make sure that an operation representing f is available in all cases. • The binding question follows from redeclaration: since a class can change an inherited feature — as with RECTANGLE redefining perimeter inherited from POLYGON — there may be two or more operations all vying to be the one representing f for a particular call. Both answers can be dynamic, meaning at execution time, or static, meaning before execution. All four possibilities appear in actual languages: Typing and binding • Typing question: When do we know for sure that at run time there will be an operation corresponding to f and applicable to the object attached to x (with the argument arg)? • Binding question: Which operation will the call execute?
620 TYPING $17.2 Some non-O-O languages,such as Pascal and Ada,have both static typing and static binding.In these languages each entity represents objects ofonly one type,specified statically;the approach yields reliability at the expense of flexibility. Smalltalk and other O-O languages influenced by it have dynamic binding and dynamic typing.This is the reverse choice:favoring flexibility at the expense of reliability enforcement. Some non-O-O languages are untyped (really meaning,as we have seen, dynamically typed)and statically bound.They include assembly languages and some scripting languages. The notation developed in this book supports static typing and dynamic binding. Note the peculiarity of C++which supports static typing(although in a non-strong The C++policy was form because of the presence of casts)and,for binding,a static policy by default,while discussed in "The permitting dynamic binding at the price of explicit virtual declarations. C++approach to binding".page 514. The reason choosing static typing and dynamic binding is clear.To the first question, "when do we know we have a feature?",the most attractive answer for reliable software engineering is the static one:"at the earliest possible time"-compilation time,to catch errors before they catch you.To the second question,"what feature do we use?",the most attractive answer is the dynamic one:"the right feature"-the feature directly adapted to the object's type.As discussed in detail in the presentation of inheritance,this is the only acceptable solution unless static and dynamic binding have the same effect. The following fictitious inheritance hierarchy helps make these notions more vivid. AIRCRAFT Kinds off乃ying object COPTER PLANE ower landing gear* BOEING AIRBUS B737 A320 B747 lower_landing gear* *deferred +effected B747400 lower_landing gear+ ++redefined For a call of the form my aircraft.lower landing gear
620 TYPING §17.2 • Some non-O-O languages, such as Pascal and Ada, have both static typing and static binding. In these languages each entity represents objects of only one type, specified statically; the approach yields reliability at the expense of flexibility. • Smalltalk and other O-O languages influenced by it have dynamic binding and dynamic typing. This is the reverse choice: favoring flexibility at the expense of reliability enforcement. • Some non-O-O languages are untyped (really meaning, as we have seen, dynamically typed) and statically bound. They include assembly languages and some scripting languages. • The notation developed in this book supports static typing and dynamic binding. Note the peculiarity of C++ which supports static typing (although in a non-strong form because of the presence of casts) and, for binding, a static policy by default, while permitting dynamic binding at the price of explicit virtual declarations. The reason choosing static typing and dynamic binding is clear. To the first question, “when do we know we have a feature?”, the most attractive answer for reliable software engineering is the static one: “at the earliest possible time” — compilation time, to catch errors before they catch you. To the second question, “what feature do we use?”, the most attractive answer is the dynamic one: “the right feature” — the feature directly adapted to the object’s type. As discussed in detail in the presentation of inheritance, this is the only acceptable solution unless static and dynamic binding have the same effect. The following fictitious inheritance hierarchy helps make these notions more vivid. For a call of the form my_aircraft ● lower_ landing_gear The C++ policy was discussed in “The C++ approach to binding”, page 514. Kinds of flying object AIRCRAFT PLANE COPTER BOEING AIRBUS B_737 B_747 B_747_400 A_320 * * * lower_landing_gear+ lower_landing_gear* lower_landing_gear++ * deferred + effected ++ redefined *