8 The run-time structure:objects In the previous chapter we saw that classes may have instances,called objects.We must now turn our attention to these objects and,more generally,to the run-time model of object-oriented computation. Where the previous chapters were mostly concemed with conceptual and structural issues,the present one will,for the first time in this book,include implementation aspects In particular it will describe how the execution of object-oriented software uses memory -a discussion continued by the study of garbage collection in the next chapter.As already noted,one of the benefits of object technology is to restore implementation issues to their full status;so even if your interest is mostly in analysis and design topics you should not be afraid of this excursion into implementation territory.It is impossible to understand the method unless you have some idea of its influence on run-time structures. The study of object structures in this chapter indeed provides a particularly good example of how wrong it is to separate implementation aspects from supposedly higher- level issues.Throughout the discussion,whenever we realize the need for a new O-O technique or mechanism,initially introduced for some implementation-related purpose, the real reason will almost always turn out to be deeper:we need the facility just as much for purely descriptive,abstract purposes.A typical example will be the distinction between references and expanded values,which might initially appear to be an obscure programming technique,but in reality provides a general answer to the question of sharing in whole-to-parts relations,an issue that figures prominently in many discussions of object-oriented analysis. This contribution of implementation is sometimes hard to accept for people who have been influenced by the view,still prevalent in the software literature,that all that counts is analysis.But it should not be so surprising.To develop software is to develop models.A good implementation technique is often a good modeling technique as well;it may be applicable,beyond software systems,to systems from various fields,natural and artificial. More than implementation in the strict sense of the term,then,the theme of this chapter is modeling:how to use object structures to construct realistic and useful operational descriptions of systems of many kinds
8 The run-time structure: objects In the previous chapter we saw that classes may have instances, called objects. We must now turn our attention to these objects and, more generally, to the run-time model of object-oriented computation. Where the previous chapters were mostly concerned with conceptual and structural issues, the present one will, for the first time in this book, include implementation aspects. In particular it will describe how the execution of object-oriented software uses memory — a discussion continued by the study of garbage collection in the next chapter. As already noted, one of the benefits of object technology is to restore implementation issues to their full status; so even if your interest is mostly in analysis and design topics you should not be afraid of this excursion into implementation territory. It is impossible to understand the method unless you have some idea of its influence on run-time structures. The study of object structures in this chapter indeed provides a particularly good example of how wrong it is to separate implementation aspects from supposedly higherlevel issues. Throughout the discussion, whenever we realize the need for a new O-O technique or mechanism, initially introduced for some implementation-related purpose, the real reason will almost always turn out to be deeper: we need the facility just as much for purely descriptive, abstract purposes. A typical example will be the distinction between references and expanded values, which might initially appear to be an obscure programming technique, but in reality provides a general answer to the question of sharing in whole-to-parts relations, an issue that figures prominently in many discussions of object-oriented analysis. This contribution of implementation is sometimes hard to accept for people who have been influenced by the view, still prevalent in the software literature, that all that counts is analysis. But it should not be so surprising. To develop software is to develop models. A good implementation technique is often a good modeling technique as well; it may be applicable, beyond software systems, to systems from various fields, natural and artificial. More than implementation in the strict sense of the term, then, the theme of this chapter is modeling: how to use object structures to construct realistic and useful operational descriptions of systems of many kinds
218 THE RUN-TIME STRUCTURE:OBJECTS $8.1 8.1 OBJECTS At any time during its execution,an O-O system will have created a certain number of objects.The run-time structure is the organization of these objects and of their relations. Let us explore its properties. What is an object? First we should recall what the word "object"means for this discussion.There is nothing The definition vague in this notion;a precise technical definition was given in the previous chapter: appeared on page 166.See also the Object rule,page Definition:object 171. An object is a run-time instance of some class. A software system that includes a class C may at various points of its execution create(through creation and cloning operations,whose details appear later in this chapter) instances of C;such an instance is a data structure built according to the pattern defined by C;for example an instance of the class PO/NT introduced in the previous chapter is a data structure consisting of two fields,associated with the two attributes x and y declared in the class.The instances of all possible classes constitute the set of objects. The above definition is the official one for object-oriented software.But "object" also has a more general meaning,coming from everyday language.Any software system is related to some external system,which may contain "objects":points,lines,angles, surfaces and solids in a graphics system:employees,pay checks and salary scales in a payroll system;and so on.Some of the objects created by the software will be in direct correspondence with such external objects,as in a payroll system that includes a class EMPLOYEE,whose run-time instances are computer models of employees. This dual use of the word"object"has some good consequences,which follow from “Direct Mapping”, the power of the object-oriented method as a modeling tool.Better than any other method,page47. object technology highlights and supports the modeling component of software development.This explains in part the impression of naturalness which it exudes,the attraction it exerts on so many people,and its early successes-still among the most visible-in such areas as simulation and user interfaces.The method here enjoys the direct mapping property which an earlier chapter described as a principal requirement of good modular design.With software systems considered to be direct or indirect models of real systems,it is not surprising that some classes will be models of external object types from the problem domain,so that the software objects(the instances of these classes)are themselves models of the corresponding external objects. But we should not let ourselves get too carried away by the word"object".As always in science and technology,it is a bit risky to borrow words from everyday language and give them technical meanings.(The only discipline which seems to succeed in this delicate art is mathematics,which routinely hijacks such innocent words as "neighborhood", s“variety”or“barrel'”and uses them with completely unexpected meanings一perhaps the
218 THE RUN-TIME STRUCTURE: OBJECTS §8.1 8.1 OBJECTS At any time during its execution, an O-O system will have created a certain number of objects. The run-time structure is the organization of these objects and of their relations. Let us explore its properties. What is an object? First we should recall what the word “object” means for this discussion. There is nothing vague in this notion; a precise technical definition was given in the previous chapter: A software system that includes a class C may at various points of its execution create (through creation and cloning operations, whose details appear later in this chapter) instances of C; such an instance is a data structure built according to the pattern defined by C; for example an instance of the class POINT introduced in the previous chapter is a data structure consisting of two fields, associated with the two attributes x and y declared in the class. The instances of all possible classes constitute the set of objects. The above definition is the official one for object-oriented software. But “object” also has a more general meaning, coming from everyday language. Any software system is related to some external system, which may contain “objects”: points, lines, angles, surfaces and solids in a graphics system: employees, pay checks and salary scales in a payroll system; and so on. Some of the objects created by the software will be in direct correspondence with such external objects, as in a payroll system that includes a class EMPLOYEE, whose run-time instances are computer models of employees. This dual use of the word “object” has some good consequences, which follow from the power of the object-oriented method as a modeling tool. Better than any other method, object technology highlights and supports the modeling component of software development. This explains in part the impression of naturalness which it exudes, the attraction it exerts on so many people, and its early successes — still among the most visible — in such areas as simulation and user interfaces. The method here enjoys the direct mapping property which an earlier chapter described as a principal requirement of good modular design. With software systems considered to be direct or indirect models of real systems, it is not surprising that some classes will be models of external object types from the problem domain, so that the software objects (the instances of these classes) are themselves models of the corresponding external objects. But we should not let ourselves get too carried away by the word “object”. As always in science and technology, it is a bit risky to borrow words from everyday language and give them technical meanings. (The only discipline which seems to succeed in this delicate art is mathematics, which routinely hijacks such innocent words as “neighborhood”, “variety” or “barrel” and uses them with completely unexpected meanings — perhaps the Definition: object An object is a run-time instance of some class. The definition appeared on page 166. See also the Object rule, page 171. “Direct Mapping”, page 47
$8.1 OBJECTS 219 reason why no one seems to have any trouble.)The term "object"is so overloaded with everyday meanings that in spite of the benefits just mentioned its use in a technical software sense has caused its share of confusion.In particular: As pointed out in the discussion of direct mapping,not all classes correspond to object types of the problem domain.The classes introduced for design and implementation have no immediate counterparts in the modeled system.They are often among the most important in practice,and the most difficult to find. See chapter 20about Some concepts from the problem domain may yield classes in the software (and the form-based sys- objects in the software's execution)even though they would not necessarily be tem.About the classified as objects in the usual sense of the term if we insist on a concrete view of notion ofcommand, see chapter 21. objects.A class such as STATE in the discussion of the form-based interactive system,or COMMAND(to be studied in a later chapter in connection with undo-redo mechanisms)fall in this category. When the word "object"is used in this book,the context will clearly indicate whether the usual meaning or(more commonly)the technical software meaning is intended.When there is a need to distinguish,one may talk about external objects and software objects. Basic form A software object is a rather simple animal once you know what class it comes from. Let O be an object.The definition on the previous page indicates that it is an instance of some class.More precisely,it is a direct instance of just one class,say C. Because of inheritance,O will then be an instance,direct or not,of other classes,the ancestors of C;but that is a matter for a future chapter,and for the present discussion we only need the notion of direct instance.The word "direct"will be dropped when there is no possible confusion. C is called the generating class,or just generator,ofO.C is a software text;O is a run-time data structure,produced by one ofthe object creation mechanisms studied below. Among its features,C has a certain number of attributes.These attributes entirely determine the form of the object:O is simply a collection of components,or fields,one for each attribute. Consider class PO/NT from the previous chapter.The class text was of the form: For the text of class class POINT feature POINT see page x,y:REAL 176. ..Routine declarations... end The routines have been omitted,and for good reason:the form of the corresponding objects (the direct instances of the class)is solely determined by the attributes,although the operations applicable to the objects depend on the routines.Here the class has two attributes,x andy,both of type REAL,so a direct instance of POINT is an object with two fields containing values of that type,for example:
§8.1 OBJECTS 219 reason why no one seems to have any trouble.) The term “object” is so overloaded with everyday meanings that in spite of the benefits just mentioned its use in a technical software sense has caused its share of confusion. In particular: • As pointed out in the discussion of direct mapping, not all classes correspond to object types of the problem domain. The classes introduced for design and implementation have no immediate counterparts in the modeled system. They are often among the most important in practice, and the most difficult to find. • Some concepts from the problem domain may yield classes in the software (and objects in the software’s execution) even though they would not necessarily be classified as objects in the usual sense of the term if we insist on a concrete view of objects. A class such as STATE in the discussion of the form-based interactive system, or COMMAND (to be studied in a later chapter in connection with undo-redo mechanisms) fall in this category. When the word “object” is used in this book, the context will clearly indicate whether the usual meaning or (more commonly) the technical software meaning is intended. When there is a need to distinguish, one may talk about external objects and software objects. Basic form A software object is a rather simple animal once you know what class it comes from. Let O be an object. The definition on the previous page indicates that it is an instance of some class. More precisely, it is a direct instance of just one class, say C. Because of inheritance, O will then be an instance, direct or not, of other classes, the ancestors of C; but that is a matter for a future chapter, and for the present discussion we only need the notion of direct instance. The word “direct” will be dropped when there is no possible confusion. C is called the generating class, or just generator, of O. C is a software text; O is a run-time data structure, produced by one of the object creation mechanisms studied below. Among its features, C has a certain number of attributes. These attributes entirely determine the form of the object: O is simply a collection of components, or fields, one for each attribute. Consider class POINT from the previous chapter. The class text was of the form: class POINT feature x, y: REAL … Routine declarations … end The routines have been omitted, and for good reason: the form of the corresponding objects (the direct instances of the class) is solely determined by the attributes, although the operations applicable to the objects depend on the routines. Here the class has two attributes, x and y, both of type REAL, so a direct instance of POINT is an object with two fields containing values of that type, for example: See chapter 20 about the form-based system. About the notion of command, see chapter 21. For the text of class POINT see page 176
220 THE RUN-TIME STRUCTURE:OBJECTS $8.1 P OBJ 3.4 -8.09 (POINT) Notice the conventions used here and in the rest of this book for representing an object as See "Graphical con- a set of fields,shown as adjacent rectangles containing the associated values.Below the ventions”,page27l. object the name ofthe generating class,here POIN7,appears in parentheses and in italics; next to each field,also in italics,there appears the name of the corresponding attribute, here x and y.Sometimes a name in roman (here P_OBJ)will appear above the object;it has no counterpart in the software but identifies the object in the discussion. In diagrams used to show the structure of an object-oriented system,or more commonly of some part of such a system,classes appear as ellipses.This convention,already used in the figures of the previous chapter,avoids any confusion between classes and objects. Simple fields Both attributes of class POIN7 are of type REAL.As a consequence,each of the corresponding fields of a direct instance of POINT contains a real value. This is an example ofa field corresponding to an attribute of one of the"basic types". Although these types are formally defined as classes,their instances take their values from predefined sets implemented efficiently on computers.They include: BOOLEAN,which has exactly two instances,representing the boolean values true and false. CHARACTER,whose instances represent characters. INTEGER,whose instances represent integers. RE4L and DOUBLE,whose instances represent single-precision and double- precision floating-point numbers. Another type which for the time being will be treated as a basic type,although we "STRINGS".13.5. will later see that it is actually in a different category,is STRING,whose instances page 456. represent finite sequences of characters. For each of the basic types we will need the ability to denote the corresponding values in software texts and on figures.The conventions are straightforward: For BOOLEAN,the two instances are written True and False. To denote an instance of CHARACTER you will write a character enclosed in single quotes,such as
220 THE RUN-TIME STRUCTURE: OBJECTS §8.1 Notice the conventions used here and in the rest of this book for representing an object as a set of fields, shown as adjacent rectangles containing the associated values. Below the object the name of the generating class, here POINT, appears in parentheses and in italics; next to each field, also in italics, there appears the name of the corresponding attribute, here x and y. Sometimes a name in roman (here P_OBJ) will appear above the object; it has no counterpart in the software but identifies the object in the discussion. In diagrams used to show the structure of an object-oriented system, or more commonly of some part of such a system, classes appear as ellipses. This convention, already used in the figures of the previous chapter, avoids any confusion between classes and objects. Simple fields Both attributes of class POINT are of type REAL. As a consequence, each of the corresponding fields of a direct instance of POINT contains a real value. This is an example of a field corresponding to an attribute of one of the “basic types”. Although these types are formally defined as classes, their instances take their values from predefined sets implemented efficiently on computers. They include: • BOOLEAN, which has exactly two instances, representing the boolean values true and false. • CHARACTER, whose instances represent characters. • INTEGER, whose instances represent integers. • REAL and DOUBLE, whose instances represent single-precision and doubleprecision floating-point numbers. Another type which for the time being will be treated as a basic type, although we will later see that it is actually in a different category, is STRING, whose instances represent finite sequences of characters. For each of the basic types we will need the ability to denote the corresponding values in software texts and on figures. The conventions are straightforward: • For BOOLEAN, the two instances are written True and False. • To denote an instance of CHARACTER you will write a character enclosed in single quotes, such as 'A'. 3.4 –8.09 x y (POINT) P_OBJ See “Graphical conventions”, page 271. “STRINGS”, 13.5, page 456
§8 1 OBJECTS 221 To denote an instance of STR/NG,write a sequence of characters in double quotes, as in "A STRING". To denote an instance of /NTEGER,write a number in an ordinary decimal notation with an optional sign,as in 34,-675 and +4. You can also write an instance of REAL or DOUBLE in ordinary notation,as in 3.5 or-0.05.Use the letter e to introduce a decimal exponent,as in-5.e-2 which denotes the same value as the preceding example. A simple notion of book Here is a class with attribute types taken from the preceding set: class BOOK/feature title:STRING date,page count:INTEGER end A typical instance of class BOOKI may appear as follows: An object title "The Red and the Black" representing a book date 1830 page count 341 (BOOKI) Since for the moment we are only interested in the structure of objects,all the features in this class and the next few examples are attributes-none are routines. This means that our objects are similar at this stage to the records or structure types of non-object-oriented languages such as Pascal and C.But unlike the situation in these languages there is little we can do with such a class in a good O-O language:because of the information hiding mechanisms,a client class has no way of assigning values to the fields of such objects.In Pascal,or in C with a slightly different syntax,a record type with a similar structure would allow a client to include the declaration and instruction bl:BOOKI Warning:not per- bl-page count 355 mitted in the O-O notation!For dis- which at run time will assign value 355 to the page com field of the object attached to cussion only. b/.With classes,however,we should not provide any such facility:letting clients change object fields as they please would make a mockery of the rule of information hiding,which
§8.1 OBJECTS 221 • To denote an instance of STRING, write a sequence of characters in double quotes, as in "A STRING". • To denote an instance of INTEGER, write a number in an ordinary decimal notation with an optional sign, as in 34, –675 and +4. • You can also write an instance of REAL or DOUBLE in ordinary notation, as in 3.5 or –0.05. Use the letter e to introduce a decimal exponent, as in –5.e–2 which denotes the same value as the preceding example. A simple notion of book Here is a class with attribute types taken from the preceding set: class BOOK1 feature title: STRING date, page_count: INTEGER end A typical instance of class BOOK1 may appear as follows: Since for the moment we are only interested in the structure of objects, all the features in this class and the next few examples are attributes — none are routines. This means that our objects are similar at this stage to the records or structure types of non-object-oriented languages such as Pascal and C. But unlike the situation in these languages there is little we can do with such a class in a good O-O language: because of the information hiding mechanisms, a client class has no way of assigning values to the fields of such objects. In Pascal, or in C with a slightly different syntax, a record type with a similar structure would allow a client to include the declaration and instruction b1: BOOK1 … b1 ● page_count := 355 which at run time will assign value 355 to the page_count field of the object attached to b1. With classes, however, we should not provide any such facility: letting clients change object fields as they please would make a mockery of the rule of information hiding, which An object representing a book "The Red and the Black" 1830 title date (BOOK1) page_count 341 Warning: not permitted in the O-O notation! For discussion only
222 THE RUN-TIME STRUCTURE:OBJECTS $8.1 implies that the author of each class controls the precise set of operations that clients may execute on its instances.No such direct field assignment is possible in an O-O context; clients will perform field modifications through procedures of the class.Later in this chapter we will add to BOOKI a procedure that gives clients the effect of the above assignment,if the author of the class indeed wishes to grant them such privileges. We have already seen that C++and Java actually permit assignments of the form b/.page count:=355.But this simply reflects the inherent limits of attempts to integrate object technology in a C context. As the designers of Java themselves write in their book about the language:"A [Arnold 1996], programmer could still mess up the object by setting [a public]field,because the field [is] page 40. subject to change"through direct assignment instructions.Too many languages require such"don't do this"warnings.Rather than propose a language and then explain at length See also“"f it is how not to use it,it is desirable to define hand in hand the method and a notation that will baroque,,fiit” support it. page 670. In proper O-O development,classes without routines,such as BOOK/,have little practical use (except as ancestors in an inheritance hierarchy,where descendants will inherit the attributes and provide their own routines;or to represent external objects which the O-O part can access but not modify,for example sensor data in a real-time system) But they will help us go through the basic concepts;then we will add routines. Writers Using the types mentioned above,we can also define a class WRITER describing a simple notion of book author: class WR/TER feature name,real name:STRING birth year,death year:INTEGER end "Stendhal" A“vriter” name object real name "Henri Beyle" birth year 1783 death year 1842 (WRITER) References Objects whose fields are all of basic types will not take us very far.We need objects with fields that represent other objects.For example we will want to represent the property that a book has an author-denoted by an instance of class WR/TER
222 THE RUN-TIME STRUCTURE: OBJECTS §8.1 implies that the author of each class controls the precise set of operations that clients may execute on its instances. No such direct field assignment is possible in an O-O context; clients will perform field modifications through procedures of the class. Later in this chapter we will add to BOOK1 a procedure that gives clients the effect of the above assignment, if the author of the class indeed wishes to grant them such privileges. We have already seen that C++ and Java actually permit assignments of the form b1 ● page_count := 355. But this simply reflects the inherent limits of attempts to integrate object technology in a C context. As the designers of Java themselves write in their book about the language: “A programmer could still mess up the object by setting [a public] field, because the field [is] subject to change” through direct assignment instructions. Too many languages require such “don’t do this” warnings. Rather than propose a language and then explain at length how not to use it, it is desirable to define hand in hand the method and a notation that will support it. In proper O-O development, classes without routines, such as BOOK1, have little practical use (except as ancestors in an inheritance hierarchy, where descendants will inherit the attributes and provide their own routines; or to represent external objects which the O-O part can access but not modify, for example sensor data in a real-time system). But they will help us go through the basic concepts; then we will add routines. Writers Using the types mentioned above, we can also define a class WRITER describing a simple notion of book author: class WRITER feature name, real_name: STRING birth_ year, death_ year: INTEGER end References Objects whose fields are all of basic types will not take us very far. We need objects with fields that represent other objects. For example we will want to represent the property that a book has an author — denoted by an instance of class WRITER. [Arnold 1996], page 40. See also “If it is baroque, fix it”, page 670. A “writer” object "Stendhal" "Henri Beyle" name real_name birth_year 1783 death_year 1842 (WRITER)
$8.1 OBJECTS 223 A possibility is to introduce a notion of subobject.For example we might think of a book object,in a new version BOOK2 of the book class,as having a field author which is itself an object,as informally suggested by the following picture: Tw0“b0ok” title title objects with "The Red and the Black” "Life of Rossini" writer” date 1830 date 1823 subobjects page 341 page 307 count counf name Stendhal” name "Stendhal" real_name "Henri Beyle" real name Henri Beyle" birth year 1783 birth year 1783 death year 1842 death_year 1842 (WRITER) (WRITER) (BOOK2) (BOOK2) Such a notion of subobject is indeed useful and we will see,later in this chapter,how to write the corresponding classes. But here it is not exactly what we need.The example represents two books with the same author;we ended up duplicating the author information,which now appears as two subobjects,one in each instance of BOOK2.This duplication is probably not acceptable: It wastes memory space.Other examples would make this waste even more unacceptable:imagine for example a set of objects representing people,each one with a subobject representing the country of citizenship,where the number of people represented is large but the number of countries is small. Even more importantly,this technique fails to account for the need to express sharing.Regardless of representation choices,the author fields of the two objects refer to the same instance of WR/TER;if you update the WRITER object(for example to record an author's death),you will want the change to affect all book objects associated with the given author. Here then is a better picture of the desired situation,assuming yet another version of the book class,BOOK3:
§8.1 OBJECTS 223 A possibility is to introduce a notion of subobject. For example we might think of a book object, in a new version BOOK2 of the book class, as having a field author which is itself an object, as informally suggested by the following picture: Such a notion of subobject is indeed useful and we will see, later in this chapter, how to write the corresponding classes. But here it is not exactly what we need. The example represents two books with the same author; we ended up duplicating the author information, which now appears as two subobjects, one in each instance of BOOK2. This duplication is probably not acceptable: • It wastes memory space. Other examples would make this waste even more unacceptable: imagine for example a set of objects representing people, each one with a subobject representing the country of citizenship, where the number of people represented is large but the number of countries is small. • Even more importantly, this technique fails to account for the need to express sharing. Regardless of representation choices, the author fields of the two objects refer to the same instance of WRITER; if you update the WRITER object (for example to record an author’s death), you will want the change to affect all book objects associated with the given author. Here then is a better picture of the desired situation, assuming yet another version of the book class, BOOK3: Two “book” objects with “writer” subobjects "Life of Rossini" 1823 title date (BOOK2) page_ 307 "Stendhal" "Henri Beyle" name real_name birth_ year 1783 death_ year 1842 (WRITER) count "The Red and the Black” 1830 title date (BOOK2) page_ 341 "Stendhal" "Henri Beyle" name real_name birth_ year 1783 death_ year 1842 (WRITER) count
224 THE RUN-TIME STRUCTURE:OBJECTS $8.1 Tw0“b00k” title "The Red and the Black" title "The Charterhouse of Parma" objects with date 1830 date 1839 references to the same nnge 341 page 307 vriter”object count count author author (BOOK3) (BOOK3) name "Stendhal" real name "Henri Beyle" birth year 1783 death year 1842 (WRITER) The author field of each instance of BOOK3 contains what is known as a reference to a possible object of type WRITER.It is not difficult to define this notion precisely: Definition:reference A reference is a run-time value which is either void or attached. If attached,a reference identifies a single object.(It is then said to be attached to that particular object In the last figure,the author reference fields of the BOOK3 instances are both attached to the WR/TER instance,as shown by the arrows,which are conventionally used on such diagrams to represent a reference attached to an object.The following figure has a void reference (perhaps to indicate an unknown author),showing the graphical representation of void references: An object with title "Candide,or Optimism" a void reference field date 1759 page 120 ("Candide”was coum published anony- author moushy.) (BOOK3)
224 THE RUN-TIME STRUCTURE: OBJECTS §8.1 The author field of each instance of BOOK3 contains what is known as a reference to a possible object of type WRITER. It is not difficult to define this notion precisely: In the last figure, the author reference fields of the BOOK3 instances are both attached to the WRITER instance, as shown by the arrows, which are conventionally used on such diagrams to represent a reference attached to an object. The following figure has a void reference (perhaps to indicate an unknown author), showing the graphical representation of void references: Definition: reference A reference is a run-time value which is either void or attached. If attached, a reference identifies a single object. (It is then said to be attached to that particular object.) "The Charterhouse of Parma" 1839 title date (BOOK3) 307 count "The Red and the Black" 1830 title date (BOOK3) page_ 341 count (WRITER) "Stendhal" "Henri Beyle" name real_name birth_ year 1783 death_ year 1842 author author page_ Two “book” objects with references to the same “writer” object An object with a void reference field (“Candide” was published anonymously.) "Candide, or Optimism" 1759 title date (BOOK3) page_ 120 count author
$8.1 OBJECTS 225 The definition of references makes no mention of implementation properties.A reference,if not void,is a way to identify an object;an abstract name for the object.This is similar to a social security number that uniquely identifies a person,or an area code that identifies a phone area.Nothing implementation-specific or computer-specific here. The reference concept of course has a counterpart in computer implementations.In machine-level programming it is possible to manipulate addresses;many programming languages offer a notion of pointer.The notion of reference is more abstract.Although a reference may end up being represented as an address,it does not have to;and even when the representation of a reference includes an address,it may include other information. Another property sets references apart from addresses,although pointers in typed languages such as Pascal and Ada(not C)also enjoy it:as will be explained below,a reference in the approach described here is typed.This means that a given reference may only become attached to objects of a specific set of types,determined by a declaration in the software text.This idea again has counterparts in the non-computer world:a social security number is only meant for persons,and area codes are only meant for phone areas. (They may look like normal integers,but you would not add two area codes.) Object identity The notion of reference brings about the concept of object identity.Every object created during the execution of an object-oriented system has a unique identity,independent of the object's value as defined by its fields.In particular: I1.Two objects with different identities may have identical fields. I2.Conversely,the fields of a certain object may change during the execution of a system;but this does not affect the object's identity. These observations indicate that a phrase such as "a denotes the same object as b" may be ambiguous:are we talking about objects with different identities but the same contents(11)?Or about the states of an object before and after some change is applied to its fields (12)?We will use the second interpretation:a given object may take on new values for its constituent fields during an execution,while remaining "the same object". Whenever confusion is possible the discussion will be more explicit.For case II we may talk of equal (but distinct)objects;equality will be defined more precisely below. A point of terminology may have caught your attention.It is not a mistake to say (as in the definition of 12)that the fields of an object may change.The term "field"as defined above denotes one of the values that make up an object,not the corresponding field identifier,which is the name of one of the attributes of the object's generating class. For each attribute of the class,for example date in class BOOK3,the object has a field, for example /832 in the object of the last figure.During execution the attributes will never change,so each object's division into fields will remain the same;but the fields themselves may change.For example an instance of BOOK3 will always have four fields, corresponding to attributes title,date,page_count,author;these fields-the four values that make up a given object of type BOOK3-may change. “Object identity”, The study of how to make objects persistent will lead us to explore further properties page 1052. of object identity
§8.1 OBJECTS 225 The definition of references makes no mention of implementation properties. A reference, if not void, is a way to identify an object; an abstract name for the object. This is similar to a social security number that uniquely identifies a person, or an area code that identifies a phone area. Nothing implementation-specific or computer-specific here. The reference concept of course has a counterpart in computer implementations. In machine-level programming it is possible to manipulate addresses; many programming languages offer a notion of pointer. The notion of reference is more abstract. Although a reference may end up being represented as an address, it does not have to; and even when the representation of a reference includes an address, it may include other information. Another property sets references apart from addresses, although pointers in typed languages such as Pascal and Ada (not C) also enjoy it: as will be explained below, a reference in the approach described here is typed. This means that a given reference may only become attached to objects of a specific set of types, determined by a declaration in the software text. This idea again has counterparts in the non-computer world: a social security number is only meant for persons, and area codes are only meant for phone areas. (They may look like normal integers, but you would not add two area codes.) Object identity The notion of reference brings about the concept of object identity. Every object created during the execution of an object-oriented system has a unique identity, independent of the object’s value as defined by its fields. In particular: I1 • Two objects with different identities may have identical fields. I2 • Conversely, the fields of a certain object may change during the execution of a system; but this does not affect the object’s identity. These observations indicate that a phrase such as “a denotes the same object as b” may be ambiguous: are we talking about objects with different identities but the same contents (I1)? Or about the states of an object before and after some change is applied to its fields (I2)? We will use the second interpretation: a given object may take on new values for its constituent fields during an execution, while remaining “the same object”. Whenever confusion is possible the discussion will be more explicit. For case I1 we may talk of equal (but distinct) objects; equality will be defined more precisely below. A point of terminology may have caught your attention. It is not a mistake to say (as in the definition of I2) that the fields of an object may change. The term “field” as defined above denotes one of the values that make up an object, not the corresponding field identifier, which is the name of one of the attributes of the object’s generating class. For each attribute of the class, for example date in class BOOK3, the object has a field, for example 1832 in the object of the last figure. During execution the attributes will never change, so each object’s division into fields will remain the same; but the fields themselves may change. For example an instance of BOOK3 will always have four fields, corresponding to attributes title, date, page_count, author; these fields — the four values that make up a given object of type BOOK3 — may change. The study of how to make objects persistent will lead us to explore further properties of object identity. “Object identity”, page 1052
226 THE RUN-TIME STRUCTURE:OBJECTS $8.1 Declaring references Let us see how to extend the initial book class,BOOK/,which only had attributes of basic types,to the new variant BOOK3 which has an attribute representing references to potential authors.Here is the class text,again just showing the attributes;the only difference is an extra attribute declaration at the end: class BOOK3 feature title:STRING date,page_count:INTEGER author:WRITER -This is the new attribute. end The type used to declare author is simply the name of the corresponding class: WRITER.This will be a general rule:whenever a class is declared in the standard form class C feature...end then any entity declared of type C through a declaration of the form x:C denotes values that are references to potential objects of type C.The reason for this See page272. convention is that using references provides more flexibility,and so are appropriate in the vast majority of cases.You will find further examination of this rule (and of the other possible conventions)in the discussion section of this chapter. Self-reference Nothing in the preceding discussion precludes an object Ol from containing a reference field which (at some point of a system's execution)is attached to Ol itself.This kind of self-reference can also be indirect.In the situation pictured below,the object with "Almaviva"in its name field is its own landlord (direct reference cycle);the object "Figaro"loves "Susanna"which loves"Figaro"(indirect reference cycle). Direct and name "Almaviva" indirect self- reference landlord loved one (PERSONI name "Figaro" Susanna" name landlord landlord loved one loved one (PERSONI) (PERSONI
226 THE RUN-TIME STRUCTURE: OBJECTS §8.1 Declaring references Let us see how to extend the initial book class, BOOK1, which only had attributes of basic types, to the new variant BOOK3 which has an attribute representing references to potential authors. Here is the class text, again just showing the attributes; the only difference is an extra attribute declaration at the end: class BOOK3 feature title: STRING date, page_count: INTEGER author: WRITER -- This is the new attribute. end The type used to declare author is simply the name of the corresponding class: WRITER. This will be a general rule: whenever a class is declared in the standard form class C feature … end then any entity declared of type C through a declaration of the form x: C denotes values that are references to potential objects of type C. The reason for this convention is that using references provides more flexibility, and so are appropriate in the vast majority of cases. You will find further examination of this rule (and of the other possible conventions) in the discussion section of this chapter. Self-reference Nothing in the preceding discussion precludes an object O1 from containing a reference field which (at some point of a system’s execution) is attached to O1 itself. This kind of self-reference can also be indirect. In the situation pictured below, the object with "Almaviva" in its name field is its own landlord (direct reference cycle); the object "Figaro" loves "Susanna" which loves "Figaro" (indirect reference cycle). See page 272. Direct and indirect selfreference (PERSON1) name "Almaviva" landlord loved_one (PERSON1) name "Figaro" landlord loved_one (PERSON1) "Susanna" name landlord loved_one