22 Iow to find the classes oremost among the goals of object-oriented methodology,since the structure of O-O software is based on decomposition into classes,is that it should give us some advice on how to find these classes.Such is the purpose of the following pages.(In some of the literature you will see the problem referred to as"finding the objects",but by now we know better:what is at stake in our software architectures is not individual objects,but object types-classes. At first we should not expect too much.Finding classes is the central decision in building an object-oriented software system;as in any creative discipline,making such decisions right takes talent and experience,not to mention luck.Expecting to obtain infallible recipes for finding the classes is as unrealistic as would be,for an aspiring mathematician,expecting to obtain recipes for inventing interesting theories and proving their theorems.Although both activities-software construction and theory construction -can benefit from general advice and the example of successful predecessors,both also require creativity of the kind that cannot fully be covered by mechanical rules.If(like many people in the industry)you still find it hard to compare the software developer to a mathematician,just think of other forms of engineering design:although it is possible to provide basic guidelines,no teachable step-by-step rules can guarantee good design of buildings or airplanes. In software too,no book advice can replace your know-how and ingenuity.The principal role of a methodological discussion is to indicate some good ideas,draw your attention to some illuminating precedents,and alert you to some known pitfalls. This would be true with any other software design method.In the case of object technology,the observation is tempered by some good news,coming to us in the form of reuse.Because much of the necessary invention may already have been done,you can build on others'accomplishments. There is more good news.By starting with humble expectations but studying carefully what works and also what does not,we will be able,little by little and against all odds,to devise what in the end deserves to be called a method for finding the classes.One of the key steps will be the realization that,as always in design,a selection technique is defined by two components:what to consider,and what to reject
22 How to find the classes Foremost among the goals of object-oriented methodology, since the structure of O-O software is based on decomposition into classes, is that it should give us some advice on how to find these classes. Such is the purpose of the following pages. (In some of the literature you will see the problem referred to as “finding the objects”, but by now we know better: what is at stake in our software architectures is not individual objects, but object types — classes.) At first we should not expect too much. Finding classes is the central decision in building an object-oriented software system; as in any creative discipline, making such decisions right takes talent and experience, not to mention luck. Expecting to obtain infallible recipes for finding the classes is as unrealistic as would be, for an aspiring mathematician, expecting to obtain recipes for inventing interesting theories and proving their theorems. Although both activities — software construction and theory construction — can benefit from general advice and the example of successful predecessors, both also require creativity of the kind that cannot fully be covered by mechanical rules. If (like many people in the industry) you still find it hard to compare the software developer to a mathematician, just think of other forms of engineering design: although it is possible to provide basic guidelines, no teachable step-by-step rules can guarantee good design of buildings or airplanes. In software too, no book advice can replace your know-how and ingenuity. The principal role of a methodological discussion is to indicate some good ideas, draw your attention to some illuminating precedents, and alert you to some known pitfalls. This would be true with any other software design method. In the case of object technology, the observation is tempered by some good news, coming to us in the form of reuse. Because much of the necessary invention may already have been done, you can build on others’ accomplishments. There is more good news. By starting with humble expectations but studying carefully what works and also what does not, we will be able, little by little and against all odds, to devise what in the end deserves to be called a method for finding the classes. One of the key steps will be the realization that, as always in design, a selection technique is defined by two components: what to consider, and what to reject
720 HOW TO FIND THE CLASSES $22.1 22.1 STUDYING A REQUIREMENTS DOCUMENT To understand the problem of finding classes,it may be best to begin by assessing a widely publicized approach. The nouns and the verbs A number of publications suggest using a simple rule for obtaining the classes:start from See the biblio- the requirements document(assuming there is one,of course,but that is another story);in graphical notes. function-oriented design you would concentrate on the verbs,which correspond to actions ("do this");in object-oriented design you underline the nouns,which describe objects.So according to this view a sentence of the form The elevator will close its door before it moves to another floor. would lead the function-oriented designer to detect the need for a"move"function;but as an object-oriented designer you should see in it three object types,ELEVATOR,DOOR and FLOOR,which will give classes.Voila! Would it that life were that simple.You would bring your requirements documents home at night,and play Object Pursuit around the dinner table.A good way to keep the children away from the TV set,and make them revise their grammar lessons while they help Mom and Dad in their software engineering work. But such a simple-minded technique cannot take us very far.Human language,used to express system requirements,is so open to nuance,personal variation and ambiguity that it is dangerous to make any important decision on the basis of a document which may be influenced as much by the author's individual style as by the actual properties of the projected software system. Any useful result that the "underline the nouns"method would give us is obvious anyway.Any decent O-0 design for an elevator control system will include an ELEVATOR class.Obtaining such classes is not the difficult part.To repeat an expression used in an earlier discussion,they are here for the picking.For the non-obvious classes a syntactic criterion-such as nouns versus verbs in a document that is by essence open to many possible stylistic variants-is close to useless. Although by itself the "underline the nouns"idea would not deserve much more consideration,we can use it further,not for its own sake but as a foil;by understanding its limitations we can gain insights into what it truly takes to find the classes and how the requirements document can help us in this endeavor. Avoiding useless classes The nouns of a requirements document will cover some classes of the final design,but will also include many "false alarms":concepts that should not yield classes. In the elevator example door was a noun.Do we need a class DOOR?Maybe,maybe not.It is possible that the only relevant property of elevator doors for this system is that
720 HOW TO FIND THE CLASSES §22.1 22.1 STUDYING A REQUIREMENTS DOCUMENT To understand the problem of finding classes, it may be best to begin by assessing a widely publicized approach. The nouns and the verbs A number of publications suggest using a simple rule for obtaining the classes: start from the requirements document (assuming there is one, of course, but that is another story); in function-oriented design you would concentrate on the verbs, which correspond to actions (“do this”); in object-oriented design you underline the nouns, which describe objects. So according to this view a sentence of the form The elevator will close its door before it moves to another floor. would lead the function-oriented designer to detect the need for a “move” function; but as an object-oriented designer you should see in it three object types, ELEVATOR, DOOR and FLOOR, which will give classes. Voilà! Would it that life were that simple. You would bring your requirements documents home at night, and play Object Pursuit around the dinner table. A good way to keep the children away from the TV set, and make them revise their grammar lessons while they help Mom and Dad in their software engineering work. But such a simple-minded technique cannot take us very far. Human language, used to express system requirements, is so open to nuance, personal variation and ambiguity that it is dangerous to make any important decision on the basis of a document which may be influenced as much by the author’s individual style as by the actual properties of the projected software system. Any useful result that the “underline the nouns” method would give us is obvious anyway. Any decent O-O design for an elevator control system will include an ELEVATOR class. Obtaining such classes is not the difficult part. To repeat an expression used in an earlier discussion, they are here for the picking. For the non-obvious classes a syntactic criterion — such as nouns versus verbs in a document that is by essence open to many possible stylistic variants — is close to useless. Although by itself the “underline the nouns” idea would not deserve much more consideration, we can use it further, not for its own sake but as a foil; by understanding its limitations we can gain insights into what it truly takes to find the classes and how the requirements document can help us in this endeavor. Avoiding useless classes The nouns of a requirements document will cover some classes of the final design, but will also include many “false alarms”: concepts that should not yield classes. In the elevator example door was a noun. Do we need a class DOOR? Maybe, maybe not. It is possible that the only relevant property of elevator doors for this system is that See the bibliographical notes
$22.1 STUDYING A REQUIREMENTS DOCUMENT 721 they may be opened and closed.Then to express the useful properties of doors it suffices to include in class ELEVATOR the query and commands door_open:BOOLEAN: close door is 444 ensure not door open end; open door is “40 ensure door_open end In another variant of the system,however,the notion of door may be important enough to justify a separate class.The only resource here is the theory of abstract data types,and the only relevant question is: Is"door"a separate data type with its own clearly identified operations,or are all the operations on doors already covered by operations on other data types such as ELEVATOR? Only your intuition and experience as a designer will tell you the answer.In looking for it,you will be aided by the requirements document,but do not expect grammatical criteria to be of more than superficial help.Turn instead to the ADT theory,which will help you ask customers or future users the right questions. Chapter 21. We encountered a similar case in the undo-redo mechanism design.The discussion distinguished between commands,such as the line insertion command in a text editor,and the more general notion of operation,which includes commands but also special requests such as Undo.Both of these words figured prominently in the statement of the problem; yet only COMMAND yielded a data abstraction (one of the principal classes ofthe design), whereas no class in the solution directly reflects the notion of operation.No analysis of a requirements document can suggest this striking difference of treatment. Is a new class necessary? Another example of a noun which may or may not give a class in the elevator example is floor.Here (as opposed to the door and operation cases)the question is not whether the concept is a relevant ADT:floors are definitely an important data abstraction for an elevator system.But this does not necessarily mean we should have a FLOOR class. The reason is simply that the properties of floors may be entirely covered,for the purposes of the elevator system,by those of integers.Each floor has a floor number;then
§22.1 STUDYING A REQUIREMENTS DOCUMENT 721 they may be opened and closed. Then to express the useful properties of doors it suffices to include in class ELEVATOR the query and commands door_open: BOOLEAN; close_door is … ensure not door_open end; open_door is … ensure door_open end In another variant of the system, however, the notion of door may be important enough to justify a separate class. The only resource here is the theory of abstract data types, and the only relevant question is: Only your intuition and experience as a designer will tell you the answer. In looking for it, you will be aided by the requirements document, but do not expect grammatical criteria to be of more than superficial help. Turn instead to the ADT theory, which will help you ask customers or future users the right questions. We encountered a similar case in the undo-redo mechanism design. The discussion distinguished between commands, such as the line insertion command in a text editor, and the more general notion of operation, which includes commands but also special requests such as Undo. Both of these words figured prominently in the statement of the problem; yet only COMMAND yielded a data abstraction (one of the principal classes of the design), whereas no class in the solution directly reflects the notion of operation. No analysis of a requirements document can suggest this striking difference of treatment. Is a new class necessary? Another example of a noun which may or may not give a class in the elevator example is floor. Here (as opposed to the door and operation cases) the question is not whether the concept is a relevant ADT: floors are definitely an important data abstraction for an elevator system. But this does not necessarily mean we should have a FLOOR class. The reason is simply that the properties of floors may be entirely covered, for the purposes of the elevator system, by those of integers. Each floor has a floor number; then Is “door” a separate data type with its own clearly identified operations, or are all the operations on doors already covered by operations on other data types such as ELEVATOR? Chapter 21
722 HOW TO FIND THE CLASSES $22.1 if a floor(as seen by the elevator system)has no other features than those associated with Mamy hotels have its floor number,you may not need a separate FLOOR class.A typical floor feature that nofloor 13,so the comes from a feature of integers is the distance between two floors,which is simply the arithmetic may be a difference of their floor numbers. bit more elaborate. If,however,floors have properties other than those of their numbers-that is to say, according to the principles of abstract data types and object-oriented software construction,significant operations not covered by those of integers-then a FLOOR class will be appropriate.For example,some floors may have special access rights defining who can visit them;then the FLOOR class could include a feature such as rights:SET [AUTHORIZATION and the associated procedures.But even that is not certain:we might get away by including in some other class an array floor rights:ARRAY [SET [AUTHORIZATION]] which simply associates a set of AUTHOR/ZATION values with each floor,identified by its number. Another argument for having a specific class FLOOR would be to lim it the available See exercise E22.1. operations:it makes sense to subtract two floors and to compare them (through the page 745. infix "<"function),but not to add or multiply them.Such a class may be written as an heir to INTEGER.The designer must ask himself,however,whether this goal really justifies adding a new class. This discussion brings us once again to the theory of abstract data types.A class does not just cover physical"objects"in the naive sense.It describes an abstract data type -a set of software objects characterized by well-defined operations and formal properties of these operations.A type of real-world objects may or may not have a counterpart in the software in the form ofa type of software objects-a class.When you are assessing whether a certain notion should yield a class or not,only the ADT view can provide the right criterion:do the objects of the system under discussion exhibit enough specific operations and properties of their own,relevant to the system and not covered by existing classes? The qualification"relevant to the system"is crucial.The aim of systems analysis is "BEYOND SOFT. not to "model the world".This may be a task for philosophers,but the builders of software WARE”,6.6.pag systems could not care less,at least for their professional activity.The task of analysis is 147. to model that part of the world which is meaningful for the software under study or construction.This principle is reinforced by the ADT approach(that is to say,the object- oriented method),which holds that objects are only defined by what we can do with them -what the discussion of abstract data types called the Principle of Selfishness.If an operation or property of an object is irrelevant to the purposes of the system,then it should not be included in the result of your analysis-however interesting it may be for other purposes.For a census processing system,the notion of PERSON may have features mother and father;but for a payroll processing system which does not require information about the parents,every PERSON is an orphan
722 HOW TO FIND THE CLASSES §22.1 if a floor (as seen by the elevator system) has no other features than those associated with its floor number, you may not need a separate FLOOR class. A typical floor feature that comes from a feature of integers is the distance between two floors, which is simply the difference of their floor numbers. If, however, floors have properties other than those of their numbers — that is to say, according to the principles of abstract data types and object-oriented software construction, significant operations not covered by those of integers — then a FLOOR class will be appropriate. For example, some floors may have special access rights defining who can visit them; then the FLOOR class could include a feature such as rights: SET [AUTHORIZATION] and the associated procedures. But even that is not certain: we might get away by including in some other class an array floor_rights: ARRAY [SET [AUTHORIZATION]] which simply associates a set of AUTHORIZATION values with each floor, identified by its number. Another argument for having a specific class FLOOR would be to limit the available operations: it makes sense to subtract two floors and to compare them (through the infix "<" function), but not to add or multiply them. Such a class may be written as an heir to INTEGER. The designer must ask himself, however, whether this goal really justifies adding a new class. This discussion brings us once again to the theory of abstract data types. A class does not just cover physical “objects” in the naïve sense. It describes an abstract data type — a set of software objects characterized by well-defined operations and formal properties of these operations. A type of real-world objects may or may not have a counterpart in the software in the form of a type of software objects — a class. When you are assessing whether a certain notion should yield a class or not, only the ADT view can provide the right criterion: do the objects of the system under discussion exhibit enough specific operations and properties of their own, relevant to the system and not covered by existing classes? The qualification “relevant to the system” is crucial. The aim of systems analysis is not to “model the world”. This may be a task for philosophers, but the builders of software systems could not care less, at least for their professional activity. The task of analysis is to model that part of the world which is meaningful for the software under study or construction. This principle is reinforced by the ADT approach (that is to say, the objectoriented method), which holds that objects are only defined by what we can do with them — what the discussion of abstract data types called the Principle of Selfishness. If an operation or property of an object is irrelevant to the purposes of the system, then it should not be included in the result of your analysis — however interesting it may be for other purposes. For a census processing system, the notion of PERSON may have features mother and father; but for a payroll processing system which does not require information about the parents, every PERSON is an orphan. Many hotels have no floor 13, so the arithmetic may be a bit more elaborate. See exercise E22.1, page 745. “BEYOND SOFTWARE”, 6.6, page 147
$22.1 STUDYING A REQUIREMENTS DOCUMENT 723 If all of the operations and properties that you can identify for a type of objects are irrelevant in this sense,or are already covered by the operations and properties of a previously identified class,the conclusion is that the object type itself is irrelevant:it must not yield a class. This explains why an elevator system might not include FLOOR as a class because (as noted above)from the point of view of the elevator system floors have no relevant properties other than those of the associated integer numbers,whereas a Computer Aided Design system designed for architects will have a FLOOR class-since in that case the floor has several specific attributes and routines. Missing important classes Not only can nouns suggest notions which do not yield classes:they can also fail to suggest some notions which should definitely yield classes.There are at least three sources of such accidents. Do not forget that,as noted,the aim of this discussion is no longer to convince ourselves of the deficiencies of the"underline the nouns"approach,whose limitations are by now so obvious that the exercise would not be very productive.Instead,we are analyzing these limitations as a way to gain more insight into the process of discovering classes. The first cause of missed classes is simply due to the flexibility and ambiguity of human language-the very qualities that make it suitable for an amazingly wide range of applications,from speeches and novels to love letters,but not very reliable as a medium for accurate technical documents.Assume the requirements document for our elevator example contains the sentence A database record must be created every time the elevator moves from one floor to another. The presence of the noun"record"suggests a class DATABASE RECORD;but we may totally miss a more important data abstraction:the notion of a move between two floors.With the above sentence in the requirements document,you will almost certainly need a MOVE class,which could be of the form class MOVE feature initial,final:FLOOR. --Or INTEGER if no FLOOR class record (d:DATABASE)is .. ..Other features... end--class MOVE This will be an important class,which a grammar-based method would miss because of the phrasing of the above sentence.Of course if the sentence had appeared as A database record must be created for every move of the elevator from one floor to another
§22.1 STUDYING A REQUIREMENTS DOCUMENT 723 If all of the operations and properties that you can identify for a type of objects are irrelevant in this sense, or are already covered by the operations and properties of a previously identified class, the conclusion is that the object type itself is irrelevant: it must not yield a class. This explains why an elevator system might not include FLOOR as a class because (as noted above) from the point of view of the elevator system floors have no relevant properties other than those of the associated integer numbers, whereas a Computer Aided Design system designed for architects will have a FLOOR class — since in that case the floor has several specific attributes and routines. Missing important classes Not only can nouns suggest notions which do not yield classes: they can also fail to suggest some notions which should definitely yield classes. There are at least three sources of such accidents. Do not forget that, as noted, the aim of this discussion is no longer to convince ourselves of the deficiencies of the “underline the nouns” approach, whose limitations are by now so obvious that the exercise would not be very productive. Instead, we are analyzing these limitations as a way to gain more insight into the process of discovering classes. The first cause of missed classes is simply due to the flexibility and ambiguity of human language — the very qualities that make it suitable for an amazingly wide range of applications, from speeches and novels to love letters, but not very reliable as a medium for accurate technical documents. Assume the requirements document for our elevator example contains the sentence The presence of the noun “record” suggests a class DATABASE_RECORD; but we may totally miss a more important data abstraction: the notion of a move between two floors. With the above sentence in the requirements document, you will almost certainly need a MOVE class, which could be of the form class MOVE feature initial, final: FLOOR; -- Or INTEGER if no FLOOR class record (d: DATABASE) is … … Other features … end -- class MOVE This will be an important class, which a grammar-based method would miss because of the phrasing of the above sentence. Of course if the sentence had appeared as A database record must be created every time the elevator moves from one floor to another. A database record must be created for every move of the elevator from one floor to another
724 HOW TO FIND THE CLASSES $22.1 then"move"would have been counted as a noun,and so would have yielded a class!We see once again the dangers of putting too much trust in a natural-language document,and the absurdity of making any serious property of a system design,especially its modular structure,dependent on such vagaries of style and mood. The second reason for overlooking classes is that some crucial abstractions may not Panel-drivensystem. be directly deducible from the requirements.Cases abound in the examples of this book.chapter 20.Undo- It is quite possible that the requirements for a panel-driven system did not explicitly cite redo:chapter 21. the notions of state and application;yet these are the key abstractions,which condition the entire design.It was pointed out earlier that some external-world object types may have no counterpart among the classes ofthe software;here we see the converse:classes of the software that do not correspond to any external-world objects.Similarly,if the author of the requirements for a text editor with undo-redo has written"the system must support line insertion and deletion",we are in luck since we can spot the nouns insertion and deletion; but the need for these facilities may just as well follow from a sentence of the form The editor must allow its users to insert or delete a line at the current cursor position. leading the naive designer to devote his attention to the trivial notions of"cursor"and "position"while missing the command abstractions(line insertion and line deletion). The third major cause of missed classes,shared by any method which uses the requirements document as the basis for analysis,is that such a strategy overlooks reuse.It is surprising to note that much of the object-oriented analysis literature takes for granted the traditional view of software development:starting from a requirements document and devising a solution to the specific problem that it describes.One of the major lessons of object technology is the lack of a clear-cut distinction between problem and solution. Existing software can and should influence new developments. When faced with a new software project,the object-oriented software developer See "THE CHANG- does not accept the requirements document as the alpha and omega of wisdom about the ING NATURE OF problem,but combines it with knowledge about previous developments and available ANALYSIS”,27.2. page 906. software libraries.If necessary,he will criticize the requirements document and propose updates and adaptations which will facilitate the construction of the system;sometimes a minor change,or the removal of a facility which is of limited interest to the final users, will produce a dramatic simplification by making it possible to reuse an entire body of existing software and,as a result,to decrease the development time by months.The corresponding abstractions are most likely to be found in the existing software,not in the requirements document for the new project. Classes COMMAND and HISTORY LOG from the undo-redo example are typical. The way to find the right abstractions for this problem is not to rack one's brain over the requirements document for a text editor:either you come upon them through a process of intellectual discovery (a "Eureka",for which no sure recipe exists);or,if someone else has already found the solution,you reuse his abstractions.You may of course be able to reuse the corresponding implementation too if it is available as part of a library;this is even better,as the whole analysis-design-implementation work has already been done for you
724 HOW TO FIND THE CLASSES §22.1 then “move” would have been counted as a noun, and so would have yielded a class! We see once again the dangers of putting too much trust in a natural-language document, and the absurdity of making any serious property of a system design, especially its modular structure, dependent on such vagaries of style and mood. The second reason for overlooking classes is that some crucial abstractions may not be directly deducible from the requirements. Cases abound in the examples of this book. It is quite possible that the requirements for a panel-driven system did not explicitly cite the notions of state and application; yet these are the key abstractions, which condition the entire design. It was pointed out earlier that some external-world object types may have no counterpart among the classes of the software; here we see the converse: classes of the software that do not correspond to any external-world objects. Similarly, if the author of the requirements for a text editor with undo-redo has written “the system must support line insertion and deletion”, we are in luck since we can spot the nouns insertion and deletion; but the need for these facilities may just as well follow from a sentence of the form leading the naïve designer to devote his attention to the trivial notions of “cursor” and “position” while missing the command abstractions (line insertion and line deletion). The third major cause of missed classes, shared by any method which uses the requirements document as the basis for analysis, is that such a strategy overlooks reuse. It is surprising to note that much of the object-oriented analysis literature takes for granted the traditional view of software development: starting from a requirements document and devising a solution to the specific problem that it describes. One of the major lessons of object technology is the lack of a clear-cut distinction between problem and solution. Existing software can and should influence new developments. When faced with a new software project, the object-oriented software developer does not accept the requirements document as the alpha and omega of wisdom about the problem, but combines it with knowledge about previous developments and available software libraries. If necessary, he will criticize the requirements document and propose updates and adaptations which will facilitate the construction of the system; sometimes a minor change, or the removal of a facility which is of limited interest to the final users, will produce a dramatic simplification by making it possible to reuse an entire body of existing software and, as a result, to decrease the development time by months. The corresponding abstractions are most likely to be found in the existing software, not in the requirements document for the new project. Classes COMMAND and HISTORY_LOG from the undo-redo example are typical. The way to find the right abstractions for this problem is not to rack one’s brain over the requirements document for a text editor: either you come upon them through a process of intellectual discovery (a “Eureka”, for which no sure recipe exists); or, if someone else has already found the solution, you reuse his abstractions. You may of course be able to reuse the corresponding implementation too if it is available as part of a library; this is even better, as the whole analysis-design-implementation work has already been done for you. The editor must allow its users to insert or delete a line at the current cursor position. Panel-driven system: chapter 20. Undoredo: chapter 21. See “THE CHANGING NATURE OF ANALYSIS”, 27.2, page 906
$22.1 STUDYING A REQUIREMENTS DOCUMENT 725 Discovery and rejection It takes two to invent anything.One makes up combinations;the other chooses, recognizes what is important to him in the mass of things which the first has imparted to him.What we call genius is much less the work of the first than the readiness of the second to choose from what has been laid before him. Paul Valery (cited in [Hadamard 1945]) Along with its straightforward lessons,this discussion has taught us a few more subtle consequences. The simple lessons have been encountered several times:do not put too much trust in a requirements document;do not put any trust in grammatical criteria. A less obvious lesson has emerged from the review of"false alarms":just as we need criteria for finding classes,we need criteria for rejecting candidate classes-concepts which initially appear promising but end up not justifying a class of their own.The design discussions of this book illustrate many such cases. “Pseudo-random To quote just one example:a discussion,yet to come,of how best to provide for pseudo- number generators: random number generation,starts naturally enough by considering the notion of random a design exercise”, number,only to dismiss it as not the appropriate data abstraction. page 754. The O-O analysis and design books that I have read include little discussion of this task.This is surprising because in the practice of advising O-O projects,especially with relatively novice teams,I have found that eliminating bad ideas is just as important as finding good ones. It may even be more important.Sit down with a group of users,developers and managers trying to get started with object technology with a fresh new project and enthusiasm fresher yet.There will be no dearth of ideas for classes (usually proposed as "objects").The problem is to dam the torrent before it damns the project.Although some class ideas will probably have been missed,many more will have to be examined and rejected.As in a large-scale police investigation,many leads come in,prompted or spontaneous;you must sort the useful ones from the canards. So we must adapt and extend the question that serves as the topic for this chapter. "How to find the classes"means two things:not just how to come up with candidate abstractions but also how to unmask the inadequate among them.These two tasks are not executed one after the other;instead,they are constantly interleaved.Like a gardener,the object-oriented designer must all the time nurture the good plants and weed out the bad: Class Elicitation principle Class elicitation is a dual process:class suggestion,class rejection. The rest of this chapter studies both components of the class elicitation process
§22.1 STUDYING A REQUIREMENTS DOCUMENT 725 Discovery and rejection It takes two to invent anything. One makes up combinations; the other chooses, recognizes what is important to him in the mass of things which the first has imparted to him. What we call genius is much less the work of the first than the readiness of the second to choose from what has been laid before him. Paul Valéry (cited in [Hadamard 1945]). Along with its straightforward lessons, this discussion has taught us a few more subtle consequences. The simple lessons have been encountered several times: do not put too much trust in a requirements document; do not put any trust in grammatical criteria. A less obvious lesson has emerged from the review of “false alarms”: just as we need criteria for finding classes, we need criteria for rejecting candidate classes — concepts which initially appear promising but end up not justifying a class of their own. The design discussions of this book illustrate many such cases. To quote just one example: a discussion, yet to come, of how best to provide for pseudorandom number generation, starts naturally enough by considering the notion of random number, only to dismiss it as not the appropriate data abstraction. The O-O analysis and design books that I have read include little discussion of this task. This is surprising because in the practice of advising O-O projects, especially with relatively novice teams, I have found that eliminating bad ideas is just as important as finding good ones. It may even be more important. Sit down with a group of users, developers and managers trying to get started with object technology with a fresh new project and enthusiasm fresher yet. There will be no dearth of ideas for classes (usually proposed as “objects”). The problem is to dam the torrent before it damns the project. Although some class ideas will probably have been missed, many more will have to be examined and rejected. As in a large-scale police investigation, many leads come in, prompted or spontaneous; you must sort the useful ones from the canards. So we must adapt and extend the question that serves as the topic for this chapter. “How to find the classes” means two things: not just how to come up with candidate abstractions but also how to unmask the inadequate among them. These two tasks are not executed one after the other; instead, they are constantly interleaved. Like a gardener, the object-oriented designer must all the time nurture the good plants and weed out the bad: The rest of this chapter studies both components of the class elicitation process. Class Elicitation principle Class elicitation is a dual process: class suggestion, class rejection. “Pseudo-random number generators: a design exercise”, page 754
726 HOW TO FIND THE CLASSES $22.2 22.2 DANGER SIGNALS To guide our search it is preferable to start with the rejection part.It will provide us with a checklist of typical pitfalls,alert us to the most important criteria,and help us keep our search for good classes focused on the most productive efforts. Let us review a few signs that usually indicate a bad choice of class.Because design "f is not a completely formalized discipline,you should not treat these signs as proofof a bad rules".page 666. design;in each case one can think of some circumstances that may make the original decision legitimate.So what we will see is not,in the terms of a previous chapter, "absolute negatives"(sure-fire rules for rejecting a design)but "advisory negatives": danger signals that alert you to the presence of a suspicious pattern,and should prompt you to investigate further.Although in most cases they should lead you to revise the design,you may occasionally decide in the end that it is right as it stands The grand mistake Many of the danger signals discussed below point to the most common and most damaging mistake,which is also the most obvious:designing a class that isn't. The principle of object-oriented software construction is to build modules around object types,not functions.This is the key to the reusability and extendibility benefits of the approach.But beginners will often fall into the most obvious pitfall:calling "class" something which is in fact a routine.Writing a module as class...feature...end does not make it a true class;it may just be a routine in disguise. This Grand Mistake is easy to avoid once you are conscious of the risk.The remedy is the usual one:make sure that each class corresponds to a meaningful data abstraction. What follows is a set of typical traits alerting you to the risk that a module which presents itself as a candidate class,and has the syntactical trappings of a class,may be an illegal immigrant not deserving to be granted citizenship in the O-O society of modules. My class performs... In a design meeting,an architecture review,or simply an informal discussion with a developer,you ask about the role of a certain class.The answer:"This class prints the results”or“this class parses the inpui”,or some other variant of“This class does.”. The answer usually points to a design flaw.A class is not supposed to do one thing but to offer a number of services(features)on objects of a certain type.If it really does just one thing,it is probably a case of the Grand Mistake:devising a class for what should just be a routine of some other class. Perhaps the mistake is not in the class itself but in the way it is being described,using phraseology that is too operational.But you had better check. In recent years the"my class does..."style has become widespread.A NeXT document NeXT documenta- describes classes as follows:"The NSTextView class declares the programmatic interface tion for OpenStep, to objects that display text laid out.."An NSLayoutManager coordinates the layout pre-release 4.0
726 HOW TO FIND THE CLASSES §22.2 22.2 DANGER SIGNALS To guide our search it is preferable to start with the rejection part. It will provide us with a checklist of typical pitfalls, alert us to the most important criteria, and help us keep our search for good classes focused on the most productive efforts. Let us review a few signs that usually indicate a bad choice of class. Because design is not a completely formalized discipline, you should not treat these signs as proof of a bad design; in each case one can think of some circumstances that may make the original decision legitimate. So what we will see is not, in the terms of a previous chapter, “absolute negatives” (sure-fire rules for rejecting a design) but “advisory negatives”: danger signals that alert you to the presence of a suspicious pattern, and should prompt you to investigate further. Although in most cases they should lead you to revise the design, you may occasionally decide in the end that it is right as it stands. The grand mistake Many of the danger signals discussed below point to the most common and most damaging mistake, which is also the most obvious: designing a class that isn’t. The principle of object-oriented software construction is to build modules around object types, not functions. This is the key to the reusability and extendibility benefits of the approach. But beginners will often fall into the most obvious pitfall: calling “class” something which is in fact a routine. Writing a module as class… feature … end does not make it a true class; it may just be a routine in disguise. This Grand Mistake is easy to avoid once you are conscious of the risk. The remedy is the usual one: make sure that each class corresponds to a meaningful data abstraction. What follows is a set of typical traits alerting you to the risk that a module which presents itself as a candidate class, and has the syntactical trappings of a class, may be an illegal immigrant not deserving to be granted citizenship in the O-O society of modules. My class performs… In a design meeting, an architecture review, or simply an informal discussion with a developer, you ask about the role of a certain class. The answer: “This class prints the results” or “this class parses the input”, or some other variant of “This class does…”. The answer usually points to a design flaw. A class is not supposed to do one thing but to offer a number of services (features) on objects of a certain type. If it really does just one thing, it is probably a case of the Grand Mistake: devising a class for what should just be a routine of some other class. Perhaps the mistake is not in the class itself but in the way it is being described, using phraseology that is too operational. But you had better check. In recent years the “my class does…” style has become widespread. A NeXT document describes classes as follows: “The NSTextView class declares the programmatic interface to objects that display text laid out…”; “An NSLayoutManager coordinates the layout “A typology of rules”, page 666. NeXT documentation for OpenStep, pre-release 4.0
$22.2 DANGER SIGNALS 727 and display of characters...";"NSTextStorage is a semi-concrete subclass of NSMutableAttributedString that manages a set of client NSLayoutManagers,notifying them of any changes...".Even if (as is most likely the case here)the classes discussed represent valuable data abstractions,it would be preferable to describe them less operationally by emphasizing these abstractions. Imperative names Assume that in a tentative design you find a class name such as PARSE or PRINT-a verb in the imperative or infinitive.It should catch your attention,as signaling again a probable case of a class that "does one thing",and should not be a class. Occasionally you may find that the class is right.Then its name is wrong.This is an “absolute positive'”rule: Class Name rule A class name must always be either: A noun,possibly qualified. .(Only for a deferred class describing a structural property)an adjective. Although like any other one pertaining to style this rule is partly a matter of convention,it helps enforce the principle that every class represents a data abstraction The first form,nouns,covers the vast majority of cases.A noun may be used by itself,as in TREE,or with some qualifying words,as in LINKED LIST,qualified by an adjective,and LINE DELETION,qualified by another noun. “Structure inherit-- The second case,adjectives,arises only for a specific case:structural property ance",page 831. classes describing an abstract structural property,as with the Kernel Library class COMPARABLE describing objects on which a certain order relation is available.Such classes should be deferred;their names (in English or French)will often end with ABLE. They are meant to be used through inheritance to indicate that all instances of a class have a certain property;for example in a system for keeping track of tennis rankings class PLAYER might inherit from COMPARABLE.In the taxonomy of inheritance kinds,this scheme will be classified as structure inheritance. See chapter 21. The only case that may seem to suggest an exception to the rule is command classes, as introduced in the undo-redo design pattern to cover action abstractions.But even then you should stick to the rule:call a text editor's command classes LINE DELETION and WORD CHANGE,not DELETE LINE and REPLACE WORD. English leaves you more flexibility in the application of this rule than many other languages,since its grammatical categories are more an article of faith than an observation of fact,and almost every verb can be nouned.If you use English as the basis for the names in your software it is fair to take advantage of this flexibility to devise shorter and simpler names:you may call a class IMPORT where other languages might treat the equivalent as a verb only,forcing you to use nouns such as IMPORTATION.But do not cheat:class
§22.2 DANGER SIGNALS 727 and display of characters…”; “NSTextStorage is a semi-concrete subclass of NSMutableAttributedString that manages a set of client NSLayoutManagers, notifying them of any changes…”. Even if (as is most likely the case here) the classes discussed represent valuable data abstractions, it would be preferable to describe them less operationally by emphasizing these abstractions. Imperative names Assume that in a tentative design you find a class name such as PARSE or PRINT — a verb in the imperative or infinitive. It should catch your attention, as signaling again a probable case of a class that “does one thing”, and should not be a class. Occasionally you may find that the class is right. Then its name is wrong. This is an “absolute positive” rule: Although like any other one pertaining to style this rule is partly a matter of convention, it helps enforce the principle that every class represents a data abstraction The first form, nouns, covers the vast majority of cases. A noun may be used by itself, as in TREE, or with some qualifying words, as in LINKED_LIST, qualified by an adjective, and LINE_DELETION, qualified by another noun. The second case, adjectives, arises only for a specific case: structural property classes describing an abstract structural property, as with the Kernel Library class COMPARABLE describing objects on which a certain order relation is available. Such classes should be deferred; their names (in English or French) will often end with ABLE. They are meant to be used through inheritance to indicate that all instances of a class have a certain property; for example in a system for keeping track of tennis rankings class PLAYER might inherit from COMPARABLE. In the taxonomy of inheritance kinds, this scheme will be classified as structure inheritance. The only case that may seem to suggest an exception to the rule is command classes, as introduced in the undo-redo design pattern to cover action abstractions. But even then you should stick to the rule: call a text editor’s command classes LINE_DELETION and WORD_CHANGE, not DELETE_LINE and REPLACE_WORD. English leaves you more flexibility in the application of this rule than many other languages, since its grammatical categories are more an article of faith than an observation of fact, and almost every verb can be nouned. If you use English as the basis for the names in your software it is fair to take advantage of this flexibility to devise shorter and simpler names: you may call a class IMPORT where other languages might treat the equivalent as a verb only, forcing you to use nouns such as IMPORTATION. But do not cheat: class Class Name rule A class name must always be either: • A noun, possibly qualified. • (Only for a deferred class describing a structural property) an adjective. “Structure inheritance”, page 831. See chapter 21
728 HOW TO FIND THE CLASSES $22.2 IMPORT should cover the abstraction"objects being imported"(nominal),not,except for a command class,the act of importing(verbal) It is interesting to contrast the Class Name rule with the discussion of the"underline the nouns"advice at the beginning of this chapter."Underline the nouns"applied a formal grammatical criterion to an informal natural-language text,the requirements document; this is bound to be of dubious value.The Class Name rule,on the other hand,applies the same criterion to a formal text-the software. Single-routine classes A typical symptom of the Grand Mistake is an effective class that contains only one exported routine,possibly calling a few non-exported ones.The class is probably just a glorified subroutine-a unit of functional rather than object-oriented decomposition. A possible exception arises for objects that legitimately represent abstracted actions,See"Smallclasses" for example a command in an interactive system,or what in a non-O-O approach would page 714. have been represented by a routine passed as argument to another routine.But the examples given in an earlier discussion show clearly enough that even in such cases there will usually be several applicable features.We noted that a mathematical software object representing a function to be integrated will not just have the feature item (a:REAL):REAL,giving the value of the function at point a:others may include domain of definition,minimum and maximum over a certain interval,derivative.Even if a class does not yet have all these features,checking that it would make sense to add them later will reinforce your conviction that you are dealing with a genuine object abstraction. In applying the single-routine rule,you should consider all the features of a class: See“TAXOMA- those introduced in the class itself,and those which it inherits from its parents.It is not NMA”24.4,page necessarily wrong for a class text to declare only one exported routine,if this is simply an 820. addition to a meaningful abstraction defined by its ancestors.It may,however,point to a case of taxomania,an inheritance-related disease which will be studied as part of the methodology of inheritance. Premature classification The mention of taxomania suggests a warning about another common mistake ofnovices: starting to worry about the inheritance hierarchy too early in the process. As inheritance is central in the object-oriented method,so is a good inheritance structure-more accurately,a good modular structure,including both inheritance and client relations-essential to the quality of a design.But inheritance is only relevant as a relation among well-understood abstractions.When you are still looking for the abstractions,it is too early to devise the inheritance hierarchy. The only clear exception arises when you are dealing with an application domain for which a pre-existing taxonomy is widely accepted,as in some branches of science.Then the corresponding abstractions will emerge together with their inheritance structure. (Before accepting the taxonomy as the basis for your software's structure,do check that it is indeed well recognized and stable,not just someone's view of things
728 HOW TO FIND THE CLASSES §22.2 IMPORT should cover the abstraction “objects being imported” (nominal), not, except for a command class, the act of importing (verbal). It is interesting to contrast the Class Name rule with the discussion of the “underline the nouns” advice at the beginning of this chapter. “Underline the nouns” applied a formal grammatical criterion to an informal natural-language text, the requirements document; this is bound to be of dubious value. The Class Name rule, on the other hand, applies the same criterion to a formal text — the software. Single-routine classes A typical symptom of the Grand Mistake is an effective class that contains only one exported routine, possibly calling a few non-exported ones. The class is probably just a glorified subroutine — a unit of functional rather than object-oriented decomposition. A possible exception arises for objects that legitimately represent abstracted actions, for example a command in an interactive system, or what in a non-O-O approach would have been represented by a routine passed as argument to another routine. But the examples given in an earlier discussion show clearly enough that even in such cases there will usually be several applicable features. We noted that a mathematical software object representing a function to be integrated will not just have the feature item (a: REAL): REAL, giving the value of the function at point a: others may include domain of definition, minimum and maximum over a certain interval, derivative. Even if a class does not yet have all these features, checking that it would make sense to add them later will reinforce your conviction that you are dealing with a genuine object abstraction. In applying the single-routine rule, you should consider all the features of a class: those introduced in the class itself, and those which it inherits from its parents. It is not necessarily wrong for a class text to declare only one exported routine, if this is simply an addition to a meaningful abstraction defined by its ancestors. It may, however, point to a case of taxomania, an inheritance-related disease which will be studied as part of the methodology of inheritance. Premature classification The mention of taxomania suggests a warning about another common mistake of novices: starting to worry about the inheritance hierarchy too early in the process. As inheritance is central in the object-oriented method, so is a good inheritance structure — more accurately, a good modular structure, including both inheritance and client relations — essential to the quality of a design. But inheritance is only relevant as a relation among well-understood abstractions. When you are still looking for the abstractions, it is too early to devise the inheritance hierarchy. The only clear exception arises when you are dealing with an application domain for which a pre-existing taxonomy is widely accepted, as in some branches of science. Then the corresponding abstractions will emerge together with their inheritance structure. (Before accepting the taxonomy as the basis for your software’s structure, do check that it is indeed well recognized and stable, not just someone’s view of things.) See “Small classes”, page 714. See “TAXOMANIA”, 24.4, page 820