APPENDIX Hierarchical Model In the network model,the data are represented by collections of records and relationships between data are represented by links.This structure holds for the hierarchical model as well.The only difference is that,in the hierarchical model, records are organized as collections of trees,rather than as arbitrary graphs. In this chapter we illustrate our concepts using a bank enterprise with the schema shown in Figure 2.15. E.1 Basic Concepts A hierarchical database consists of a collection of records that are connected to each other through links.A record is similar to a record in the network model. Each record is a collection of fields(attributes),each of which contains only one data value.A link is an association between precisely two records.Thus,a link here is similar to a link in the network model. Consider a database that represents a customer-account relationship in a bank- ing system.There are two record types:customer and account.The customer record type can be defined in the same manner as in Appendix A.It consists of three fields:customer name,customer street,and customercity.Similarly,the ac- count record consists of two fields:account number and balance. A sample database appears in Figure E.1.It shows that customer Hayes has account A-102,customer Johnson has accounts A-101 and A-201,and customer Turner has account A-305. Note that the set of all customer and account records is organized in the form of a rooted tree,where the root of the tree is a dummy node.As we shall see,a hierarchical database is a collection of such rooted trees,and hence forms a forest. We shall refer to each such rooted tree as a database tree. The content of a particular record may have to be replicated in several different locations.For example,in our customer-account banking system,an account may belong to several customers.The information pertaining to that account, or the information pertaining to the various customers to which that account may belong,will have to be replicated.This replication may occur either in the 1
APPENDIXE Hierarchical Model In the network model, the data are represented by collections of records and relationships between data are represented by links. This structure holds for the hierarchical model as well. The only difference is that, in the hierarchical model, records are organized as collections of trees, rather than as arbitrary graphs. In this chapter we illustrate our concepts using a bank enterprise with the schema shown in Figure 2.15. E.1 Basic Concepts A hierarchical database consists of a collection of records that are connected to each other through links. A record is similar to a record in the network model. Each record is a collection of fields (attributes), each of which contains only one data value. A link is an association between precisely two records. Thus, a link here is similar to a link in the network model. Consider a database that represents a customer-account relationship in a banking system. There are two record types: customer and account. The customer record type can be defined in the same manner as in Appendix A. It consists of three fields: customer name, customer street, and customer city. Similarly, the account record consists of two fields: account number and balance. A sample database appears in Figure E.1. It shows that customer Hayes has account A-102, customer Johnson has accounts A-101 and A-201, and customer Turner has account A-305. Note that the set of all customer and account records is organized in the form of a rooted tree, where the root of the tree is a dummy node. As we shall see, a hierarchical database is a collection of such rooted trees, and hence forms a forest. We shall refer to each such rooted tree as a database tree. The content of a particular record may have to be replicated in several different locations. For example, in our customer-account banking system, an account may belong to several customers. The information pertaining to that account, or the information pertaining to the various customers to which that account may belong, will have to be replicated. This replication may occur either in the 1
2 Appendix E Hierarchical Model Hayes Main Harrison Johnson Alma Palo Alto Turner Putnam Stamford A-102 400 A-101 500 A-201 900 A-305350 Figure E.1 Sample database. same database tree or in several different trees.Record replication has two major drawbacks: 1.Data inconsistency may result when updating takes place. 2.Waste of space is unavoidable. We shall deal with this issue in Section E.5 by introducing the concept of a virtual record. E.2 Tree-Structure Diagrams A tree-structure diagram is the schema for a hierarchical database.Such a diagram consists of two basic components: 1.Boxes,which correspond to record types 2.Lines,which correspond to links A tree-structure diagram serves the same purpose as an entity-relationship(E-R) diagram;namely,it specifies the overall logical structure of the database.A tree- structure diagram is similar to a data-structure diagram in the network model. The main difference is that,in the latter,record types are organized in the form of an arbitrary graph,whereas in the former,record types are organized in the form of a rooted tree. We have to be more precise about what a rooted tree is.First,there can be no cycles in the underlying graph.Second,there is a record type that is designated as the root of the tree.The relationships formed in the tree-structure diagram must be such that only one-to-many or one-to-one relationships exist between a parent and a child.The general form of a tree-structure diagram appears in Figure E.2. Note that the arrows are pointing from children to parents.A parent may have an arrow pointing to a child,but a child must have an arrow pointing to its parent. The database schema is represented as a collection of tree-structure diagrams. For each such diagram,there exists one single instance of a database tree.The root of this tree is a dummy node.The children of the dummy node are instances of the
2 Appendix E Hierarchical Model Figure E.1 Sample database. same database tree or in several different trees. Record replication has two major drawbacks: 1. Data inconsistency may result when updating takes place. 2. Waste of space is unavoidable. We shall deal with this issue in Section E.5 by introducing the concept of a virtual record. E.2 Tree-Structure Diagrams A tree-structure diagram is the schema for a hierarchical database. Such a diagram consists of two basic components: 1. Boxes, which correspond to record types 2. Lines, which correspond to links A tree-structure diagram serves the same purpose as an entity–relationship (E-R) diagram; namely, it specifies the overall logical structure of the database. A treestructure diagram is similar to a data-structure diagram in the network model. The main difference is that, in the latter, record types are organized in the form of an arbitrary graph, whereas in the former, record types are organized in the form of a rooted tree. We have to be more precise about what a rooted tree is. First, there can be no cycles in the underlying graph. Second, there is a record type that is designated as the root of the tree. The relationships formed in the tree-structure diagram must be such that only one-to-many or one-to-one relationships exist between a parent and a child. The general form of a tree-structure diagram appears in Figure E.2. Note that the arrows are pointing from children to parents. A parent may have an arrow pointing to a child, but a child must have an arrow pointing to its parent. The database schema is represented as a collection of tree-structure diagrams. For each such diagram, there exists one single instance of a database tree. The root of this tree is a dummy node. The children of the dummy node are instances of the
E.2 Tree-Structure Diagrams 3 B Figure E.2 General structure of a tree-structure diagram. root record type in the tree-structure diagram.Each record instance may,in turn, have several children,which are instances of various record types,as specified in the corresponding tree-structure diagram. To understand how tree-structure diagrams are formed,we shall show how to transform E-R diagrams to their corresponding tree-structure diagrams.We first show how to apply such transformations to single relationships.We then explain how to ensure that the resulting diagrams are in the form of rooted trees. E.2.1 Single Relationships Consider the E-R diagram of Figure E.3a;it consists of the two entity sets customer and account related through a binary,one-to-many relationship depositor,with no descriptive attributes.This diagram specifies that a customer can have several accounts,but an account can belong to only one customer.The corresponding tree- structure diagram appears in Figure E.3b.The record type customer corresponds to customer account customer name depositor account number customer street balance customer_city (a)E-R diagram customer name customer street customer city customer account number balance account (b)Tree structure diagram Figure E.3 E-R diagram and its corresponding tree-structure diagram
E.2 Tree-Structure Diagrams 3 Figure E.2 General structure of a tree-structure diagram. root record type in the tree-structure diagram. Each record instance may, in turn, have several children, which are instances of various record types, as specified in the corresponding tree-structure diagram. To understand how tree-structure diagrams are formed, we shall show how to transform E-R diagrams to their corresponding tree-structure diagrams. We first show how to apply such transformations to single relationships. We then explain how to ensure that the resulting diagrams are in the form of rooted trees. E.2.1 Single Relationships Consider the E-R diagram of Figure E.3a; it consists of the two entity sets customer and account related through a binary, one-to-many relationship depositor, with no descriptive attributes. This diagram specifies that a customer can have several accounts, but an account can belong to only one customer. The corresponding treestructure diagram appears in Figure E.3b. The record type customer corresponds to Figure E.3 E-R diagram and its corresponding tree-structure diagram
Appendix E Hierarchical Model customer name customer street customer_city customer account number balance account Figure E.4 Tree-structure diagram with one-to-one relationship. the entity set customer.It includes three fields:customer-name,customer street,and customer city.Similarly,account is the record type corresponding to the entity set account.It includes two fields:account number and balance.Finally,the relationship depositor has been replaced with the link depositor,with an arrow pointing to customer record type. An instance of a database corresponding to the described schema may thus contain a number of customer records linked to a number of account records,as in Figure E.1.Since the relationship is one to many from customer to account, a customer can have more than one account,as does Johnson,who has both accounts A-101 and A-201.An account,however,cannot belong to more than one customer;none do in the sample database. If the relationship depositor is one to one,then the link depositor has two arrows:one pointing to account record type,and one pointing to customer record type(Figure E.4).A sample database corresponding to this schema appears in Figure E.5.Since the relationship is one to one,an account can be owned by precisely one customer,and a customer can have only one account,as is indeed the case in the sample database. If the relationship depositor is many to many(see Figure E.6a),then the trans- formation from an E-R diagram to a tree-structure diagram is more complicated. Only one-to-many and one-to-one relationships can be directly represented in the hierarchical model. There are many different ways to transform this E-R diagram to a tree-structure diagram.All these diagrams,however,share the property that the underlying database tree (or trees)will have replicated records. Hayes Main Harrison Lindsay Park Pittsfield Turner Putnam Stamford A-102 400 A-222 700 A-201 900 A-305 350 Figure E.5 Sample database corresponding to diagram of Figure E.4
4 Appendix E Hierarchical Model Figure E.4 Tree-structure diagram with one-to-one relationship. the entity set customer. It includes three fields: customer name, customer street, and customer city. Similarly, account is the record type corresponding to the entity set account. It includes two fields: account number and balance. Finally, the relationship depositor has been replaced with the link depositor, with an arrow pointing to customer record type. An instance of a database corresponding to the described schema may thus contain a number of customer records linked to a number of account records, as in Figure E.1. Since the relationship is one to many from customer to account, a customer can have more than one account, as does Johnson, who has both accounts A-101 and A-201. An account, however, cannot belong to more than one customer; none do in the sample database. If the relationship depositor is one to one, then the link depositor has two arrows: one pointing to account record type, and one pointing to customer record type (Figure E.4). A sample database corresponding to this schema appears in Figure E.5. Since the relationship is one to one, an account can be owned by precisely one customer, and a customer can have only one account, as is indeed the case in the sample database. If the relationship depositor is many to many (see Figure E.6a), then the transformation from an E-R diagram to a tree-structure diagram is more complicated. Only one-to-many and one-to-one relationships can be directly represented in the hierarchical model. There are many different ways to transform this E-R diagram to a tree-structure diagram. All these diagrams, however, share the property that the underlying database tree (or trees) will have replicated records. Figure E.5 Sample database corresponding to diagram of Figure E.4
E.2 Tree-Structure Diagrams 5 customer account customer_name depositor account_number customer_street balance customer_city (a)E-R diagram customer_name customer street customer city account number balance customer account account number balance customer_name customer_street customer_city account customer treeT tree T2 (b)Tree structure diagram Figure E.6 E-R diagram and its corresponding tree-structure diagrams. The decision regarding which transformation should be used depends on many factors,including The type of queries expected on the database The degree to which the overall database schema being modeled fits the given E-R diagram We shall present a transformation that is as general as possible.That is,all other possible transformations are a special case of this one transformation. To transform the E-R diagram of Figure E.6a into a tree-structure diagram,we take these steps: 1.Create two separate tree-structure diagrams,Ti and T2,each of which has the customer and account record types.In tree Ti,customer is the root;in tree 1,account is the root. 2.Create the following two links: depositor,a many-to-one link from account record type to customer record type,in Ti account customer,a many-to-one link from customer record type to ac- count record type,in T2 The resulting tree-structure diagrams appear in Figure E.6b.The presence of two diagrams(1)permits customers who do not participate in the depositor relation- ship as well as accounts that do not participate in the depositor relationship,and (2)permits efficient access to account information for a given customer as well as customer information for a given account
E.2 Tree-Structure Diagrams 5 Figure E.6 E-R diagram and its corresponding tree-structure diagrams. The decision regarding which transformation should be used depends on many factors, including • The type of queries expected on the database • The degree to which the overall database schema being modeled fits the given E-R diagram We shall present a transformation that is as general as possible. That is, all other possible transformations are a special case of this one transformation. To transform the E-R diagram of Figure E.6a into a tree-structure diagram, we take these steps: 1. Create two separate tree-structure diagrams, T1 and T2, each of which has the customer and account record types. In tree T1, customer is the root; in tree T2, account is the root. 2. Create the following two links: • depositor, a many-to-one link from accountrecord type to customerrecord type, in T1 • account customer, a many-to-one link from customer record type to account record type, in T2 The resulting tree-structure diagrams appear in Figure E.6b. The presence of two diagrams (1) permits customers who do not participate in the depositor relationship as well as accounts that do not participate in the depositor relationship, and (2) permits efficient access to account information for a given customer as well as customer information for a given account
Appendix E Hierarchical Model Hayes Main Harrison Johnson Alma Palo Alto Smith North Rye A-102400 A-101 500 A-201900 A-201 900 A-215 700 (a) A-102400 A-101 500 A-201900 A-215700 Hayes Main Harrison Johnson Alma Palo Alto Johnson Alma Palo Alto Smith North Rye Smith North Rye (b) Figure E.7 Sample database corresponding to diagram of Figure E.6b. A sample database corresponding to the tree-structure diagram of Figure E.6b appears in Figure E.7.There are two database trees.The first tree(Figure E.7a) corresponds to the tree-structure diagram Ti;the second tree(Figure E.7b)corre- sponds to the tree-structure diagram T2.As we can see,all customer and account records are replicated in both database trees.In addition,account record A-201 ap- pears twice in the first tree,whereas customer records Johnson and Smith appear twice in the second tree. If a relationship also includes a descriptive attribute,the transformation from an E-R diagram to a tree-structure diagram is more complicated.A link cannot contain any data value.In this case,a new record type needs to be created,and the appropriate links need to be established.The manner in which links are formed depends on the way the relationship depositor is defined. Consider the E-R diagram of Figure E.3a.Suppose that we add the attribute access date to the relationship depositor,to denote the most recent date on which a customer accessed the account.This newly derived E-R diagram appears in Figure E.8a.To transform this diagram into a tree-structure diagram,we must 1.Create a new record type access date with a single field. 2.Create the following two links:
6 Appendix E Hierarchical Model Figure E.7 Sample database corresponding to diagram of Figure E.6b. A sample database corresponding to the tree-structure diagram of Figure E.6b appears in Figure E.7. There are two database trees. The first tree (Figure E.7a) corresponds to the tree-structure diagram T1; the second tree (Figure E.7b) corresponds to the tree-structure diagram T2. As we can see, all customer and account records are replicated in both database trees. In addition, account record A-201 appears twice in the first tree, whereas customer records Johnson and Smith appear twice in the second tree. If a relationship also includes a descriptive attribute, the transformation from an E-R diagram to a tree-structure diagram is more complicated. A link cannot contain any data value. In this case, a new record type needs to be created, and the appropriate links need to be established. The manner in which links are formed depends on the way the relationship depositor is defined. Consider the E-R diagram of Figure E.3a. Suppose that we add the attribute access date to the relationship depositor, to denote the most recent date on which a customer accessed the account. This newly derived E-R diagram appears in Figure E.8a. To transform this diagram into a tree-structure diagram, we must 1. Create a new record type access date with a single field. 2. Create the following two links:
E.2 Tree-Structure Diagrams 7 access date customer : account customer name customer street depositor account number customer city balance (a)E-R diagram customer name customer street customer_city customer access_date access date account number balance account (b)Tree structure diagram Figure E.8 E-R diagram and its corresponding tree-structure diagram. .customer date,a many-to-one link from access date record type to customer record type date account,a many-to-one link from account record type to access date record type Hayes Main Harrison Johnson Alma Palo Alto Turner Putnam Stamford 10June2009 24May2009 17June2009 10June2009 A-102 400 A-101 500 A-201 900 A-305 350 Figure E.9 Sample database corresponding to diagram of Figure E.8b
E.2 Tree-Structure Diagrams 7 Figure E.8 E-R diagram and its corresponding tree-structure diagram. • customer date, a many-to-one link from access daterecord type to customer record type • date account, a many-to-one link from account record type to access date record type Figure E.9 Sample database corresponding to diagram of Figure E.8b
Appendix E Hierarchical Model customer name customer streef customer_city account number balance customer account access date access date access_date access date account_number balance customer name customer_street customer city account customer Figure E.10 Tree-structure diagram with many-to-many relationships. The resulting tree-structure diagram is illustrated in Figure E.8b. An instance corresponding to the described schema appears in Figure E.9.It shows that: Hayes has account A-102,which was last accessed on 10 June 2009. Johnson has two accounts:A-101,which was last accessed on 24 May 2009, and A-201,which was last accessed on 17 June 2009. Hayes Main Harrison Johnson Alma Palo Alto Smith North Rye 10June2009 24May2009 17June2009 21une2009 3 June 2009 A-102 400 A-101 500A-201900 A-201 900 A-215 700 (a) A-102400 A-101500 A-201900 A-215700 10June2009 24May2009 17June2009 21June2009 3 June 2009 Hayes Main Harrison Johnson Alma Palo Alto Johnson Alma Palo Alto Smith North Rye Smith North Rye (b) Figure E.11 Sample database corresponding to diagram of Figure E.10
8 Appendix E Hierarchical Model Figure E.10 Tree-structure diagram with many-to-many relationships. The resulting tree-structure diagram is illustrated in Figure E.8b. An instance corresponding to the described schema appears in Figure E.9. It shows that: • Hayes has account A-102, which was last accessed on 10 June 2009. • Johnson has two accounts: A-101, which was last accessed on 24 May 2009, and A-201, which was last accessed on 17 June 2009. Figure E.11 Sample database corresponding to diagram of Figure E.10
E.2 Tree-Structure Diagrams 9 Turner has account A-305,which was last accessed on 10 June 2009 Note that two different accounts can be accessed on the same date,as were accounts A-102 and A-305.These accounts belong to two different customers,so the access date record must be replicated to preserve the hierarchy. If the relationship depositor were one to one with the attribute date,then the transformation algorithm would be similar to the one described.The only difference would be that the two links customer date and date account would be one-to-one links. Assume that the relationship depositor is many to many with the attribute access date;here again,we can choose among a number of alternative transfor- mations.We shall use the most general transformation;it is similar to the one applied to the case where the relationship depositor has no descriptive attribute. The record types customer,account,and access date need to be replicated,and two separate tree-structure diagrams must be created,as in Figure E.10.A sample database corresponding to this schema is in Figure E.11. Until now,we have considered only binary relationships.We shift our at- tention here to general relationships.The transformation of E-R diagrams cor- responding to general relationships into tree-structure diagrams is complicated. branch branch_name branch_city assets customer account customer_name CAB account_balance customer_street balance customer_city (a)E-R diagram branch_name branch_city assets branch_name branch_city assets branch branch customer_name customer_streef customer_city account_number balance account account_number balance customer_name customer_street customer_city customer (b)Tree structure diagrams Figure E.12 E-R diagram and its corresponding tree-structure diagrams
E.2 Tree-Structure Diagrams 9 • Turner has account A-305, which was last accessed on 10 June 2009. Note that two different accounts can be accessed on the same date, as were accounts A-102 and A-305. These accounts belong to two different customers, so the access date record must be replicated to preserve the hierarchy. If the relationship depositor were one to one with the attribute date, then the transformation algorithm would be similar to the one described. The only difference would be that the two links customer date and date account would be one-to-one links. Assume that the relationship depositor is many to many with the attribute access date; here again, we can choose among a number of alternative transformations. We shall use the most general transformation; it is similar to the one applied to the case where the relationship depositor has no descriptive attribute. The record types customer, account, and access date need to be replicated, and two separate tree-structure diagrams must be created, as in Figure E.10. A sample database corresponding to this schema is in Figure E.11. Until now, we have considered only binary relationships. We shift our attention here to general relationships. The transformation of E-R diagrams corresponding to general relationships into tree-structure diagrams is complicated. Figure E.12 E-R diagram and its corresponding tree-structure diagrams
10 Appendix E Hierarchical Model Rather than present a general transformation algorithm,we present a single ex- ample to illustrate the overall strategy that you can apply to deal with such a transformation. Consider the E-R diagram of Figure E.12a,which consists of the three entity sets customer,account,and branch,related through the general relationship set CAB with no descriptive attribute. There are many different ways to transform this E-R diagram into a tree- structure diagram.Again,all share the property that the underlying database tree (or trees)will have replicated records.The most straightforward transformation is to create two tree-structure diagrams,as shown in Figure E.12b. An instance of the database corresponding to this schema is illustrated in Fig- ure E.13.It shows that Hayes has account A-102 in the Perryridge branch;Johnson has accounts A-101 and A-201 in the Downtown and Perryridge branches,respec- tively;and Smith has accounts A-201 and A-215 in the Perryridge and Mianus branches,respectively. We can extend the preceding transformation algorithm in a straightforward manner to deal with relationships that span more than three entity sets.We simply replicate the various record types,and generate as many tree-structure Downtown Brooklyn 9000000 Perryridge Horseneck 17000000 Mianus Horseneck 400000 Haves Main Harrison Smith North Rye Johnson Alma Palo Alto A-102 400 Johnson Alma Palo Alto Smith North Rye A-101500 A-201900 A-201900 A-215700 (a) Downtown Brooklyn 9000000 Perryridge Horseneck 17000000 Mianus Horseneck 400000 A-102400 A-101 500 A-201900 A215700 Hayes Main Harrison Johnson Alma Palo Alto Johnson Alma Palo Alto Smith North Rye (b) Smith North Rye Figure E.13 Sample database corresponding to diagram of Figure E.12b
10 Appendix E Hierarchical Model Rather than present a general transformation algorithm, we present a single example to illustrate the overall strategy that you can apply to deal with such a transformation. Consider the E-R diagram of Figure E.12a, which consists of the three entity sets customer, account, and branch, related through the general relationship set CAB with no descriptive attribute. There are many different ways to transform this E-R diagram into a treestructure diagram. Again, all share the property that the underlying database tree (or trees) will have replicated records. The most straightforward transformation is to create two tree-structure diagrams, as shown in Figure E.12b. An instance of the database corresponding to this schema is illustrated in Figure E.13. It shows that Hayes has account A-102 in the Perryridge branch; Johnson has accounts A-101 and A-201 in the Downtown and Perryridge branches, respectively; and Smith has accounts A-201 and A-215 in the Perryridge and Mianus branches, respectively. We can extend the preceding transformation algorithm in a straightforward manner to deal with relationships that span more than three entity sets. We simply replicate the various record types, and generate as many tree-structure Figure E.13 Sample database corresponding to diagram of Figure E.12b