Contents Preface Acknowledgements xiv Preliminaries and Notation PART I:Design and Analysis of Data Types 1 1 The Complexity of Algorithms 2 1.1 Comparing the Efficiency of Algorithms 2 1.2 Time and Space Complexity of Algorithms 6 1.3 Asymptotic Time Complexity-Big O and Big S Notations 9 1.4 Importance of Growth Rate of Time Complexities 12 1.5 Rules for Deriving Time Complexities of Pascal Programs 14 Exercises 21 Bibliographic Notes and Further Reading 23 2 Abstract Data Types and Program Design 25 2.1 Elements of Program Design 25 2.2 Abstraction Mechanisms in Program Design Process 27 2.3 Formal Specification of Abstract Data Types 39 Exercises 43 Bibliographic Notes and Further Reading 44 3 Elementary (Linear)ADTs 48 3.1 The ADT Stack 48 3.2 The ADT Queue 58 3.3 The ADT List 64 3.4 Case Study:Evaluation of Arithmetic Expressions 9 Exercises 86 Bibliographic Notes and Further Reading 89 4 Non-linear ADTs-Trees 91 4.1 Trees 4.2 Binary Trees 4.3 Threaded Binary Trees 9910 4.4 Binary Search Trees 5
vili Contents 4.5 Balanced Trees:AVL Trees 112 4.6 Balanced Trees:B-trees 122 Exercises 131 Bibliographic Notes and Further Reading 134 5 Abstract Data Type Sets-I 136 5.1 ADT Set:Definitions 137 5.2 Implementing Sets using Bit-vectors 139 5.3 Implementing Sets using Lists 141 5.4 Implementing Sets using Trees 145 Exercises 149 Bibliographic Notes and Further Reading 150 6 Abstract Data Type Sets-II 152 6.1 ADT Mapping 152 6.2 Hashing (Key Transformation) 155 6.3 Priority Queues 170 6.4 The ADT Relation 178 Exercises 189 Bibliographic Notes and Further Reading 197 7 Non-Linear ADTs-Graphs 200 7.1 Definitions and Terminology 200 7.2 ADT Graph and its Implementation 203 7.3 Spanning Trees 206 7.4 Path-finding Algorithms 217 Exercises 224 Bibliographic Notes and Further Reading 228 PART II:Algorithm Design with Abstract Data Types 231 8 Techniques for Developing Efficient Algorithms 232 8.1 Divide and Conquer 232 8.2 Dynamic Programming 237 8.3 Other Techniques 245 Exercises 245 Bibliographic Notes and Further Reading 249 9 Sorting:an Algorithm on the ADT List 251 9.1 An Abstract Sorting Algorithm 251 9.2 Easy Split/Hard Join Sorting Algorithms 254 9.3 Hard Split/Easy Join Sorting Algorithms 259 9.4 Sorting by Utilising the Representation of Elements 270 9.5 Taxonomy,Comparisons and Related Issues 273
Contents iⅸ Exercises 277 Bibliographic Notes and Further Reading 279 10 Graph Traversals and Algorithms 281 10.1 Depth First Search 281 10.2 Breadth First Search 283 10.3 Detection of Cycles 285 10.4 Topological Ordering 287 10.5 Connectedness of Graphs 290 Exercises 292 Bibliographic Notes and Further Reading 293 11 String-searching Algorithms 295 11.1 Definitions 295 11.2 Data Structures for Strings 296 11.3 String-searching Algorithms 297 11.4 String-searching in Large Static Strings 307 Exercises 310 Bibliographic Notes and Further Reading 311 12 Hard'Problems and NP-completeness 313 12.1 NP-problems and NP-complete Problems 315 12.2 Methods for Solving 'Hard'Problems 318 Exercises 329 Bibliographic Notes and Further Reading 332 Solutions to Selected Exercises 335 Index 373
Preface This book is intended as a second course on programming with data structures. The concepts of 'data structures'and their impact on design and efficiency of computer algorithms are well established and well studied.For instance,see the authoritative text by D.E.Knuth,The Art of Computer Programming (Knuth, 1973),and The Design and Analysis of Computer Algorithms by A.V.Aho, J.E.Hopcroft and J.O.Ullman (Aho et al.,1974).Because of their long stand- ing history,'data structures'and their role in program design are well understood, and in the last decade or so,numerous textbooks have appeared in this field.For instance,see Baron and Shapiro (1980),Aho et al.(1974),Gotlieb and Gotlieb (1978),Stubbs and Webre (1985),Reingold and Hansen (1983),Korsh (1986), Wirth (1976)and Martin (1986). However,following general design methods based on high-level abstraction of systems (computer systems,engineering systems,etc.)and their modular decom- positions,'data structures'have also been studied in this modular framework. The approach taken is based on the notion of an Abstract Data Type which is defined as an abstract mathematical model with a defined set of operations.The treatment of abstract data types is very informal.The specification of data types and their corresponding operations are presented in a form directly representable in a Pascal-like language. The textbook Data Structures and Algorithms by A.V.Aho,J.E.Hopcroft and J.D.Ullman (Aho et al.,1983)is a pioneering work incorporating the notion of abstraction in the design of data structures.Other texts following a similar approach have appeared:Stubbs and Webre (1985),Wirth (1986)and Martin (1986).This book follows the same general principles as the above,in particular, Aho et.al.(1983)in describing data abstractions in languages such as Pascal. However,the material and programs are more structured and relationships between ADTs and implementations are made more clear. The primary advantage gained by using abstract data types in program design is that they lead to better structured and modular programs.This type of modu- larity ensures that logical tasks and data are grouped together to form an inde- pendent module whose functional specification is visible to other programs using it.These application programs do not require detailed information on how the tasks and data are realised in computer hardware.Thus various strategies may be used to implement a module's data types and operations so far as they conform to the specification of that module.This separation of specification and imple- mentation of programs ensure that many design decisions,such as efficiency, trade-offs between speed and storage etc.,can be made at a later stage when the
Preface xi logical properties of data types and their operations have proved to satisfy the application. In Part I,abstract data types are classified with respect to the single relation 'predecessor/successor'which may or may not exist among the individual elements of a data type.Those data types for which this relation does not hold are primarily sets.Those with this relation defined on them are further classified into linear and non-linear data types.Non-linear data types are finally classified into trees and graphs.Only when the specification of an abstract data type is completely given do we begin to consider implementation issues.In this phase, the major emphasis rests on 'efficient'implementations of a data type and its operations.Therefore,a chapter is devoted to discussing the notion of 'effic- iency'of an algorithm so that later in the book the efficiency of the implemen- tations of abstract data types and algorithms may be quantified.Furthermore, this enables us to compare different implementations of the same abstract data type or the same task in a given programming language and a computer system. Declarative languages,in particular functional languages such as HOPE (Burstall et al.,1980)and ML (Milner,1978)allow for more succinct specifica- tion of ADTs and their implementations.(See for example,specification of lists in the Bibliographic notes,at the end of chapter 3). However,to date,these languages do not yield the same level of efficiency on conventional computers. Prerequisites The reader is assumed to have practical experience of a Pascal-like language. Although Pascal is used to express all the algorithms,they could easily be trans- formed into other similar languages.The efficiency of the algorithms is a major concern of the book and therefore it is advantageous if the reader has some knowledge about how various constructs of a Pascal-like language are imple- mented-with a view to comparing their relative speeds.However,this is not essential since the brief explanations given in the book should be sufficient to grasp these ideas. Structure of the Book The book is in two parts.Part I begins by examining the time and space require- ments of computer algorithms and develops a notation which is used in the remainder of the book to compare various implementations of abstract data types.Chapter 2 introduces the concept of abstraction in program design and shows how data types can be abstracted from their detailed implementation issues.Section 2.3 briefly describes a method of formally specifying the data type stack and proving that its implementation conforms to that specification
xii Preface The remainder of the book relies on informal specifications and proofs.This section and the bibliographic notes at the end of chapter 3 are intended as a guide for more interested readers,should they wish to explore the topic more rigorously;therefore it may be skipped in its entirety or on the first reading. Chapters 3 to 7 then discuss a variety of abstract data types including stacks, queues,lists,sets,trees,relations and graphs.Each abstract data type is defined as a mathematical model with a number of operations defined on that model. The operations are given as a set of Pascal procedures and functions.Various implementation strategies of these data types are discussed in terms of Pascal constructs.Examples of applications of these abstract data types are also given. Part II further describes many algorithms and common techniques for deve. loping efficient algorithms using abstract data types.Programming paradigms such as divide and conquer,dynamic programming,graph searching,tabulation techniques and randomised algorithms,are discussed.Chapter 9 develops a divide and conquer algorithm for the sorting problem and it is shown how implemen- tation considerations are delayed until the final stage of program design.Finally, it is emphasised that these algorithm design techniques are not universal tools for producing efficient algorithms for all problems.In particular,chapter 12 briefly discusses the class of 'hard'problems for which most algorithms are not very efficient.A few techniques are described,such as how to handle such inherently difficult problems. Exercises and Assignments Computer programming can only really be learnt through experience.Merely reading about it is no substitute for first-hand practical experience.Therefore, to reinforce the ideas,exercises are included at the end of each chapter.These are usually simple and can be tackled after the content of the chapter is well understood.Some involve verifying algorithms by checking them manually on paper,while others require the writing of programs or fragments of programs. Guidance on the solutions of selected exercises is provided at the end of the book.Since the emphasis of the book is on the efficient implementation of various abstract data types and the comparison of algorithms,a few practical assignments are also included among the exercises.These take the form of designing,writing and testing actual programs.A few guidelines on the design of data types and algorithms for these assignments are given in the solutions to exercises. Bibliographic Notes and Further Reading No book of this size can deal comprehensively with abstract data types,data structures,algorithm design and the theory and practice of programming with data.However,the material presented here should provide an introduction to the practical issues of abstract data type and algorithm design.For motivated
Preface xⅲ readers,each chapter concludes with a brief list of related literature and some comments on it.Many of the references are articles published in computer journals;these journals and their abbreviations are given below. ACM:Association for Computing Machinery ACM Computing Surveys ACM TODS:ACM Transactions on Databases Acta Informatica BIT:Behandlings Informations Tidskrift-Denmark CACM:Communication of the ACM Computer Journal-British Computer Society Journal of the ACM SIAM:Social and Industrial Applied Mathematics SIAM Journal of Computing Software Practice and Experience References Aho,A.V.,Hopcroft,J.E.and Ullman,J.D.(1974).The Design and Analysis of Computer Algorithms,Addison-Wesley,Reading,Massachusetts. Aho,A.V.,Hopcroft,J.E.and Ullman,J.D.(1983).Data Structures and Algorithms,Addison-Wesley,Reading,Massachusetts. Baron,R.J.and Shapiro,L.G.(1980).Data Structures and their Implementa- tions,PWS Publications,Boston,Massachusetts. Burstall,R.M.,MacQueen,D.B.and Sannella,D.T.(1980).'Hope:an experi- mental applicative language',in Proceedings of 1980 Lisp Conference,Stanford, California,pp.136-143. Gotlieb,C.C.and Gotlieb,L.R.(1978).Data Types and Data Structures, Prentice-Hall,Englewood Cliffs,New Jersey. Knuth,D.E.(1973).The Art of Computer Programming.Volume 1:Funda- mental Algorithms,2nd edition,Addison-Wesley,Reading,Massachusetts. Korsh,J.F.(1986).Data Structures,Algorithms and Program Style,PWS Computer Science,Boston,Massachusetts. Martin,J.J.(1986).Data Types and Data Structures,Prentice-Hall,Englewood Cliffs,New Jersey. Milner,R.(1978).'A theory of type polymorphism in programming',Journal of Computer and Systems Science,Vol.17,No.3 pp.348-375. Reingold,E.and Hansen,W.(1983).Data Structures,Little,Brown,Boston, Massachusetts. Stubbs,D.and Webre,N.(1985).Data Structures with Abstract Data Types and Pascal,Brooks/Cole Publishing,Monterey,California. Wirth,N.(1976).Algorithms Data Structures Programs,Prentice-Hall (1986). Englewood Cliffs,New Jersey. Wirth,N.(1986).Algorithms Data Structures,Prentice-Hall,Englewood Cliffs, New Jersey
Acknowledgements My sincerest thanks go to Martin Henson for many stimulating discussions.He also read the drafts of some chapters and I found his constructive comments and criticisms invaluable. I acknowledge the assistance of the Department of Computer Science at the University of Essex in providing computer facilities and support while this book was being produced. I thank all the students attending the course on data structures and algorithms at the University of Essex who provided me with many useful comments on the material presented in the book. I also greatly appreciate all the help and effort that the secretaries,Marisa Bostock and Ann Cook,put into the typing of the first draft. And last but not least,I am most grateful to my wife,Hengameh,who was my main source of encouragement whenever my enthusiasm waned,and for her continued support and forbearance. Colchester Manoochehr Azmoodeh 1990 xiv
Preliminaries and Notation The only prerequisites for reading this book are practical experience in program- ming in a Pascal-like language and a sound understanding of general programming concepts together with a good grasp of Pascal-like data types (that is,reals, integers,characters,arrays and records).However,the analysis of many data structures and algorithms in the book requires a minimal understanding of such basic mathematical concepts as sets,functions etc.For the purposes of reference and also to introduce the notation used,a brief survey of these concepts is given below.A discussion of the idiosyncrasies of Pascal syntax,as used in the book, then follows. Mathematical Backgrounds Sets A set is a collection of objects called members.The members of a set are usually of the same type.For example S:A set of colours =fred,blue,yellow,white} S2:The set of negative numbers={x lx is an integer and x<0} S3:An empty set={). The and 'are used to represent sets;in S2,x is a variable and''should read as 'such that'.Therefore set S2 should read as 'the set of all values of x such that x is an integer andx is less than 0'.'A is a member of a set S'and 'B is not a member of a set S'are denoted as:A ES and BS.Thus red∈S1 brown年S1 -15∈S2 3398年S2 Members of sets are not ordered.Therefore fred,blue}={blue,red}.The number of members in a set S is called its cardinality and is denoted by ISI.Therefore, |S11=4,IS3l=0. Subsets Set S is a subset of S'if every member of S is also a member of S'.This is denoted as SC S'.We denote'S is not a subset of S''as S S'. XV