30 Concurrency,distribution, client-server and the Internet ike humans.computers can team up with their peers to achieve resus that oof them could obtain alone;unlike humans,they can do many things at once (or with the appearance of simultaneity),and do all of them well.So far,however,the discussion has implicitly assumed that the computation is sequential-proceeds along a single thread of control.We should now see what happens when this assumption no longer holds,as we move to concurrent (also known as parallel)computation. Concurrency is not a new subject,but for a long time interest in it remained mostly confined to four application areas:operating systems,networking,implementation of database management systems,and high-speed scientific software.Although strategic and prestigious,these tasks involve only a small subset of the software development community. Things have changed.Concurrency is quickly becoming a required component ofjust about every type of application,including some which had traditionally been thought of as fundamentally sequential in nature.Beyond mere concurrency,our systems,whether or not client-server,must increasingly become distributed over networks,including the network of networks-the Internet.This evolution gives particular urgency to the central question of this chapter:can we apply object-oriented ideas in a concurrent and distributed context? Not only is this possible:object technology can help us develop concurrent and distributed applications simply and elegantly. 30.1 A SNEAK PREVIEW As usual,this discussion will not throw a pre-cooked answer at you,but instead will carefully build a solution from a detailed analysis of the problem and an exploration of possible avenues,including a few dead ends.Although necessary to make you understand the techniques in depth,this thoroughness might lead you to believe that they are complex; that would be inexcusable,since the concurrency mechanism on which we will finally settle is in fact characterized by almost incredible simplicity.To avoid this risk,we will begin by examining a summary of the mechanism,without any of the rationale. Warning. If you hate"spoilers",preferring to start with the full statement of the issues and to let the SPOILER!(The next drama proceed to its denouement step by step and inference by inference,ignore the one- section is 30.2,page page summary that follows and skip directly to the next section
30 Concurrency, distribution, client-server and the Internet Like humans, computers can team up with their peers to achieve results that none of them could obtain alone; unlike humans, they can do many things at once (or with the appearance of simultaneity), and do all of them well. So far, however, the discussion has implicitly assumed that the computation is sequential — proceeds along a single thread of control. We should now see what happens when this assumption no longer holds, as we move to concurrent (also known as parallel) computation. Concurrency is not a new subject, but for a long time interest in it remained mostly confined to four application areas: operating systems, networking, implementation of database management systems, and high-speed scientific software. Although strategic and prestigious, these tasks involve only a small subset of the software development community. Things have changed. Concurrency is quickly becoming a required component of just about every type of application, including some which had traditionally been thought of as fundamentally sequential in nature. Beyond mere concurrency, our systems, whether or not client-server, must increasingly become distributed over networks, including the network of networks — the Internet. This evolution gives particular urgency to the central question of this chapter: can we apply object-oriented ideas in a concurrent and distributed context? Not only is this possible: object technology can help us develop concurrent and distributed applications simply and elegantly. 30.1 A SNEAK PREVIEW As usual, this discussion will not throw a pre-cooked answer at you, but instead will carefully build a solution from a detailed analysis of the problem and an exploration of possible avenues, including a few dead ends. Although necessary to make you understand the techniques in depth, this thoroughness might lead you to believe that they are complex; that would be inexcusable, since the concurrency mechanism on which we will finally settle is in fact characterized by almost incredible simplicity. To avoid this risk, we will begin by examining a summary of the mechanism, without any of the rationale. If you hate “spoilers”, preferring to start with the full statement of the issues and to let the drama proceed to its dénouement step by step and inference by inference, ignore the onepage summary that follows and skip directly to the next section. Warning: SPOILER!(The next section is 30.2, page
952 CONCURRENCY,DISTRIBUTION,CLIENT-SERVER AND THE INTERNET $30.1 The extension covering full-fledged concurrency and distribution will be as minimal as it can get starting from a sequential notation:a single new keyword-separate.How is this possible?We use the fundamental scheme ofO-O computation:feature call,x.f(a), executed on behalf of some object Ol and calling on the object O2 attached to x,with the argument a.But instead of a single processor that handles operations on all objects,we may now rely on different processors for Ol and 02-so that the computation on Ol can move ahead without waiting for the call to terminate,since another processor handles it. Because the effect of a call now depends on whether the objects are handled by the same processor or different ones,the software text must tell us unambiguously what the intent is for any x.Hence the need for the new keyword:rather than just x:SOME TYPE, we declare x:separate SOME TYPE to indicate that x is handled by a different processor, so that calls of targetx can proceed in parallel with the rest of the computation.With such a declaration,any creation instruction !x.make (...)will spawn offa new processor-a new thread of control-to handle future calls on x. Nowhere in the software text should we have to specify which processor to use.All we state,through the separate declaration,is that two objects are handled by different processors,since this radically affects the system's semantics.Actual processor assignment can wait until run time.Nor do we settle too early on the exact nature of processors:a processor can be implemented by a piece of hardware(a computer),but just as well by a task(process)of the operating system,or,on a multithreaded OS,just a thread of such a task.Viewed by the software,"processor"is an abstract concept,you can execute the same concurrent application on widely different architectures (time-sharing on one computer,distributed network with many computers,threads within one Unix or Windows task...)without any change to its source text.All you will change is a "Concurrency Configuration File"which specifies the last-minute mapping of abstract processors to physical resources. We need to specify synchronization constraints.The conventions are straightforward: No special mechanism is required for a client to resynchronize with its supplier after a separate call x.f(a)has gone off in parallel.The client will wait when and if it needs to:when it requests information on the object through a query call,as in value=x.some query.This automatic mechanism is called wait by necessity. To obtain exclusive access to a separate object O2,it suffices to use the attached entity a as an argument to the corresponding call,as in r(a). A routine precondition involving a separate argument such as a causes the client to wait until the precondition holds. To guarantee that we can control our software and predict the result (in particular, rest assured that class invariants will be maintained),we must allow the processor in charge of an object to execute at most one routine at any given time. We may,however,need to interrupt the execution of a routine to let a new,high- priority client take over.This will cause an exception,so that the spurned client can take the appropriate corrective measures-most likely retrying after a while. This covers most of the mechanism,which will enable us to build the most advanced concurrent and distributed applications through the full extent of O-O techniques,from A complete sum- multiple inheritance to Design by Contract-as we will now study in detail,forgetting mary appears in for a while all that we have read in this short preview. 3011,page1025
952 CONCURRENCY, DISTRIBUTION, CLIENT-SERVER AND THE INTERNET §30.1 The extension covering full-fledged concurrency and distribution will be as minimal as it can get starting from a sequential notation: a single new keyword — separate. How is this possible? We use the fundamental scheme of O-O computation: feature call, x ● f (a), executed on behalf of some object O1 and calling f on the object O2 attached to x, with the argument a. But instead of a single processor that handles operations on all objects, we may now rely on different processors for O1 and O2 — so that the computation on O1 can move ahead without waiting for the call to terminate, since another processor handles it. Because the effect of a call now depends on whether the objects are handled by the same processor or different ones, the software text must tell us unambiguously what the intent is for any x. Hence the need for the new keyword: rather than just x: SOME_TYPE, we declare x: separate SOME_TYPE to indicate that x is handled by a different processor, so that calls of target x can proceed in parallel with the rest of the computation. With such a declaration, any creation instruction !! x ● make (…) will spawn off a new processor — a new thread of control — to handle future calls on x. Nowhere in the software text should we have to specify which processor to use. All we state, through the separate declaration, is that two objects are handled by different processors, since this radically affects the system’s semantics. Actual processor assignment can wait until run time. Nor do we settle too early on the exact nature of processors: a processor can be implemented by a piece of hardware (a computer), but just as well by a task (process) of the operating system, or, on a multithreaded OS, just a thread of such a task. Viewed by the software, “processor” is an abstract concept; you can execute the same concurrent application on widely different architectures (time-sharing on one computer, distributed network with many computers, threads within one Unix or Windows task…) without any change to its source text. All you will change is a “Concurrency Configuration File” which specifies the last-minute mapping of abstract processors to physical resources. We need to specify synchronization constraints. The conventions are straightforward: • No special mechanism is required for a client to resynchronize with its supplier after a separate call x ● f (a) has gone off in parallel. The client will wait when and if it needs to: when it requests information on the object through a query call, as in value := x ● some_query. This automatic mechanism is called wait by necessity. • To obtain exclusive access to a separate object O2, it suffices to use the attached entity a as an argument to the corresponding call, as in r (a). • A routine precondition involving a separate argument such as a causes the client to wait until the precondition holds. • To guarantee that we can control our software and predict the result (in particular, rest assured that class invariants will be maintained), we must allow the processor in charge of an object to execute at most one routine at any given time. • We may, however, need to interrupt the execution of a routine to let a new, highpriority client take over. This will cause an exception, so that the spurned client can take the appropriate corrective measures — most likely retrying after a while. This covers most of the mechanism, which will enable us to build the most advanced concurrent and distributed applications through the full extent of O-O techniques, from multiple inheritance to Design by Contract — as we will now study in detail, forgetting for a while all that we have read in this short preview. A complete summary appears in 30.11, page 1025
$30.2 THE RISE OF CONCURRENCY 953 30.2 THE RISE OF CONCURRENCY Back to square one.We must first review the various forms of concurrency,to understand how the evolution of our field requires most software developers to make concurrency part of their mindset.In addition to the traditional concepts of multiprocessing and multiprogramming,the past few years have introduced two innovative concepts:object request brokers and remote execution through the Net. Multiprocessing More and more,we want to use the formidable amount of computing power available around us;less and less,we are willing to wait for the computer(although we have become quite com fortable with the idea that the computer is waiting for us).So if one processing unit would not bring us quickly enough the result that we need,we will want to rely on several units working in parallel.This form of concurrency is known as multiprocessing. Spectacular applications of multiprocessing have involved researchers relying on hundreds of computers scattered over the Internet,at times when the computers' (presumably consenting)owners did not need them,to solve computationally intensive problems such as breaking cryptographic algorithms.Such efforts do not just apply to computing research:Hollywood's insatiable demand for realistic computer graphics has played its part in fueling progress in this area;the preparation of the movie Toy Story,one of the first to involve artificial characters only (only the voices are human),relied at some point on a network of more than one hundred high-end workstations-more economical, it seems,than one hundred professional animators. Multiprocessing is also ubiquitous in high-speed scientific computing,to solve ever larger problems of physics,engineering,meteorology,statistics,investment banking More routinely,many computing installations use some form of load balancing: automatically dispatching computations among the various computers available at any particular time on the local network of an organization. Another form of multiprocessing is the computing architecture known as client- server computing,which assigns various specialized roles to the computers on a network: the biggest and most expensive machines,of which a typical company network will have just one or a few,are "servers"handling shared databases,heavy computations and other strategic central resources;the cheaper machines,ubiquitously located wherever there is an end user,handle decentralizable tasks such as the human interface and simple com putations;they forward to the servers any task that exceeds their competence. The current popularity of the client-server approach is a swing of the pendulum away from the trend of the preceding decade.Initially (nineteen-sixties and seventies) architectures were centralized,forcing users to compete for resources.The personal computer and workstation revolution of the eighties was largely about empowering users with resources theretofore reserved to the Center(the "glass house"in industry jargon). Then they discovered the obvious:a personal computer cannot do everything,and some resources must be shared.Hence the emergence of client-server architectures in the nineties.The inevitable cynical comment-that we are back to the one-mainframe- many-terminals architecture ofour youth,only with more expensive terminals now called "client workstations"-is not really justified:the industry is simply searching,through trial and error,for the proper tradeoff between decentralization and sharing
§30.2 THE RISE OF CONCURRENCY 953 30.2 THE RISE OF CONCURRENCY Back to square one. We must first review the various forms of concurrency, to understand how the evolution of our field requires most software developers to make concurrency part of their mindset. In addition to the traditional concepts of multiprocessing and multiprogramming, the past few years have introduced two innovative concepts: object request brokers and remote execution through the Net. Multiprocessing More and more, we want to use the formidable amount of computing power available around us; less and less, we are willing to wait for the computer (although we have become quite comfortable with the idea that the computer is waiting for us). So if one processing unit would not bring us quickly enough the result that we need, we will want to rely on several units working in parallel. This form of concurrency is known as multiprocessing. Spectacular applications of multiprocessing have involved researchers relying on hundreds of computers scattered over the Internet, at times when the computers’ (presumably consenting) owners did not need them, to solve computationally intensive problems such as breaking cryptographic algorithms. Such efforts do not just apply to computing research: Hollywood’s insatiable demand for realistic computer graphics has played its part in fueling progress in this area; the preparation of the movie Toy Story, one of the first to involve artificial characters only (only the voices are human), relied at some point on a network of more than one hundred high-end workstations — more economical, it seems, than one hundred professional animators. Multiprocessing is also ubiquitous in high-speed scientific computing, to solve ever larger problems of physics, engineering, meteorology, statistics, investment banking. More routinely, many computing installations use some form of load balancing: automatically dispatching computations among the various computers available at any particular time on the local network of an organization. Another form of multiprocessing is the computing architecture known as clientserver computing, which assigns various specialized roles to the computers on a network: the biggest and most expensive machines, of which a typical company network will have just one or a few, are “servers” handling shared databases, heavy computations and other strategic central resources; the cheaper machines, ubiquitously located wherever there is an end user, handle decentralizable tasks such as the human interface and simple computations; they forward to the servers any task that exceeds their competence. The current popularity of the client-server approach is a swing of the pendulum away from the trend of the preceding decade. Initially (nineteen-sixties and seventies) architectures were centralized, forcing users to compete for resources. The personal computer and workstation revolution of the eighties was largely about empowering users with resources theretofore reserved to the Center (the “glass house” in industry jargon). Then they discovered the obvious: a personal computer cannot do everything, and some resources must be shared. Hence the emergence of client-server architectures in the nineties. The inevitable cynical comment — that we are back to the one-mainframemany-terminals architecture of our youth, only with more expensive terminals now called “client workstations” — is not really justified: the industry is simply searching, through trial and error, for the proper tradeoff between decentralization and sharing
954 CONCURRENCY,DISTRIBUTION,CLIENT-SERVER AND THE INTERNET $30.2 Multiprogramming The other main form of concurrency is multiprogramming,which involves a single computer working on several tasks at once. If we consider general-purpose systems(excluding processors that are embedded in an application device,be it a washing machine or an airplane instrument,and single- mindedly repeat a fixed set of operations),computers are almost always multi- programmed,performing operating system tasks in parallel with application tasks.In a strict form of multiprogramming the parallelism is apparent rather than real:at any single time the processing unit is actually working on just one job;but the time to switch between jobs is so short that an outside observer can believe they proceed concurrently.In addition, the processing unit itself may do several things in parallel(as in the advance fetch schemes of many computers,where each clock cycle loads the next instruction at the same time it executes the current one),or may actually be a combination of several processing units, so that multiprogramm ing becomes intertwined with multiprocessing A common application of multiprogramming is time-sharing,allowing a single machine to serve several users at once.But except in the case of very powerful "mainframe"computers this idea is considered much less attractive now than it was when computers were a precious rarity.Today we consider our time to be the more valuable resource,so we want the system to do several things at once just for us.In particular,multi- windowing user interfaces allow several applications to proceed in parallel:in one window we browse the Web,in another we edit a document,in yet another we compile and test some software.All this requires powerful concurrency mechanisms. Providing each computer user with a multi-windowing,multiprogramming interface is the responsibility of the operating system.But increasingly the users of the software we develop want to have concurrency within one application.The reason is always the same: they know that computing power is available by the bountiful,and they do not want to wait idly.So if it takes a while to load incoming messages in an e-mail system,you will want to be able to send an outgoing message while this operation proceeds.With a good Web browser you can access a new site while loading pages from another.In a stock trading system,you may at any single time be accessing market information from several stock exchanges,buying here,selling there,and monitoring a client's portfolio It is this need for intra-application concurrency which has suddenly brought the whole subject of concurrent computing to the forefront of software development and made it of interest far beyond its original constituencies.Meanwhile,all the traditional applications remain as important as ever,with new developments in operating systems,the Internet,local area networks,and scientific computing-where the continual quest for speed demands ever higher levels of multiprocessing
954 CONCURRENCY, DISTRIBUTION, CLIENT-SERVER AND THE INTERNET §30.2 Multiprogramming The other main form of concurrency is multiprogramming, which involves a single computer working on several tasks at once. If we consider general-purpose systems (excluding processors that are embedded in an application device, be it a washing machine or an airplane instrument, and singlemindedly repeat a fixed set of operations), computers are almost always multiprogrammed, performing operating system tasks in parallel with application tasks. In a strict form of multiprogramming the parallelism is apparent rather than real: at any single time the processing unit is actually working on just one job; but the time to switch between jobs is so short that an outside observer can believe they proceed concurrently. In addition, the processing unit itself may do several things in parallel (as in the advance fetch schemes of many computers, where each clock cycle loads the next instruction at the same time it executes the current one), or may actually be a combination of several processing units, so that multiprogramming becomes intertwined with multiprocessing. A common application of multiprogramming is time-sharing, allowing a single machine to serve several users at once. But except in the case of very powerful “mainframe” computers this idea is considered much less attractive now than it was when computers were a precious rarity. Today we consider our time to be the more valuable resource, so we want the system to do several things at once just for us. In particular, multiwindowing user interfaces allow several applications to proceed in parallel: in one window we browse the Web, in another we edit a document, in yet another we compile and test some software. All this requires powerful concurrency mechanisms. Providing each computer user with a multi-windowing, multiprogramming interface is the responsibility of the operating system. But increasingly the users of the software we develop want to have concurrency within one application. The reason is always the same: they know that computing power is available by the bountiful, and they do not want to wait idly. So if it takes a while to load incoming messages in an e-mail system, you will want to be able to send an outgoing message while this operation proceeds. With a good Web browser you can access a new site while loading pages from another. In a stock trading system, you may at any single time be accessing market information from several stock exchanges, buying here, selling there, and monitoring a client’s portfolio. It is this need for intra-application concurrency which has suddenly brought the whole subject of concurrent computing to the forefront of software development and made it of interest far beyond its original constituencies. Meanwhile, all the traditional applications remain as important as ever, with new developments in operating systems, the Internet, local area networks, and scientific computing — where the continual quest for speed demands ever higher levels of multiprocessing
$30.2 THE RISE OF CONCURRENCY 955 Object request brokers Another important recent development has been the emergence of the CORBA proposal from the Object Management Group,and the OLE 2/ActiveX architecture from Microsoft. Although the precise goals,details and markets differ,both efforts promise substantial progress towards distributed computing. The general purpose is to allow applications to access each other's objects and services as conveniently as possible,either locally or across a network.The CORBA effort (more precisely its CORBA 2 stage,clearly the interesting one)has also placed particular emphasis on interoperability: CORBA-aware applications can cooperate even if they are based on"object request brokers"from different vendors. Interoperability also applies to the language level:an application written in one of the supported languages can access objects from an application written in another.The interaction goes through an intermediate language called IDL (Interface Definition Language);supported languages have an official IDL binding,which maps the constructs of the language to those of IDL. IDL is a common-denominator O-O language centered on the notion ofinterface.An IDL interface for a class is similar in spirit to a short form,although more rudimentary (IDL in particular does not support assertions);it describes the set of features available on a certain abstraction.From a class written in an O-O language such as the notation of this book,tools will derive an IDL interface,making the class and its instances of interest to client software.A client written in the same language or another can,through an IDL interface,access across a network the features provided by such a supplier. Remote execution Another development of the late nineties is the mechanism for remote execution through the World-Wide Web. The first Web browsers made it not just possible but also convenient to explore information stored on remote computers anywhere in the world,and to follow logical connections,or hyperlinks,at the click of a button.But this was a passive mechanism: someone prepared some information,and everyone else accessed it read-only The next step was to move to an active setup where clicking on a link actually triggers execution of an operation.This assumes the presence,within the Web browser,of an execution engine which can recognize the downloaded information as executable code, and execute it.The execution engine can be a built-in part of the browser,or it may be dynamically attached to it in response to the downloading of information of the corresponding type.This latter solution is known as a plug-in mechanism and assumes that users interested in a particular execution mechanism can download the execution engine,usually free,from the Internet
§30.2 THE RISE OF CONCURRENCY 955 Object request brokers Another important recent development has been the emergence of the CORBA proposal from the Object Management Group, and the OLE 2/ActiveX architecture from Microsoft. Although the precise goals, details and markets differ, both efforts promise substantial progress towards distributed computing. The general purpose is to allow applications to access each other’s objects and services as conveniently as possible, either locally or across a network. The CORBA effort (more precisely its CORBA 2 stage, clearly the interesting one) has also placed particular emphasis on interoperability: • CORBA-aware applications can coöperate even if they are based on “object request brokers” from different vendors. • Interoperability also applies to the language level: an application written in one of the supported languages can access objects from an application written in another. The interaction goes through an intermediate language called IDL (Interface Definition Language); supported languages have an official IDL binding, which maps the constructs of the language to those of IDL. IDL is a common-denominator O-O language centered on the notion of interface. An IDL interface for a class is similar in spirit to a short form, although more rudimentary (IDL in particular does not support assertions); it describes the set of features available on a certain abstraction. From a class written in an O-O language such as the notation of this book, tools will derive an IDL interface, making the class and its instances of interest to client software. A client written in the same language or another can, through an IDL interface, access across a network the features provided by such a supplier. Remote execution Another development of the late nineties is the mechanism for remote execution through the World-Wide Web. The first Web browsers made it not just possible but also convenient to explore information stored on remote computers anywhere in the world, and to follow logical connections, or hyperlinks, at the click of a button. But this was a passive mechanism: someone prepared some information, and everyone else accessed it read-only. The next step was to move to an active setup where clicking on a link actually triggers execution of an operation. This assumes the presence, within the Web browser, of an execution engine which can recognize the downloaded information as executable code, and execute it. The execution engine can be a built-in part of the browser, or it may be dynamically attached to it in response to the downloading of information of the corresponding type. This latter solution is known as a plug-in mechanism and assumes that users interested in a particular execution mechanism can download the execution engine, usually free, from the Internet
956 CONCURRENCY,DISTRIBUTION,CLIENT-SERVER AND THE INTERNET $30.3 This idea was first made popular by Java in late 1995 and 1996;Java execution engines have become widely available.Plug-ins have since appeared for many other mechanisms.An alternative to providing a specific plug-in is to generate,from any source language,code for a widely available engine,such as a Java engine;several compiler vendors have indeed started to provide generators of Java "bytecode"(the low-level portable code that the Java engine can execute). For the notation of this book the two avenues have been pursued:ISE has a free execution engine;and at the time of writing a project is in progress to generate Java bytecode. Either approach raises the potential of security problems:how much do you trust someone's application?If you are not careful,clicking on an innocent-looking hyperlink could unleash a vicious program that destroys files on your computer,or steals your personal information.More precisely you should not,as a user,be the one asked to be careful:the responsibility is on the provider of an execution engine and the associated library of basic facilities.Some widely publicized Java security failures in 1996 caused considerable worries about the issue. The solution is to use carefully designed and certified execution engines and libraries coming from reputable sources.Often they will have two versions: One version is meant for unlimited Internet usage,based on a severely restricted execution engine. In ISE's tool the only I/O library facilities in this restricted tool only read and write to and from the terminal,not files.The "external"mechanism of the language has also been removed,so that a vicious application cannot cause mischief by going to C,say,to perform file manipulations.The Java "Virtual Machine"(the engine)is also draconian in what it permits Internet"applets" to do with the file system of your computer. The other version has fewer or no such restrictions,and provides the full power of the libraries,file I/O in particular.It is meant for applications that will run ona secure Intranet (internal company network)rather than the wilderness of the Internet. In spite of the insecurity specter,the prospect of unfettered remote execution,a new step in the ongoing revolution in the way we distribute software,has generated enormous excitement,which shows no sign of abating. 30.3 FROM PROCESSES TO OBJECTS To support all these mind-boggling developments,requiring ever more use of concurrent processing,we need powerful software support.How are we going to program these things?Object technology,of course,suggests itself. Robin Milner is said to have exclaimed,in a 1991 workshop at an O-O conference, Cited in [Matsuoka "I can't understand why objects [of-O languages]are not concurrent in the first place". 1993]. Even if only in the second or third place,how do we go about making objects concurrent?
956 CONCURRENCY, DISTRIBUTION, CLIENT-SERVER AND THE INTERNET §30.3 This idea was first made popular by Java in late 1995 and 1996; Java execution engines have become widely available. Plug-ins have since appeared for many other mechanisms. An alternative to providing a specific plug-in is to generate, from any source language, code for a widely available engine, such as a Java engine; several compiler vendors have indeed started to provide generators of Java “bytecode” (the low-level portable code that the Java engine can execute). For the notation of this book the two avenues have been pursued: ISE has a free execution engine; and at the time of writing a project is in progress to generate Java bytecode. Either approach raises the potential of security problems: how much do you trust someone’s application? If you are not careful, clicking on an innocent-looking hyperlink could unleash a vicious program that destroys files on your computer, or steals your personal information. More precisely you should not, as a user, be the one asked to be careful: the responsibility is on the provider of an execution engine and the associated library of basic facilities. Some widely publicized Java security failures in 1996 caused considerable worries about the issue. The solution is to use carefully designed and certified execution engines and libraries coming from reputable sources. Often they will have two versions: • One version is meant for unlimited Internet usage, based on a severely restricted execution engine. In ISE’s tool the only I/O library facilities in this restricted tool only read and write to and from the terminal, not files. The “external” mechanism of the language has also been removed, so that a vicious application cannot cause mischief by going to C, say, to perform file manipulations. The Java “Virtual Machine” (the engine) is also draconian in what it permits Internet “applets” to do with the file system of your computer. • The other version has fewer or no such restrictions, and provides the full power of the libraries, file I/O in particular. It is meant for applications that will run on a secure Intranet (internal company network) rather than the wilderness of the Internet. In spite of the insecurity specter, the prospect of unfettered remote execution, a new step in the ongoing revolution in the way we distribute software, has generated enormous excitement, which shows no sign of abating. 30.3 FROM PROCESSES TO OBJECTS To support all these mind-boggling developments, requiring ever more use of concurrent processing, we need powerful software support. How are we going to program these things? Object technology, of course, suggests itself. Robin Milner is said to have exclaimed, in a 1991 workshop at an O-O conference, “I can’t understand why objects [of O-O languages] are not concurrent in the first place”. Even if only in the second or third place, how do we go about making objects concurrent? Cited in [Matsuoka 1993]
$30.3 FROM PROCESSES TO OBJECTS 957 If we start from non-O-O concurrency work,we will find that it largely relies on the notion of process.A process is a program unit that acts like a special-purpose computer: it executes a certain algorithm,usually repeating it until some external event triggers termination.A typical example is the process that manages a printer,repeatedly executing "Wait until there is at least a job in the print queue" "Get the next print job and remove it from the queue" "Print the job" Various concurrency models differ in how processes are scheduled and synchronized,compete for shared hardware resources,and exchange information.In some concurrent programming languages,you directly describe a process;in others,such as Ada,you may also describe process types,which at run time are instantiated into processes,much as the classes of object-oriented software are instantiated into objects. Similarities The correspondence seems indeed clear.As we start exploring how to combine ideas from concurrent programming and object-oriented software construction,it seems natural to identify processes with objects,and process types with classes.Anyone who has studied concurrent computing and discovers O-O development,or the other way around,will be struck by the similarities between these two technologies: Both rely on autonomous,encapsulated modules:processes or process types;classes. Like processes and unlike the subroutines of sequential,non-O-O approaches, objects will,from each activation to the next,retain the values they contain. To build reasonable concurrent systems,it is indispensable in practice to enforce heavy restrictions on how modules can exchange information;otherwise things quickly get out ofhand.The O-O approach,as we have seen,places similarly severe restrictions on inter-module communication. The basic mechanism for such communication may loosely be described,in both cases,under the general label of"message passing". So it is not surprising that many people have had a"Eureka!"when first thinking, Milner-like,about making objects concurrent.The unification,it seems,should come easily. This first impression is unfortunately wrong:after the similarities,one soon stumbles into the discrepancies Active objects Building on the analogies just summarized,a number of proposals for concurrent O-O mechanisms(see the bibliographical notes)have introduced a notion of"active object". From:Doug Lea, An active object is an object that is also a process:it has its own program to execute.In a “Concurrent Pro- definition from a book on Java: gramming in Java", Addison-Wesley, Each object is a single,identifiable process-like entity (not ulike a Unix 1996. process)with state and behavior
§30.3 FROM PROCESSES TO OBJECTS 957 If we start from non-O-O concurrency work, we will find that it largely relies on the notion of process. A process is a program unit that acts like a special-purpose computer: it executes a certain algorithm, usually repeating it until some external event triggers termination. A typical example is the process that manages a printer, repeatedly executing “Wait until there is at least a job in the print queue” “Get the next print job and remove it from the queue” “Print the job” Various concurrency models differ in how processes are scheduled and synchronized, compete for shared hardware resources, and exchange information. In some concurrent programming languages, you directly describe a process; in others, such as Ada, you may also describe process types, which at run time are instantiated into processes, much as the classes of object-oriented software are instantiated into objects. Similarities The correspondence seems indeed clear. As we start exploring how to combine ideas from concurrent programming and object-oriented software construction, it seems natural to identify processes with objects, and process types with classes. Anyone who has studied concurrent computing and discovers O-O development, or the other way around, will be struck by the similarities between these two technologies: • Both rely on autonomous, encapsulated modules: processes or process types; classes. • Like processes and unlike the subroutines of sequential, non-O-O approaches, objects will, from each activation to the next, retain the values they contain. • To build reasonable concurrent systems, it is indispensable in practice to enforce heavy restrictions on how modules can exchange information; otherwise things quickly get out of hand. The O-O approach, as we have seen, places similarly severe restrictions on inter-module communication. • The basic mechanism for such communication may loosely be described, in both cases, under the general label of “message passing”. So it is not surprising that many people have had a “Eureka!” when first thinking, Milner-like, about making objects concurrent. The unification, it seems, should come easily. This first impression is unfortunately wrong: after the similarities, one soon stumbles into the discrepancies. Active objects Building on the analogies just summarized, a number of proposals for concurrent O-O mechanisms (see the bibliographical notes) have introduced a notion of “active object”. An active object is an object that is also a process: it has its own program to execute. In a definition from a book on Java: Each object is a single, identifiable process-like entity (not unlike a Unix process) with state and behavior. From: Doug Lea, “Concurrent Programming in Java”, Addison-Wesley, 1996
958 CONCURRENCY,DISTRIBUTION,CLIENT-SERVER AND THE INTERNET $30.3 This notion,however,raises difficult problems. The most significant one is easy to see.A process has its own agenda:as illustrated by the printer example,it relentlessly executes a certain sequence of actions.Not so with classes and objects.An object does not do one thing;it is a repository of services (the features of the generating class),and just waits for the next client to solicit one of those services-chosen by the client,not the object.If we make the object active,it becomes responsible for the scheduling of its operations.This creates a conflict with the clients, which have a very clear view of what the scheduling should be:they just want the supplier, whenever they need a particular service,to be ready to provide it immediately! The problem arises in non-object-oriented approaches to concurrency and has led to mechanisms for synchronizing processes-that is to say,specifying when and how each is ready to communicate,waiting if necessary for the other to be ready too.For example in a very simple,unbuffered producer-consumer scheme we may have a producer process that repeatedly executes “Make it known that producer is not ready'” "Perform some computation that produces a valuex" “Make it known that producer is ready” "Wait for consumer to be ready" “Pass x to consumer” and a consumer process that repeatedly executes Handshake "Make it known that consumer is ready" “Wait for producer to be ready” “Get x from producer” "Make it known that consumer is not ready" "Perform some computation that uses the value x" a scheme which we may also view pictorially: producer consumer A simple Produce producer- Wait Wait consumer scheme Communicate Communicate Handshake (passx) ons ume Communication occurs when both processes are ready for each other;this is sometimes called a handshake or rende=-vous.The design of synchronization mechanisms -enabling us in particular to express precisely the instructions to "Make it known that process is ready"and"Wait for process to be ready"-has been a fertile area of research and development for several decades
958 CONCURRENCY, DISTRIBUTION, CLIENT-SERVER AND THE INTERNET §30.3 This notion, however, raises difficult problems. The most significant one is easy to see. A process has its own agenda: as illustrated by the printer example, it relentlessly executes a certain sequence of actions. Not so with classes and objects. An object does not do one thing; it is a repository of services (the features of the generating class), and just waits for the next client to solicit one of those services — chosen by the client, not the object. If we make the object active, it becomes responsible for the scheduling of its operations. This creates a conflict with the clients, which have a very clear view of what the scheduling should be: they just want the supplier, whenever they need a particular service, to be ready to provide it immediately! The problem arises in non-object-oriented approaches to concurrency and has led to mechanisms for synchronizing processes — that is to say, specifying when and how each is ready to communicate, waiting if necessary for the other to be ready too. For example in a very simple, unbuffered producer-consumer scheme we may have a producer process that repeatedly executes a scheme which we may also view pictorially: Communication occurs when both processes are ready for each other; this is sometimes called a handshake or rendez-vous. The design of synchronization mechanisms — enabling us in particular to express precisely the instructions to “Make it known that process is ready” and “Wait for process to be ready” — has been a fertile area of research and development for several decades. “Make it known that producer is not ready” “Perform some computation that produces a value x” “Make it known that producer is ready” “Wait for consumer to be ready” “Pass x to consumer” and a consumer process that repeatedly executes “Make it known that consumer is ready” “Wait for producer to be ready” “Get x from producer” “Make it known that consumer is not ready” “Perform some computation that uses the value x” Handshake A simple producerconsumer scheme Produce Consume Wait Communicate producer consumer Handshake (pass x) Wait Communicate
$30.3 FROM PROCESSES TO OBJECTS 959 All this is fine for processes,the concurrent equivalent of traditional sequential programs which"do one thing";indeed,a concurrent system built with processes is like a sequential system with several main programs.But in the object-oriented approach we have rejected the notion of main program and instead defined software units that stand ready to provide any one of a number of possible features. Reconciling this view with the notion of process requires elaborate synchronization constructs to make sure that each supplier is ready to execute a feature when the client needs it.The reconciliation is particularly delicate when both client and supplier are active objects,since each has its own agenda. All this does not make it impossible to devise mechanisms based on the notion of active object,as evidenced by the abundant literature on the subject (to which the bibliographical notes to this chapter give many references).But this evidence also shows the complexity of the proposed solutions,of which none has gained wide acceptance, suggesting that the active object approach is not the right one. Active objects clash with inheritance Doubts about the suitability of the active object approach grow as one starts looking at how it combines with other O-O mechanisms,especially inheritance. If a class B inherits from a class 4 and both are active (that is to say,describe instances that must be active objects),what happens in B to the description of's process? In many cases you will need to add some new instructions,but without special language mechanisms this means that you will almost always have to redefine and rewrite the entire process part-not an attractive proposition. See“Sequencing Here is an example of special language mechanism.Although the Simula 67 and inheritance'”, language does not support concurrency,it has a notion of active object:a Simula class can, page 1121,as part of the discussion of besides its features,include a set of instructions,called the body of the class,so that we Simula. can talk of executing an object-meaning executing the body of its generating class.The body of a class 4 can include a special instruction inner,which has no effect in the class itself but,in a proper descendant B,stands for the body of B.So if the body of4 reads some initialization,inner,some termination actions and the body of B reads specific B actions then execution of that body actually means executing some initialization;specific B actions,some_termination_actions Although the need for a mechanism of this kind is clear in a language supporting the notion of active object,objections immediately come to mind:the notation is misleading, since if you just read the body of B you will get a wrong view of what the execution does; it forces the parent to plan in detail for its descendants,going against basic O-O concepts (the Open-Closed principle);and it only works in a single-inheritance language
§30.3 FROM PROCESSES TO OBJECTS 959 All this is fine for processes, the concurrent equivalent of traditional sequential programs which “do one thing”; indeed, a concurrent system built with processes is like a sequential system with several main programs. But in the object-oriented approach we have rejected the notion of main program and instead defined software units that stand ready to provide any one of a number of possible features. Reconciling this view with the notion of process requires elaborate synchronization constructs to make sure that each supplier is ready to execute a feature when the client needs it. The reconciliation is particularly delicate when both client and supplier are active objects, since each has its own agenda. All this does not make it impossible to devise mechanisms based on the notion of active object, as evidenced by the abundant literature on the subject (to which the bibliographical notes to this chapter give many references). But this evidence also shows the complexity of the proposed solutions, of which none has gained wide acceptance, suggesting that the active object approach is not the right one. Active objects clash with inheritance Doubts about the suitability of the active object approach grow as one starts looking at how it combines with other O-O mechanisms, especially inheritance. If a class B inherits from a class A and both are active (that is to say, describe instances that must be active objects), what happens in B to the description of A’s process? In many cases you will need to add some new instructions, but without special language mechanisms this means that you will almost always have to redefine and rewrite the entire process part — not an attractive proposition. Here is an example of special language mechanism. Although the Simula 67 language does not support concurrency, it has a notion of active object: a Simula class can, besides its features, include a set of instructions, called the body of the class, so that we can talk of executing an object — meaning executing the body of its generating class. The body of a class A can include a special instruction inner, which has no effect in the class itself but, in a proper descendant B, stands for the body of B. So if the body of A reads some_initialization; inner; some_termination_actions and the body of B reads specific_B_actions then execution of that body actually means executing some_initialization; specific_B_actions; some_termination_actions Although the need for a mechanism of this kind is clear in a language supporting the notion of active object, objections immediately come to mind: the notation is misleading, since if you just read the body of B you will get a wrong view of what the execution does; it forces the parent to plan in detail for its descendants, going against basic O-O concepts (the Open-Closed principle); and it only works in a single-inheritance language. See “Sequencing and inheritance”, page 1121, as part of the discussion of Simula
960 CONCURRENCY,DISTRIBUTION,CLIENT-SERVER AND THE INTERNET $30.3 Even with a different notation,the basic problem will remain:how to combine the process specification of a class with those of its proper descendants;how to reconcile parents'process specifications in the case of multiple inheritance. Later in this chapter we will see other problems,known as the"inheritance anomaly"and “Synchrontation for arising from the use of inheritance with synchronization constraints. concurrent O-O com- Faced with these difficulties,some of the early O-O concurrency proposals preferred putation".page 980 to stay away from inheritance altogether.Although justifiable as a temporary measure to help understand the issues by separating concerns,this exclusion ofinheritance cannot be sustained in a definitive approach to the construction of concurrent object-oriented software;this would be like cutting the arm because the finger itches.(For good measure, some of the literature adds that inheritance is a complex and messy notion anyway,as if telling the patient,after the operation,that having an arm was a bad idea in the first place.) The inference that we may draw is simpler and less extreme.The problem is not object technology per se,in particular inheritance;it is not concurrency;it is not even the combination of these ideas.What causes trouble is the notion of active object. Processes programmed As we prepare to get rid of active objects it is useful to note that we will not really be renouncing anything.An object is able to perform many operations:all the features of its generating class.By tuming it into a process,we select one of these operations as the only one that really counts.There is absolutely no benefit in doing this!Why limit ourselves to one algorithm when we can have as many as we want? Another way to express this observation is that the notion of process need not be a built-in concept in the concurrency mechanism;processes can be programmed simply as routines.Consider for example the concept of printer process cited at the beginning of this chapter.The object-oriented view tells us to focus on the object type,printer,and to treat the process as just one routine,say live,of the corresponding class: indexing description:"Printers handling one print job at a time" note:"A better version,based on a general class PROCESS, %appears below under the name PRINTER" class PRINTER I feature--Status report stop requested:BOOLEAN is do...end oldest:JOB is do ..end feature--Basic operations setup is do ..end wait for job is do ..end remove oldest is do...end print (j:JOB)is do ..end
960 CONCURRENCY, DISTRIBUTION, CLIENT-SERVER AND THE INTERNET §30.3 Even with a different notation, the basic problem will remain: how to combine the process specification of a class with those of its proper descendants; how to reconcile parents’ process specifications in the case of multiple inheritance. Later in this chapter we will see other problems, known as the “inheritance anomaly” and arising from the use of inheritance with synchronization constraints. Faced with these difficulties, some of the early O-O concurrency proposals preferred to stay away from inheritance altogether. Although justifiable as a temporary measure to help understand the issues by separating concerns, this exclusion of inheritance cannot be sustained in a definitive approach to the construction of concurrent object-oriented software; this would be like cutting the arm because the finger itches. (For good measure, some of the literature adds that inheritance is a complex and messy notion anyway, as if telling the patient, after the operation, that having an arm was a bad idea in the first place.) The inference that we may draw is simpler and less extreme. The problem is not object technology per se, in particular inheritance; it is not concurrency; it is not even the combination of these ideas. What causes trouble is the notion of active object. Processes programmed As we prepare to get rid of active objects it is useful to note that we will not really be renouncing anything. An object is able to perform many operations: all the features of its generating class. By turning it into a process, we select one of these operations as the only one that really counts. There is absolutely no benefit in doing this! Why limit ourselves to one algorithm when we can have as many as we want? Another way to express this observation is that the notion of process need not be a built-in concept in the concurrency mechanism; processes can be programmed simply as routines. Consider for example the concept of printer process cited at the beginning of this chapter. The object-oriented view tells us to focus on the object type, printer, and to treat the process as just one routine, say live, of the corresponding class: indexing description: "Printers handling one print job at a time" note: “A better version, based on a general class PROCESS, % %appears below under the name PRINTER" class PRINTER_1 feature -- Status report stop_requested: BOOLEAN is do … end oldest: JOB is do … end feature -- Basic operations setup is do … end wait_for_job is do … end remove_oldest is do … end print (j: JOB) is do … end “Synchronization for concurrent O-O computation”, page 980