© 2000 by CRC Press LLC While th_中国高校课件下载中心

点击下载：《电子工程师手册》学习资料（英文版）Chapter 96 Operating Systems

正在加载图片...

While there are many advantages to distributed systems, there are also several disadvantages. The primary difficulty is that software for implementing distributed systems is large and complex. Small personal computers could not effectively run modern distributed applications. Software development tools for this environment are not well advanced. Thus, application developers are having a difficult time working in this environment. An additional problem is network speed. Most office networks are currently based on IEEE standard 802.3 IEEE, 1985], commonly(although erroneously) called Ethernet, which operates at 10 Mb/s(ten million bits per second). With this limited bandwidth, it is easy to saturate the network. While higher-speed networks such as FDDI and ATM? networks do exist, they are not yet in common use. While distributed computing has many advantages, we must also understand that without appropriate safeguards, our data may not be secure. Security is a difficult problem in a distributed environment. whom do you trust when there are potentially thousands of users with access to your local system? A network is subject to security attack by a number of mechanisms. It is possible to monitor all packets going across the network; hence, unencrypted data are easily obtained by an unauthorized user. a malicious user may cause a denial-of-service attack by flooding the network with packets, making all systems inaccessible to legitimate users Finally, we must deal with the problem of scale. To connect a few dozen or even a few hundred computers may not cause a problem with current software. However, global networks of computers are now being Scaling our current software to work with tens of thousands of computers running across large hic boundaries with many different types of networks is a challenge that has not yet been met 96.4 Fault-Tolerant Systems Most computers simply stop running when they break. We take this as a given. There are many environments, however, where it is not acceptable for the computer to stop working. The space shuttle is a good example There are other environments where you would simply prefer if the system continued to operate. A business sing a computer for order entry can continue to operate if the computer breaks, but the cost and inconvenience may be high Fault-tolerant systems are composed of specially designed hardware and software that are capable f continuous operation To build a fault-tolerant system requires both hardware and software modifications. Let's take a look at an example of a small problem that illustrates the type of changes that must be made. Remember, the goal of such a system is to achieve continuous operation. That means we can never purposely shut the computer off. How then do we repair the system if we cannot shut it off? First, the hardware must be capable of having circuit boards plugged and unplugged while the system is running; this is not possible on most computers. Second, removing a board must be detected by the hardware and reported to the operating system. The operating system,the manager of resources, must then discontinue use of that resource. Each component of the computer system, both hardware and software, must be specially built to handle failures. It should also be obvious that a fault-tolerant system must have redundant hardware. If, for example, a disk controller should fail, there must be another controller communicating with the disks that can take over One problem with implementing a fault-tolerant system is knowing when something has failed. If a circuit board totally ceases operation, we can determine the failure by its lack of response to commands. Another failure mode exists where the failing component appears to work but is operating incorrectly. A common approach to detect this problem is a voting mechanism. By implementing three hardware replicas the system detect when any one has failed by its producing output inconsistent with the other two. In that case, the output of the two components in agreement is used The operating system must be capable of restarting a program from a known point when a component or which the program was running has failed. The system can use checkpoints for this purpose. When an application program reaches a known state, such as when it completes a transaction, it stores the current state of the Fiber distributed data interface. The FDDI standard specifies an optical fiber ring with a data rate of 100 Mb/s. aSynchronous transfer mode. a packet-oriented transfer mode moving data in fixed-size packets called cells. There is no fixed speed for ATM. Typical speed is currently 155 Mb/s, although there are implementations running at 2 Gb/s. e 2000 by CRC Press LLC© 2000 by CRC Press LLC While there are many advantages to distributed systems, there are also several disadvantages. The primary difficulty is that software for implementing distributed systems is large and complex. Small personal computers could not effectively run modern distributed applications. Software development tools for this environment are not well advanced. Thus, application developers are having a difficult time working in this environment. An additional problem is network speed. Most office networks are currently based on IEEE standard 802.3 [IEEE, 1985], commonly (although erroneously) called Ethernet, which operates at 10 Mb/s (ten million bits per second). With this limited bandwidth, it is easy to saturate the network. While higher-speed networks such as FDDI1 and ATM2 networks do exist, they are not yet in common use. While distributed computing has many advantages, we must also understand that without appropriate safeguards, our data may not be secure. Security is a difficult problem in a distributed environment. Whom do you trust when there are potentially thousands of users with access to your local system? A network is subject to security attack by a number of mechanisms. It is possible to monitor all packets going across the network; hence, unencrypted data are easily obtained by an unauthorized user. A malicious user may cause a denial-of-service attack by flooding the network with packets, making all systems inaccessible to legitimate users. Finally, we must deal with the problem of scale. To connect a few dozen or even a few hundred computers together may not cause a problem with current software. However, global networks of computers are now being installed. Scaling our current software to work with tens of thousands of computers running across large geographic boundaries with many different types of networks is a challenge that has not yet been met. 96.4 Fault-Tolerant Systems Most computers simply stop running when they break. We take this as a given. There are many environments, however, where it is not acceptable for the computer to stop working. The space shuttle is a good example. There are other environments where you would simply prefer if the system continued to operate. A business using a computer for order entry can continue to operate if the computer breaks, but the cost and inconvenience may be high. Fault-tolerant systems are composed of specially designed hardware and software that are capable of continuous operation. To build a fault-tolerant system requires both hardware and software modifications. Let’s take a look at an example of a small problem that illustrates the type of changes that must be made. Remember, the goal of such a system is to achieve continuous operation. That means we can never purposely shut the computer off. How then do we repair the system if we cannot shut it off? First, the hardware must be capable of having circuit boards plugged and unplugged while the system is running; this is not possible on most computers. Second, removing a board must be detected by the hardware and reported to the operating system. The operating system, the manager of resources, must then discontinue use of that resource. Each component of the computer system, both hardware and software, must be specially built to handle failures. It should also be obvious that a fault-tolerant system must have redundant hardware. If, for example, a disk controller should fail, there must be another controller communicating with the disks that can take over. One problem with implementing a fault-tolerant system is knowing when something has failed. If a circuit board totally ceases operation, we can determine the failure by its lack of response to commands. Another failure mode exists where the failing component appears to work but is operating incorrectly. A common approach to detect this problem is a voting mechanism. By implementing three hardware replicas the system can detect when any one has failed by its producing output inconsistent with the other two. In that case, the output of the two components in agreement is used. The operating system must be capable of restarting a program from a known point when a component on which the program was running has failed. The system can use checkpoints for this purpose. When an application program reaches a known state, such as when it completes a transaction, it stores the current state of the 1 Fiber distributed data interface. The FDDI standard specifies an optical fiber ring with a data rate of 100 Mb/s. 2 Asynchronous transfer mode. A packet-oriented transfer mode moving data in fixed-size packets called cells. There is no fixed speed for ATM. Typical speed is currently 155 Mb/s, although there are implementations running at 2 Gb/s

<<向上翻页向下翻页>>

点击下载：《电子工程师手册》学习资料（英文版）Chapter 96 Operating Systems