[25]that leverages Service Level Agreements and business [4]G.Candea,A.Fox,Designing for High Availability and Measurability, objectives to effectively manage IT resources at runtime, in Proceedings of the First Workshop on Evaluating and Architecting although they mainly focus on automating IT management and System dependability (EASY),Goteborg.Sweden,July 2001. [5]L.Xie,J.Luo,J.Qiu,J.A.Pershing,Y.Li,Y.Chen,Availability Weak- operations.Zhang et al.proposed a QoS-Aware Optimization Point Analysis over an SOA Deployment Framework,in Proceedings Framework [26]to minimize the number of machines subject of IEEE/IFIP International Conference on Network Operations and to response time and throughput requirements,utilizing the Management Symposium 2008(NOMS).Salvador,Bahia,Brazil,Apr. 2008. cross-layer relationship from business level to IT resource [6]Business Process Execution Language for Web Services version 1.1. level;their work gives more emphasis to performance opti- http://www.ibm.com/developerworks/library/specification/ws-bpel/. [7]C.H.Lie,C.L.Hwang,F.A.Tillman,Availability of Maintained mization. Systems:A State-of-the-Art Survey,IIE Transactions,Volume 9,Issue 3,Sep.1977,pp.247-259. VI.CONCLUSION [8]K.Kanoun,H.Madeira,J.Arlat,A Framework for Dependability Benchmarking,in Proceedings of DSN Workshop on Dependability In this paper,we have proposed a workflow-based methodol- Benchmarking of The 2002 International Conference on Dependable ogy to perform availability weak-point analysis over an SOA Systems and Networks (DSN),Washington D.C..USA,Jun.2002,pp 12-15. deployment framework.This compute-efficient methodology [9]G.Janakiraman,J.R.Santos,Y.Turner,Automated Multi-Tier System is able to calculate a near-optimal solution;it minimizes the Design for Service Availability,in Proceedings of the First Workshop on overall HA enhancement cost while satisfying the business- Design of Self-Managing Systems(DSN),San Francisco,C.A.USA, level availability requirements.We have showed by experi- Jun.2003. [10]D.P.Bertsekas,Constrained Optimization and Lagrange Multiplier mental evaluation that our analysis methodology can find a Methods,Academic Press,1982. near-optimal solution and outperforms the exhaustive iteration [11]W.Arnold,T.Eilam,M.Kalantar,A.V.Konstantinou,A.A.Totok, method in computational efficiency. Pattern based SOA Deployment,in Proceedings of the fifth Intemational Conference on Service Oriented Computing (ICSOC).Vienna.Austria, A number of additional aspects still need to be addressed September,2007,pp.1-12. for providing high availability in an SOA deployment topol- [12]MATLAB The Language of Technical Computing. ogy.One is the identification of availability weak points http://www.mathworks.com/products/matlab/. [13]Optimization Toolbox For Use with MATLAB Users Guide Version 2, in the network,and corresponding HA patterns to improve http://www.mathworks.com/products/optimization/. the network availability when necessary.This is somewhat [14]M.Sullivan,R.Chillarege,Software Defects and their Impact on different from our current work,as the network typically is System Availability-A Study of Field Failures in Operating Systems, in Proceedings of the 21st International Symposium on Fault-Tolerant burdened with security and isolation constraints that need to Computing (FTCS),Jun.1991,pp.2-9. be taken into account while the availability weak points are [15]J.B.Dugan,K.S.Trivedi.Coverage Modeling for Dependability addressed.Additional follow-on work is to take performance Analysis of Fault-Tolerant Systems,IEEE Transactions on Computers, Volume 38,Issue 6,Jun 1989,pp.775-787. requirements into account,because different HA solutions [16]Ana-Elena Rugina,Karama Kanoun,Mohamed Kaaniche,A System have various performance impacts on the IT infrastructure and Dependability Modeling Framework Using AADL and GSPNs,Archi- business workflows.For example,cluster-based HA solutions tecting Dependable Systems IV:Volume 4615/2007,pp.14C38 [17]A.P.A.Van Moorsel,Action models:a reliability modeling formalism for generally improve performance while failover-based HA solu- fault-tolerant distributed computing systems,in Proceedings of IEEE tions usually degrade performance. International Computer Performance and Dependability Symposium 1998 (IPDS).Sep.1998.Durham,North Carolina,USA.Pp.119-128. ACKNOWLEDGMENTS [18]K.D.Figiel,D.R.Sule,A generalized reliability block diagram (RBD) simulation,in Proceedings of Winter Simulation Conference 1990 This is an enhanced version of the paper that we presented (WSC).New Orleans,Louisiana,Dec.1990.pp.551-556. [19]R.Robinson.A.Polozoff.IBM WebSphere Developer Technical Journal: at NOMS 2008 [5]. Planning for Availability in the Enterprise IBM Software Services The authors would like to thank Guerney Hunt,Jef- for WebSphere,Dec.2003.http://www.ibm.com/developerworks/web- frey Kephart,Tamar Eilam,Alexander V.Konstantinou,and sphere/techjournal/0312-polozoff/polozoff.html. [20]M.Y.Chen,E.Kiciman,E.Fratkin,A.Fox,E.A.Brewer,Pinpoint: Alexander A.Totok for their comments and feedback that Problem Determination in Large,Dynamic Internet Services.in Pro- helped us shape our vision and improve this paper.The authors ceedings of 2002 International Conference on Dependable Systems and would also like to thank J.P.Martin-Flatin for greatly helping Networks (DSN).Bethesda.MD.USA.June 2002. [21]Hewlett Packard Company,HP us to improve this paper. MC/ServiceGuard. January 2003. www.hp.com/products 1/unix/highavailability/ar/ mcservice- guard/index.html. REFERENCES [22]Sun Microsystems,Cluster,Jan.2003,www.sun.com/software/cluster/. [23]Hewlett Packard. TruCluster Software, Jan. 2003. [1]R.Radhakrishnan,B.Sriraman,Aligning Architectural Approaches to www.tru64unix.compaq.com/cluster/ wards an SOA-Based Enterprise Architecture,in Proceedings of [24]V.Castelli,R.E.Harper,P.Heidelberger,S.W.Hunter,K.S.Trivedi,K. the Working IEEE/IFIP Conference on Software Architecture 2007 Vaidyanathan,W.P.Zeggert,Proactive management of software aging. (WICSA).Mumbai,India,Jan.2007,pp.38-38. [BM Joumal of Research and Development,45(2):311-332,Mar.2001 [2]IBM Interational Technical Support Organization WebSphere [25]I.Aib,M.Sall,C.Bartolini.A.Boulmakoul,R.Boutaba,G.Pu- Application Server Network Deployment V6:High Avail- jolle,Business-aware Policy-based Management,in Proceedings of the ability Solutions, Oct.2005,http://www.redbooks.ibm.com/ first IEEE International Workshop on Business-Driven IT Management redbooks/SG246688/wwhelp/wwhimpl/java/html/wwhelp.htm. (BDIM),Apr.2006,Vancouver,Canada. [3]M.Kamath,G.Alonso,R.Gunthor,C.Mohan,Providing High Avail- [26]C.Zhang.R.N.Chang.C.S.Perng.E.So,C.Q.Tang.T.Tao.QoS- ability in Very Large Workflow Management Systems,in Proceedings of Aware Optimization of Composite-Service Fulfillment Policy,in Pro- the Fifth Intemational Conference on Extending Database Technology ceedings of 2007 IEEE International Conference on Services Computing (EDBT).Avignon,France,Mar.1996.pp.427-442. (SCC).Jul.2007.Salt Lake City,Utah,USA,pp.11-19.[25] that leverages Service Level Agreements and business objectives to effectively manage IT resources at runtime, although they mainly focus on automating IT management and operations. Zhang et al. proposed a QoS-Aware Optimization Framework [26] to minimize the number of machines subject to response time and throughput requirements, utilizing the cross-layer relationship from business level to IT resource level; their work gives more emphasis to performance optimization. VI. CONCLUSION In this paper, we have proposed a workflow-based methodology to perform availability weak-point analysis over an SOA deployment framework. This compute-efficient methodology is able to calculate a near-optimal solution; it minimizes the overall HA enhancement cost while satisfying the businesslevel availability requirements. We have showed by experimental evaluation that our analysis methodology can find a near-optimal solution and outperforms the exhaustive iteration method in computational efficiency. A number of additional aspects still need to be addressed for providing high availability in an SOA deployment topology. One is the identification of availability weak points in the network, and corresponding HA patterns to improve the network availability when necessary. This is somewhat different from our current work, as the network typically is burdened with security and isolation constraints that need to be taken into account while the availability weak points are addressed. Additional follow-on work is to take performance requirements into account, because different HA solutions have various performance impacts on the IT infrastructure and business workflows. For example, cluster-based HA solutions generally improve performance while failover-based HA solutions usually degrade performance. ACKNOWLEDGMENTS This is an enhanced version of the paper that we presented at NOMS 2008 [5]. The authors would like to thank Guerney Hunt, Jeffrey Kephart, Tamar Eilam, Alexander V. Konstantinou, and Alexander A. Totok for their comments and feedback that helped us shape our vision and improve this paper. The authors would also like to thank J.P. Martin-Flatin for greatly helping us to improve this paper. REFERENCES [1] R.Radhakrishnan, B.Sriraman, Aligning Architectural Approaches towards an SOA-Based Enterprise Architecture, in Proceedings of the Working IEEE/IFIP Conference on Software Architecture 2007 (WICSA), Mumbai, India, Jan. 2007, pp. 38-38. [2] IBM International Technical Support Organization, WebSphere Application Server Network Deployment V6: High Availability Solutions, Oct. 2005, http://www.redbooks.ibm.com/ redbooks/SG246688/wwhelp/wwhimpl/java/html/wwhelp.htm. [3] M. Kamath, G. Alonso, R. Gunthor, C. Mohan, Providing High Availability in Very Large Workflow Management Systems, in Proceedings of the Fifth International Conference on Extending Database Technology (EDBT), Avignon, France, Mar. 1996, pp. 427-442. [4] G. Candea, A. Fox, Designing for High Availability and Measurability, in Proceedings of the First Workshop on Evaluating and Architecting System dependability (EASY), Goteborg, Sweden, July 2001. [5] L. Xie, J. Luo, J. Qiu, J. A. Pershing, Y. Li, Y. Chen, Availability WeakPoint Analysis over an SOA Deployment Framework, in Proceedings of IEEE/IFIP International Conference on Network Operations and Management Symposium 2008 (NOMS), Salvador, Bahia, Brazil, Apr. 2008. [6] Business Process Execution Language for Web Services version 1.1, http://www.ibm.com/developerworks/library/specification/ws-bpel/. [7] C. H. Lie, C. L. Hwang, F. A. Tillman, Availability of Maintained Systems: A State-of-the-Art Survey, IIE Transactions, Volume 9, Issue 3, Sep. 1977, pp. 247 - 259. [8] K. Kanoun, H. Madeira, J. Arlat, A Framework for Dependability Benchmarking, in Proceedings of DSN Workshop on Dependability Benchmarking of The 2002 International Conference on Dependable Systems and Networks (DSN), Washington D.C., USA, Jun. 2002, pp. 12-15. [9] G. Janakiraman, J. R. Santos, Y. Turner, Automated Multi-Tier System Design for Service Availability, in Proceedings of the First Workshop on Design of Self-Managing Systems (DSN), San Francisco, C.A., USA, Jun. 2003. [10] D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods, Academic Press, 1982. [11] W. Arnold, T. Eilam, M. Kalantar, A. V. Konstantinou, A. A. Totok, Pattern based SOA Deployment, in Proceedings of the fifth International Conference on Service Oriented Computing (ICSOC), Vienna, Austria, September, 2007, pp. 1-12. [12] MATLAB - The Language of Technical Computing. http://www.mathworks.com/products/matlab/. [13] Optimization Toolbox For Use with MATLAB Users Guide Version 2, http://www.mathworks.com/products/optimization/. [14] M. Sullivan, R. Chillarege, Software Defects and their Impact on System Availability - A Study of Field Failures in Operating Systems, in Proceedings of the 21st International Symposium on Fault-Tolerant Computing (FTCS), Jun. 1991, pp. 2-9. [15] J. B. Dugan, K. S. Trivedi, Coverage Modeling for Dependability Analysis of Fault-Tolerant Systems, IEEE Transactions on Computers, Volume 38, Issue 6, Jun 1989, pp. 775 - 787. [16] Ana-Elena Rugina, Karama Kanoun, Mohamed Kaaniche, A System Dependability Modeling Framework Using AADL and GSPNs, Architecting Dependable Systems IV: Volume 4615/2007, pp. 14C38. [17] A.P.A.Van Moorsel, Action models: a reliability modeling formalism for fault-tolerant distributed computing systems, in Proceedings of IEEE International Computer Performance and Dependability Symposium 1998 (IPDS), Sep. 1998, Durham, North Carolina, USA, pp. 119 - 128. [18] K. D. Figiel, D.R. Sule, A generalized reliability block diagram (RBD) simulation, in Proceedings of Winter Simulation Conference 1990 (WSC), New Orleans, Louisiana, Dec. 1990, pp. 551-556. [19] R. Robinson, A. Polozoff, IBM WebSphere Developer Technical Journal: Planning for Availability in the Enterprise IBM Software Services for WebSphere, Dec. 2003. http://www.ibm.com/developerworks/ websphere/techjournal/0312 polozoff/polozoff.html. [20] M. Y. Chen, E. Kiciman, E. Fratkin, A. Fox, E. A. Brewer, Pinpoint: Problem Determination in Large, Dynamic Internet Services. in Proceedings of 2002 International Conference on Dependable Systems and Networks (DSN), Bethesda, MD, USA, June 2002. [21] Hewlett Packard Company, HP MC/ServiceGuard. January 2003, www.hp.com/products1/unix/highavailability/ar/ mcserviceguard/index.html. [22] Sun Microsystems, Cluster, Jan. 2003, www.sun.com/software/cluster/. [23] Hewlett Packard, TruCluster Software, Jan. 2003, www.tru64unix.compaq.com/cluster/. [24] V. Castelli, R. E. Harper, P. Heidelberger, S. W. Hunter, K. S. Trivedi, K. Vaidyanathan, W. P. Zeggert, Proactive management of software aging, IBM Journal of Research and Development, 45(2):311-332, Mar. 2001. [25] I. Aib, M. Sall, C. Bartolini, A. Boulmakoul, R. Boutaba, G. Pujolle, Business-aware Policy-based Management, in Proceedings of the first IEEE International Workshop on Business-Driven IT Management (BDIM), Apr. 2006, Vancouver, Canada. [26] C. Zhang, R. N. Chang, C. S. Perng, E. So, C. Q. Tang, T. Tao, QoSAware Optimization of Composite-Service Fulfillment Policy, in Proceedings of 2007 IEEE International Conference on Services Computing (SCC), Jul. 2007, Salt Lake City, Utah, USA, pp. 11-19