CLOUD COMPUTING Principles and Paradigms.pdf
Document Details
Uploaded by Deleted User
2011
Tags
Full Transcript
CLOUD COMPUTING Principles and Paradigms Download from Wow! eBook Edited by Rajkumar Buyya...
CLOUD COMPUTING Principles and Paradigms Download from Wow! eBook Edited by Rajkumar Buyya The University of Melbourne and Manjrasoft Pty Ltd., Australia James Broberg The University of Melbourne, Australia Andrzej Goscinski Deakin University, Australia CLOUD COMPUTING CLOUD COMPUTING Principles and Paradigms Edited by Rajkumar Buyya The University of Melbourne and Manjrasoft Pty Ltd., Australia James Broberg The University of Melbourne, Australia Andrzej Goscinski Deakin University, Australia Copyright r 2011 by John Wiley & Sons, Inc. All rights reserved. Published by John Wiley & Sons, Inc., Hoboken, New Jersey Published simultaneously in Canada No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission. Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages. For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com. Library of Congress Cataloging-in-Publication Data: Cloud computing : principles and paradigms / edited by Rajkumar Buyya, James Broberg, Andrzej Goscinski. p. ; cm. Includes bibliographical references and index. ISBN 978-0-470-88799-8 (hardback) 1. Cloud computing. I. Buyya, Rajkumar, 1970 II. Broberg, James. III. Goscinski, Andrzej. QA76.585.C58 2011 004.67u8—dc22 2010046367 Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 CONTENTS PREFACE XV ACKNOWLEDGMENTS XIX CONTRIBUTORS XXI PART I FOUNDATIONS 1 1 Introduction to Cloud Computing 3 William Voorsluys, James Broberg, and Rajkumar Buyya 1.1 Cloud Computing in a Nutshell / 3 1.2 Roots of Cloud Computing / 5 1.3 Layers and Types of Clouds / 13 1.4 Desired Features of a Cloud / 16 1.5 Cloud Infrastructure Management / 17 1.6 Infrastructure as a Service Providers / 26 1.7 Platform as a Service Providers / 31 1.8 Challenges and Risks / 34 1.9 Summary / 37 References / 37 2 Migrating into a Cloud 43 T. S. Mohan 2.1 Introduction / 43 2.2 Broad Approaches to Migrating into the Cloud / 48 2.3 The Seven-Step Model of Migration into a Cloud / 51 2.4 Conclusions / 54 Acknowledgments / 55 References / 55 v vi CONTENTS 3 Enriching the ‘Integration as a Service’ Paradigm for the Cloud Era 57 Pethuru Raj 3.1 An Introduction / 57 3.2 The Onset of Knowledge Era / 59 3.3 The Evolution of SaaS / 59 3.4 The Challenges of SaaS Paradigm / 61 3.5 Approaching the SaaS Integration Enigma / 63 3.6 New Integration Scenarios / 67 3.7 The Integration Methodologies / 69 3.8 SaaS Integration Products and Platforms / 72 3.9 SaaS Integration Services / 80 3.10 Businesses-to-Business Integration (B2Bi) Services / 84 3.11 A Framework of Sensor—Cloud Integration / 89 3.12 SaaS Integration Appliances / 94 3.13 Conclusion / 95 References / 95 4 The Enterprise Cloud Computing Paradigm 97 Tariq Ellahi, Benoit Hudzia, Hui Li, Maik A. Lindner, and Philip Robinson 4.1 Introduction / 97 4.2 Background / 98 4.3 Issues for Enterprise Applications on the Cloud / 103 4.4 Transition Challenges / 106 4.5 Enterprise Cloud Technology and Market Evolution / 108 4.6 Business Drivers Toward a Marketplace for Enterprise Cloud Computing / 112 4.7 The Cloud Supply Chain / 115 4.8 Summary / 117 Acknowledgments / 117 References / 118 PART II INFRASTRUCTURE AS A SERVICE (IAAS) 121 5 Virtual Machines Provisioning and Migration Services 123 Mohamed El-Refaey 5.1 Introduction and Inspiration / 123 CONTENTS vii 5.2 Background and Related Work / 124 5.3 Virtual Machines Provisioning and Manageability / 130 5.4 Virtual Machine Migration Services / 132 5.5 VM Provisioning and Migration in Action / 136 5.6 Provisioning in the Cloud Context / 145 5.7 Future Research Directions / 151 5.8 Conclusion / 154 References / 154 6 On the Management of Virtual Machines for Cloud Infrastructures 157 Ignacio M. Llorente, Rubén S. Montero, Borja Sotomayor, David Breitgand, Alessandro Maraschini, Eliezer Levy, and Benny Rochwerger 6.1 The Anatomy of Cloud Infrastructures / 158 6.2 Distributed Management of Virtual Infrastructures / 161 6.3 Scheduling Techniques for Advance Reservation of Capacity / 166 6.4 Capacity Management to meet SLA Commitments / 172 6.5 Conclusions and Future Work / 185 Acknowledgments / 186 References / 187 7 Enhancing Cloud Computing Environments Using a Cluster as a Service 193 Michael Brock and Andrzej Goscinski 7.1 Introduction / 193 7.2 Related Work / 194 7.3 RVWS Design / 197 7.4 Cluster as a Service: The Logical Design / 202 7.5 Proof of Concept / 212 7.6 Future Research Directions / 218 7.7 Conclusion / 219 References / 219 8 Secure Distributed Data Storage in Cloud Computing 221 Yu Chen, Wei-Shinn Ku, Jun Feng, Pu Liu, and Zhou Su 8.1 Introduction / 221 8.2 Cloud Storage: from LANs TO WANs / 222 8.3 Technologies for Data Security in Cloud Computing / 232 viii CONTENTS 8.4 Open Questions and Challenges / 242 8.5 Summary / 246 References / 246 PART III PLATFORM AND SOFTWARE AS A SERVICE (PAAS/IAAS) 249 9 Aneka—Integration of Private and Public Clouds 251 Christian Vecchiola, Xingchen Chu, Michael Mattess, and Rajkumar Buyya 9.1 Introduction / 251 9.2 Technologies and Tools for Cloud Computing / 254 9.3 Aneka Cloud Platform / 257 9.4 Aneka Resource Provisioning Service / 259 9.5 Hybrid Cloud Implementation / 262 9.6 Visionary thoughts for Practitioners / 269 9.7 Summary and Conclusions / 271 Acknowledgments / 272 References / 273 10 CometCloud: An Autonomic Cloud Engine 275 Hyunjoo Kim and Manish Parashar 10.1 Introduction / 275 10.2 CometCloud Architecture / 276 10.3 Autonomic Behavior of CometCloud / 280 10.4 Overview of CometCloud-based Applications / 286 10.5 Implementation and Evaluation / 287 10.6 Conclusion and Future Research Directions / 295 Acknowledgments / 295 References / 296 11 T-Systems’ Cloud-Based Solutions for Business Applications 299 Michael Pauly 11.1 Introduction / 299 11.2 What Enterprises Demand of Cloud Computing / 300 11.3 Dynamic ICT Services / 302 11.4 Importance of Quality and Security in Clouds / 305 CONTENTS ix 11.5 Dynamic Data Center—Producing Business-ready, Dynamic ICT Services / 307 11.6 Case Studies / 314 11.7 Summary: Cloud Computing offers much more than Traditional Outsourcing / 318 Acknowledgments / 319 References / 319 12 Workflow Engine for Clouds 321 Suraj Pandey, Dileban Karunamoorthy, and Rajkumar Buyya 12.1 Introduction / 321 12.2 Background / 322 12.3 Workflow Management Systems and Clouds / 323 12.4 Architecture of Workflow Management Systems / 326 12.5 Utilizing Clouds for Workflow Execution / 328 12.6 Case Study: Evolutionary Multiobjective Optimizations / 334 12.7 Visionary thoughts for Practitioners / 340 12.8 Future Research Directions / 341 12.9 Summary and Conclusions / 341 Acknowledgments / 342 References / 342 13 Understanding Scientific Applications for Cloud Environments 345 Shantenu Jha, Daniel S. Katz, Andre Luckow, Andre Merzky, and Katerina Stamou 13.1 Introduction / 345 13.2 A Classification of Scientific Applications and Services in the Cloud / 350 13.3 SAGA-based Scientific Applications that Utilize Clouds / 354 13.4 Discussion / 363 13.5 Conclusions / 367 References / 368 14 The MapReduce Programming Model and Implementations 373 Hai Jin, Shadi Ibrahim, Li Qi, Haijun Cao, Song Wu, and Xuanhua Shi 14.1 Introduction / 373 14.2 MapReduce Programming Model / 375 14.3 Major MapReduce Implementations for the Cloud / 379 x CONTENTS 14.4 MapReduce Impacts and Research Directions / 385 14.5 Conclusion / 387 Acknowledgments / 387 References / 387 PART IV MONITORING AND MANAGEMENT 391 15 An Architecture for Federated Cloud Computing 393 Benny Rochwerger, Constantino Vázquez, David Breitgand, David Hadas, Massimo Villari, Philippe Massonet, Eliezer Levy, Alex Galis, Ignacio M. Llorente, Rubén S. Montero, Yaron Wolfsthal, Kenneth Nagin, Lars Larsson, and Fermı́n Galán 15.1 Introduction / 393 15.2 A Typical Use Case / 394 15.3 The Basic Principles of Cloud Computing / 398 15.4 A Model for Federated Cloud Computing / 400 15.5 Security Considerations / 407 15.6 Summary and Conclusions / 410 Acknowledgments / 410 References / 410 16 SLA Management in Cloud Computing: A Service Provider’s Perspective 413 Sumit Bose, Anjaneyulu Pasala, Dheepak R. A, Sridhar Murthy and Ganesan Malaiyandisamy 16.1 Inspiration / 413 16.2 Traditional Approaches to SLO Management / 418 16.3 Types of SLA / 421 16.4 Life Cycle of SLA / 424 16.5 SLA Management in Cloud / 425 16.6 Automated Policy-based Management / 429 16.7 Conclusion / 435 References / 435 17 Performance Prediction for HPC on Clouds 437 Rocco Aversa, Beniamino Di Martino, Massimiliano Rak, Salvatore Venticinque, and Umberto Villano 17.1 Introduction / 437 17.2 Background / 440 CONTENTS xi 17.3 Grid and Cloud / 442 17.4 HPC in the Cloud: Performance-related Issues / 445 17.5 Summary and Conclusions / 453 References / 454 PART V APPLICATIONS 457 18 Best Practices in Architecting Cloud Applications in the AWS Cloud 459 Jinesh Varia 18.1 Introduction / 459 18.2 Background / 459 18.3 Cloud Concepts / 463 18.4 Cloud Best Practices / 468 18.5 GrepTheWeb Case Study / 479 18.6 Future Research Directions / 486 18.7 Conclusion / 487 Acknowledgments / 487 References / 487 19 Massively Multiplayer Online Game Hosting on Cloud Resources 491 Vlad Nae, Radu Prodan, and Alexandru Iosup 19.1 Introduction / 491 19.2 Background / 492 19.3 Related Work / 494 19.4 Model / 495 19.5 Experiments / 500 19.6 Future Research Directions / 507 19.7 Conclusions / 507 Acknowledgments / 507 References / 507 20 Building Content Delivery Networks Using Clouds 511 James Broberg 20.1 Introduction / 511 20.2 Background/Related Work / 512 xii CONTENTS 20.3 MetaCDN: Harnessing Storage Clouds for Low-Cost, High-Performance Content Delivery / 516 20.4 Performance of the MetaCDN Overlay / 525 20.5 Future Directions / 527 20.6 Conclusion / 528 Acknowledgments / 529 References / 529 21 Resource Cloud Mashups 533 Lutz Schubert, Matthias Assel, Alexander Kipp, and Stefan Wesner 21.1 Introduction / 533 21.2 Concepts of a Cloud Mashup / 536 21.3 Realizing Resource Mashups / 542 21.4 Conclusions / 545 References / 546 PART VI GOVERNANCE AND CASE STUDIES 549 22 Organizational Readiness and Change Management in the Cloud Age 551 Robert Lam 22.1 Introduction / 551 22.2 Basic Concept of Organizational Readiness / 552 22.3 Drivers for Changes: A Framework to Comprehend the Competitive Environment / 555 22.4 Common Change Management Models / 559 22.5 Change Management Maturity Model (CMMM) / 563 22.6 Organizational Readiness Self-Assessment: (Who, When, Where, and How) / 565 22.7 Discussion / 567 22.8 Conclusion / 570 Acknowledgments / 571 References / 572 23 Data Security in the Cloud 573 Susan Morrow 23.1 An Introduction to the Idea of Data Security / 573 23.2 The Current State of Data Security in the Cloud / 574 CONTENTS xiii 23.3 Homo Sapiens and Digital Information / 575 23.4 Cloud Computing and Data Security Risk / 576 23.5 Cloud Computing and Identity / 578 23.6 The Cloud, Digital Identity, and Data Security / 584 23.7 Content Level Security—Pros and Cons / 586 23.8 Future Research Directions / 588 23.9 Conclusion / 590 Acknowledgments / 591 Further Reading / 591 References / 591 24 Legal Issues in Cloud Computing 593 Janine Anthony Bowen 24.1 Introduction / 593 24.2 Data Privacy and Security Issues / 596 24.3 Cloud Contracting models / 601 24.4 Jurisdictional Issues Raised by Virtualization and Data Location / 603 24.5 Commercial and Business Considerations—A Cloud User’s Viewpoint / 606 24.6 Special Topics / 610 24.7 Conclusion / 611 24.8 Epilogue / 611 References / 612 25 Achieving Production Readiness for Cloud Services 615 Wai-Kit Cheah and Henry Kasim 25.1 Introduction / 615 25.2 Service Management / 615 25.3 Producer Consumer Relationship / 616 25.4 Cloud Service Life Cycle / 620 25.5 Production Readiness / 626 25.6 Assessing Production Readiness / 626 25.7 Summary / 634 References / 634 Index 635 PREFACE Cloud computing has recently emerged as one of the buzzwords in the ICT industry. Numerous IT vendors are promising to offer computation, storage, and application hosting services and to provide coverage in several continents, offering service-level agreements (SLA)-backed performance and uptime pro- mises for their services. While these “clouds” are the natural evolution of traditional data centers, they are distinguished by exposing resources (compu- tation, data/storage, and applications) as standards-based Web services and following a “utility” pricing model where customers are charged based on their utilization of computational resources, storage, and transfer of data. They offer subscription-based access to infrastructure, platforms, and applications that are popularly referred to as IaaS (Infrastructure as a Service), PaaS (Platform as a Service), and SaaS (Software as a Service). While these emerging services have increased interoperability and usability and reduced the cost of computa- tion, application hosting, and content storage and delivery by several orders of magnitude, there is significant complexity involved in ensuring that applica- tions and services can scale as needed to achieve consistent and reliable operation under peak loads. Currently, expert developers are required to implement cloud services. Cloud vendors, researchers, and practitioners alike are working to ensure that potential users are educated about the benefits of cloud computing and the best way to harness the full potential of the cloud. However, being a new and popular paradigm, the very definition of cloud computing depends on which computing expert is asked. So, while the realization of true utility computing appears closer than ever, its acceptance is currently restricted to cloud experts due to the perceived complexities of interacting with cloud computing providers. This book illuminates these issues by introducing the reader with the cloud computing paradigm. The book provides case studies of numerous existing compute, storage, and application cloud services and illustrates capabilities and limitations of current providers of cloud computing services. This allows the reader to understand the mechanisms needed to harness cloud computing in their own respective endeavors. Finally, many open research problems that have arisen from the rapid uptake of cloud computing are detailed. We hope that this motivates the reader to address these in their own future research and xv xvi PREFACE development. We believe the book to serve as a reference for larger audience such as systems architects, practitioners, developers, new researchers, and graduate-level students. This book also comes with an associated Web site (hosted at http://www.manjrasoft.com/CloudBook/) containing pointers to advanced on-line resources. ORGANIZATION OF THE BOOK This book contains chapters authored by several leading experts in the field of cloud computing. The book is presented in a coordinated and integrated manner starting with the fundamentals and followed by the technologies that implement them. The content of the book is organized into six parts: I. Foundations II. Infrastructure as a Service (IaaS ) III. Platform and Software as a Service (PaaS/SaaS) IV. Monitoring and Management V. Applications VI. Governance and Case Studies Part I presents fundamental concepts of cloud computing, charting their evolution from mainframe, cluster, grid, and utility computing. Delivery models such as Infrastructure as a Service, Platform as a Service, and Software as a Service are detailed, as well as deployment models such as Public, Private, and Hybrid Clouds. It also presents models for migrating applications to cloud environments. Part II covers Infrastructure as a Service (IaaS), from enabling technologies such as virtual machines and virtualized storage, to sophisticated mechanisms for securely storing data in the cloud and managing virtual clusters. Part III introduces Platform and Software as a Service (PaaS/IaaS), detailing the delivery of cloud hosted software and applications. The design and operation of sophisticated, auto-scaling applications and environments are explored. Part IV presents monitoring and management mechanisms for cloud computing, which becomes critical as cloud environments become more complex and interoperable. Architectures for federating cloud computing resources are explored, as well as service level agreement (SLA) management and performance prediction. Part V details some novel applications that have been made possible by the rapid emergence of cloud computing resources. Best practices for architecting cloud applications are covered, describing how to harness the power of loosely coupled cloud resources. The design and execution of applications that leverage PREFACE xvii cloud resources such as massively multiplayer online game hosting, content delivery and mashups are explored. Part VI outlines the organizational, structural, regulatory and legal issues that are commonly encountered in cloud computing environments. Details on how companies can successfully prepare and transition to cloud environments are explored, as well as achieving production readiness once such a transition is completed. Data security and legal concerns are explored in detail, as users reconcile moving their sensitive data and computation to cloud computing providers. Rajkumar Buyya The University of Melbourne and Manjrasoft Pty Ltd., Australia James Broberg The University of Melbourne, Australia Andrzej Goscinski Deakin University, Australia ACKNOWLEDGMENTS First and foremost, we are grateful to all the contributing authors for their time, effort, and understanding during the preparation of the book. We thank Professor Albert Zomaya, editor of the Wiley book series on parallel and distributed computing, for his enthusiastic support and guidance during the preparation of book and enabling us to easily navigate through Wiley’s publication process. We would like to thank members of the book Editorial Advisory Board for their guidance during the preparation of the book. The board members are: Dr. Geng Lin (CISCO Systems, USA), Prof. Manish Parashar (Rutgers: The State University of New Jersey, USA), Dr. Wolfgang Gentzsch (Max-Planck- Gesellschaft, München, Germany), Prof. Omer Rana (Cardiff University, UK), Prof. Hai Jin (Huazhong University of Science and Technology, China), Dr. Simon See (Sun Microsystems, Singapore), Dr. Greg Pfister (IBM, USA (retired)), Prof. Ignacio M. Llorente (Universidad Complutense de Madrid, Spain), Prof. Geoffrey Fox (Indiana University, USA), and Dr. Walfredo Cirne (Google, USA). All chapters were reviewed and authors have updated their chapters to address review comments. We thank members of the Melbourne CLOUDS Lab for their time and effort in peer reviewing of chapters. Raj would like to thank his family members, especially Smrithi, Soumya, and Radha Buyya, for their love, understanding, and support during the prepara- tion of the book. James would like to thank his wife, Amy, for her love and support. Andrzej would like to thank his wife, Teresa, for her love and support. Finally, we would like to thank the staff at Wiley, particularly, Simone Taylor (Senior Editor, Wiley), Michael Christian (Editorial Assistant, Wiley), and S. Nalini (MPS Limited, a Macmillan Company, Chennai, India). They were wonderful to work with! R.B. J.B. A.G. xix CONTRIBUTORS MATTHIAS ASSEL, High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, 70550 Stuttgart, Germany ROCCO AVERSA, Department of Information Engineering, Second University of Naples, 81031 Aversa (CE), Italy SUMIT BOSE, Unisys Research Center, Bangalore, India - 560025 JANINE ANTHONY BOWEN, ESQ., McKenna Long & Aldridge LLP, Atlanta, GA 30308, USA DAVID BREITGAND, IBM Haifa Research Lab, Haifa University Campus, 31095, Haifa, Israel JAMES BROBERG, Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, Melbourne, VIC 3010, Australia MICHAEL BROCK, School of Information Technology, Deakin University, Geelong, Victoria 3217, Australia RAJKUMAR BUYYA, Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, Melbourne, VIC 3010, Australia HAIJUN CAO, School of Computer Science and Technology, Huazhong Uni- versity of Science and Technology, Wuhan, 430074, China WAI-KIT CHEAH, Advanced Customer Services, Oracle Corporation (S) Pte Ltd., Singapore 038986 YU CHEN, Department of Electrical and Computer Engineering, State Uni- versity of New York—Binghamton, Binghamton, NY 13902 XINGCHEN CHU, Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, Melbourne, VIC 3010, Australia BENIAMINO DI MARTINO, Department of Information Engineering, Second University of Naples, 81031 Aversa (CE), Italy xxi xxii CONTRIBUTORS TARIQ ELLAHI, SAP Research Belfast, BT3 9DT, Belfast, United Kingdom MOHAMED A. EL-REFAEY, Arab Academy for Science, Technology and Maritime Transport, College of Computing and Information Technology, Cairo, Egypt JUN FENG, Department of Electrical and Computer Engineering, State Uni- versity of New York—Binghamton, Binghamton, NY 13902 FERMÍN GALÁN, Telefónica I þ D, Emilio Vargas, 6. 28043 Madrid, Spain ALEX GALIS, University College London, Department of Electronic and Electrical Engineering, Torrington Place, London WC1E 7JE, United Kingdom ANDRZEJ GOSCINSKI, School of Information Technology, Deakin University, Geelong, Victoria 3217, Australia DAVID HADAS, IBM Haifa Research Lab, Haifa University Campus, 31095, Haifa, Israel BENOIT HUDZIA, SAP Research Belfast, BT3 9DT, Belfast, United Kingdom SHADI IBRAHIM, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China ALEXANDRU IOSUP, Electrical Engineering, Mathematics and Computer Science Department, Delft University of Technology, 2628 CD, Delft, The Netherlands SHANTENU JHA, Center for Computation and Technology and Department of Computer Science, Louisiana State University, Baton Rouge, LA 70803 HAI JIN, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China DILEBAN KARUNAMOORTHY, Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, Melbourne, VIC 3010, Australia HENRY KASIM, HPC and Cloud Computing Center, Oracle Corporation (S) Pte Ltd, #18-01 Suntec Tower Four, Singapore 038986 DANIEL S. KATZ, Computation Institute, University of Chicago, Chicago, Illinois 60637 HYUNJOO KIM, Department of Electrical and Computer Engineering, Rutgers, The State University of New Jersey, New Brunswick, NJ ALEXANDER KIPP, High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, 70550 Stuttgart, Germany WEI-SHINN KU, Department of Computer Science and Software Engineering, Auburn University, AL 36849 CONTRIBUTORS xxiii ROBERT LAM, School of Information and Communication Technologies SAIT Polytechnic, Calgary, Canada T2M 0L4 LARS LARSSON, Department of Computing Science, University Umea, Sweden ELIEZER LEVY, SAP Research SRC Ra’anana, Ra’anana 43665; Israel HUI LI, SAP Research Karlsruhe, Vincenz-Priessnitz-Strasse, 176131 Karls- ruhe, Germany MAIK A. LINDNER, SAP Research Belfast, BT3 9DT, Belfast, United Kingdom PU LIU, IBM Endicott Center, New York, NY IGNACIO M. LLORENTE, Distributed Systems Architecture Research Group, Departmento de Arquitectura de Computadores y Automática, Facultad de Download from Wow! eBook Informática, Universidad Complutense de Madrid, 28040 Madrid, Spain ANDRE LUCKOW, Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803 GANESAN MALAIYANDISAMY, SETLabs, Infosys Technologies Limited, Electro- nics City, Bangalore, India, 560100 ALESSANDRO MARASCHINI, ElsagDatamat spa, Rome, Italy PHILIPPE MASSONET, CETIC, B-6041 Charleroi, Belgium MICHAEL MATTESS, Department of Computer Science and Software Engineer- ing, The University of Melbourne, Parkville, Melbourne, VIC 3010, Australia ANDRE MERZKY, Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, 70803 T. S. MOHAN, Infosys Technologies Limited, Electronics City, Bangalore, India, 560100 S. MONTERO, Distributed Systems Architecture Research Group, RUBEN Departmento de Arquitectura de Computadores, y Automática, Facultad de Informatica, Universidad Complutense de Madrid, 28040 Madrid, Spain SUSAN MORROW, Avoco Secure, London W1S 2LQ, United Kingdom SRIDHAR MURTHY, Infosys Technologies Limited, Electronics City, Bangalore, India, 560100 VLAD NAE, Institute of Computer Science, University of Innsbruck, Tech- nikerstrabe 21a, A-6020 Innsbruck, Austria KENNETH NAGIN, IBM Haifa Research Lab, Haifa University Campus, 31095, Haifa, Israel xxiv CONTRIBUTORS SURAJ PANDEY, Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, Melbourne, VIC 3010, Australia MANISH PARASHAR, Department of Electrical and Computer Engineering, Rutgers, The State University of New Jersey, New Jersey, USA. ANJANEYULU PASALA, SETLabs, Infosys Technologies Limited, Electronics City, Bangalore, India, 560100 MICHAEL PAULY, T-Systems, Aachen, Germany RADU PRODAN, Institute of Computer Science, University of Innsbruck, A-6020 Innsbruck, Austria LI QI, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China DHEEPAK R A, SETLabs, Infosys Technologies Limited, Electronics City, Bangalore, India, 560100 PETHURU RAJ, Robert Bosch India, Bangalore 560068, India MASSIMILIANO RAK, Department of Information Engineering, Second University of Naples, 81031 Aversa (CE), Italy PHILIP ROBINSON, SAP Research Belfast, BT3 9DT, Belfast, United Kingdom BENNY ROCHWERGER, IBM Haifa Research Lab, Haifa University Campus, 31095, Haifa, Israel LUTZ SCHUBERT, High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, 70550 Stuttgart, Germany XUANHUA SHI, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China BORJA SOTOMAYOR, Department of Computer Science, University of Chicago, Chicago, IL KATERINA STAMOU, Department of Computer Science, Louisiana State University, Baton Rouge, LA, 70803 ZHOU SU, Department of Computer Science, Graduate School of Science and Engineering, Waseda University, Japan JINESH VARIA, Amazon Web Services, Seattle, WA 98109 CONSTANTINO VÁZQUEZ, Facultad de Informática, Universidad Complutense de Madrid, 28040 Madrid, Spain CHRISTIAN VECCHIOLA, Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, Melbourne, VIC 3010, Australia CONTRIBUTORS xxv SALVATORE VENTICINQUE, Department of Information Engineering, Second University of Naples, 81031 Aversa (CE), Italy UMBERTO VILLANO, Department of Engineering, University of Sannio, 82100 Benevento, Italy MASSIMO VILLARI, Department. of Mathematics Faculty of Engineering, University of Messina, 98166 Messina, Italy WILLIAM VOORSLUYS, Department of Computer Science and Software Engineering, The University of Melbourne, Parkville, Melbourne, VIC 3010, Australia STEFAN WESNER, High Performance Computing Center Stuttgart (HLRS), University of Stuttgart, 70550 Stuttgart, Germany YARON WOLFSTHAL, IBM Haifa Research Lab, Haifa University Campus, 31095, Haifa, Israel SONG WU, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China PART I FOUNDATIONS CHAPTER 1 INTRODUCTION TO CLOUD COMPUTING WILLIAM VOORSLUYS, JAMES BROBERG, and RAJKUMAR BUYYA 1.1 CLOUD COMPUTING IN A NUTSHELL When plugging an electric appliance into an outlet, we care neither how electric power is generated nor how it gets to that outlet. This is possible because electricity is virtualized; that is, it is readily available from a wall socket that hides power generation stations and a huge distribution grid. When extended to information technologies, this concept means delivering useful functions while hiding how their internals work. Computing itself, to be considered fully virtualized, must allow computers to be built from distributed components such as processing, storage, data, and software resources. Technologies such as cluster, grid, and now, cloud computing, have all aimed at allowing access to large amounts of computing power in a fully virtualized manner, by aggregating resources and offering a single system view. In addition, an important aim of these technologies has been delivering computing as a utility. Utility computing describes a business model for on-demand delivery of computing power; consumers pay providers based on usage (“pay- as-you-go”), similar to the way in which we currently obtain services from traditional public utility services such as water, electricity, gas, and telephony. Cloud computing has been coined as an umbrella term to describe a category of sophisticated on-demand computing services initially offered by commercial providers, such as Amazon, Google, and Microsoft. It denotes a model on which a computing infrastructure is viewed as a “cloud,” from which businesses and individuals access applications from anywhere in the world on demand. The main principle behind this model is offering computing, storage, and software “as a service.” Cloud Computing: Principles and Paradigms, Edited by Rajkumar Buyya, James Broberg and Andrzej Goscinski Copyright r 2011 John Wiley & Sons, Inc. 3 4 INTRODUCTION TO CLOUD COMPUTING Many practitioners in the commercial and academic spheres have attempted to define exactly what “cloud computing” is and what unique characteristics it presents. Buyya et al. have defined it as follows: “Cloud is a parallel and distributed computing system consisting of a collection of inter-connected and virtualised computers that are dynamically provisioned and presented as one or more unified computing resources based on service-level agreements (SLA) established through negotiation between the service provider and consumers.” Vaquero et al. have stated “clouds are a large pool of easily usable and accessible virtualized resources (such as hardware, development platforms and/or services). These resources can be dynamically reconfigured to adjust to a variable load (scale), allowing also for an optimum resource utilization. This pool of resources is typically exploited by a pay-per-use model in which guarantees are offered by the Infrastructure Provider by means of customized Service Level Agreements.” A recent McKinsey and Co. report claims that “Clouds are hardware- based services offering compute, network, and storage capacity where: Hardware management is highly abstracted from the buyer, buyers incur infrastructure costs as variable OPEX, and infrastructure capacity is highly elastic.” A report from the University of California Berkeley summarized the key characteristics of cloud computing as: “(1) the illusion of infinite computing resources; (2) the elimination of an up-front commitment by cloud users; and (3) the ability to pay for use... as needed...” The National Institute of Standards and Technology (NIST) charac- terizes cloud computing as “... a pay-per-use model for enabling available, convenient, on-demand network access to a shared pool of configurable computing resources (e.g. networks, servers, storage, applications, services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” In a more generic definition, Armbrust et al. define cloud as the “data center hardware and software that provide services.” Similarly, Sotomayor et al. point out that “cloud” is more often used to refer to the IT infrastructure deployed on an Infrastructure as a Service provider data center. While there are countless other definitions, there seems to be common characteristics between the most notable ones listed above, which a cloud should have: (i) pay-per-use (no ongoing commitment, utility prices); (ii) elastic capacity and the illusion of infinite resources; (iii) self-service interface; and (iv) resources that are abstracted or virtualised. In addition to raw computing and storage, cloud computing providers usually offer a broad range of software services. They also include APIs and development tools that allow developers to build seamlessly scalable applica- tions upon their services. The ultimate goal is allowing customers to run their everyday IT infrastructure “in the cloud.” A lot of hype has surrounded the cloud computing area in its infancy, often considered the most significant switch in the IT world since the advent of the 1.2 ROOTS OF CLOUD COMPUTING 5 Internet. In midst of such hype, a great deal of confusion arises when trying to define what cloud computing is and which computing infrastructures can be termed as “clouds.” Indeed, the long-held dream of delivering computing as a utility has been realized with the advent of cloud computing. However, over the years, several technologies have matured and significantly contributed to make cloud computing viable. In this direction, this introduction tracks the roots of cloud computing by surveying the main technological advancements that significantly contributed to the advent of this emerging field. It also explains concepts and developments by categorizing and comparing the most relevant R&D efforts in cloud computing, especially public clouds, management tools, and development frameworks. The most significant practical cloud computing realizations are listed, with special focus on architectural aspects and innovative technical features. 1.2 ROOTS OF CLOUD COMPUTING We can track the roots of clouds computing by observing the advancement of several technologies, especially in hardware (virtualization, multi-core chips), Internet technologies (Web services, service-oriented architectures, Web 2.0), distributed computing (clusters, grids), and systems management (autonomic computing, data center automation). Figure 1.1 shows the convergence of technology fields that significantly advanced and contributed to the advent of cloud computing. Some of these technologies have been tagged as hype in their early stages of development; however, they later received significant attention from academia and were sanctioned by major industry players. Consequently, a specification and standardization process followed, leading to maturity and wide adoption. The emergence of cloud computing itself is closely linked to the maturity of such technologies. We present a closer look at the technol- ogies that form the base of cloud computing, with the aim of providing a clearer picture of the cloud ecosystem as a whole. 1.2.1 From Mainframes to Clouds We are currently experiencing a switch in the IT world, from in-house generated computing power into utility-supplied computing resources delivered over the Internet as Web services. This trend is similar to what occurred about a century ago when factories, which used to generate their own electric power, realized that it is was cheaper just plugging their machines into the newly formed electric power grid. Computing delivered as a utility can be defined as “on demand delivery of infrastructure, applications, and business processes in a security-rich, shared, scalable, and based computer environment over the Internet for a fee”. 6 INTRODUCTION TO CLOUD COMPUTING Hardware Hardware Virtualization Multi-core chips Distributed Computing Internet Technologies SOA Utility & Cloud Web 2.0 Grid Computing Web Services Computing Mashups Autonomic Computing Data Center Automation Systems Management FIGURE 1.1. Convergence of various advances leading to the advent of cloud computing. This model brings benefits to both consumers and providers of IT services. Consumers can attain reduction on IT-related costs by choosing to obtain cheaper services from external providers as opposed to heavily investing on IT infrastructure and personnel hiring. The “on-demand” component of this model allows consumers to adapt their IT usage to rapidly increasing or unpredictable computing needs. Providers of IT services achieve better operational costs; hardware and software infrastructures are built to provide multiple solutions and serve many users, thus increasing efficiency and ultimately leading to faster return on investment (ROI) as well as lower total cost of ownership (TCO). Several technologies have in some way aimed at turning the utility comput- ing concept into reality. In the 1970s, companies who offered common data processing tasks, such as payroll automation, operated time-shared mainframes as utilities, which could serve dozens of applications and often operated close to 100% of their capacity. In fact, mainframes had to operate at very high utilization rates simply because they were very expensive and costs should be justified by efficient usage. The mainframe era collapsed with the advent of fast and inexpensive microprocessors and IT data centers moved to collections of commodity servers. Apart from its clear advantages, this new model inevitably led to isolation of workload into dedicated servers, mainly due to incompatibilities 1.2 ROOTS OF CLOUD COMPUTING 7 between software stacks and operating systems. In addition, the unavail- ability of efficient computer networks meant that IT infrastructure should be hosted in proximity to where it would be consumed. Altogether, these facts have prevented the utility computing reality of taking place on modern computer systems. Similar to old electricity generation stations, which used to power individual factories, computing servers and desktop computers in a modern organization are often underutilized, since IT infrastructure is configured to handle theore- tical demand peaks. In addition, in the early stages of electricity generation, electric current could not travel long distances without significant voltage losses. However, new paradigms emerged culminating on transmission systems able to make electricity available hundreds of kilometers far off from where it is generated. Likewise, the advent of increasingly fast fiber-optics networks has relit the fire, and new technologies for enabling sharing of computing power over great distances have appeared. These facts reveal the potential of delivering computing services with the speed and reliability that businesses enjoy with their local machines. The benefits of economies of scale and high utilization allow providers to offer computing services for a fraction of what it costs for a typical company that generates its own computing power. 1.2.2 SOA, Web Services, Web 2.0, and Mashups The emergence of Web services (WS) open standards has significantly con- tributed to advances in the domain of software integration. Web services can glue together applications running on different messaging product plat- forms, enabling information from one application to be made available to others, and enabling internal applications to be made available over the Internet. Over the years a rich WS software stack has been specified and standardized, resulting in a multitude of technologies to describe, compose, and orchestrate services, package and transport messages between services, publish and dis- cover services, represent quality of service (QoS) parameters, and ensure security in service access. WS standards have been created on top of existing ubiquitous technologies such as HTTP and XML, thus providing a common mechanism for delivering services, making them ideal for implementing a service-oriented architecture (SOA). The purpose of a SOA is to address requirements of loosely coupled, standards-based, and protocol-independent distributed computing. In a SOA, software resources are packaged as “services,” which are well-defined, self- contained modules that provide standard business functionality and are independent of the state or context of other services. Services are described in a standard definition language and have a published interface. The maturity of WS has enabled the creation of powerful services that can be accessed on-demand, in a uniform way. While some WS are published with the 8 INTRODUCTION TO CLOUD COMPUTING intent of serving end-user applications, their true power resides in its interface being accessible by other services. An enterprise application that follows the SOA paradigm is a collection of services that together perform complex business logic. This concept of gluing services initially focused on the enterprise Web, but gained space in the consumer realm as well, especially with the advent of Web 2.0. In the consumer Web, information and services may be programmatically aggregated, acting as building blocks of complex compositions, called service mashups. Many service providers, such as Amazon, del.icio.us, Facebook, and Google, make their service APIs publicly accessible using standard protocols such as SOAP and REST. Consequently, one can put an idea of a fully functional Web application into practice just by gluing pieces with few lines of code. In the Software as a Service (SaaS) domain, cloud applications can be built as compositions of other services from the same or different providers. Services such user authentication, e-mail, payroll management, and calendars are examples of building blocks that can be reused and combined in a business solution in case a single, ready-made system does not provide all those features. Many building blocks and solutions are now available in public marketplaces. For example, Programmable Web1 is a public repository of service APIs and mashups currently listing thousands of APIs and mashups. Popular APIs such as Google Maps, Flickr, YouTube, Amazon eCommerce, and Twitter, when combined, produce a variety of interesting solutions, from finding video game retailers to weather maps. Similarly, Salesforce.com’s offers AppExchange,2 which enables the sharing of solutions developed by third-party developers on top of Salesforce.com components. 1.2.3 Grid Computing Grid computing enables aggregation of distributed resources and transparently access to them. Most production grids such as TeraGrid and EGEE seek to share compute and storage resources distributed across different administrative domains, with their main focus being speeding up a broad range of scientific applications, such as climate modeling, drug design, and protein analysis. A key aspect of the grid vision realization has been building standard Web services-based protocols that allow distributed resources to be “discovered, accessed, allocated, monitored, accounted for, and billed for, etc., and in general managed as a single virtual system.” The Open Grid Services Archi- tecture (OGSA) addresses this need for standardization by defining a set of core capabilities and behaviors that address key concerns in grid systems. 1 http://www.programmableweb.com 2 http://sites.force.com/appexchange 1.2 ROOTS OF CLOUD COMPUTING 9 Globus Toolkit is a middleware that implements several standard Grid services and over the years has aided the deployment of several service-oriented Grid infrastructures and applications. An ecosystem of tools is available to interact with service grids, including grid brokers, which facilitate user inter- action with multiple middleware and implement policies to meet QoS needs. The development of standardized protocols for several grid computing activities has contributed—theoretically—to allow delivery of on-demand computing services over the Internet. However, ensuring QoS in grids has been perceived as a difficult endeavor. Lack of performance isolation has prevented grids adoption in a variety of scenarios, especially on environ- ments where resources are oversubscribed or users are uncooperative. Activities associated with one user or virtual organization (VO) can influence, in an uncontrollable way, the performance perceived by other users using the same platform. Therefore, the impossibility of enforcing QoS and guaranteeing execution time became a problem, especially for time-critical applications. Another issue that has lead to frustration when using grids is the availability of resources with diverse software configurations, including disparate operating systems, libraries, compilers, runtime environments, and so forth. At the same time, user applications would often run only on specially customized environ- ments. Consequently, a portability barrier has often been present on most grid infrastructures, inhibiting users of adopting grids as utility computing environments. Virtualization technology has been identified as the perfect fit to issues that have caused frustration when using grids, such as hosting many dissimilar software applications on a single physical platform. In this direction, some research projects (e.g., Globus Virtual Workspaces ) aimed at evolving grids to support an additional layer to virtualize computation, storage, and network resources. 1.2.4 Utility Computing With increasing popularity and usage, large grid installations have faced new problems, such as excessive spikes in demand for resources coupled with strategic and adversarial behavior by users. Initially, grid resource management techniques did not ensure fair and equitable access to resources in many systems. Traditional metrics (throughput, waiting time, and slowdown) failed to capture the more subtle requirements of users. There were no real incentives for users to be flexible about resource requirements or job deadlines, nor provisions to accommodate users with urgent work. In utility computing environments, users assign a “utility” value to their jobs, where utility is a fixed or time-varying valuation that captures various QoS constraints (deadline, importance, satisfaction). The valuation is the amount they are willing to pay a service provider to satisfy their demands. The service providers then attempt to maximize their own utility, where said utility may directly correlate with their profit. Providers can choose to prioritize 10 INTRODUCTION TO CLOUD COMPUTING high yield (i.e., profit per unit of resource) user jobs, leading to a scenario where shared systems are viewed as a marketplace, where users compete for resources based on the perceived utility or value of their jobs. Further information and comparison of these utility computing environments are available in an extensive survey of these platforms. 1.2.5 Hardware Virtualization Cloud computing services are usually backed by large-scale data centers composed of thousands of computers. Such data centers are built to serve many users and host many disparate applications. For this purpose, hardware virtualization can be considered as a perfect fit to overcome most operational issues of data center building and maintenance. The idea of virtualizing a computer system’s resources, including processors, memory, and I/O devices, has been well established for decades, aiming at improving sharing and utilization of computer systems. Hardware virtua- lization allows running multiple operating systems and software stacks on a single physical platform. As depicted in Figure 1.2, a software layer, the virtual machine monitor (VMM), also called a hypervisor, mediates access to the physical hardware presenting to each guest operating system a virtual machine (VM), which is a set of virtual platform interfaces. The advent of several innovative technologies—multi-core chips, paravir- tualization, hardware-assisted virtualization, and live migration of VMs—has contributed to an increasing adoption of virtualization on server systems. Traditionally, perceived benefits were improvements on sharing and utilization, better manageability, and higher reliability. More recently, with the adoption of virtualization on a broad range of server and client systems, researchers and practitioners have been emphasizing three basic capabilities regarding Virtual Machine 1 Virtual Machine 2 Virtual Machine N User software User software User software Email Server Facebook App App A App X Data Web Ruby on Java App B App Y base Server Rails Linux Guest OS Virtual Machine Monitor (Hypervisor) Hardware FIGURE 1.2. A hardware virtualized server hosting three virtual machines, each one running distinct operating system and user level software stack. 1.2 ROOTS OF CLOUD COMPUTING 11 management of workload in a virtualized system, namely isolation, consolida- tion, and migration. Workload isolation is achieved since all program instructions are fully confined inside a VM, which leads to improvements in security. Better reliability is also achieved because software failures inside one VM do not affect others. Moreover, better performance control is attained since execution of one VM should not affect the performance of another VM. The consolidation of several individual and heterogeneous workloads onto a single physical platform leads to better system utilization. This practice is also employed for overcoming potential software and hardware incompatibilities in case of upgrades, given that it is possible to run legacy and new operation systems concurrently. Workload migration, also referred to as application mobility , targets at facilitating hardware maintenance, load balancing, and disaster recovery. It is done by encapsulating a guest OS state within a VM and allowing it to be suspended, fully serialized, migrated to a different platform, and resumed immediately or preserved to be restored at a later date. A VM’s state includes a full disk or partition image, configuration files, and an image of its RAM. A number of VMM platforms exist that are the basis of many utility or cloud computing environments. The most notable ones, VMWare, Xen, and KVM, are outlined in the following sections. VMWare ESXi. VMware is a pioneer in the virtualization market. Its ecosys- tem of tools ranges from server and desktop virtualization to high-level management tools. ESXi is a VMM from VMWare. It is a bare-metal hypervisor, meaning that it installs directly on the physical server, whereas others may require a host operating system. It provides advanced virtualization techniques of processor, memory, and I/O. Especially, through memory ballooning and page sharing, it can overcommit memory, thus increasing the density of VMs inside a single physical server. Xen. The Xen hypervisor started as an open-source project and has served as a base to other virtualization products, both commercial and open-source. It has pioneered the para-virtualization concept, on which the guest operating system, by means of a specialized kernel, can interact with the hypervisor, thus significantly improving performance. In addition to an open-source distribu- tion , Xen currently forms the base of commercial hypervisors of a number of vendors, most notably Citrix XenServer and Oracle VM. KVM. The kernel-based virtual machine (KVM) is a Linux virtualization subsystem. Is has been part of the mainline Linux kernel since version 2.6.20, thus being natively supported by several distributions. In addition, activities such as memory management and scheduling are carried out by existing kernel 12 INTRODUCTION TO CLOUD COMPUTING features, thus making KVM simpler and smaller than hypervisors that take control of the entire machine. KVM leverages hardware-assisted virtualization, which improves perfor- mance and allows it to support unmodified guest operating systems ; currently, it supports several versions of Windows, Linux, and UNIX. 1.2.6 Virtual Appliances and the Open Virtualization Format An application combined with the environment needed to run it (operating system, libraries, compilers, databases, application containers, and so forth) is referred to as a “virtual appliance.” Packaging application environments in the shape of virtual appliances eases software customization, configuration, and patching and improves portability. Most commonly, an appliance is shaped as a VM disk image associated with hardware requirements, and it can be readily deployed in a hypervisor. On-line marketplaces have been set up to allow the exchange of ready-made appliances containing popular operating systems and useful software combina- tions, both commercial and open-source. Most notably, the VMWare virtual appliance marketplace allows users to deploy appliances on VMWare hypervi- sors or on partners public clouds , and Amazon allows developers to share specialized Amazon Machine Images (AMI) and monetize their usage on Amazon EC2. In a multitude of hypervisors, where each one supports a different VM image format and the formats are incompatible with one another, a great deal of interoperability issues arises. For instance, Amazon has its Amazon machine image (AMI) format, made popular on the Amazon EC2 public cloud. Other formats are used by Citrix XenServer, several Linux distributions that ship with KVM, Microsoft Hyper-V, and VMware ESX. In order to facilitate packing and distribution of software to be run on VMs several vendors, including VMware, IBM, Citrix, Cisco, Microsoft, Dell, and HP, have devised the Open Virtualization Format (OVF). It aims at being “open, secure, portable, efficient and extensible”. An OVF package consists of a file, or set of files, describing the VM hardware characteristics (e.g., memory, network cards, and disks), operating system details, startup, and shutdown actions, the virtual disks themselves, and other metadata containing product and licensing information. OVF also supports complex packages composed of multiple VMs (e.g., multi-tier applications). OVF’s extensibility has encouraged additions relevant to management of data centers and clouds. Mathews et al. have devised virtual machine contracts (VMC) as an extension to OVF. A VMC aids in communicating and managing the complex expectations that VMs have of their runtime environ- ment and vice versa. A simple example of a VMC is when a cloud consumer wants to specify minimum and maximum amounts of a resource that a VM needs to function; similarly the cloud provider could express resource limits as a way to bound resource consumption and costs. 1.3 LAYERS AND TYPES OF CLOUDS 13 1.2.7 Autonomic Computing The increasing complexity of computing systems has motivated research on autonomic computing, which seeks to improve systems by decreasing human involvement in their operation. In other words, systems should manage themselves, with high-level guidance from humans. Autonomic, or self-managing, systems rely on monitoring probes and gauges (sensors), on an adaptation engine (autonomic manager) for computing optimizations based on monitoring data, and on effectors to carry out changes on the system. IBM’s Autonomic Computing Initiative has contributed to define the four properties of autonomic systems: self-configuration, self- optimization, self-healing, and self-protection. IBM has also suggested a reference model for autonomic control loops of autonomic managers, called MAPE-K (Monitor Analyze Plan Execute—Knowledge) [34, 35]. The large data centers of cloud computing providers must be managed in an efficient way. In this sense, the concepts of autonomic computing inspire software technologies for data center automation, which may perform tasks such as: management of service levels of running applications; management of data center capacity; proactive disaster recovery; and automation of VM provisioning. 1.3 LAYERS AND TYPES OF CLOUDS Cloud computing services are divided into three classes, according to the abstraction level of the capability provided and the service model of providers, namely: (1) Infrastructure as a Service, (2) Platform as a Service, and (3) Software as a Service. Figure 1.3 depicts the layered organization of the cloud stack from physical infrastructure to applications. These abstraction levels can also be viewed as a layered architecture where services of a higher layer can be composed from services of the underlying layer. The reference model of Buyya et al. explains the role of each layer in an integrated architecture. A core middleware manages physical resources and the VMs deployed on top of them; in addition, it provides the required features (e.g., accounting and billing) to offer multi-tenant pay-as-you-go services. Cloud development environments are built on top of infrastructure services to offer application development and deployment capabilities; in this level, various programming models, libraries, APIs, and mashup editors enable the creation of a range of business, Web, and scientific applications. Once deployed in the cloud, these applications can be consumed by end users. 1.3.1 Infrastructure as a Service Offering virtualized resources (computation, storage, and communication) on demand is known as Infrastructure as a Service (IaaS). A cloud infrastructure 14 INTRODUCTION TO CLOUD COMPUTING Service Main Access & Service content Class Management Tool Cloud Applications Web Browser Social networks, Office suites, CRM, SaaS Video processing Cloud Cloud Platform Development Environment Programming languages, Frameworks, PaaS Mashups editors, Structured data Virtual Cloud Infrastructure Infrastructure Manager Compute Servers, Data Storage, IaaS 17 Firewall, Load Balancer FIGURE 1.3. The cloud computing stack. enables on-demand provisioning of servers running several choices of operating systems and a customized software stack. Infrastructure services are considered to be the bottom layer of cloud computing systems. Amazon Web Services mainly offers IaaS, which in the case of its EC2 service means offering VMs with a software stack that can be customized similar to how an ordinary physical server would be customized. Users are given privileges to perform numerous activities to the server, such as: starting and stopping it, customizing it by installing software packages, attaching virtual disks to it, and configuring access permissions and firewalls rules. 1.3.2 Platform as a Service In addition to infrastructure-oriented clouds that provide raw computing and storage services, another approach is to offer a higher level of abstraction to make a cloud easily programmable, known as Platform as a Service (PaaS). A cloud platform offers an environment on which developers create and deploy applications and do not necessarily need to know how many processors or how much memory that applications will be using. In addition, multiple program- ming models and specialized services (e.g., data access, authentication, and payments) are offered as building blocks to new applications. Google AppEngine, an example of Platform as a Service, offers a scalable environment for developing and hosting Web applications, which should be written in specific programming languages such as Python or Java, and use the services’ own proprietary structured object data store. Building blocks 1.3 LAYERS AND TYPES OF CLOUDS 15 include an in-memory object cache (memcache), mail service, instant messaging service (XMPP), an image manipulation service, and integration with Google Accounts authentication service. 1.3.3 Software as a Service Applications reside on the top of the cloud stack. Services provided by this layer can be accessed by end users through Web portals. Therefore, consumers are increasingly shifting from locally installed computer programs to on-line software services that offer the same functionally. Traditional desktop applica- tions such as word processing and spreadsheet can now be accessed as a service in the Web. This model of delivering applications, known as Software as a Service (SaaS), alleviates the burden of software maintenance for customers and simplifies development and testing for providers [37, 41]. Salesforce.com, which relies on the SaaS model, offers business productivity applications (CRM) that reside completely on their servers, allowing costumers to customize and access applications on demand. 1.3.4 Deployment Models Although cloud computing has emerged mainly from the appearance of public computing utilities, other deployment models, with variations in physical location and distribution, have been adopted. In this sense, regardless of its service class, a cloud can be classified as public, private, community, or hybrid based on model of deployment as shown in Figure 1.4. Public/Internet Private/Enterprise Hybrid/Mixed Clouds Clouds Clouds 3rd party, Cloud computing Mixed usage of multi-tenant Cloud model run private and public infrastructure within a company’s Clouds: & services: Leasing public own Data Center/ cloud services * available on infrastructure for when private cloud subscription basis internal and/or capacity is (pay as you go) partners use. insufficient FIGURE 1.4. Types of clouds based on deployment models. 16 INTRODUCTION TO CLOUD COMPUTING Armbrust et al. propose definitions for public cloud as a “cloud made available in a pay-as-you-go manner to the general public” and private cloud as “internal data center of a business or other organization, not made available to the general public.” In most cases, establishing a private cloud means restructuring an existing infrastructure by adding virtualization and cloud-like interfaces. This allows users to interact with the local data center while experiencing the same advantages of public clouds, most notably self-service interface, privileged access to virtual servers, and per-usage metering and billing. A community cloud is “shared by several organizations and supports a specific community that has shared concerns (e.g., mission, security require- ments, policy, and compliance considerations).” A hybrid cloud takes shape when a private cloud is supplemented with computing capacity from public clouds. The approach of temporarily renting capacity to handle spikes in load is known as “cloud-bursting”. 1.4 DESIRED FEATURES OF A CLOUD Certain features of a cloud are essential to enable services that truly represent the cloud computing model and satisfy expectations of consumers, and cloud offerings must be (i) self-service, (ii) per-usage metered and billed, (iii) elastic, and (iv) customizable. 1.4.1 Self-Service Consumers of cloud computing services expect on-demand, nearly instant access to resources. To support this expectation, clouds must allow self-service access so that customers can request, customize, pay, and use services without intervention of human operators. 1.4.2 Per-Usage Metering and Billing Cloud computing eliminates up-front commitment by users, allowing them to request and use only the necessary amount. Services must be priced on a short- term basis (e.g., by the hour), allowing users to release (and not pay for) resources as soon as they are not needed. For these reasons, clouds must implement features to allow efficient trading of service such as pricing, accounting, and billing. Metering should be done accordingly for different types of service (e.g., storage, processing, and bandwidth) and usage promptly reported, thus providing greater transparency. 1.4.3 Elasticity Cloud computing gives the illusion of infinite computing resources available on demand. Therefore users expect clouds to rapidly provide resources in any 1.5 CLOUD INFRASTRUCTURE MANAGEMENT 17 quantity at any time. In particular, it is expected that the additional resources can be (a) provisioned, possibly automatically, when an application load increases and (b) released when load decreases (scale up and down). 1.4.4 Customization In a multi-tenant cloud a great disparity between user needs is often the case. Thus, resources rented from the cloud must be highly customizable. In the case of infrastructure services, customization means allowing users to deploy specialized virtual appliances and to be given privileged (root) access to the virtual servers. Other service classes (PaaS and SaaS) offer less flexibility and are not suitable for general-purpose computing , but still are expected to provide a certain level of customization. 1.5 CLOUD INFRASTRUCTURE MANAGEMENT A key challenge IaaS providers face when building a cloud infrastructure is managing physical and virtual resources, namely servers, storage, and net- works, in a holistic fashion. The orchestration of resources must be performed in a way to rapidly and dynamically provision resources to applications. The software toolkit responsible for this orchestration is called a virtual infrastructure manager (VIM). This type of software resembles a traditional operating system—but instead of dealing with a single computer, it aggregates resources from multiple computers, presenting a uniform view to user and applications. The term “cloud operating system” is also used to refer to it. Other terms include “infrastructure sharing software ” and “virtual infra- structure engine.” Sotomayor et al. , in their description of the cloud ecosystem of software tools, propose a differentiation between two categories of tools used to manage clouds. The first category—cloud toolkits—includes those that “expose a remote and secure interface for creating, controlling and monitoring virtualize resources,” but do not specialize in VI management. Tools in the second category—the virtual infrastructure managers—provide advanced features such as automatic load balancing and server consolidation, but do not expose remote cloud-like interfaces. However, the authors point out that there is a superposition between the categories; cloud toolkits can also manage virtual infrastructures, although they usually provide less sophisticated features than specialized VI managers do. The availability of a remote cloud-like interface and the ability of managing many users and their permissions are the primary features that would distinguish “cloud toolkits” from “VIMs.” However, in this chapter, we place both categories of tools under the same group (of the VIMs) and, when applicable, we highlight the availability of a remote interface as a feature. 18 INTRODUCTION TO CLOUD COMPUTING Virtually all VIMs we investigated present a set of basic features related to managing the life cycle of VMs, including networking groups of VMs together and setting up virtual disks for VMs. These basic features pretty much define whether a tool can be used in practical cloud deployments or not. On the other hand, only a handful of software present advanced features (e.g., high availability) which allow them to be used in large-scale production clouds. 1.5.1 Features We now present a list of both basic and advanced features that are usually available in VIMs. Virtualization Support. The multi-tenancy aspect of clouds requires multiple customers with disparate requirements to be served by a single hardware infrastructure. Virtualized resources (CPUs, memory, etc.) can be sized and resized with certain flexibility. These features make hardware virtualization, the ideal technology to create a virtual infrastructure that partitions a data center among multiple tenants. Self-Service, On-Demand Resource Provisioning. Self-service access to resources has been perceived as one the most attractive features of clouds. This feature enables users to directly obtain services from clouds, such as spawning the creation of a server and tailoring its software, configurations, and security policies, without interacting with a human system administrator. This cap- ability “eliminates the need for more time-consuming, labor-intensive, human- driven procurement processes familiar to many in IT”. Therefore, exposing a self-service interface, through which users can easily interact with the system, is a highly desirable feature of a VI manager. Multiple Backend Hypervisors. Different virtualization models and tools offer different benefits, drawbacks, and limitations. Thus, some VI managers provide a uniform management layer regardless of the virtualization technol- ogy used. This characteristic is more visible in open-source VI managers, which usually provide pluggable drivers to interact with multiple hypervisors. In this direction, the aim of libvirt is to provide a uniform API that VI managers can use to manage domains (a VM or container running an instance of an operating system) in virtualized nodes using standard operations that abstract hypervisor specific calls. Storage Virtualization. Virtualizing storage means abstracting logical sto- rage from physical storage. By consolidating all available storage devices in a data center, it allows creating virtual disks independent from device and location. Storage devices are commonly organized in a storage area network (SAN) and attached to servers via protocols such as Fibre Channel, iSCSI, and 1.5 CLOUD INFRASTRUCTURE MANAGEMENT 19 NFS; a storage controller provides the layer of abstraction between virtual and physical storage. In the VI management sphere, storage virtualization support is often restricted to commercial products of companies such as VMWare and Citrix. Other products feature ways of pooling and managing storage devices, but administrators are still aware of each individual device. Interface to Public Clouds. Researchers have perceived that extending the capacity of a local in-house computing infrastructure by borrowing resources from public clouds is advantageous. In this fashion, institutions can make good use of their available resources and, in case of spikes in demand, extra load can be offloaded to rented resources. A VI manager can be used in a hybrid cloud setup if it offers a driver to Download from Wow! eBook manage the life cycle of virtualized resources obtained from external cloud providers. To the applications, the use of leased resources must ideally be transparent. Virtual Networking. Virtual networks allow creating an isolated network on top of a physical infrastructure independently from physical topology and locations. A virtual LAN (VLAN) allows isolating traffic that shares a switched network, allowing VMs to be grouped into the same broadcast domain. Additionally, a VLAN can be configured to block traffic originated from VMs from other networks. Similarly, the VPN (virtual private network) concept is used to describe a secure and private overlay network on top of a public network (most commonly the public Internet). Support for creating and configuring virtual networks to group VMs placed throughout a data center is provided by most VI managers. Additionally, VI managers that interface with public clouds often support secure VPNs connecting local and remote VMs. Dynamic Resource Allocation. Increased awareness of energy consumption in data centers has encouraged the practice of dynamic consolidating VMs in a fewer number of servers. In cloud infrastructures, where applications have variable and dynamic needs, capacity management and demand predic- tion are especially complicated. This fact triggers the need for dynamic resource allocation aiming at obtaining a timely match of supply and demand. Energy consumption reduction and better management of SLAs can be achieved by dynamically remapping VMs to physical machines at regular intervals. Machines that are not assigned any VM can be turned off or put on a low power state. In the same fashion, overheating can be avoided by moving load away from hotspots. A number of VI managers include a dynamic resource allocation feature that continuously monitors utilization across resource pools and reallocates avail- able resources among VMs according to application needs. 20 INTRODUCTION TO CLOUD COMPUTING Virtual Clusters. Several VI managers can holistically manage groups of VMs. This feature is useful for provisioning computing virtual clusters on demand, and interconnected VMs for multi-tier Internet applications. Reservation and Negotiation Mechanism. When users request computa- tional resources to available at a specific time, requests are termed advance reservations (AR), in contrast to best-effort requests, when users request resources whenever available. To support complex requests, such as AR, a VI manager must allow users to “lease” resources expressing more complex terms (e.g., the period of time of a reservation). This is especially useful in clouds on which resources are scarce; since not all requests may be satisfied immediately, they can benefit of VM placement strategies that support queues, priorities, and advance reservations. Additionally, leases may be negotiated and renegotiated, allowing provider and consumer to modify a lease or present counter proposals until an agreement is reached. This feature is illustrated by the case in which an AR request for a given slot cannot be satisfied, but the provider can offer a distinct slot that is still satisfactory to the user. This problem has been addressed in OpenPEX, which incorporates a bilateral negotiation protocol that allows users and providers to come to an alternative agreement by exchanging offers and counter offers. High Availability and Data Recovery. The high availability (HA) feature of VI managers aims at minimizing application downtime and preventing business disruption. A few VI managers accomplish this by providing a failover mechanism, which detects failure of both physical and virtual servers and restarts VMs on healthy physical servers. This style of HA protects from host, but not VM, failures [57, 58]. For mission critical applications, when a failover solution involving restart- ing VMs does not suffice, additional levels of fault tolerance that rely on redundancy of VMs are implemented. In this style, redundant and synchro- nized VMs (running or in standby) are kept in a secondary physical server. The HA solution monitors failures of system components such as servers, VMs, disks, and network and ensures that a duplicate VM serves the application in case of failures. Data backup in clouds should take into account the high data volume involved in VM management. Frequent backup of a large number of VMs, each one with multiple virtual disks attached, should be done with minimal interference in the systems performance. In this sense, some VI managers offer data protection mechanisms that perform incremental backups of VM images. The backup workload is often assigned to proxies, thus offloading production server and reducing network overhead. 1.5 CLOUD INFRASTRUCTURE MANAGEMENT 21 1.5.2 Case Studies In this section, we describe the main features of the most popular VI managers available. Only the most prominent and distinguishing features of each tool are discussed in detail. A detailed side-by-side feature comparison of VI managers is presented in Table 1.1. Apache VCL. The Virtual Computing Lab [60, 61] project has been incepted in 2004 by researchers at the North Carolina State University as a way to provide customized environments to computer lab users. The software compo- nents that support NCSU’s initiative have been released as open-source and incorporated by the Apache Foundation. Since its inception, the main objective of VCL has been providing desktop (virtual lab) and HPC computing environments anytime, in a flexible cost- effective way and with minimal intervention of IT staff. In this sense, VCL was one of the first projects to create a tool with features such as: self-service Web portal, to reduce administrative burden; advance reservation of capacity, to provide resources during classes; and deployment of customized machine images on multiple computers, to provide clusters on demand. In summary, Apache VCL provides the following features: (i) multi-platform controller, based on Apache/PHP; (ii) Web portal and XML-RPC interfaces; (iii) support for VMware hypervisors (ESX, ESXi, and Server); (iv) virtual networks; (v) virtual clusters; and (vi) advance reservation of capacity. AppLogic. AppLogic is a commercial VI manager, the flagship product of 3tera Inc. from California, USA. The company has labeled this product as a Grid Operating System. AppLogic provides a fabric to manage clusters of virtualized servers, focusing on managing multi-tier Web applications. It views an entire applica- tion as a collection of components that must be managed as a single entity. Several components such as firewalls, load balancers, Web servers, application servers, and database servers can be set up and linked together. Whenever the application is started, the system manufactures and assembles the virtual infrastructure required to run it. Once the application is stopped, AppLogic tears down the infrastructure built for it. AppLogic offers dynamic appliances to add functionality such as Disaster Recovery and Power optimization to applications. The key differential of this approach is that additional functionalities are implemented as another pluggable appliance instead of being added as a core functionality of the VI manager. In summary, 3tera AppLogic provides the following features: Linux-based controller; CLI and GUI interfaces; Xen backend; Global Volume Store (GVS) storage virtualization; virtual networks; virtual clusters; dynamic resource allocation; high availability; and data protection. 22 TABLE 1.1. Feature Comparison of Virtual Infrastructure Managers Installation Client UI, Advance Platform of API, Language Backend Storage Interface to Virtual Dynamic Resource Reservation of High Data License Controller Bindings Hypervisor(s) Virtualization Public Cloud Networks Allocation Capacity Availability Protection Apache Apache v2 Multi- Portal, VMware No No Yes No Yes No No VCL platform XML-RPC ESX, ESXi, (Apache/ Server PHP) AppLogic Proprietary Linux GUI, CLI Xen Global No Yes Yes No Yes Yes Volume Store (GVS) Citrix Essentials Proprietary Windows GUI, CLI, XenServer, Citrix No Yes Yes No Yes Yes Portal, Hyper-V Storage XML-RPC Link Enomaly ECP GPL v3 Linux Portal, WS Xen No Amazon EC2 Yes No No No No Eucalyptus BSD Linux EC2 WS, CLI Xen, KVM No EC2 Yes No No No No Nimbus Apache v2 Linux EC2 WS, Xen, KVM No EC2 Yes Via Yes (via No No WSRF, CLI integration with integration with OpenNebula OpenNebula) OpenNEbula Apache v2 Linux XML-RPC, Xen, KVM No Amazon EC2, Yes Yes Yes No No CLI, Java Elastic Hosts (via Haizea) OpenPEX GPL v2 Multiplatform Portal, WS XenServer No No No No Yes No No (Java) oVirt GPL v2 Fedora Linux Portal KVM No No No No No No No Platform Proprietary Linux Portal Hyper-V No EC2, IBM CoD, Yes Yes Yes Unclear Unclear ISF XenServer, HP Enterprise VMWare ESX Services Platform VMO Proprietary Linux, Portal XenServer No No Yes Yes No Yes No Windows VMWare Proprietary Linux, CLI, GUI, VMware VMware VMware Yes VMware No Yes Yes vSphere Windows Portal, WS ESX, ESXi vStorage vCloud partners DRM VMFS 1.5 CLOUD INFRASTRUCTURE MANAGEMENT 23 Citrix Essentials. The Citrix Essentials suite is one the most feature complete VI management software available, focusing on management and automation of data centers. It is essentially a hypervisor-agnostic solution, currently supporting Citrix XenServer and Microsoft Hyper-V. By providing several access interfaces, it facilitates both human and programmatic interaction with the controller. Automation of tasks is also aided by a workflow orchestration mechanism. In summary, Citrix Essentials provides the following features: Windows- based controller; GUI, CLI, Web portal, and XML-RPC interfaces; support for XenServer and Hyper-V hypervisors; Citrix Storage Link storage virtuali- zation; virtual networks; dynamic resource allocation; three-level high avail- ability (i.e., recovery by VM restart, recovery by activating paused duplicate VM, and running duplicate VM continuously) ; data protection with Citrix Consolidated Backup. Enomaly ECP. The Enomaly Elastic Computing Platform, in its most complete edition, offers most features a service provider needs to build an IaaS cloud. Most notably, ECP Service Provider Edition offers a Web-based customer dashboard that allows users to fully control the life cycle of VMs. Usage accounting is performed in real time and can be viewed by users. Similar to the functionality of virtual appliance marketplaces, ECP allows providers and users to package and exchange applications. In summary, Enomaly ECP provides the following features: Linux-based controller; Web portal and Web services (REST) interfaces; Xen back-end; interface to the Amazon EC2 public cloud; virtual networks; virtual clusters (ElasticValet). Eucalyptus. The Eucalyptus framework was one of the first open-source projects to focus on building IaaS clouds. It has been developed with the intent of providing an open-source implementation nearly identical in functionality to Amazon Web Services APIs. Therefore, users can interact with a Eucalyptus cloud using the same tools they use to access Amazon EC2. It also distinguishes itself from other tools because it provides a storage cloud API—emulating the Amazon S3 API—for storing general user data and VM images. In summary, Eucalyptus provides the following features: Linux-based con- troller with administration Web portal; EC2-compatible (SOAP, Query) and S3- compatible (SOAP, REST) CLI and Web portal interfaces; Xen, KVM, and VMWare backends; Amazon EBS-compatible virtual storage devices; interface to the Amazon EC2 public cloud; virtual networks. Nimbus3. The Nimbus toolkit is built on top of the Globus framework. Nimbus provides most features in common with other open-source VI managers, such as an EC2-compatible front-end API, support to Xen, and a backend interface to Amazon EC2. However, it distinguishes from others by 24 INTRODUCTION TO CLOUD COMPUTING providing a Globus Web Services Resource Framework (WSRF) interface. It also provides a backend service, named Pilot, which spawns VMs on clusters managed by a local resource manager (LRM) such as PBS and SGE. Nimbus’ core was engineered around the Spring framework to be easily extensible, thus allowing several internal components to be replaced and also eases the integration with other systems. In summary, Nimbus provides the following features: Linux-based control- ler; EC2-compatible (SOAP) and WSRF interfaces; Xen and KVM backend and a Pilot program to spawn VMs through an LRM; interface to the Amazon EC2 public cloud; virtual networks; one-click virtual clusters. OpenNebula. OpenNebula is one of the most feature-rich open-source VI managers. It was initially conceived to manage local virtual infrastructure, but has also included remote interfaces that make it viable to build public clouds. Altogether, four programming APIs are available: XML-RPC and libvirt for local interaction; a subset of EC2 (Query) APIs and the OpenNebula Cloud API (OCA) for public access [7, 65]. Its architecture is modular, encompassing several specialized pluggable components. The Core module or