Research Reports on Cloud Computing

bhaumik.rathod.121 · Jun 29, 2013

Description
Cloud computing is a colloquial expression used to describe a variety of different computing concepts that involve a large number of computers that are connected through a real-time communication network (typically the Internet).

Research Reports on Cloud Computing
WHAT IS CLOUD COMPUTING 1. Definition There are many definitions of cloud computing due to its fast development and vast number of research papers associated with it. A definition from the NIST Working Definition of Cloud Computing published by the U.S.Government's National Institute of Standards and Technology is “Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” The above definition is by far the clearest and most comprehensive definition of cloud computing and is widely used in researches and papers. It points out five main characteristics of cloud computing which are: On-demand self-service: cloud providers allow their computing resources such as processing power, storage, virtual machine, etc. to be acquired and used anytime their customer needs without the interaction of human being. Broad network access: the provided services can be accessed through a network (typically the Internet) from a variety of devices such as laptops or smartphones. Resources pooling: the same resources are shared over many users. This referred to as multitenancy where for example a physical server can host multiple virtual machines belong to different users. Rapid elasticity: resources from the cloud can be quickly scale up, when demanded, or down, when no longer required, according to the user’s need. Measured service: resources usage is measured using appropriate metrics.

-

-

2. Related Technologies Cloud computing is often compared with the following technologies, each of them shared certain aspect of cloud computing and they bring different benefits to cloud computing. Grid Computing: grid computing is the combination of compute resources from multiple administrative domains applied to a common task. Grid computing is considered as the backbone and infrastructure support of cloud computing since it offered cloud computing scalability, multi-tenancy and multitasking. In grid computing, scalability is achieved through load balancing of application instances running separately on a variety of operating system and connected through web services. Compute resources

are allocated and de-allocated on-demand. System resources go up and down as more user or instances join and leave the network. The multi-tenancy is where the physical and virtual resources are able to serve multiple clients (tenants). Multitasking refer to when multiple tasks, also known as processes, share compute resources. Multi-tenancy and multitasking allow many customers to perform different tasks, accessing a single or multiple application instances. Sharing resources among large pool of users helps in reducing infrastructure cost and peak load capacity. Utility Computing: cloud computing realized the utility computingconcept, which is to provide computation as an utility like the way electricity or gas are provided, but on wider and larger scale. There are few cases describe the benefits that utility computing bring to cloud computing. First, before cloud computing, companies that want to provide new IT service must plan far ahead for provisioning since the demand for service varies with time. Provisioning a data center for the peak load, which only happens a few days per month, leads to resources underutilization at other times. Instead, cloud computing let companies pay by the hour of computing resource usage and thus leads to cost savings even if the hourly rate is higher than the rate to own one. Another case is when the demand is unknown in advance. For example, a new service hosted on the cloud must be able to support a raising surge of attentions and accesses during its first few weeks of debut, followed potentially by the reduction of customers as the service start to enter its stable state. Finally, companies that perform batch analysis can make use of the “cost associativity” of cloud computing for faster computation, the idea behind this is using 1000 servers for one hour cost the same as using one server for 1000 hours. With the ability to quickly scale up or down according to the users need, companies will be able to analyze the same amount of data and paying the same cost but receive the result in a much better timely fashion. Virtualization: as the five main characteristics of cloud are presented above, the key to cloud computing is the ability to access compute resources on demand and virtualization is the right way to do so. Without virtualization, a single server is really able to dedicate itself to one task, application, or function and serve that over a network, typically the Internet, thus facing with the problem of over allocating compute resources. Using virtualization, a powerful server can host multiple virtual servers, each with its own hardware specification like CPU speed, RAM, and storage; each of these servers then can be put to a single use. Since each virtual server is allocated with only enough compute resources, more servers can be created, but if the demand goes higher, more power can be allocated to that server, or lowered if the demand drops. While virtualization may be used to provide cloud computing, cloud computing is quite different from virtualization. Cloud computing may look like virtualization and is very similar in fashion; however, it can be better described as a service where virtualization is part of a physical infrastructure. Autonomic Computing: autonomic computing refers to the self-managing characteristics of distributed computing resources, adapting to unpredictable change while making its intrinsic complexity transparent to users and operators. A typical autonomic computing system will have the following characteristics: Self-Configuration: Automatic configuration of components; Self-Healing: Automatic discovery, and correction of faults;

-

Self-Optimization: Automatic monitoring and control of resources to ensure the optimal functioning with respect to the defined requirements; Self-Protection: Proactive identification and protection from arbitrary attacks.

The goal of autonomic computing is to lower the management complexity of today’s computer system. Although cloud computing adapt some feature of autonomic computing such as autonomic resource provisioning, its objective is to lower the resource cost rather than to reduce system management complexity. 3. Cloud Computing Architecture 3.1. Architecture In this section, we will discuss about architecture, deployment models and service model of cloud computing. In general, a typical cloud computing system is made up of 4 essential layers: the hardware layer, the infrastructure layer, the platform layer and the application layer. They will be briefly described: - The hardware layer: this is where the physical resources of a cloud is managed including servers, routers, switchers , power and cooling systems. The hardware layer is typical a data center that contains thousands of servers, routers, switchers, etc. which are organized and interconnected. - The infrastructure layer: also known as the virtualization layer, this layer in charge of creating a pool of storage and computing resources by partitioning the physical resource using virtualization technologies. This is an essential component of cloud computing since some key features such as dynamic resource allocation is only made available through virtualization. - The platform layer: build on top of the infrastructure layer, the platform layer consists of operating systems and application frameworks. For example, Google AppEngine provides API for implementing storage, database and business logic of typical web application. - The application layer: this layer is probably the most visible layer to end user. The application layer consists of the actual cloud applications. Different from traditional application, cloud application can benefit from the automatic scaling feature to achieve better performance, availability and lower operating cost. Compared to a typical service hosting environment, cloud computing architecture offers a more flexible operating manner and is more modular. Each layer is loosely coupled with the layers above and below, allowing each layer to be developed separately. The modularity architecture allows various type of application to be hosted on the cloud while reducing management and maintenance cost. 3.2. Deployment models 3.2.1.Private cloud: The cloud infrastructure is provisioned for exclusive use by a single organization consuming multiple users (e.g. business units). It may be owned, managed, and operated by the organization, a third party, or some combination of them, and it may exist on or off premises. 3.2.2.Community cloud: The cloud infrastructure is provisioned for exclusive use by a specific community of users from organizations that have shared concerns (e.g., mission, security requirements, policy, and compliance considerations). It may be owned, managed, and operated by one or more of the organizations in the community, a third party, or some combination of them, and it may exist on or off premises.

3.2.3.Public cloud: The cloud infrastructure is provisioned for open use by the general public. It may be owned, managed, and operated by a business, academic, or government organization, or some combination of them. It exists on the premises of the cloud provider. 3.2.4.Hybrid cloud: The cloud infrastructure is a composition of two or more distinct cloud infrastructures (private, community, or public) that remain unique entities, but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for load balancing between clouds). 3.3. Service Models Each of the above deployment models must support one or many service models describe below. 3.3.1.Software as a Service (SaaS): The capability provided to the user is to use the provider’s applications running on a cloud infrastructure. The applications are accessible from various client devices through either a thin client interface, such as a web browser (e.g., web-based email), or a program interface. The user does not manage or control the underlying cloud infrastructure including network, servers, operating systems, storage, or even individual application capabilities, with the possible exception of limited user-specific application configuration settings. SaaS is the most known and the leading service of more widespread adoption of cloud computing. SaaS offering can be classified by the software and pricing model. Here is the overview of some SaaS providers.

Provider Salesforce.com Google Gmail Process Maker Live XDrive SmugMug OpSource Appian Anywhere Box.net MuxCLoud

Software CRM Email Businessprocessmanagement Storage Data sharing Billing Business process management Storage Data processing

Pricing model Pay per use Free Pay per use Subscription Subscription Subscription Pay per use Pay per use Pay per use

In SaaS, cloud provider control most of the software stack. The figure below illustrates how control and management responsibilities are shared.

As the figure depicted, cloud user only have control over some specific application resources that were made available by the cloud provider. For example, a Gmail user can perform a variety set of action such as create, send, or delete email messages. In some case, user can have limited administrative control of an application. Another example is that a Gmail user can create email accounts for other user. In the other hand, cloud providers have significant more administrative control over the application. A provider is responsible for deploying, configuring, updating, and managing the operation of the application so that it provides expected service levels to users. Although a user can have limited administrative control, the control possessed by the user existed only at the discretion of the provider. 3.3.2.Platform as a Service (PaaS): The capability provided to the user is to deploy onto the cloud infrastructure user-created or acquired applications created using programminglanguages, libraries, services, and tools supported by the provider. The user does not manage or control the underlying cloud infrastructure including network, servers, operating systems, or storage, but has control over the deployed applications and possiblyconfiguration settings for the application-hosting environment.PaaS offering can be classified by the availability of features that influence the application development. The most used features are programming models, programming languages, frameworks and persistence options. Here is the overview of some PaaS providers: Provider Target to Use Programming language, Frameworks .NET enterprise .NET applications, Web applications Web applications Python, Java Enterprise applications Apex Programming Models Persistence options

Aneka

Threads, Task, Flat files, RDBMS MapReduce Request-based Web programming Workflow, Request-based Web programming, Excel-like formula language Unrestricted BigTable Own database object

AppEngine Force.com

Azure

Heroku

Enterprise .NET applications, Web applications Web applications Ruby on Rails

Amazon Elastic Data processing MapReduce

Request-based Web programing Hive and Pirg, MapReduce Cascading, Java, Ruby, Perl, Python, PHP, C++

Table/BLOB/queue storage, SQL Services PostgreSQL, Amazon RDS Amazon S3

In PaaS, the cloud provider controls the more privilege, lower level of the software stack. The figure below illustrates how control and management responsibilities are shared.

The cloud provider controls the lower layers, the operating system and hardware, as well as networking, infrastructure such as LANs, routers and switchers. The user is provided with programming and utility interfaces for provision of execution environment in which the user’s applications run. This provision also include access to needed resource such as CPU cycles, memory, persistent storage, data stores, data bases, network connections, etc. The programming models, which are the situations where the application codes get activated, are determined by the provider. Billing and other management purposes are based on the monitoring of the user applications. 3.3.3.Infrastructure as a Service (IaaS) The capability provided to the user is to provision processing, storage, networks, and other fundamental computing resources where the user is able to deploy and run arbitrary software, which can include operating systems and applications. The user does not manage or control the underlying cloud infrastructure but has control over operating systems, storage, and deployed applications; and possibly limited control of select networking components (e.g., host firewalls).IaaS offering can be classified by the availability of features that influence the cost benefit ratio to be experience by user applications when moved to the cloud. The most relevant features are geographic distribution of data center, variety of user interfaces and APIs to access the system, instance hardware capacity, choice of virtualization platform and operating system and different billing methods. Here is the overview of some IaaS providers: Provider Geographic User interfaces distribution and APIs of data centers US CLI, WS, Portal Europe Hardware capacity Guest operating systems Linux Windows Smallest billing unit

Amazon E2C

Flexiscale

UK

Web console

CPU: 1_20 EC2 compute units Memory: 1.7-15 GB Storage: 1601690 GB, 1 GB – 1 TB (per ESB units) CPU: 1-4 Memory: 0.5-16 GB Storage: 20-270 GB

Hour

Linux, Windows

Hour

GoGrid

REST, PHP, Ruby

Java, CPU: 1-6 Linux, Python, Memory: 0.5-8 Windows GB Storage: 30-480 GB CPU: 1/16-8 OpenSolaris Memory: 0.2532.5 GB Storage: 5-100GB

Hour

Joyent

US

Month

RackSpace

US

Portal, REST, CPU: Quad-core Linux Python, PHP, Memory: 0.25-16 Java, .NET GB Storage: 10-620 GB

Hour

In IaaS, the cloud provider controls the most privileged, lowest layers of the software stack. The figure below illustrates how control and management responsibilities are shared.

The layer usually occupied by the operating system is now divided into 2 layers: the hypervisor, which is also the Virtual Machine Monitor layer, and the hardware layer. A hypervisor uses the hardware to synthesize one or more VMs; each VM is a duplicate of a real machine. When a user rented access to a VM, it appears to the user as a real computer that can be administrated via commands sent over the network. As depicted in the figure, the provider has complete control over the hardware layer and administrative control over the hypervisor. Users can make request to the hypervisor to create and manage new VM, but these requests are only applicable when they are now conflict with the provider’s policies over resource allocation. Interfaces to networking features which are used by user to configure custom virtual network will be provided through hypervisor. The user will typically maintain complete control over the operation of the guest operating system in each VM, and all software layers above it. While this structure gives users significant control over the software stack, users must take appropriate actions to operate, update and configure these traditional computing resources to meet their needs of securities and reliability. This structure also reveals many of the issues that were handled by the provider in SaaS and Paas to the users.

KEY TECHNOLOGIES USED IN CLOUD COMPUTING In this section, we will discuss about key technologies that further fueled the demand for cloud computing: Server virtualization Service-oriented architecture (SOA) Open source software Web development Mashups

1. Server Virtualization Over the last ten years, the trend in data center has been toward decentralization, also known as horizontals scaling. The cost of purchasing and maintaining centralized servers were too expensive. Due to this expense, applications are moved to their own dedicated servers, usually using commodity hardware. Decentralization ease the pain of the ongoing maintenance of each application, since patches and updates could be applied without interfering with other running systems. One application per server also helps with the tracking of problems as they arrive and increase security since each system is isolate from other systems in the network. However, decentralization comes at the expense of more power consumption, less physical place, and a greater management effort which, together, account for up to $10,000 in annual maintenance costs per machine. Moreover, decentralization decreases the efficiency of each machine, leaving the average server idle 85% of the time. Together, these inefficiencies often eliminate any potential cost or labor savings promised by decentralization. Server virtualization tackled on these problems in one approach. By using special software, one physical machine can be converted to multiple virtual machines, each with its own hardware configuration such as CPU, RAM, storage, etc. capable of running its own operating system. In theory, virtual machines can be created to use all of the physical machine’s processing power, even though doing that would not be a good approach in real life solution. Computer scientists have been applying virtualization for supercomputers for decades, but it’s only a few recent years since virtualization becomes feasible for servers. 1.1. Types of server virtualization

There are three ways to create virtual servers: full virtualization, para-virtualization and OS-level virtualization. The physical will be called host and the virtual servers are called guests.

-

-

-

Full virtualization: this type of virtualization used a special kind of software called the hypervisor to interact directly with the physical layer of the machine. It serves as a platform for the virtual server’s OS. Each virtual server runs its own OS and is kept completely independent and unaware of the other virtual servers running on the same machine. The hypervisor control the physical resources allocation. When virtual servers run applications, the hypervisor allocate resources from the physical machine to the appropriate virtual server. The hypervisor also required some computing resources for itself which means that the physical machine must reserve some processing power and resources to run the hypervisor. This can impact the overall system performance. Para-virtualization: unlike full virtualization, the para-virtualization model offer potential performance benefits since a guest’s operating system or application is aware that it is running within a virtualized environment and has been modified to exploit this. The entire system works together as a cohesive unit. In this model, the hypervisor doesn’t need as much processing power as full virtualization. One potential downside of this approach is that modified guests cannot be migrated back to use on normal physical server. OS-level virtualization: this approach doesn’t use a hypervisor at all. Instead, the virtualization capacity is part of the host OS, which perform all functions as a fully virtualized hypervisor. While this approach offer the best performance compares to the 2 approach described above, it does so with the expense of flexibility. It is not possible for guest to run different OS or even different version of the host’s OS. Each virtual server still remains isolated and independent from each other but all guest operating systems must be the same, as with the host. This is called homogeneous environment. Virtualization benefits

1.2.

Virtualization brings benefits in both financial and technological aspects: Server virtualization conserves space through consolidation. It's common practice to dedicate each server to a single application. If several applications only use a small amount of processing power, the network administrator can consolidate several machines into one server running multiple virtual environments. For companies that have hundreds or thousands of servers, the need for physical space can decrease significantly. Server virtualization provides a way for companies to practice redundancy without purchasing additional hardware. Redundancy refers to running the same application on multiple servers. If a server fails for any reason, another server running the same application can take its place thusminimizing any interruption in service. Building two virtual servers to perform the same tasks on one physical server is not a good practice. If the physical server were to crash, both virtual servers would also fail. In most cases, network administrators will create redundant virtual servers on different physical machines. Virtual servers offer programmers isolated, independent systems in which they can test new applications or operating systems. Rather than buying a dedicated physical machine, the network administrator can create a virtual server on an existing machine. Because each virtual

-

-

-

-

server is independent in relation to all the other servers, programmers can run software without worrying about affecting other applications. Server hardware will eventually become obsolete and need to be replaced. To continue offering services provide by these outdated systems, which are called legacy system, the network administrator could create virtual version of the legacy hardware on modern servers. From the application point of view, nothing has changed. This give organization time to properly transit to new process without worrying about hardware incompatible or failure, particular if the vendor that produced legacy system no longer exists and can’t fix broken equipment. The task of migrating, which is to move one virtual server form one physical machine to another, can be done with fewer restrictions with virtualization. Originally, this was only possible if both physical machines have the same hardware, operating system and processor. It’s possible now to migrate a virtual server from one physical machine to another even if they have different processors, but only if the processors come from the same manufacturer.

2. Service-oriented architecture (SOA) Service-oriented architecture is a model for organizing and utilizing distributed capabilities that may be controlled by different ownership domains and implemented using various technology combinations. In general, organizations or individuals create capabilities to support or solve a problem they encounter along their business course and these capabilities might meet other entities needs. SOA defines how two computing entities, such as programs, interact in such a way as to enable one entity to perform a unit of work on behalf of another entity. Service interactions are defined using a description language. Each interaction is self-contained and loosely coupled, so that each interaction is independent of any other interaction. It is not necessary to have a one-to-one correlation between needs and capabilities. One need could be fulfilled using various combinations of numerous capabilities while single capabilities may address one or many needs. One distinguish aspect of SOA is that it provide powerful framework for matching needs and capabilities and for combining capabilities to address those needs by leveraging other capabilities. One capability may be repurposed across a multitude of needs. Simple Object Access Protocol (SOAP)-based Web services are becoming the most common implementation of SOA. However, there are non-Web services implementations of SOA that provide similar benefits. The protocol independence of SOA means that different consumers can communicate with the service in different ways. Ideally, there should be a management layer between the providers and consumers to ensure complete flexibility regarding implementation protocols. There are few architectural principles that should be followed when designing SOA: Loose coupling: services maintain a relationship that minimizes dependencies and only maintain an awareness of each other. Service contract: services adhere to a communications agreement as defined collectively by one or more service description documents.

-

Service abstraction: beyond what is described in the service contract, services hide logic from the outside world. Service reusability: logic is divided into services with the intention of promoting reuse. Service composability: collections of services can be coordinated and assembled to form composite services. Service autonomy: services have control over the logic they encapsulate. Service optimization: all else equal, high-quality services are generally considered preferable to low-quality ones. Service discoverability: services are designed to be outwardly descriptive so that they can be found and assessed via available discovery mechanisms.

Service-oriented architecture gives existing systems the flexibility and agility to respond to a business environment which is changing rapidly. Service-oriented architectures allow businesses and governments to capitalize on opportunity by: Become more agile: allow organization to quickly response to new business imperatives, develop distinctive new capabilities and leverage existing assets for better responsiveness. Business and IT are more aligned. Driving cost reductions: SOA promote the reuse of existing assets, increase efficiency and reduce development cost. New systems can be build faster for less cost because of the reduction in integration expense, also these systems are easily reconfigure to adapt new business model since they are built for flexibility and they have long term value of interoperability. Boosting ROI: while service oriented architecture provide a foundation for high performance, value and increase in return on investment (ROI) is not directly reflected through SOA but through project that enabled SOA.

-

-

Service-oriented architectures also allow organizations to meet IT goals. The technological value of SOAs includes: Simpler system: SOAs are based on industrial standards and can reduce complexity when compare to the integration of systems on a solution-to-solution basis. Service scale: allow building of scalable, evolvable systems. SOA enabled system to be scale down to support mobile devices or scale up for large system across organization. Enhancing architectural flexibility: SOA support the implementing of next-generation composite solutions. These solutions focus on improving performance and loosely coupling multiple business processes from multiple systems in simple user interface. Manage complex system: when applied SOA, centralized services are not required and end users are empowered with high end communication for easy management.

-

3. Open Source Software

Traditionally, most software allows users to buy or downloaded in a compiled ready-to-run manner. This means that the programmer’s codes, known as source code, has run through a special program called the compiler which translate the codes into a form that computer can understand and execute. It is extremely difficult to modify the compiled version of most applications. Most software companies benefit from this in such a way that other companies cannot copy their code and use it in a competing product. It also gives them control over quality and features found in a particular product. Open source software is at the complete opposite ends of the spectrum. Open source software allows its source code to be made available and licensed with an open-source license in which the copyright holder provides the rights to study,modify and distribute the software for free to anyone and for any purposes. By supporting the concept of open source software, developers believe that their application will be further developed, more useful and error free. Open source doesn't just mean access to the source code. The distribution terms of open-source software must comply with the following criteria: Free Redistribution: the license shall not restrict any party from selling or redistributing the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require any feefor such sale. Source Code: the product must include source code, and allow distribution in source code and compiled form. If some forms of the product are not distributed with the source code, a wellpublicized means of obtaining such source code must be presented. The source code must be obtained without or little reproduction cost and can be downloaded via the Internet. Deliberately obfuscated source code is not allowed. Intermediate forms such as the output of a preprocessor or translator are not allowed. Derived Works: the license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software. Integrity of The Author’s Source Code: the license may restrict the redistribution of modified source code only if the license allows the distribution of “patch files” with the source code for the purpose of modifying the program at build time. The license must explicitly permit the distribution of programs built from the modified code. The license may require modified work to be included a software version or different name. No Discrimination Against Person or Group: the license must not discriminate against any person or group of persons. No Discrimination Against Fields of Endeavor: the license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research. Distribution of license: the rights attached to the program must apply to all of whom the program is redistributed without the need of execution of another additional license. License Must Not Be Specific to a Product: the rights attached to a program must not depend on the program’s being part of a software distribution. If the program is extracted from original software distribution and used or distributed under the term of the program’s license, all parties

-

-

-

-

-

-

to whom the program is redistributed should have the same rights as those granted with the original software distribution. License Must Not Restrict Other Software: the license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software. License Must Be Technology-Neutral: no provision of the license may be predicated on any individual technology or style of interface.

4. Web Development Web Development is a broad term and is at times confused with web design but should never be mistaken as such. Web Development includes any activity related to development of a web site for the Web or an internet. This can include e-commerce business development, web design, web content development, client-side/server-side scripting (e.g. PHP, JavaScript etc.), and web serverconfiguration. However, among web professionals, web development usually refers to the non-design aspects of building web sites, e.g. writing markup and coding. Web development can range from developing the simplest static single page of plain text to the most complex web-based internet applications, electronic businesses, or social network services. Web development teams can span over hundreds of people (web developers) for larger businesses or can be as little as a single webmaster. Web development is more of a collaborative effort among different departments rather than then single effort of a designated department.

5. Mashups A mashup is a technique by which a website or Web application uses data, presentation or functionality from two or more sources to create a new service. Mashups are made possible via Web services or public APIs that (generally) allow free access. Most mashups are visual and interactive in nature. To a user, a mashup should provide a richer, more interactive experience. A mashup is also beneficial to developers because it requires less code, allowing for a quicker development cycle. Mashups can be classified based on the following four questions: ? What to mash up? Where to mash up? How to mash up? For whom to mash up?

What

Depending on the sort of assets being combined or integrated,mashups are assigned to one of the following three categories

resentation, data or application functionality. A presentation mashupfocuses on retrieving information andlayout of different Web sources without regarding the underlying data and application functionality. A data mashup merges data provided by different sources into 1 content page. A functionality mashupcombines data and applicationfunctionality provided by different sources to a new service. Thefunctionalities are accessible via APIs. ? Where

Mashup can be distinguished depending on the location where they are mashed up. Server-side mashup integrated resource on server; client-side mashed up integrated resource on client, usually a web browser. ? How

Mashups can be furthered categorize into how the resources are integrated or combined to one representation. The extraction mashup can be considered as a data wrappercollecting and analyzing resources from different sources andmerging the resources to one content page. In a flow mashup the user customizes the resource flow of theWeb page combining resources from different sources. Theresources are transformed and integrated within the mashupapplication. ? For Whom

Mashups can be built using different tools that combine content from different resources but distinguishing the target group of interest. Mashup can be categorized as consumer mashups or enterprise mashup. Consumer mashup intended for public use and combines resource from public or private sources. Enterprise mashup merge multiple resources of systems in an enterprise environment. These mashups combine data and application functionalities of different systems e.g., ERP, CRM or SCM inorder to respond to their objectives. The creation of enterprisemashups requires considering security, governance or enterprisepolicies.

CLOUD COMPUTING BENEFITS IN BUSINESS AND TECHNOLOGY 1. Business benefits 1.1. Reduced costs It is common for companies or enterprises to buy their own hardware such as servers, network infrastructure, and power system and to purchase software licenses. All their data are hold and flowed on these infrastructures, and when companies or organization expands, they have to make further capital investments in storage space, server and network infrastructures to increase system’s capability and capacity. Most software licenses have to be purchased to support more users too. But such infrastructure updates incur service downtime and can affect business performance. Software updates are even worse. When an update of a component in a software stack is not done correctly, or has conflict with other applications, the whole system will suffer. Businesses have to manage IT assets, put in place a security infrastructure, a disaster recovery plan, keep both software and hardware up-to-date, and ensure that there are sufficient redundancy measures in place should any piece of the hardware fails, in addition to the manpower they have to hire to troubleshoot the systems. With cloud computing, businesses no longer have to own infrastructure and worry about its problems since they are shifted to cloud providers. This drastically reduced the IT capital expenditures. Instead, they pay for usage of the infrastructure-as-a-service, platform-as-a-service, or software-as-a-service. This type of payment is known as “pay-as-you-go” manner, like paying for electricity for phone bill, this allows businesses to pay only for what they need and only when they need it. As their business expands, they can scale up their IT requirements without having to forecast or make great IT capital investments, reducing their investment risks while improving cash flows. IT operational cost will also be reduced since businesses don’t have to pay for expert to manage their system and power consumption may be reduced. CFOs will be better manage company’s finance since cloud computing provide ability to budget effectively through predictable monthly costs. 1.2. Improve business agility Many companies today focus on cost controls and how operational expense can be reduced with cloud. However, the most forward-thinking organizations are demonstrating today that business agility is actually the greatest cloud benefit. Business agility is the ability of an organization to adapt rapidly and cost-efficiently in response to changes in the business environment. The prime benefits of agility include faster revenue growth; more effective responsiveness to risks and reputational threats; and greater, more lasting cost reduction, according to McKinsey & Company, the leading global management consulting firm.

Here are some specifics to explain how cloud computing help business improve cloud computing. ? Faster roll out of new services. In the ever changing business environment, when creating a market distinction in particular important, organizations need to do better than just innovate, they have to innovate faster than their competitors. Unfortunately, traditional IT infrastructure cannot well support such fast pace innovative. Consider, for instance, how a new service was traditionally created. Based on business strategy and service design phases, new infrastructure to support the service will have to be purchased and installed. In particular, new servers will be purchased, provisioned and configured appropriately. The new service will also have to be integrated with the existing infrastructure, so that key capabilities key security and service management can be applied. Cloud architecture represents a dramatic improvement on that model in every aspect. Rolling out new services in cloud computing won’t require new infrastructure at all. Using virtualization, new serve rs can be created in a completely consistent way thanks to the predefined image libraries. Once created, these servers automatically inherit all the capability of the cloud as a whole such as scalability, on-demand self-service, network access and resource pooling. As a result, services that used to take weeks or months to deploy now take only hours or days. And this tremendous acceleration, in turn, empowered the organizations in a far swifter, more agile fashion than it could before. ? Reduced risk of innovation One of the greatest challenges of business today is determining what exactly products or services customers want to buy, and then creating and offering the closest possible approximation of it. The better organizations perform at such a task, the more successful they tend to be. Due to the static nature of tradition infrastructures, though, this innovation is difficult to approach because of implied risks. Suppose, for instance, that a mid-size, brick-and-mortar retailer’s customers appear interested in an online store and associated discussion forum, in order to buy the company’s products over the Internet and then share and evaluate their experiences using those products. Creating and offering that service would normally require investing in, and subsequently managing, a dedicated infrastructure to support it. But because the customer interest can’t be perfectly reluctant to endorse the necessary investment, seeing it as too risky. So, in this scenario, the service is never created, and the potential benefits are never realized. With cloud computing, a completely different scenario applies. Instead of facing the burden of buying and deploying new infrastructures, the existing cloud’s virtual infrastructure can be easily leveraged to support new services. The cost will only come from actual service utilization by customers.

If such experimental turn out to be unsuccessful, the service can easily be discontinued with very little financial outlay. But if it is successful, it can be scaled up to meet whatever the demanding of customer, generating revenue for the business. As more and more experiments add up, businesses began to realized how cloud computing encourage innovation by reducing risks. ? Location independence Cloud computing is independent from business location. That is, it doesn’t matter where an organization or IT team takes place, cloud computing can deliver the same performance and functionalities. This brings impressive benefits to business. For example, IT personnel can work everywhere through a web browser as long as there is an internet connection, instead of being tied physically to an on-site operation center. A more general case for improved agility can also be made for the entire workforce, because internal IT services such as email or work calendars are available to all employees in the same way: over the Web. For globally distributed organizations, team assemble now become the matter of expertise, not location. Team member from all over the world can collaborate relatively easy and natural since it doesn’t required team member to move, or even commute, leads to improve business agility. Perhaps the most benefit scenario of location independence is the case of moving an entire organization from one physical location to another. If all of its IT services are delivered via an external cloud, the organization can pack up and move without a lot of technical change and zero service downtime. IT infrastructures don’t have to be shutdown, moved, reinstalled and tested. End user can continue to use the services without any interruption. ? Higher business resilience via advanced disaster recovery One aspect of business agility that is often left out: the speed of key services can be brought back online in the case of going down. For organization that rely on offering revenue-generating services, this in an important point to consider. Every minute of service downtime will create larger and larger negative impact, ranging from slow services to loss of revenue for multiple days, leading to media coverage and substantial brand damage. Thanks to advanced backup processes such as data mirroring, and the intrinsically virtual nature of the cloud architecture, cloud providers can restore entire servers with incredible speed, then repopulate them with necessary data just as swiftly. They also have 24/7 technical support, delivering peace of mind when a technical problem occurs. ? Faster software testing, faster time to market

The virtual of cloud made it an incredible efficient and cost-effective testing platform. For example, an organization creates new apps for mobile devices. Each new build must be rigorously tested to ensure it’s as feature-complete and bug-free as possible. That’s mean testing it on a clean system that had never been used to testing before. Setting up this kind of testing environment traditionally is a slow and cumbersome process. But with cloud computing, this kind of system can be created nearly instantly and with perfect consistency from instance to instance. Since the testing cycle is far faster, and test environments are more consistent, this organization can now create better software and get it to market much more quickly – essentially defining business agility in the software development field. 1.3. Elasticity Using pay-as-you-go pricing model, cloud computing offers elasticity to help reduce business cost like never before. The observation is that cloud computing’s ability to add or remove resource at a fine grain and with a lead time of minutes rather than weeks allow more closely matching between resources and workload. Estimate of server utilization of data center is around 5% to 20% since peak workload of a service may exceed by a factor of 2 to 10. Organization must provision for the peak and allow resources to be idle at nonpeak time. The more pronounced the disproportion, the more the waste. Here is a simple example of how elasticity reducing this waste and therefore reducing cost Example: Assume our service has a predictable daily demand where the peak required 500 servers at noon but only 100 servers at night. As long as the average utilization over the whole day is 300 servers, the actual utilization over the whole day 300 x 24 = 7200 server-hours; but since we must provision for the peak 500 servers, we pay for 500 x 24 = 12000 server-hours, a factor of 1.7 more than what is needed. Therefore, as long as the pay-as-you-go cost per server-hour over a period (usually 3 years) is less than 1.7 times the cost of buying the servers, we can save money. In fact, the above example underestimates the benefits of elasticity, because in addition to daily pattern, most nontrivial also experience seasonal or periodic demand variation as well as unpredictable demand burst due to external events. Since it can take weeks to acquire and install new equipment, the only way to handle such event is to provision for them in advanced. If the spike prediction is correct, capacity is wasted, and if they overestimate the spike they provision for, it’s even worse. Underprovisioning might as well happen, accidentally turning away excess user. While the monetary effects of overprovisioning can be easily measured, those of underprovisioning are harder to measure yet potentially equally serious: not only do rejected users generate zero revenue; they may never come back due to bad service. 2. New application opportunities Several important classes of existing applications will become even more compelling with Cloud Computing and contribute further to its momentum. Here are some of such classes of application

? Mobile interactive applications Tim O’Reilly, founder of O’Reilly media, believes that ““the future belongs to services that respond in real-time to information provided either by their users or by nonhuman sensors.”. These services better be in the cloud not only because they must have high availability, but also they rely on large datasets that would better be hosted on the cloud. This is especially the case for services that use mashups. While not all mobile devices enjoy connectivity to the cloud 100% of the time, the challenge of disconnected operation has been addressed successfully in speci?c application domains, so we do not see this as a signi?cant obstacle to the appeal of mobile applications. ? Parallel batch processing Cloud Computing presents a unique opportunity for batch-processing and analytics jobs that analyze terabytes of data and can take hours to ?nish. If there is enough data parallelism in the application, users can take advantage of the cloud’s new “cost associativity”: using hundreds of computers for a short time costs the same as using a few computers for a long time. For example, instead of using 1 server and 200 hours to analyze some sets of data, users can use 200 servers to analyze the same amount of data in only 1 hour and still cost the same as using 1 server for 200 hours. ? The rise of analytics A special case of compute-intensive batch processing is business analytics. While the large database industry was originally dominated by transaction processing, that demand is leveling off. A growing share of computing resources is now spent on understanding customers, supply chains, buying habits, ranking, and so on. Hence, while online transaction volumes will continue to grow slowly, decision support is growing rapidly, shifting the resource balance in database processing from transactions to business analytics. ? Extension of compute-intensive desktop application The latest versions of the mathematics software packages Matlab and Mathematica are capable of using Cloud Computing to perform expensive evaluations. Other desktop applications might similarly benefit from seamless extension into the cloud. Symbolic mathematics involves a great deal of computing per unit of data, making it a domain worth investigating. An interesting alternative model might be to keep the data in the cloud and rely on having suf?cient bandwidth to enable suitable visualization and a responsive GUI back to the human user. Of?ine image rendering or 3D animation might be a similar example: given a compact description of the objects in a 3D scene and the characteristics of the lighting sources, rendering the image is an embarrassingly parallel task with a high computation-to-bytes ratio.

CLOUD COMPUTING ISSUES As an emerging technology, cloud computing contains a number of issues, not all of which are unique to cloud, that are concerns for all IT hosted services. The bellowed areas of issues will better clarify the relation between cloud computing and these open issues in both locally-managed and outsourced IT computing services. 1. Computing Performance Different types of application require different levels of performance. For example, email is generally tolerant of short service interruptions, but industrial automation and real-time processing generally require both high performance and a high degree of predictability. Cloud computing incurs several performance issues that somewhat similar to the issues of other types of distributed computing. ? Latency

Latency is the time delay that a system experiences when processing a request. Delay in cloud computing is often experienced when a request message travel to the provider and a response message is sent back to the customer. Generally, delay times are not a single expected number but instead a range, with a significant amount of variability caused by congestion, configuration error, or failures. These factors are often not under the control of a provider or user. However, network optimization technologies and web application acceleration services may be applied to lessen such poor performance. ? Off-line data synchronization

Accessing data or document in the cloud is problematic when user does not have an internet connection. The ability to synchronize data and document when processing offline with data and document that already on the cloud is crucial, especially for SaaS clouds. Such synchronization might be achieved through version control, collaboration or other synchronization capabilities in the cloud. ? Data storage management

When data storage is considered in the context of clouds, users require the ability to: (1) provision additional storage capacity on demand, (2) know and restrict the physical location of the stored data, (3) verify how data was erased, (4) have access to a documented process for securely disposing of data storage hardware, and (5) administer access control over data. These are all challenges when data is hosted by an external party. 2. Cloud Reliability

Reliability refers to the probability that a system will offer failure-free service for a specified period of time within the bounds of a specified environment. For the cloud, reliability is broadly a function of the reliability of four individual components: (1) the hardware and software facilities offered by providers, (2) the provider’s personnel, (3) connectivity to the subscribed services, and (4) the user’s personnel. ? Network Dependence

Cloud computing means that user will access provided services and data through a network, typically the Internet. Network dependence implies that every application is a network application which suggests that the application is relatively complex: i.e., the risk of errors or security vulnerabilities will be higher than for non-networked, standalone applications. For example, cloud application should apply cryptographically capabilities to ensure security in data transit. There have been several well-publicized regional Internet outages that have been the result of denial of service attacks, viruses infiltrating web servers, worms taking down DNS servers, failures in undersea cables, and fiber optic cables being damaged during earthquakes and subsequent mudslides. Although these outages are relatively infrequent, they can have an impact on network connectivity for hours. Backup plans for these rare but serious outage events should be carefully addressed in organization tactical IT plan. ? Cloud Provider Outages

In spite of clauses in service agreements implying high availability and minimal downtimes for users, service or utility outages are inevitable due to man-made causes (e.g., malicious attacks or inadvertent administrator errors) or natural causes (e.g., floods, tornados, etc.). User should take the below considerations when deciding whether to move their business to the cloud: What is the frequency and duration of outages that the user can tolerate without adversely impacting their business processes? What are the resiliency alternatives a user has for contingency situations involving a prolonged outage?

3. Compliance When data or processing is moved to a cloud, the consumer retains the ultimate responsibility for compliance but the provider (having direct access to the data) may be in the best position to enforce compliance rules. ? Lack of Visibility

User may lack of visibility of how clouds operate. If so, they are not likely to see if their services are undertaken and delivered in a secure manner. Different models of cloud computing provide different levels of visibility. However, the option for a consumer to request that additional monitoring

mechanisms are deployed at a provider’s site is plausible and currently used in a variety of non-cloud systems ? Physical Data Location

Cloud providers decide where to set up their physical data center based on several parameters such as: construction cost, energy cost, safety, etc. Users, however, may have to comply with international, state or federal statues and directives that prohibit data storage outside certain physical boundaries or borders. 4. Information Security Information security pertains to protecting the confidentiality and integrity of data and ensuring data availability. An organization that owns and runs its IT operations will normally take the following types of measures for its data security: Organizational/Administrative controls specifying who can perform data related operations such as creation, access, disclosure, transport, and destruction. Physical Controls relating to protecting storage media and the facilities housing storage devices. Technical Controls for Identity and Access Management (IAM), encryption of data at rest and in transit, and other data audit-handling requirements for complying with regulatory requirements.

The following subsections briefly describe some security issues of cloud computing. ? Risk of Unintended Data Disclosure

Unclassified government systems are often operates in a manner where single system is used to process non-sensitive, public information. In a typical scenario, a user will store sensitive and nonsensitive information in separate directories on a system or in separate mail messages on an email server. By doing so, sensitive information is expected to be carefully managed to avoid unintended distribution. If a consumer wishes to use cloud computing for non-sensitive computing, while retaining the security advantages of on premises resources for sensitive computing, care must be taken to store sensitive data in encrypted form only. ? Data Privacy

Privacy addresses the confidentiality of data for specific entities, such as consumers or others whose information is processed in a system. Privacy carries legal and liability concerns, and should be viewed not only as a technical challenge but also as a legal and ethical concern. Protecting privacy in any computing system is a technical challenge; in a cloud setting this challenge is complicated by the distributed nature of clouds and the possible lack of consumer awareness over where data is stored and who has or can have access. ? System Integrity

Clouds require protection against intentional subversion or sabotage of the functionality of a cloud. Within a cloud there are stakeholders: consumers, providers, and a variety of administrators. The ability to partition access rights to each of these groups, while keeping malicious attacks at bay, is a key attribute of maintaining cloud integrity. In a cloud setting, any lack of visibility into a cloud's mechanisms makes it more difficult for consumers to check the integrity of cloud-hosted applications.

doc_870606868.docx

Research Reports on Cloud Computing

Attachments