Flash Crowds and Dynamic Resource Provisioning - Vrije Universiteit ...

Flash Crowds and Dynamic Resource Provisioning: Not an Easy Task Ion Morozan Vrije Universiteit Amsterdam, The Netherlands [email protected]

Abstract—During the last couple of years, Cloud Computing (CC) became a much bigger trend in the business and networking world, with a predicted increase of 130% by 2016, as IDC [1] announced during their last event this year, in September. There are so many benefits of moving businesses to the cloud that companies can not resist anymore. Why this is attractive to a business is that one of the core concepts of Cloud Computing is to avoid impact of overprovisioning and under-provisioning. Increases in power computation requirements is not an issue anymore, but flash crowds and fast resource provisioning still represent a challenge, that has not been totally solved yet. The aim of this paper is to describe why flash crowds (FC) represent a problem that need careful attention and if it is worthy on investing in a system able to handle this phenomenon which usually occurs for a short period of time. Another interesting aspect of flash crowds is that is hard to differentiate them from a DDoS attack. Both have very similar properties in terms of Internet traffic, as a consequence we propose some solutions that aim to differentiate them. On the other hand a lot of research effort is being spent in the field of handling flash crowds by allocating resources on the fly when the system encounters large spikes or surge in traffic to a particular Web site. To address this issue we propose an algorithm for prediction-based resource measurement and provisioning by using Linear Regression to satisfy upcoming resource demands.

I. I NTRODUCTION Cloud computing is a major change from how computers and software have worked few years ago. In the past, if you wanted to serve an application online, you had to invest an important amount of money not only in developing the application, but also in the underlying infrastructure to support a possible success of it. Now, cloud computing allows dynamic resource scaling for businesses which represents the key characteristics that differentiates this emerging system from the traditional computing paradigm. It provides opportunities for small companies to host their applications in the cloud; thus, eliminating the overhead of procuring traditional infrastructure resources which typically takes several months. The pay-as-you-go model and dynamic resource provisioning offered by cloud computing represent clearly an advantage to the overhead associated with static provisioning. Since there are no clear patterns in accessing a service on the Internet an intelligent way of dynamic resource provisioning is required which is effective in terms of both cost and performance. This is an open challenge that made proactive prediction-

based resource scaling required in order to deal with the traffic fluctuations encountered by Web applications. Nowadays, Internet has become a common household facility for million and million of people, where flash crowds are not a rare phenomenon. The term flash crowd was coined in 1973 by Larry Niven in his science fiction story [2] with the same name, where thousands of people took advantage of the ease of teleporting and went back in time to see historical events. On the Web, the pervasive access of browsers and rapid spread of the news, leads to a similar situation where many users simultaneously request the same data of a popular Web site. This increased overhead can have undesired consequence to the system such as crashes or unusually high response times. In the end both can translate into lost revenue or negative customer attitudes toward the service. Denial of Service (DoS) and Distributed DoS attacks represent a serious threat on the Internet. DDoS attacks are malicious requests launched explicitly from a collection of compromised systems known as botnet. These requests does not need to be handled by a server, they are intended to subvert the normal operation of a Web service. By analyzing the traffic behavior of flash crowd and DDoS we conclude that they are similar, but operate totally different with respect to the access intents, distribution of IP addresses and speed of the increased and decreased traffic. Fortunately, there are ways to distinguish them from legitimate flash crowds [14], [15] and ways to fight back [3]. In this paper we are going to evaluate the properties of both types of events with a special attention to characteristics that distinguish the two. Identifying these characteristics allows us to formulate a strategy for online services to quickly discard malicious requests by this removing the explicit attempt to prevent legitimate users of a service to access it. Another problem that we strive to tackle in the following sections is related to the predictive based resource allocation model for handling multi-time-scale variations seen in the Internet workloads, system sizing and capacity planning in the cloud. Therefore, our proposed techniques use statistical and prediction models to forecast the future wave in resource requirements; thus enabling proactive scaling to handle temporal spikes in workload. The remainder of this paper is organized as follows. In section 2 we explain why flash crowds represent a sensitive problem that need careful attention and analyze if investing

(a) Flash Crowd Traffic Model.

(b) Generated Flash Crowd Traffic. Fig. 1.

Modeled and simulated flash traffic.

in the infrastructure to sustain this short burst of traffic is something necessary. Next, in section 3 we differentiate flash crowds from DDoS attacks. We then describe a mechanism to overcome the problem of large spikes in traffic to a particular Web site by using our proposed algorithm. We discuss related work in Section 5 and present our conclusions in Section 6. II. H ANDLE F LASH C ROWDS Predicting a phenomenon with an erratic pattern is a challenge that needs serious attention. As we have already described a flash crowd represents the event where a server becomes unresponsive to all the requests produced by numerous clients in a limited amount of time within a network. Since flash events are characterized as being atypical increases in load for relatively short duration in time, they can differ from each other in several aspects. For example, some of them exhibit very sharp growth rates, while some of them are long-lived or have very high magnitude peaks. According to various studies [4], [5] a FC has three major phases each of which states the relationship between the traffic (represented as requests) arrived on a server within certain time interval. Figure 1(a) shows these three phases in the event: a ramp-up phase, a sustained traffic phase and a ramp-down phase. The flash event starts at time t0 . Initially, there is the rampup phase where the traffic level has a linear increase from normal request rate Rnormal to the maximum rate of requests Rf lash . All of these are happening during the short time interval [t0 , t1 ]. This step relates to the speed at which the

system needs to detect and react to the flash crowd. We model the second phase of the flash crowd with a sustained request rate, which is equal to Rf lash and takes place between t1 and t2 . This represents the maximum load imposed by the FC and express the resources required by the system (e.g. number of servers) to handle the surge in access. In the ramp-down phase the traffic level gradually decreases and eventually returns back to Rnormal , in the time interval t2 and t3. As a final remark the interval [t0, t3] depict the length of time period from the arrival of the crowd until the load goes to normal. This would correspond to the duration of “anomaly” for the system. Figure 1(b) shows an example of a real trace in a Web proxy which experiences a flash crowd. In normal settings the application meets 480 requests/minute (8 reqs/sec), but during the flash event this increases to around 10.000 requests/minute (160 reqs/sec). As we can see these requests bursts arrive without any prior notification, and worse there is no easy way of identifying them beforehand. In Section 4 we are going to take a look to some predictive and reactive techniques which try to overcome the effects of these events. Having provided background on how flash crowds behave now we turn our attention to an interesting question: Is it worthy investing in your own infrastructure capable of handling flash crowds?. Nowadays, Web applications face the challenge of serving millions of users spread all over the world, where every user expects the service to be always available and reliable. Successful web services do not only have a big amount of users, they are also evolving faster than the performance of computer hardware is increasing. Thus at one point or another a Web application which attempts to become big needs the ability to scale. Therefore, If we answer to this question through the eyes of a garage innovator [8], then investing tens of thousands of dollars for resources just in case the Web site will experience a sudden burst in access requests is not feasible or at least a niche of start-ups could afford it. There are plenty of issues and trade-offs that a start-up faces during the development of a low-cost, scalable Web service. Fortunately, Cloud Computing provides a “win–win” way out to this puzzle by giving access to vast resources almost instantly in exchange to dynamic and flexible prices. CC model has two main advantages which represent the prime needs of a garage investor: no up-front hardware investments and hardware optimally utilized. This environment meets with success start-ups needs as: low overhead during the lean times, high scalability, simple schema’s, hardware that is designed to handle failures automatically and it is designed for distributed deployments. There are several cloud computing solutions in the market today, the choices range from virtualized hardware to all-inone programming interface packages. The illusion of infinite resources is the level of abstraction that each solution seems to focus [9]. For example Amazon’s Elastic Compute Cloud (EC2) [10] provides very low level abstraction to the hardware, virtualizing the hardware and offering a set of libraries that allow developers to have a full control over the software

stack. On the other side Google App Egine (GAE) [11] provides frameworks in several programming languages like Java or Python that allow quick development of powerful applications. Therefore, garage innovators have several ways to build a scalable low-cost Web application depending on their requirements. Now if we turn our attention to the big players (e.g. Google, Microsoft) then clearly we have to change our approach. For this companies money represent the last thing they take into consideration when it comes to satisfying their customers. As a consequence almost all of them have invested huge amount of money in their own infrastructure capable of serving million users simultaneously. This seems to be a logical decision since big companies are always in a competition with each other, thus storing sensitive data on someone’s else infrastructure is like sharing your ideas with them. Security is their prime concern, but of course there are other benefits such as flexibility. Who else knows best your products if not you? Thereby private data centers have been designed by the engineers of these companies to offer full support to their custom products. Last but not least cost is another relevant factor. Clearly the costs of sustaining the overhead of popular services such as Google search engine is huge, thereby is clever to place funds in your own infrastructure rather then renting it. In conclusions, we strongly believe that the answer to our question is: it depends. It depends form what point of view we are playing on the market. If you are a start-up then probably building your application by taking advantage of the endless capabilities offered by Cloud Computing is the right way to do it. On the other hand corporations have all rights in investing in their own infrastructure since privacy, scalability and satisfaction of their clients establish their first priority.

known as Flash crowd. The FC attack is an attack which mimics the traffic of a flash event. The main distinction from typical DDoS attacks is that the number of clients dynamically changes to simulate licit clients. In the reminder of this section we depict some mechanisms to discriminate DDoS attack traffic from flash crowd. Getting back to the flash crowd attack presented above, a characteristic which makes it unique is that during a FC the traffic generated may fluctuate and looks like a zig-zag wave [13], due to the dynamics in the number of clients. while a DDoS traffic is continuously stable. A possible solution to this attack is the use of graphical puzzels to differentiate bots from humans, but this requires human interaction which can be annoying to users. Another theoretical solution presented by K. Prasad et al. [15] uses a discrimination algorithm based on entropy variations to detect patterns of flash crowd imitations. A different way to discriminate FC from DDos attacks is through packet arrival patterns. This technique uses the packet arrival rate to differentiate attack-source traffic from user traffic. The idea behind is that Internet users have a limit for the response from the outcome of the first request (i.e. after a web page has been loaded the users may take time to respond), while in the case of a DDoS attack the transmission rate have certain patterns, thus being predictable. Therefore we can test the pattern of packet transmission by using mathematical models or statistical analysis. Discriminating DDoS attacks from flash crowds using information distance is done by using abstract distance metrics (e.g. Sibson distance, Helliger distance) to measure the similarity among flows. The key point of this attack is that the traffic generated during a DDoS attack is issued by a program which act in a predictable way whereas real flash crowds come from randomly distributed users from all over the Interent.

III. DD O S OR F LASH C ROWDS

TABLE I C OMPARISON OF F LASH C ROWD AND DD O S ATTACK .

Denial-of-Service (DoS) attack is disruption and inconvenience and represent an explicit attempt to prevent an information service’s being accessed by legitimate users. The semantic key difference between a flash event and DoS is that the former represents legitimate access to a Web site while the latter does not. Yet this does not help in distinguishing them automatically, thus we need to develop behavioral differences between them after understanding their individual properties. A. Flash Crowds vs. DDoS A Distributed DoS is a master-slave configuration where a large number of distributed hosts flood the victim with an abundance of attack packets simultaneously. Routers and servers can handle a finite amount of traffic at any given time, so when this limit is reached all traffic will be ignored and the users will be denied access. Yahoo, eBay, and CNN are examples of sites who were inaccessible to their customers for certain period of time due to DDoS attacks. DDoS attacks are classified in four different categories: (1) UDP flooding attack, (2) TCP SYN flooding attack, (3) ICMP flooding attack and (4) the application-layer DDoS attack

Category Network status Server status Traffic Type Traffic Control Response Traffic Source Flow size Predictability

Flash Crowd Congested Overloaded Genuine Responsive

DDoS Congested Overloaded Malicious Unresponsive

Web Large No. of Flows Predictable

Any Any Unpredictable

Table 1. describes an analogy between the traffic of a flash crowd and a DDoS attack. As we can see the main differences are represented by the traffic type where the traffic of a flash event is genuine, while for a DDoS is malicious and the traffic response control where the former one is responsive while the latter is unresponsive. The “problem” with these countermeasures is that they are applicable to noisy DDoS attacks. Attacks in which you can identify patterns that lead to this conclusion. Recently M. Kang et al. [12] introduced a new type of DDoS attack, called Crossfire, that degrades and eventually cuts off network connections to a target area by flooding only few links. The

strong aspects of this attack are that it ensures undetectability by using legitimate flows (e.g. 4 Kbps flows) to flood target links, attack-flow indistinguishability since all the traffic carries different source and destination IP addresses, making bandwidth aggregation mechanisms ineffective. Another key characteristic of Crossfire is represented by the flexibility in the choice of targets, therefore the attack can be launched against any target area. Lastly, the attack is able to disconnect a target area persistently, making the Crossfire attack pure data plane attack. This carefully crafted DDoS attack can very easily conceal malicious traffic sent by bots to legitimate one sent by real users. Possible solutions to this attack would be (1) supporting application layer overlays that would route around flooded links by selecting different routes or (2) initiate massive attacks to disrupt the bot markets. Since there are no strong solutions to DDoS attacks which can guarantee no downtime, we can conclude this section by claiming some useful techniques to manage them: (1) dropping extra requests that cannot be served by the system, (2) push the extra requests back to the network edge (dropping them there) and (3) multi-route the large web response traffic to reduce the impact on the cross traffic. IV. DYNAMIC R ESOURCE P ROVISIONING Predicting peak Internet workloads has proven to be difficult since hosted applications may vary over time due to long term period trends such as time-of-day effects (e.g. more people accessing the Internet during day time) and also due to flash crowds. Dynamic resource provisioning enables additional resources (e.g. servers) to be allocated by an application on the fly to handle workload increases. The effectiveness of this depends on a number of factors, such as resource allocation algorithm, the granularity of allocation and the allocation overheads. We define the allocation scheme through two parameters: (1) responsiveness defined as the amount of time it takes for the allocation model to detect a load change and determine the new allocation and (2) allocation overhead which represents the duration to actually bring the new resources online once the allocation scheme has been taken. The problem that dynamic resource provisioning is trying to solve is to make the application able to take automatic scaling decisions by evaluating its own requirements in realtime and thus request resources beforehand through intelligent prediction to maximize performance. Most of the existing resource provisioning techniques, some of which we are going to discuss in the next section, have been developed from the cloud service provider viewpoint. Our proposed solution is more prospective and robust in terms of resource prediction in the cloud for several reasons. First, it uses historical data for training and testing the forecasting models. Second, our prediction algorithm if developed using statistical learning algorithms which are simple yet effective. The algorithm that we propose uses linear regression to identify the ascending trend in the load of a Web application,

and verify whether the load anticipated in the near future might be classified as a flash crowd. The proposed learning algorithm is Vector Autoregressive (VAR) model a common technique used in finance, economics, and other fields to model relationship among multiple time series. For example, an economist might use VAR models to understand the impact of monetary policy on inflation and unemployment. As a consequence we tried to expand this model and apply it for our purposes, to predict the most likely future outcome based on recent resource usage and historical data. For that we used statsmodels [17], a Python module that provides classes and functions for the estimation of different statistical models to implement our prototype. The idea is that we tried to correlate the linear interdependencies among multiple historical values of metrics such as: servers cpu load or services response times (e.g. php web service) that were collected during the online profiling phase to describe the evolution of next possible values of these metrics. The model is triggered when the underlying environment encounters violations in the service level agreement (SLA) established between the parties. SLAs contain Quality of Service (QoS) properties which need to be measured and monitored during the provisioning of the service, as a consequence when the system identifies that some services exceed a predefined threshold (e.g. response delay > 500ms or cpu load > 80%) than the VAR model is carried out and some decisions are taken in order to keep the system within the Service Level Objectives (SLO). The flow of the algorithm is as follows: 1) Load previously collected data (historical cpu_load and php_resp_time values) and real-time values; 2) Select data to fit the VAR model; 3) Run the VAR model and predict next values by using statsmodels library; 4) Adjust resources of the system (e.g. add or remove servers) based on the resulted data and predefined threshold. A complete implementation of the algorithm can be found at [18]. This algorithm proved to capture good behavior of the system in future requirements if the training phase is fed with a broad set of prior collected traces, therefore it has been integrated in ConPaaS [19], an open source cloud platform, as a part of the dynamic resource provisioning scheme. V. R ELATED W ORK A lot of research is being conducted in the area of flash crowds. From handling, to discriminate FC from DDoS attacks and reduce their impact by predictive and reactive provisioning all of them have the same goal, make the Web applications responsive despite the type of the problem. The problem of resource provisioning has been addressed for single-tier [20], [21] or multi-tier [22], [23] Web applications. These models capture the performance impacts of techniques such as caching and database replication and assume that the underlying machines are homogeneous. However in Cloud Computing platforms resources are heterogeneous so these systems do not apply. As a consequence a few research

works investigate the problem of resource provisioning in the Cloud. Stewart et al. [24] implemented a performance prediction model of the Internet services by using different hardware capacities such as processor speeds and processor cache sizes. Alternatively, Marin et al. [25] used hardware patterns to predict the efficiency of different applications across heterogeneous architectures. These approaches cannot be easily extended to predict Web application performance because they rely on hardware metrics which are not available in the Cloud, where low-level metrics are hidden by the virtualization layer. On the other hand, Silvia et al. [26] proposed an algorithm for optimizing the number of machines for a job with a predefined number of independent tasks so that the maximum speedup can be achieved within a limited budget. Similarly, Parekh et al. [27] addressed the problem of creating an automated adaptive scaling system in the cloud by using a proportional thresholding approach which dynamically adjusts the target range (i.e. high and low thresholds) based on the number of accumulated virtual instances. Although these works present new techniques for adaptive scaling, there is a need for an effective prediction scheme to forecast the future SLAs in order to minimize the system overhead. In this regard, Caron et al. [28] created a pattern matching algorithm, based on similar characteristics of web-traffic, that is used to predict the workload based on past usage pattern. Even though this approach aids in making dynamic scaling decisions in realtime, it is unable to adapt to any new pattern that might appear as a result of the dynamic nature of the web traffic. Most of the existing research focuses on resource provisioning from the cloud service provider perspective. To this end, our proposed solution presented in Section 4 aims at analyzing the problem of resource provisioning form the application provider’s viewpoint so that the application is able to take automatic scaling decisions by assessing its own business level agreements in real-time and thus request for resources beforehand through intelligent prediction models to maximize performance. VI. C ONCLUSION The concept of flash crowd is by its nature imprecise therefore no ideal definition can be painted. However, we proposed a formal definition that is driven by expected system reactions, and thus approximates human intuition. We started this paper with the aim towards describing why flash crowds represent a problem that needs careful attention and we argued that predicting a phenomenon with an erratic pattern is challenging since these bursts arrive without any prior notification. Throughout the paper we analyzed if it is worthy to invest in a private infrastructure able to handle a FC and we identified that this expensive investment can be sustained mainly by the big companies. On the other hand we also showed that garage innovators can take advantage of Cloud Computing to deploy web applications that can meet rigorous requirements in exchange to dynamic and flexible prices.

Another problem that we tackled in this paper is related to the confusion between DDoS attacks and flash crowds where the vast majority assume that legitimate traffic can be distinguished between malicious one by performing simple checks on the content of the packets, their header, or their arrival rates. Yet, attackers are increasingly disguising their traffic by mimicking legitimate users access patterns as we noticed with Crossfire attack [12], which allows the attackers to defy traditional filters. Since there is yet no proven solution to this attack we can identify that this endless race between “white” and “black” authorities is never going to end because technology evolution does not take sides. As a consequence we proposed some useful solutions to manage DDoS attacks. In the end we addressed the problem of predicting a flash event by presenting a novel learning algorithm used in finance known as Vector Autoregression. This model correlates the linear interdependencies between multiple historical and realtime values of different metrics (e.g. cpu load) and forecast the evolution of them in the future by facilitating dynamic resource management for web-based applications. We claimed that our model behaves well when the training phase is fed with a broad set of input data, therefore we can identify that flash crowds and dynamic resource provisioning are not a trivial task. Throughout the paper we supported our claim by presenting practical examples which identify open issues that need to be addressed, before assuming that flash crowds can be predicted, identified and controlled. R EFERENCES [1] International Data Corporation: IDC’s Cloud Computing and Datacenter Roadshow 2013. [online] http:// idc-cema.com/ eng/ events/ 52888-idc-s-cloud-computing-and-datacenter-roadshow-2013, 2013. [cited: October-2013]. [2] Larry Niven. Flash crowd. The Flight of the Horse. Ballantine Books, 1971 [3] J. Ioannidis and S. M. Bellovin. Implementing push-back: Router-based defense against DDoS attacks. In Proceedings of Network and Distributed System Security Symposium, San Diego, CA, Feb. 2002. [4] I. Ari, B. Hong, E. Miller, S. Brandt, D. Long, Managing flash Crowds on the Internet, IEEE/ACM (MASCOTS03), Orlando, FL, USA, October 2003. [5] I. Ari, B. Hong, E. Miller, S. Brandt, D. Long, Modeling, Analysis and Simulation of Flash Crowds on the Internet, Storage Systems Research Center, Jack Baskin School of Engineering, University of California, Santa Cruz, CA, USA, February 2004 [6] A. Chandra and P. Shenoy, Effectiveness of Dynamic Resource Allocation for Handling Internet Flash Crowds, Department of Computer Science University of Massachusetts Amherst, 2003 [7] A Marnerides, D. Pezaros, and D. Hutchison, Flash Crowd Detection within the realms of an Internet Service Provider (ISP), proceeding of the 9th Annual Postgraduate Symposium on The Convergence of Telecommunications, Networking and Broadcasting, pp. 153–158, Liverpool, UK, June 2008. [8] J. Elson and J. Howell, Handling flash crowds from your garage, Annual Technical Conference, pp 171–184, Boston, Massachusetts, USA, 2008. [9] M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, Above the Clouds: A Berkeley View of Cloud Computing. Technical Report UCB/EECS2009-28, EECS Department, University of California, Berkeley, Feb 2009. [10] Amazon Inc., Amazon Elastic Compute Cloud (Amazon EC2), http: // aws.amazon.com/ ec2/ , 2013. [cited: October-2013]. [11] Google Inc., Google App Engine, https:// developers.google.com/ appengine/ , 2013. [cited: October-2013]. [12] M. Kang, S. Lee, V. Gligor, The crossfire attack., IEEE Symposium on Security and Privacy (SP) pp 127–141, 2013.

[13] T. Thapngam, S. Yu, W. Zhou and G. Beliakov, Discriminating DDoS Attack Traffic from Flash Crowd through Packet Arrival Patterns, The First IEEE International Workshop on Security in Computers, Networking and Communications, pp 952–958, 2011. [14] P. Rajani Reddy, R. Siva and C. Malathi Techniques to Differentiate DDOS Attacks from Flash Crowd. International Journal of Advanced Research in Computer Science and Software Engineering, ISSN: 2277 128X, Volume 3, Issue 6, June 2013. [15] K. Munivara Prasad, A. Rama Mohan Reddy and K. Venugopal Rao, Discriminating DDoS Attack traffic from Flash Crowds on Internet Threat Monitors (ITM) Using Entropy variations. Afr J. of Comp & ICTs. Vol 6, No. 3. Pp 53–62, June 2013. [16] J. Jung, B. Krishnamurthy, and M. Rabinovich, Flash Crowds and Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites., Proceedings of the 11th international conference on World Wide Web, Honolulu, Hawaii, USA, May , 2002. [17] StatsModels, Statistics in Python, http:// statsmodels.sourceforge.net/ devel/ index.html, 2013. [cited: October-2013]. [18] Ion Morozan, Vector Autoregression Model (VAR) for Flash Crowds, https:// bitbucket.org/ ionmorozan/ vector-autoregression-model-var, 2013. [cited: October-2013]. [19] ConPaaS, Integrated Runtime Environment for Elastic Cloud Applications http:// www.conpaas.eu/ , 2013. [cited: October-2013]. [20] T. Abdelzaher, K. Shin, and N. Bhatti, Performance guarantees for Web server end-systems: a control-theoretical application., IEEE Transactions on Parallel and Distributed Systems, January, 2002. [21] R. Doyle, J. Chase, O. Asad, W. Jin, and A. Vahdat. Model-based resource provisioning in a Web service utility. In Proc. USITS, 2003. [22] B. Urgaonkar, P. Shenoy, A. Chandra, P. Goyal, and T. Wood, Agile dynamic provisioning of multi-tier Internet applications., ACM Transactions on Autonomous Adaptive Systems, January, 2008. [23] D. Villela, P. Pradhan, and D. Rubenstein, Provisioning servers in the application tier for e-commerce systems., ACM Transactions on Internet Technology, July, 2007. [24] C. Stewart, T. Kelly, A.Zhang, and K. Shen, A dollar from 15 cents: Cross-platform management for internet services., In Proceedings of the USENIX Annual Technical Conference, June, 2008. [25] G. Marin, and J. Mellor-Crummey, Cross-architecture performance predictions for scientific applications using parameterized models., Proceedings of the SIGMETRICS Conference, June, 2004. [26] J. Silva, L. Veiga, and P. Ferreira, Heuristic for resources allocation on utility computing infrastructures., MGC, Leuven,Belgium, December, 2008. [27] H. Lim , S. Babu , J. Chase , S. Parekh, Automated control in cloud computing: challenges and opportunities., Proceedings of the 1st workshop on Automated control for datacenters and clouds, Barcelona, Spain June, 2009. [28] E. Caron, F. Desprez, and A. Muresan, Forecasting for grid and cloud computing on-demand resources based on pattern matching., in Proc. IEEE Cloud Computing, 2010.