Professional Documents
Culture Documents
Computer Networks
journal homepage: www.elsevier.com/locate/comnet
a r t i c l e i n f o a b s t r a c t
Article history: In this paper, we develop the optimal minimum-energy scheduler for the dynamic online
Received 24 May 2013 joint allocation of the task sizes, computing rates, communication rates and communica-
Accepted 5 August 2013 tion powers in virtualized Networked Data Centers (NetDCs) that operates under hard
Available online 18 September 2013
per-job delay-constraints. The referred NetDC’s infrastructure is composed by multiple
frequency-scalable Virtual Machines (VMs), that are interconnected by a bandwidth and
Keywords: power-limited switched Local Area Network (LAN). Due to the nonlinear power-vs.-com-
Energy-saving
munication rate relationship, the resulting Computing-Communication Optimization Prob-
Networked data centers
Dynamic online communication–computing
lem (CCOP) is inherently nonconvex. In order to analytically compute the exact solution of
resource provisioning the CCOP, we develop a solving approach that relies on the following two main steps: (i) we
Hard real-time applications prove that the CCOP retains a loosely coupled structure, that allows us to perform the loss-
less decomposition of the CCOP into the cascade of two simpler sub-problems; and, (ii) we
prove that the coupling between the aforementioned sub-problems is provided by a (sca-
lar) constraint, that is linear in the offered workload. The resulting optimal scheduler is
amenable of scalable and distributed online implementation and its analytical character-
ization is in closed-form. After numerically testing its actual performance under randomly
time-varying synthetically generated and real-world measured workload traces, we com-
pare the obtained performance with the corresponding ones of some state-of-the-art static
and sequential schedulers.
2013 Published by Elsevier B.V.
focus of this paper, whose main contributions can be optimization problem as a feedback control problem that
summarized as follows: must converge to a priori known target performance level.
While this approach is suitable for tracking problems, it
i. the contrasting objectives of low consumption of cannot be employed for energy-minimization problems,
both communication and computing energies in where the target values are priori unknown. Roughly
delay and bandwidth-constrained NetDCs affected speaking, the common approach pursued by [10,17,18,20]
by reconfiguration costs are cast in the form of a is to formulate the afforded minimum-cost problems as
suitable constrained optimization problem, namely, sequential optimization problems and, then, solves them
the Computing and Communication Optimization by using limited look-ahead control. Hence, the
Problem (CCOP); effectiveness of this approach relies on the ability to
ii. due to the nonlinear behavior of the rate-vs.- predict accurately future workload and degrades when
power-vs.-delay relationship, the CCOP is not a the workload exhibits almost unpredictable time fluctua-
convex optimization problem and neither guaran- tions [15].
teed-convergence iterative algorithms nor closed- Furthermore, the joint provisioning of communication
form formulas are, to date, available for its solution. and computing resources is not considered by these contri-
Hence. in order to solve the CCOP in exact and butions, that mainly focus on the computing aspects. In or-
closed-form, we prove that it admits a loss-free der to avoid the prediction of future workload, [19] resorts
(e.g., optimality preserving) decomposition into to a Lyapunov-based technique, that dynamically optimizes
two simpler sub-problems, namely, the CoMmunica- the provisioning of the computing resources by exploiting
tion Optimization Problem (CMOP) and the ComPut- the available queue information. Although the pursued
ing Optimization Problem (CPOP). Although the approach is of interest, it relies on an inherent delay-vs.-
CMOP and CPOP are not guaranteed to be convex utility tradeoff, that does not allow to account for
problems, their loosely coupled structure (in the hard-deadline constraints. The combined exploitation of
sense of [25]) allows us to solve the (nonconvex) DVFS and virtualization techniques is the focus of [27].
CCOP and develop analytical conditions for its Although the parallel computing platform considered in
feasibility; [27] is, indeed, multi-core and managed by a Virtual Ma-
iii. we develop a fully autonomic version of the pro- chine Manager (VMM), the framework developed in [27]
posed resource scheduler, that is capable to quickly does not consider load balancing and neglects both the en-
adapt to the a priori unknown time-variations of ergy and time overheads induced by the underlying LAN.
the offered workload, without requiring workload The joint analysis of the computing-plus-communica-
forecasting; tion energy consumption in NetDCs is, indeed, the focus
iv. finally, we derive analytical conditions for the (pos- of [2,4,6], where delay-tolerant Internet-based applications
sible) hibernation of the instantiated VMs, that high- are considered. Interestingly, the main lesson stemming
light the tight inter-play between computing and from these contributions is that the energy consumption
communication resources. due to data communication may represent a large part of
the overall energy demand, especially when the utilized
A remarkable feature of the developed adaptive sched- network is power and/or bandwidth-limited. Overall, these
uler is its scalable and distributed structure, that makes works numerically analyze and test the energy perfor-
the complexity of its online implementation independent mance of some state-of-the-art schedulers for NetDCs,
from the size of the considered NetDC. but do not attempt to optimize it through the dynamic
joint scaling of the available communication-plus-comput-
1.2. Related work ing resources. As also recently pointed out in [3] and [15,
Section 9], this is still an open research topic, especially
Updated surveys of the current technologies and open when hard-delay requirements are also present.
communication challenges related to the green cloud par- The rest of this paper is organized as follows. After mod-
adigm have been recently presented in [3,15]. Specifically, eling in Section 2 the considered NetDC infrastructure, in
power management schemes that exploit Dynamic Voltage Section 3 we formally state the afforded CCOP and, then,
and Frequency Scaling (DVFS) techniques [15, Section 5] in Section 4, we solve it and provide the analytical condi-
for performing resource provisioning are the focus of tions for its feasibility. In Section 5, we present the main
[26,30,31]. Although these contributions consider structural properties of the resulting optimal scheduler,
hard-deadline constraints, they do not consider, indeed, and analytically characterize the (possible) occurrence of
the performance penalty and the energy-vs.-delay tradeoff hibernation phenomena of the instantiated VMs. In Sec-
stemming from the finite capacity of the utilized LANs. tion 6, we numerically test the average performance of
Furthermore, the complexity of the online implementation the proposed scheduler at various values of the Peak-to-
of the therein proposed schedulers scales as O(M log(M)), Mean Ratio (PMR) of the (randomly time-variant) offered
where M is the minimum between the number of tasks workload, and, then, we compare the obtained perfor-
to be executed in parallel and the number of available mance against the corresponding ones of some state-of-
computing machines. the-art static and sequential schedulers. The conclusive
Energy-saving dynamic provisioning of the computing Section 7 recaps the main results and outlines some hints
resources in virtualized green data centers is the topic of for future research. The final Appendix reports the main
[10,16–20,27]. Specifically, [16] formulates the analytical proofs.
N. Cordeschi et al. / Computer Networks 57 (2013) 3479–3491 3481
About the adopted notation, ½xba indicates min{max{x; on the size L (bit) of the task T to be currently processed by
a}; b}, [x]+ means max{x;0}, while 1[A] is the (binary-val- the VM, the processing rate fc may be adaptively scaled at
ued) indicator function of the A event (that is, 1[A] is unit run-time, and it may assume values over the interval
when the A event happens, while it vanishes when the A ½0; fcmax , where fcmax (bit/s) is the maximum allowed pro-
event does not occur). cessing rate.
Furthermore, due to the real-time nature of the consid-
ered application scenario, the time allowed the VM to fully
2. The considered NetDC infrastructure process each submitted task is fixed in advance at D (s),
regardless of the actual size L of the task currently submit-
In principle, a networked clustered platform for parallel ted to the VM. In addition to the currently submitted task
computing is composed by multiple (possibly, virtualized) (of size L), the VM may also process a background work-
processing units and a central resource controller [1]. Each load of size Lb(bit), that accounts for the Operating System
processing unit executes the currently assigned task as an (OS) programs. As in [13,24], this background workload is
independent processor by self-managing own local stor- assumed stored by the local memory equipping the VM,
age/computing resources. Intra-cluster communication is so that the execution of the background task entails com-
supported through message passing. When a request for puting costs but does not induce communication costs.
a new job is submitted to the NetDC, the central resource Hence, by definition, the utilization factor g of the VM
controller dynamically performs both admission control equates [24]: g , f c =fcmax 2 ½0; 1. The analysis reported in
and resource allocation. [14,15]. the (quite recent) contributions [7, Section 3.4], [14,23,
Hence, from an infrastructure perspective, emerging Section 3.1] points out that the reduction of the dynamic
NetDCs are composed by three main components [14,15], component of the computing energy plays a major role in
e.g., the data storage, the VMM and the switched LAN the reduction of the overall computing energy, especially
(see Fig.1). A new job is initiated by an event, that is con- when the supported services are of real-time type. Then,
stituted by the arrival at the instant ta of a file of size Lt in agreement with these contributions, let E c E c ðfc Þ
(bit). Due to the real-time nature of the considered applica- (Joule) be the overall energy consumed by the VM to pro-
tion scenario, full processing of the input file must be cess a single task of duration D (s) at the processing rate
carried out within a given (e.g., a priori assigned) deter- fc, and let E max
c E c ðfcmax Þ (Joule) be the corresponding max-
ministic working time Tt (s). Hence, in our framework, a imum energy when the VM operates at the maximum pro-
real-time job is characterized by [13]: (i) the size Lt (bit) cessing rate fcmax . Then, by definition, the dimensionless
of the file to be processed; (ii) the maximum tolerated pro- ratio:
cessing delay Tt (s); and, (iii) the job granularity, that is, the
E c ðfc Þ fc
(integer-valued) maximum number MT P 1 of indepen- UðgÞ , max U ; ð1Þ
Ec fcmax
dent parallel tasks embedded into the submitted job [13,
Section 2.4]. is the so-called Normalized Energy Consumption of the
Let MV P 1 be the (integer-valued) maximum number considered VM [5]. From an analytical point of view,
of VMs that may be instantiated onto the NetDC of Fig. 1. U(g): [0, 1] ? [0, 1] is a function of the actual value g of
In principle, each VM may be modeled as a (virtual) server, the VM utilization factor. Its analytical behavior depends
that is capable to process fc bits per second [24]. Depending on the specific features of the resource provisioning policy
Fig. 1. The considered NetDC architecture. Continuous lines () indicate bidirectional data flows. Dotted lines () are bidirectional controlling paths. The
LAN is composed of the switch unit and the associated point-to-point links.
3482 N. Cordeschi et al. / Computer Networks 57 (2013) 3479–3491
actually implemented by the VMM of Fig. 1 [7,11]. How- the (one-way) transmission and switching and the corre-
ever, at least for CMOS-based physical CPUs, the following sponding power PnetR ðiÞ demanded by the receive circuit.
three (mild) assumptions on U(g) are typically met [5,23]: In general, the actual value assumed by P net
i depends on
(i) U(g = 0) = 0 and U(g = 1) = 1; (ii) U(g) is strictly increas- the corresponding transmission rate Ri, noise spectral
ing and continuous in g; and, (iii) U(g) is strictly convex in g, power density N 0 ðiÞ (W/Hz), bandwidth Wi (Hz) and (non-
e.g., d2U(g)/dg2 > 0, "g 2 [0, 1]. Just as a practical example, negative) gain gi of the ith link [8]. The Shannon-Hartley
the analytical form assumed by U(g) for DVFS-enabled exponential formula:
CMOS CPUs is recognized to be well described by the fol-
Ri =W i
lowing quadratic one [5,11,23]: Pnet
i Pnet
i ðRi Þ ¼ fi 2 1 ; ð3Þ
ðiÞ
N 0 Wi
UðgÞ ¼ g2 ; g 2 ½0; 1: ð2Þ with fi , gi
; i ¼ 1; . . . ; M, and the a-powered formula:
1=a
Furthermore, as in [13], we may utilize the (dimension- Ri
Pnet
i Pnet
i ðRi Þ ¼ Xi ; ð4Þ
less) attribute x for measuring the relative energy cost in- Wi
curred by the actuation of the considered VM for the ðiÞ
N 0 Wi
with a2]0, 1[, and Xi , 1=a ; i ¼ 1; . . . ; M, are examples
execution of the planned task. Specifically, larger values ðlnð2ÞÞ gi
of x make more energy expensive the execution of the task of power-rate functions of practical interest [8, Chapter 3].
on the considered VM, and x = +1 forbids at all the execu- Hence, the corresponding one-way transmission delay D(i)
tion of the planned task by VM. Therefore, a suitable (s) equates: D(i) = Li/Ri, so that the corresponding one-way
setting of the attributes {x(i), i = 1, . . ., MV} may capture communication energy E net ðiÞ (Joule) is: E net ðiÞ ¼ Pnet
i ðRi Þ
task preferences, heterogeneity in the functionality of the (Li/Ri).
available VMs, as well as underlying task-placement
constraints [13]. 2.3. Reconfiguration cost and machine management
Remark 1. Parallel tasks in clustered NetDCs basis). Furthermore, from a technological point of view,
We point out that, in our framework, the size Lt of each the time-overheads induced by frequency scaling are lim-
submitted job remains constant over the corresponding ited up to few tens of ls in state-of-the-art DVFS-enabled
working time Tt and no workload fluctuations take place multi-core computing platforms [15,28], so that current
during the execution of each job. Hence, formally speaking, technology makes feasible the execution of time-sharing
neither migrations of VMs nor actions for turning the operations at run-time [31]. Hence, in the sequel, we di-
underlying physical CPUs ON/OFF are to be forecast at run- rectly focus on the case of continuous (possibly, piecewise
time, so that the scheduling policies considered here are of linear) DVFS-enabled computing platforms.
clairvoyant-type [13].1 Finally, our model subsumes that
the job inter-arrival are delays larger than the allowed per-
3. Optimal communication–computing resource
job processing time Tt, so that no queueing phenomena
allocation
occur. As detailed in the following Section 3, load balancing
and task parallelization are effective means to meet this
In this section, we deal with the second service offered
assumption. This is in agreement with the assumed hard
by the Manager Module of Fig. 1, namely, the dynamic load
real-time behavior of the overall NetDC of Fig. 1, that, in
balancing and provisioning of the communication-plus-
principle, cannot be guaranteed when randomly time-var-
computing resources. Specifically, this service aims at
iant unpredictable queuing/blocking delays occur [30].
properly tuning the task sizes {Li, i = 1,. . ., M}, the commu-
nication rates {Ri, i = 1, . . ., M} and the computing rates {fi,
Remark 2. Discrete DVFS i = 1, . . ., M} of the DVFS-enabled VMs of Fig. 1. The goal is
In general, the aforementioned assumption of continu- to minimize (on a per-slot basis) the overall resulting com-
ous-valued utilization factor g may be questionable, munication-plus-computing energy:
because, due to the one-to-one mapping in (1), it is X
M X
M
equivalent to require continuous-valued computing rates E tot , E c ðiÞ þ E net ðiÞ ðJouleÞ; ð9Þ
fc’s. Actual VMs are instantiated on top of physical CPUs, i¼1 i¼1
and, then, we may use it as the true energy consumption s:t: : ðLi þ Lb ðiÞÞ 6 fi D; i ¼ 1; . . . ; M; ð11:2Þ
curve for resource provisioning [21]. Unfortunately, being X
M
Li ¼ Lt ; ð11:3Þ
the virtual curve of continuous type, it is no longer guaran-
i¼1
teed that the resulting optimally scheduled computing
0 6 fi 6 fimax ; i ¼ 1; . . . ; M; ð11:4Þ
rates are still discrete valued. However, as also explicitly
Li P 0; i ¼ 1; . . . ; M; ð11:5Þ
noted in [21,31], any point (g⁄, U(g⁄)), with
2Li
gb ðjÞ < g < gb ðjþ1Þ , on the virtual curve may be actually þ D 6 T t ; i ¼ 1; . . . ; M; ð11:6Þ
Ri
attained by time-averaging over D secs (i.e., on a per-job
XM
basis) the corresponding surrounding vertex points: Ri 6 Rt ; ð11:7Þ
ðjþ1Þ
ðgb ðjÞ ; Uð g
b ðjÞ ÞÞ and g b b ðjþ1Þ Þ . Due to the piecewise
; Uð g i¼1
linear behavior of the virtual curve, as in [21,31], it is guar- Ri P 0; i ¼ 1; . . . ; M: ð11:8Þ
anteed that the average energy cost of the discrete DVFS
system equates that of the corresponding virtual one over About the stated problem, some explicative remarks are
each time interval of duration D (s) (e.g., on a per-job in order. Specifically, the first two terms in the summation
in (11.1) take account for the computing-plus-reconfigura-
1
tion energy E c ðiÞ consumed by the VM (i), while the last
Since, at the present time, migration of a (single) VM requires about a
minute and turning physical servers ON/OFF requires about 4-5 minutes
term in (11.1) is the communication energy E net ðiÞ
[24], it may be not appealing (or even not feasible at all) to resort to requested by the corresponding point-to-point link for
migrations of VMs during the execution of real-time jobs. conveying Li bits at the transmission rate of Ri (bit/s).
3484 N. Cordeschi et al. / Computer Networks 57 (2013) 3479–3491
Furthermore, fi0 and fi in (11.1) represent the current (i.e., in the simplest case when the power function Pnet i ð:Þ re-
already computed and consolidated) computing rate and duces to an assigned constant. Therefore, neither guaran-
the target one, respectively. Formally speaking, fi is the var- teed-convergence iterative algorithms nor closed form
iable to be optimized, while fi0 describes the current state of expressions n are, to date, available o to compute the optimal
the VM (i), and, then, it plays the role of a known constant. solution b bi; b
Li; R f i ; i ¼ 1; . . . ; M of the CCOP. However, a
The constraint in (11.2) guarantees that the VM (i) exe- direct examination of Eqs. (11.1), (11.2), (11.3), (11.4),
cutes the assigned task within D secs, while the (global) (11.5), (11.6), (11.7), (11.8) unveils that the CCOP is a
constraint in (11.3) assures that the overall job is parti- loosely coupled optimization problem (in the sense of
tioned into M parallel tasks. According to (10), the set of [25]), where the variables {Li, i = 1, . . ., M} couple the com-
constraints in (11.6) forces the NetDC to process the overall munication and computing sub-problems. In the sequel,
job within the assigned deadline Tt, and, then, it guarantees we formally develop a solving approach that is based on
that the overall communication–computing platform of the (lossless) decomposition of the CCOP into the (afore-
Fig. 1 operates in hard real-time. Finally, the global con- mentioned) CMOP and CPOP.
straint in (11.7) limits up to Rt (bit/s) the aggregate trans- Formally speaking, for any assigned nonnegative vector
!
mission rate sustainable by the underlying LAN of Fig. 1, L of the tasks sizes, the CMOP is the (generally nonconvex)
so that Rt is directly dictated by the actually implemented optimization problem in the communication rate variables
LAN technology (such as, for example, Myrinet, InfiniBand, {Ri, i = 1, . . ., M}, so formulated:
Fast/Gigabit Ethernet, and so on; see, for example, [8]).
XM
2Pnet
i ðRi Þ
min Li ; ð12:1Þ
fRi g
i¼1
Ri
3.1. Generalizations of the CCOP’s formulation
s:t: : CCOP’s constraints in ð11:6Þ; ð11:7Þ; ð11:8Þ: ð12:2Þ
Depending on the actually considered NetDC infrastruc- !
Let fRi ð L Þ; i ¼ 1; . . . ; Mg be the optimal solution of the
ture, several generalizations of the reported CCOP’s formu-
CMOP in (12), and let
lation are possible. Specifically, as in [30], we assume ( )
constant the frequency-switching time-overhead induced ! þ M ! X
M
!
S, L 2 R0 : Li =Ri ð L Þ 6 ðT t DÞ=2; i ¼ 1;. .. ;M; Ri ð L Þ 6 Rt ;
by the adopted DVFS technique. However, it is direct to i¼1
The CCOP in (11) is not a convex optimization problem. 4.1. Resolution of the CMOP and closed-form characterization
This due to the fact that, in general, each function: of the S region
Pnet
i ðRi ÞðLi =Ri Þ in (11.1) is not jointly convex in Li, Ri, even
About the feasibility and solution of the CMOP, the
2
following result holds.
From an application point of view, such kind of jointly convex energy
functions may model the coupling effects that are typically present when
the employed Virtualization Layer of Fig. 1 fails to provide ’’perfect’’ 3
In order to simplify the notation, we understand the dependence of Ri
!
isolation among the instantiated VMs [24]. on L when it is not strictly mandatory.
N. Cordeschi et al. / Computer Networks 57 (2013) 3479–3491 3485
Proposition 2. Let each function Pnet i ðRi Þ=Ri in (12) be Lb ðiÞ 6 Dfimax ; i ¼ 1; . . . ; M; ð20:1Þ
continuous and increasing for Ri P 0. Hence, for any assigned X
M
!
vector L , the following two properties hold: Lt 6 fimax D Lb ðiÞ ; ð20:2Þ
i¼1
(i)) and the corresponding processing rate fi is strictly lar- for the adaptive updating of the step-size in (27). In our
ger than the minimum one: fimin , Lb ðiÞ=D, requested for framework, this updating reads as in [12, Eq. (2.4)]:
processing the background workload Lb(i) (see the con- ( ( !))
straint in (11.2)). In principle, we expect that the hiberna-
X
M
ðn1Þ
aðnÞ max 0; min b; aðn1Þ cV ðn1Þ Li Lt ;
tion of the VM (i) may lead to energy savings when ke, fi0 ’s
i¼1
and the ratios P net
i =Ri ’s are large, while the offered work- ð29Þ
load Lt is small. As proved in the final Appendix A, this is,
(n1)
indeed, the behavior exhibited by the optimal scheduler, where b and c are positive constants, while V is up-
that hibernates the VM (i) at the processing frequency fi dated as in [12, Eq. (2.5)]:
in (23.1) when the following hibernation condition is met: !
ðnÞ
ðn1Þ X
M
ðn1Þ
l 6 THðiÞ: ð26Þ V 1aðn1Þ
V Li Lt V ð0Þ ¼ 0: ð30Þ
i¼1
Interestingly, the ith hibernation threshold TH(i) in (22) is
fully dictated by the power-rate behavior of the ith trans- The convexity of the objective function in (18) and the
mission link of the utilized LAN, while the corresponding existence and uniqueness of the solution of the CPOP allow
hibernated frequency in (23.1) depends on computing-re- the above iterations to converge to the optimum set
lated parameters (see Eq. (21)). This provides additional fl ; Li ; mi ; fi ; i ¼ 1; . . . ; Mg of Proposition 4 [12].
evidence of the tight computing-communication coupling From an application point of view, a (first) appealing
induced by the green paradigm. feature of the step-size updating algorithm in (29) is its
robustness against the actual tuning of the involved
5.2. Online implementation issues and parameter tracking parameters b and c. As already pointed out in [12], we
anticipate that, also in our framework, the final energy per-
As also pointed out in [13, Table 6.1], the minimum exe- formance of the overall resource provisioning algorithm
cution-time scheduling of independent tasks on parallel remains virtually unchanged for b and c ranging over the
computing platforms is, in general, a NP-hard BinPacknig intervals [104,102] and [0.1,0.6], respectively.
optimization problem, that resists closed-form solutions, An additional appealing feature of (29) is that, some
even when the communication costs are not considered. times, if the offered workload changes, it would be equiv-
From this point of view, remarkable features of the optimal alent to re-start the overall iterates (27)(28)(29)(30) with
scheduler of Proposition 4 are that: (i) it leads to distributed the current workload as input [12]. Hence, it is expected
and parallel computation (with respect to the i-index) of that the presented gradient-based updating algorithm
the 3M variables ffi ; Li ; Ri ; i ¼ 1; . . . ; Mg; and, (ii) its imple- exhibits adaptive capabilities and the numerical results
mentation complexity is fully independent from the size Lt confirm, indeed, this expectation.
of the offered workload and the aggregate capacity Rt of the
utilized LAN. 5.3. Adaptive setting of the computing time D
Moreover, in time-varying environments characterized
by (possibly, abrupt and random) time-fluctuations of the In principle, adapting the value of the allowed comput-
offered workload Lt (see Section 6), the per-job evaluation ing time D in (11.2), (11.6) to the (possibly, abrupt) time-
and online tracking of the Lagrange multiplier in (25)may variations of the offered workload Lt could provide an
be performed by resorting to a gradient-based updating, effective means for attaining additional energy savings.
that assumes the following form [12]: Unfortunately, due to the product form of the constraints
" !# in (11.2) and the combined presence of D and {Li, i = 1,
X
M
ðn1Þ . . ., M} in the X ð:Þ function in (19), the CPOP in (18) is no
ðnÞ ðn1Þ ðn1Þ
l ¼ l a Li Lt ;
i¼1
longer a jointly convex optimization problem when D is
þ
included into the set of variables to be optimized. This
lð0Þ ¼ Lð0Þ
i ¼ 0; ð27Þ
implies, in turn, that the corresponding Kar-
where n P 1 is an integer value iteration index, {a } is a (n1) ush Kuhn Tucker (KKT) conditions are no longer suffi-
(suitable) n-variant nonnegative step-size sequence, and cient for analytically characterizing the global optimum
the following dummy iterates in the n-index also hold of the CPOP. However, as also remarked in [14, Section 5.1],
(see Eqs. (23.1), (23.2) and (24)): the Fixed Point Iteration Method (FPIM) (also referred to as
" !# Gauss–Seidel Component Solution method [32, Sec-
ðn1Þ
2@Pnet 2Li tion 3.1]) provides an effective (albeit suboptimal) formal
mðnÞ
i ¼ l
ðnÞ
i
; ð28:1Þ
@Ri ðT t DÞ tool for dealing with nonconvex optimization problems
þ
h if max
ðnÞ ðnÞ i composed by the coupling of convex sub-problems. Specif-
fi ¼ p1i 2ke fi0 þ mi D min ; ð28:2Þ
f
i ically, by referring to the CPOP in (18), the FPIM iteratively
" 1 ðnÞ #
ðT t DÞ @Pnet l computes the optimum values of a subset of variables (e.g.,
ðnÞ ðnÞ i ðRi Þ
Li ¼ 1½mðnÞ >0 fi D Lb ðiÞ þ 1½mðnÞ ¼0 :
i i 2 @Ri 2 {fi, Li, i = 1, . . ., M} or D), while the values of the other ones
þ
(e.g., alternatively, D or {fi, Li, i = 1, . . ., M}) are held fixed.
ð28:3Þ
The iteration stops when the improvement in the value
Furthermore, an effective means for coping with the of the objective function in (18) is less than an assigned
unpredictable time-variations of the offered workload is (small) threshold over two consecutive iterations [32,
provided by the gradient-descendent algorithm of [12] Section 3.1.2]. In detail, let the feasibility conditions of
N. Cordeschi et al. / Computer Networks 57 (2013) 3479–3491 3487
Proposition 3 be met. Then, when D is fixed to a specific by the proposed optimal scheduler under both syntheti-
previously computed value D⁄, the CPOP in (18) is solved cally generated and measured realistic workload traces.
with respect to {fi, Li, i = 1, . . ., M} according to Proposition
4. Viceversa, when {fi, Li, i = 1, . . ., M} are held fixed to spe- 6.1. Simulated stochastic setting
cific previously computed values ffi ; Li ; i ¼ 1; . . . ; Mg, the
CPOP is solved with respect to D. Under the (additional) Specifically, in order to stress the effects of the reconfig-
assumption that the X ð:Þ function in (19) is convex in D uration costs and the time-fluctuations of the offered
for any assigned {fi, Li, i = 1, . . ., M}, the constrained minimi- workload on the energy performance of the simulated
zation of the objective function in (18) over D simply re- schedulers, as in [19], we model the workload size as an
duces to [22, Chapter 4]: (i) equate to zero the derivative independent identically distributed (i.i.d.) random se-
@X ð:Þ=@ D, and, then, solve the resulting algebraic equation quence {Lt(mTt), m = 0, 1, . . .}, whose samples are Tt-spaced
with respect to D; and, (ii) project the so obtained solution apart r.v.s that are uniformly distributed over the interval
onto the following closed interval: ½Lt a; Lt þ a, with Lt 8 (Mbit). By setting the spread
parameter a to 2 (Mbit), 4 (Mbit), 6 (Mbit) and 8 (Mbit),
½ max fðLi þ Lb ðiÞÞ=fi g; T t ð2Lt =Rt Þ
i¼1;...;M we obtain PMRs of 1.25, 1.5, 1.75 and 2.0, respectively.
These PMRs are quite common in large-to-medium size
that formally accounts for the box constraints in (11.2),
data centers, that perform in real-time Internet-based con-
(11.6) and (11.7) on the feasible D’s values.
tent delivery [10]. About the dynamic setting of fi0 in
The proposed FPIM allows an adaptive setting of D at
(11.1), at the first round of each batch of the carried out
run-time. Formally speaking, since, in our framework, the
simulations, all the frequencies fi0 ’s are reset. Afterwards,
analytical conditions of [32, Proposition 1.7] are not assured
at the mth round, each fi0 is set to the corresponding opti-
to hold, we cannot guarantee that the proposed FPI proce-
mal value fi computed at the previous (m 1)th round.
dure converges to a global optimum. However, we can
Furthermore, unless otherwise stated, the presented
state that the FPI procedure always converges to (at least)
numerical results refer to the Shannon-Hartley’s power-
a local optimum. In fact, since, at each iteration, the opti-
rate function in (3), together with the quadratic computing
mal solution of each component sub-problem is computed,
energy function in (2). Each simulated point has been
the current D⁄ and ffi ; Li ; i ¼ 1; . . . ; Mg assignments are
numerically evaluated by averaging over 1000 indepen-
improved. This formally guarantees that the overall FPI
dent runs.
algorithm always approaches a solution that is (at least) lo-
cally optimal.
6.2. Impact of the LAN setup, hibernation phenomena and
reconfiguration costs
5.4. Cases of study and application examples
Goal of a first set of numerical tests is to evaluate the ef-
From an application point of view, it should be fects on the per-job average consumed energy E tot of the
remarked that the Shannon-Hartley and a-powered size M of the NetDC and the setting of the bandwidths
relationships in (3), (4) as well as the quadratic computing ðiÞ
{Wi}, noise levels fN 0 g and link gains {gi} of the employed
energy function in (2) are examples of functions of practi- LAN. For this purpose, we set [28]: Tt = 5 (s), Rt = 100 (Mb/
cal interest that meet all the assumptions of Proposition 4. s), PMR = 1.25, ke = 0.05 ðJ=ðMHzÞ2 Þ; fimax ¼ 105 (Mbit/s),
Specifically, for the case of the Shannon-Hartley’s formula xðiÞ ¼ 1; E max ¼ 60 (Joule), D = 0.1 (s), Wi = 1 (MHz) and
i
in (3), the derivative in (24) and its inverse specialize to Lb(i) = 0.
!
ðiÞ Afterwards, since the bandwidth/noise/link-gain effects
@Pnet
i ðRi Þ N0 Ri 1
¼ ðln 2Þ 2W i ; @Pnet
i =@Ri ðyÞ are captured by the corresponding ratios {fi} (see (3)), we
@Ri gi
have numerically evaluated the average total energy con-
2y sumption E tot of the proposed optimal scheduler under
¼ W i log2 ; ð31Þ
THðiÞ the following settings: (i) fi = 0.2 (mW); (ii) fi = 0.5 (mW);
ðiÞ (iii) fi = [0.5 + 0.25(i 1)] (mW); and, (iv) fi = [0.5 + 0.5
with (see Eq. (22)): THðiÞ ð2ðln 2ÞN 0 =g i Þ. (i 1)] (mW), i = 1, . . ., M. The obtained numerical plots
Finally, for the case of the quadratic computing energy are drawn in Fig. 2. As it could be expected, larger fi’s
function in (2), the expression in (23.1) for the ith optimal penalize the overall energy performance of the simulated
processing frequency is directly computable as in NetDC. Interestingly, since E tot is, by definition, the mini-
" #f max mum energy when up to M VMs may be instantiated, at
i
2ke fi0 þ mi D
fi : ð32Þ fixed positive P net
i ’s, E tot decreases for increasing M and,
2
2ke þ 2x ðiÞE max
i =ðfimax Þ f min then, it approaches a minimum value that does not vary
i
when M is further increased (see the flat segment of the
The above examples support the practical effectiveness of two uppermost plots of Fig. 2).
the proposed solving approach. An instance of hibernation of the instantiated VMs is
exemplified by the plots of Fig.3. They refer to the consid-
6. Numerical results and performance comparison ered application scenario with fi = [0.5 + 0.5(i 1)] (mW),
fi0 ¼ 0:2fimax ; i ¼ 1; . . . ; M. Specifically, the upper bars of
We numerically evaluate the per-job average communi- Fig. 3 report the (numerically evaluated) optimal average
cation-plus-computing (e.g., total) energy E tot consumed processing rates fi ’s, while the lower bars refer to the
3488 N. Cordeschi et al. / Computer Networks 57 (2013) 3479–3491
f* (Mbit/s)
20
first nine VMs are permanently loaded, while the corre-
sponding upper bars confirm, indeed, that all the available
10
VMs constantly run at positive processing rates. This
i
means that, in the considered application scenario, the last 0
three VMs are hibernated by the optimal scheduler. 1 2 3 4 5 6 7 8 9 10 11 12
Li (Mbit/s)
estingly, these plots show that E tot increases for growing 2
ke’s only for small values of M, while the optimal number
M⁄ of VMs to be instantiated (e.g., the right size of the Net- 1
*
DC) decreases for increasing ke’s.
0
1 2 3 4 5 6 7 8 9 10 11 12
M
6.3. Performance comparison under synthetic time-
uncorrellated workload profiles Fig. 3. Hibernation phenomena for the application scenario of Section 6.2
at M = 12 and PMR = 1.25.
These conclusions are also supported by the numerical
results of this subsection. They aim at unveiling the impact
of the PMR of the offered workload on the average energy
ke=0.5 (Joule/(MHz)2)
SEquential Scheduler (SES)[15]. Intuitively, we expect that
the energy savings attained by dynamic schedulers in- 50
crease when DVFS-enabled VMs are used, especially at
large PMR values. However, we also expect that not negli-
40
gible reconfiguration costs (e.g., large ke’s) may reduce the
attained energy-saving and that the experienced reduc-
tions tend to increases at large PMRs. In order to validate 30
under the constraints in (11.1)–(11.8). We have evaluated In order to test the energy performance of the proposed
the solution of the above sequential optimization problem scheduler when the sequence of the offered workload is
by resorting to numerical computing tools, that we have time-correlated and the FPI-based dynamic tuning of D
implemented through Matlab routines. In doing so, we of Section 5.3 is also implemented, we have considered
have also numerically ascertained that, at least in the sim- the realistic workload trace reported in Fig. 14a of [29],
ulated application scenarios, I ¼ 10 slots suffice for attain- that, as pointed out in [29, Section 6.1], it is representative
ing full energy-saving. Since the SES operates off-line, it of an 1-h HTTP-type session arrival process actually
cannot be really employed in hard real-time applications. measured at the Web servers of the 1998 Soccer World
However, by design, it fixes the ultimate performance Cup site (see [29, Section 6] and references therein).
attainable through dynamic resource provisioning policies. The numerical tests carried out in this sub-section
Hence, it allows us to (numerically) evaluate the ultimate refer to the communication–computing infrastructure of
performance loss suffered by dynamic schedulers that Section 6.3 at ke = 0.5 (Joule/(MHz)2).
must work in real-time. Furthermore, in order to maintain the peak workload
Table 1 and Table 2 report, the average energy savings still fixed at 16 (Mbit/slot), we assume each arrival of the
(in percent) provided by the proposed scheduler and the HTTP sessions in [29, Fig. 14a] carries out a workload of
sequential scheduler over the static one for the cases of 0.533 (Mbit). Hence, by referring to the described applica-
medium reconfiguration costs (e.g., ke = 0.05 (Joule/ tion scenario, a first set of numerical trials has been per-
MHz2)) and high reconfiguration costs (e.g., ke = 0.5 formed by statically setting D at 1.2 (s). We have
(Joule/MHz2)). numerically evaluated that, in this case, the average energy
An examination of the numerical results reported in reduction of the proposed scheduler over the static one of
Table 1 leads to two main conclusions. First, the average Eq. (33) is around 27%, while the corresponding average
energy-saving of the proposed dynamic scheduler over energy reduction of the sequential scheduler over the pro-
the static one approaches 94% at PMR = 2, even when the posed one is of the order of 5.5%. Afterwards, in a second
VMs are equipped with a limited number Q = 6 of discrete set of numerical trials, we have still initialized D at 1.2
processing rates. Second, the performance loss suffered by (s), and, then, we have included into the implementation
the proposed (online) scheduler with respect to the sequen- of the proposed scheduler the FPI-based procedure of
tial (off-line) one tends to increase for growing PMRs, but it Section 5.3.
remains limited up to 5–6%. The same conclusions may be Hence, on the basis of the obtained numerical results,
drawn from the data of Table 2. Specifically, although they we have experienced that the average energy reduction
refer to a computing platform with reconfiguration costs of the FPI-equipped proposed scheduler over the static
that are ten times higher than those of Table 1, the average one approaches 40–42%, while the corresponding average
energy reduction of the proposed scheduler over the static energy loss with respect to the sequential scheduler scales
one still approaches 92% at PMR = 2, while the correspond- down to 2.5–3.0%.
ing energy loss with respect to the sequential scheduler is
still limited up to 4–5%.
7. Conclusion
proposed dynamic scheduler over the state-of-the-art Li ¼ fi Lb ðiÞ; at mi > 0: ðA:5Þ
static one may approach 90%, even when the PMR of the of-
Hence, after equating to zero Eq. (A.3) and solving the
fered workload is limited up to 2 and the number of differ-
resulting algebraic equation with respect to mi, we obtain
ent processing rates equipping each computing machine is
the following expression for the corresponding optimal mi :
limited up to 5–6. Interestingly, the corresponding average
energy loss of the proposed scheduler with respect to the @Pnet
i Li
corresponding sequential one is limited up to 4–6%, espe- mi l 2 ; at mi > 0: ðA:6Þ
@Ri T t D
cially when the offered workload exhibits not negligible
time-correlation. Since Li must fall into the closed interval ½0; Dfi Lb ðiÞ for
feasible CPOPs (see Eqs. (11.2), (11.5)), at mi ¼ 0, we must
have: Li ¼ 0 or 0 < Li < Dfi Lb ðiÞ. Specifically, we observe
Appendix A. Derivations of Eqs. (23.1)–(25)
that, by definition, vanishing Li is optimal when
½@L=@Li Li ¼0 P 0. Therefore, by imposing that the derivative
Being the constraint in (11.7) already accounted for by
in (A.3) is nonnegative at Li ¼ mi ¼ 0, we obtain the follow-
the feasibility condition (20.3), without loss of optimality,
ing condition for the resulting optimal l⁄:
we may directly focus on the resolution of optimization
problem in (18) under the constraints in (11.2), (11.3), l 6 2 @Pnet
i ðRi Þ=@Ri Ri ¼0 , THðiÞ;
(11.4), (11.5). Since this problem is strictly convex and all
at mi ¼ Li ¼ 0; i ¼ 1; . . . ; M; ðA:7Þ
its constraints are linear, the Slater’s qualification condi-
tions hold [22, Chapter 5], so that the KKT conditions [22, that proves, in turn, the validity of the hibernation condi-
Chapter 4] are both necessary and sufficient for analyti- tion in (26). Passing to consider the case of mi ¼ 0 and Li
cally characterizing the corresponding unique optimal glo- falling into the open interval: 0; Dfi Lb ðiÞ½, we observe
bal solution. Before applying these conditions, we observe that the corresponding KKT condition is unique, it is neces-
that each power-rate function in (19) is increasing for sary and sufficient for the optimality and requires that Eq.
Li P 0, so that, without loss of optimality, we may replace (A.3) vanishes [22, Chapter 4]. Hence, the application of
the equality constraint in (11.3) by the following equiva- this condition leads to the following expression for the
P
lent one: M i¼1 Li P Lt . In doing so, the Lagrangian function optimal Li (see Eq. (A.3)):
of the afforded problem reads as in 1
Li ¼ ðT t DÞ @Pnet
i ðRi Þ=@Ri ðl =2Þ; at mi
X
M
LðfLi ; fi ; mi ; lgÞ Z ðfLi ; fi gÞ þ mi ðLi fi D þ Lb ðiÞÞ ¼ 0 and 0 < Li < Dfi Lb ðiÞ : ðA:8Þ
i¼1
! ⁄
Eq.(A.8) vanishes at l = TH(i) (see Eq. (A.7)), and this
X
M
þ l Lt Li ; ðA:1Þ proves that the function: Li ðlÞ vanishes and is continuous
i¼1 at l⁄ = TH(i). Therefore, since Eq. (A.7) already assures that
vanishing Li is optimal at mi ¼ 0 and l⁄ 6 TH(i) and the
where ZðfLi ; fi gÞ indicates the objective function in (18.1), (aforementioned) KKT optimality condition leading to Eq.
mi’s and l are nonnegative Lagrange multipliers, and the (A.8) is unique, we conclude that the expression in (A.8)
box constraints in (11.4), (11.5) are managed as implicit for the optimal Li must hold when mi ¼ 0 and l⁄ P TH(i).
ones. The partial derivatives of Lð:; :Þ with respect to fi, Li This structural property of the optimal scheduler allows
are given by us to merge Eqs. (A.7), (A.8) into the following equivalent
expression:
@Lð:Þ xðiÞE max @ Ui ðfi =fimax Þ
i
þ 2ke fi fi0 mi D; i ¼ 1; . . . ; M; ðA:2Þ h 1 i
@fi fimax @ gi
Li ¼ ðT t DÞ @Pnet
i ðRi Þ=@Ri ðl =2Þ ; for mi ¼ 0; ðA:9Þ
@Lð:Þ @ net Li þ
2 P þ mi l; i ¼ 1; . . . ;M; ðA:3Þ
@Li @Ri i Tt D so that Eq. (23.2) directly arises from Eqs. (A.5), (A.9). Final-
while the complementary conditions [22, Chapter 4] asso- ly, after observing that mi cannot be negative by definition,
ciated to the constraints present in (A.1) read as in from Eq. (A.6) we obtain Eq. (24), where the projector
! operator accounts for the nonnegative value of mi . This
X
M
completes the proof of Proposition 4.
mi ½Li fi D þ Lb ðiÞ ¼ 0; i ¼ 1; .. .; M; l Lt Li ¼ 0: ðA:4Þ
i¼1
References
Hence, by equating to zero Eq. (A.2) and, then, by solving
the resulting algebraic equation with respect to fi, we di- [1] R. Buyya, J. Broberg, A. Goscinski, Cloud Computing-Principles and
Paradigms, first ed., Wiley, New York, 2011.
rectly arrive at Eq. (23.1), that also accounts for the box
[2] J. Baliga, R.W.A. Ayre, K. Hinton, R.S. Tucker, Green cloud computing:
constraint: fimin 6 fi 6 fimax through the corresponding pro- balancing energy in processing, IEEE Storage Transport 99 (1) (2011)
jector operator. Moreover, a direct exploitation of the last 149–167.
[3] A. Mishra, R. Jain, A. Durresi, Cloud computing: networking and
complementary condition in (A.4) allows us to compute
communication challenges, IEEE Commun. Mag. 50 (9) (2012) 24–
the optimal l⁄ by solving the algebraic equation in 25.
Eq.(25). In order to obtain the analytical expressions for [4] A.L.F. Bittencourt, E.R.M. Madeira, N.L.S. de Fonseca, Scheduling in
Li and mi , we proceed to consider the two cases of mi > 0 hybrid clouds, IEEE Commun. Mag. 50 (9) (2012) 42–47.
[5] D. Warneke, O. Kao, Exploiting dynamic resource allocation for
and mi ¼ 0. Specifically, when mi is positive, the ith con- efficient parallel data processing in the cloud, IEEE Trans. Paral. Distr.
straint in (11.2) is binding (see Eq. (A.4)), so that we have Syst. 22 (6) (2011) 985–997.
N. Cordeschi et al. / Computer Networks 57 (2013) 3479–3491 3491
[6] O. Tamm, C. Hersmeyer, A.M. Rush, Eco-sustainable system and [30] K.H. Kim, R. Buyya, J. Kim, Power aware scheduling of bag-of-tasks
network architectures for future transport networks, Bell Labs. Tech. applications with deadline constraints on DVS-enabled clusters, in:
J. 14 (4) (2010) 311–327. IEEE CCGRID International Symposium, Rio De Janeiro, Brazil, 2007,
[7] K.H. Kim, A. Beloglazov, R. Buyya, Power-aware provisioning of cloud pp. 541–548.
resources for real-time services, in: ACM MGC’09, Urbana [31] K. Li, Performance analysis of power-aware task scheduling
Champaign, Illinois, USA, 2009. algorithms on multiprocessor computers with dynamic voltage
[8] G. Keiser, Local Area Networks, second ed., McGraw Hill, 2002. and speed, IEEE Trans. Par. Distr. Syst. 19 (11) (2008) 1484–1497.
[9] T. Lu, M. Chen, L.L.H. Andrew, Simple and effective dynamic [32] D.P. Bertsekas, J.N. Tsitsiklis, Parallel and Distributed Computation
provisioning for power-proportional data centers, IEEE Trans. Paral. Numerical Methods, first ed., Athena Scientific, 1997.
Distr. Syst. 24 (6) (2013) 1161–1171. [33] F. Gebali, Algorithms and Parallel Computing, first ed., Wiley, New
[10] V. Mathew, R. Sitaraman, A. Rowstrom, Energy-aware load balancing York, 2011.
in content delivery networks, in: IEEE INFOCOM’12, Orlando, FL,
USA, 2012, pp. 954–962.
[11] D. Zhu, R. Melhem, B.R. Childers, Scheduling with dynamic voltage/
rate adjustment using slack reclamation in multiprocessor real-time Nicola Cordeschi received the Laurea degree
systems, IEEE Trans. Paral. Distr. Syst. 14 (7) (2003) 686–700. (summa cum laude) in Communication Engi-
[12] H.J. Kushner, J. Yang, Analysis of adaptive step-size SA algorithms for neering from the University of Rome ‘‘La
parameter tracking, IEEE Trans. Autom. Control 40 (8) (1995) 1403– Sapienza’’ in 2004. He received the Ph.D.
1410. degree in Information and Communication
[13] O. Sinnen, Task Scheduling for Parallel Systems, first ed., Wiley, New Engineering in 2008. His Ph.D. dissertation
York, 2007. was on the adaptive QoS Transport of Multi-
[14] J. Almeida, V. Almeida, D. Ardagna, I. Cunha, C. Francalanci, Joint media over Wireless Connections via cross-
admission control and resource allocation in virtualized servers, J. layer approaches based on the Calculus of
Paral. Distr. Comput. 70 (2010) 344–362. Variations. He is currently a Contractor-
[15] A. Beloglazov, R. Buyya, A. Zomaya, A taxonomy and survey of Researcher with the DIET Dept., University of
energy-efficient data centers and cloud computing systems, Adv.
Rome ‘‘La Sapienza’’. His research activity is
Comput. 82 (2011) 47–111.
focused on wireless communications and deals with the design and
[16] P. Padala, K.Y. You, K.G. Shin, X. Zhu, M. Uysal, Z. Wang, S. Singhal, M.
Merchant, Automatic control of multiple virtualized resources, in: optimization of highperformance transmission systems for wireless
EuroSys ’09, Nuremberg, Germany, 2009, pp. 13–26. multimedia applications.
[17] D. Kusic, J.O. Kephart, N. Kandasamy, G. Jiang, Power and
performance management of virtualized computing environments
via lookahead control, J. Cluster Comput. 12 (1) (2009) 1–15. Mohammad Shojafar received his B.S. in
[18] S. Govindan, J. Choi, B. Urgaonkar, A. Sasubramanian, A. Baldini, Computer Engineering-Software major at Iran
Statistical profiling-based techniques for effective power University Science and Technology, Tehran,
provisioning in data centers, in: EuroSys’09, New York, NY, USA, Iran (2001–2006) and M.Sc. at Qazvin Islamic
2009, pp. 317–330. Azad University, Qazvin, Iran (2007–2010). He
[19] R. Urgaonkar, M.C. Kozat, K. Igarashi, M.J. Neely, Dynamic resource is currently a Ph.D. student in Information and
allocation and power management in virtualized data centers, in: Communication Engineering at DIET Dept. of
IEEE/IFIP NOMS’10, Osaka, Japan, 2010, pp. 479–486. the ‘‘La Sapienza’’ University of Rome. His
[20] M. Lin, A. Wierman, L. Andrew, E. Thereska, Dynamic right-sizing for current research focuses on wireless com-
power-proportional data centers, in: IEEE/INFOCOM’11, Shanghai,
munications, distributed computing and
China, 2011, pp. 1098–1106.
optimization.
[21] M.J. Neely, E. Modiano, C.E. Rohs, Power allocation and routing in
multi beam satellites with time-varying channels, IEEE/ACM Trans.
Netw. 19 (1) (2003) 138–152.
[22] M.S. Bazaraa, H.D. Sherali, C.M. Shetty, Nonlinear Programming,
third ed., Wiley, New York, 2006.
[23] R. Koller, A. Verma, A. Neogi, WattApp: an application aware power Enzo Baccarelli received the Laurea degree
meter for shared data centers, in: ICAC’10, Washington, DC, USA, (summa cum laude) in electronic engineering
2010, pp. 31–40. and Ph.D. degree in Communication Theory
[24] M. Portnoy, Virtualization Essentials, first ed., Wiley, New York, and Systems, both from the University ‘‘La
2012. Sapienza’’ in 1989 and 1992, respectively. In
[25] M. Chiang, S.H. Low, A.R. Calderbank, J.C. Doyle, Layering as 1995, he received the Post-Doctorate degree
optimization decomposition: a mathematical theory of network
in Information Theory and Applications from
architectures, Proc. IEEE 95 (1) (2007) 255–312.
the INFOCOM Dept., University ‘‘La Sapienza’’
[26] J.J. Chen, T.W. Kuo, Multiprocessor Energy-efficient Scheduling for
where he also served as Research Scientist
real-time tasks with different power characteristics, in: ICCP’05,
Oslo, Norway, 2005, pp. 13–20. from 1996 to 1998. Since 1998 he has been an
[27] G. Laszewski, L. Wang, A.J. Young, X. He, Power-aware scheduling of Associate Professor in signal processing and
virtual machines in DVFS-enabled Clusters, in: IEEE CLUSTER’09, radio communications at the University ‘‘La
New Orleans, LA, USA, Sep. 2009, pp. 1–9. Sapienza’’. Since 2003, he is full Professor in data communication at the
[28] H. El-Rewini, M.A. El-Barr, Advanced Computer Architecture and University ‘‘La Sapienza’’.
Parallel Processing, first ed., Wiley, New York, 2005.
[29] B. Urgaonkar, G. Pacifici, P. Shenoy, M. Spreitzer, A. Tantawi, Analytic
modeling of multitier Internet applications, ACM Trans. Web (TWEB)
1 (1) (2007).