Professional Documents
Culture Documents
Repair Codes
Francesco De Pellegrini , Rachid El Azouzi, Alonso Silva and Olfa Hassani
AbstractHigh availability of containerized applications re- containerized applications, whenever a container fails, such
quires to perform robust storage of applications state. Since failure can be masked, while the related traffic and tasks are
basic replication techniques are extremely costly at scale, storage redirected to healthy replicas. Incidentally, this is also the
space requirements can be reduced by means of erasure and/or
arXiv:1711.03034v1 [cs.IT] 8 Nov 2017
2
optimal repairing MBR codes set = 2dB/[k(2d k + 1)] Let assume that once k chunks are acquired, the repair
and = /[k(2d k + 1)] [14]. process proceeds by downloading from the remaining d k
In order to obey to availability constraints, we assume that repairing servers. Hence, for any initial state x, we can write
repairing operations need to complete by time horizon T , i.e., the entries of the transition probability matrix
it must hold Xd (T ) = n. Once the regeneration procedure P (dt) = P {X
x ,x t+dt = x |Xt = x} =
through repair codes is completed, the full set of n operational
repairing nodes is restored. We model such procedure as
u(t) dt if x = x + e0
follows. First, new repairing servers are activated, e.g., by
x0 dt if x = x e0
adding a new physical node to the datacenter, or by installing = (d k + 1)xk1 dt if x = x + ek ek1 (1)
dedicated storage virtual machines on servers already part of
xk dt if x = x ek1
the fabric. They can be switched on at a maximum rate ; the
o(dt) otherwise
activation process is a Poisson process with rate , i.e., new
servers can be activated at rate > 0 new replacement servers where with ek is the k-th element of the standard basis. The
per second. first row describes the event of activation and the second row
Once activated, a repairing server downloads parity infor- the failure of a newly activated repairing server, respectively.
mation from d operational repairing servers. We assume that The third row describes the acquisition of a repair chunk by
each chunk transfer requires an exponential random time with a repairing node having k 1 chunks, and the fourth row
mean 1/ > 0. The regeneration procedure has two cost describes the failure of a node having retrieved k chunks. The
components: last row states that multiple transitions are negligible in the
i. activation cost: activating a new repairing server has a corresponding infinitesimal generator.
cost c1 per repairing server, due to the usage of legacy The process of regeneration of the servers can be studied
hardware in the datacenter and the related setup costs; using a fluid model. Due to the structure of system (1),
ii. transfer cost: data transfer has a cost c2 per bit, hence a the meanfield approximation can be proved tight for n in
chunk transfer has a cost c2 . the order of a few tenths [21]. By using the resulting fluid
During the regeneration process, due to hardware and/or approximation, in the next section we shall obtain an optimal
software issues, failure of repairing servers may occur as well; control problem in continuous time.
failure instants are modeled as exponential random variables The control space U is the set of the piecewise continuous
of parameter . functions taking values in [0, 1]. The dynamics of the number
The number of newly activated servers is denoted by X0 (t), of repairing servers thus writes
whereas Xk (t) denote the number of replacement servers that X0 (t) = 0 X0 (t) + u(t) = f0 (X, u, t)
have k repair chunks, for k = 1, . . . , d. Only nodes retrieving d X1 (t) = 1 X1 (t) + dX0 (t) = f1 (X, u, t)
chunks are operational replacement nodes: for notations sake, ..
we shall consider Xd (t) the whole set of repairing nodes, i.e., .
those include the n r which have not crashed. Restoration Xk (t) = k Xk (t) + (d k + 1)Xk1 (t) = fk (X, u, t)
of the system using repair codes is possible if and only if ..
Xd (t) d at each point in time (if k Xd (t) < n d only .
full restoration is possible, if Xd (t) < k, containers state is Xd (t) = d Xd (t) + Xd1 (t) = fd (X, u, t) (2)
lost.).
The ODE system (2) represents the dynamics of the regener-
A. Markov model and fluid approximation ation process. Here, k = + (d k) is the rate at which
We shall study how to optimally activate new repairing servers with k chunks fail to repair plus the rate at which
servers in order to successfully restore all n servers within they receive a new chunk, thus joining those having k + 1
finite time horizon T at minimum cost. We start by assum- chunks. Also, the first equation of the ODE system (2), namely
ing a stochastic control, namely, the probability u that a f0 (), incorporates the activation of new peers at controlled
replacement server is activated. The activation rate of new rate u(t).
repairing servers is u(t). The control acts by thinning the
maximum activation rate , which can be easily implemented IV. O PTIMAL C ONTROL P ROBLEM
by randomly sampling servers to be activated. Thus, u(t) The objective is to minimize the cost to restore the system
is the rate at which replacement servers become active subject by deadline T : the storage regeneration dynamics (2) is con-
to stochastic control u(t). Let us define the state of the trolled by activation control u. Hence, the objective function
system as X = (X0 , X1 , . . . , Xd ), where Xk denotes the writes
number of servers which have retrieved the content from k Z T" d1
X
#
repairing servers. The state X(t) has a dynamics described J(u) = c1 u(v) + c2 (d i) Xi (v) dv (3)
by a continuous time Markov decision process (MDP), where 0 i=0
we observe that all states X such that Xd < d are absorbing, where the first term appearing in the integral is the servers
since no repairing is possible. activation cost whereas the second one is the cost for trans-
3
ferring chunks to repair servers. We shall solve the following and n = max{ 0|X d (T ) n} and d = max{
optimization problem: 0| mint[0,T ] X d (t) d}.
Problem 1 (Optimal Storage Regeneration). Find a control In the rest of the paper we assume > 0 and feasibility in
policy u which solves: the sense meant by the previous statement.
System dimensioning. Lemma 1 provides indications for
min J(u)
uU dimensioning the system in order to guarantee feasible re-
s.t. Xd (t) d 0tT (4) generation. In particular, in the worst case we would need to
Xd (T ) = n transfer n d chunks to newly activated repair nodes. In turn,
one would choose the time horizon by which to repair, namely
where d Xd (0) n. T , and , i.e., the rate at which chunks can be transferred, and
In order for the repairing procedure to succeed, at least d the codes triple C = (n, k, d), such in a way to satisfy the
repair nodes must be present at all points in time. We observe assumptions of the above statement.
that, because (2) describes the deterministic dynamics of the
B. Relaxed problem
mean value of the underlying MDP, it is possible that some
sample paths do not satisfy the constraints, an event that should Constraint Relaxation. The terminal state constraint can be
occur with small probability. To this aim, is possible to tighten accounted by relaxing the problem in the form
constraints appearing in (4), in the form J (u) = J(u) + (n Xd (T )) (5)
d = (1 + 1 )d n = (1 + 2 )n, by means of the terminal cost function q(X) := (nXd (T )).
where 1 , 2 > 0 represent relative margins. In the rest of the We note that 0 has the role of a multiplier, and when the
paper, we shall refer to the case 1 = 2 = 0 without loss of constraint is active > 0.
generality. State Augmentation. In order to account for the first con-
Hereafter, we shall determine the conditions when the straint, we operate the augmentation of the state space by
problem is feasible, i.e., the set of solutions of the problem is introducing an auxiliary variable
not empty. Actually, we recall that, as long as k chunks exist in Xd+1 (t) = (Xd (t) d)2 1 {d Xd (t)}
the system, full restoration is still possible. However, we focus
solely on the cases when regeneration is feasible, which can where the indicating function 1 {x} = 1 if x > 0 and 1 {x} =
be determined easily by analysis of the uncontrolled dynamics, 0 if x < 0. Since
Z T
as discussed next. Xd+1 (t) = Xd+1 (v)dv + Xd+1 (0).
0
A. Feasibility and System Dimensioning We impose the auxiliary constraint Xd+1 (T ) = Xd+1 (0) = 0:
because Xd+1 (t) 0 for t [0, T ], when such two constraints
Let us denote X d (t) the dynamics corresponding to u(t) are satisfied, then Xd (t) d all over the interval [0, T ].
1 in the interval [0, T ]. Because the activation control is We denote the problem of minimizing J (u) the relaxed
basically slowing down the maximum activation rate , it holds problem and it will be solved next.
Xd (t) X d (t) for all t [0, T ]. Hence, it is immediate to
observe that the problem is feasible if and only the dynamics C. Hamiltonian formulation and Pontryagin Principle.
of X d is compatible with the constraints. Such condition can Let denote g(X, u, t) the instantaneous cost appearing inside
be derived in closed form. By writing the Laplace transform the integral cost (3). In order to solve the optimal control
of (2), i.e., Xk (s) = L{Xk (t)} we obtain problem, it is possible to write the Hamiltonian for the optimal
control problem in standard form
X 1 (s) Xd1 (s) + X d (0)
X 0 (s) = , X 1 (s) = , . . . , X d (s) = H(X, u, p) = p(t) f (X, u) + g(X)
s + 0 s + 1 s + d
d where p is the vector of co-state variables Hence, according
d (0)
which in turn provides X d (s) = Qd (s+d!
+ Xs+ . As to the Pontryagin Minimum Principle [22], [23], the optimal
k=0 k)
showed in the Appendix, the following closed form expression control u needs to satisfy
for the dynamics of the repairing servers holds:
d u(t) = arg min H(X, u, p)
uU
X d (t) = et 1 et + X d (0)
where the associated Hamiltonian system is
Feasibility conditions can be described in terms of the system
Xk = Hpk (X, u, p) (6)
parameters as follows:
d pk = HXk (X, u, p) (7)
Lemma 1. Problem 1 is feasible if and only if 1eT We have d + 1 terminal conditions in the form pk (T ) =
n eT X d (0) and it is so for any , where qXk (T ) = 0 for k = 0, 1, . . . , d 1, d + 1. Also, terminal
:= min{n , d } condition pd (T ) = qXd (T ) = holds.
4
V. S OLUTION Lemma 3. It holds p0 (t) = F (t) + G(t) where
d
In order to solve the storage regeneration problem, we can
F (t) = 1 e(T t) e(T t)
write the Hamiltonian as
X d 1 Z T t
d1
H(X, u, p) = c1 + p0 (t) u(t) X0 (t)p0 (t) + G(t) = c2 d (ev 1)k e(+d)v dv
d1
X k 0
k=0
+c2 (d i) Xi (t)
Next, we characterize solutions of the relaxed problem
i=0
d h
which correspond to feasible solutions.
X i
+ Xk (t) + (d k + 1) Xk1 (t) pk (t)
A. Pure Activation Cost
k=1
+pd+1 (t) (Xd (t) d)2 1 {d Xd (t)} (8) We start our analysis from the simpler case when the
transfer cost is negligible compared to the activation cost, i.e.,
We can hence derive from (6) the adjoint ODE system in the
c2 = 0. It is hence possible to derive explicit relations on the
costate variables
structure of the optimal control.
p0 = HX0 = 0 p0 d p1 c2 d (9)
Theorem 1. If c2 = 0, then a solution of the relaxed problem
p1 = HX1 = 1 p1 (d 1) p2 c2 (d 1) is a threshold policy, in particular:
.. i. Single switch: ton = 0 and 0 < toff < T iff 0 ;
.
ii. Null control: 0 = ton = toff iff > 0 , and m c1 , where
pk = HXk = k pk (d k) pk+1 c2 (d k) m = minv[0,T ] {p0 (v)};
.. iii. Double switch: 0 < ton < toff T iff > 0 , and m < c1
.
The critical value
pd1 = HXd1 = d1 pd1 pd c2 p
d
pd = HXd = d pd 2(Xd (t) d) 1 {d Xd (t)} pd+1 0 := max{0, log( d /c1 (1 eT ))}
T
pd+1 = 0 while the switching epochs write ton = max{0, T + 1 log zon },
In what follows, we will derive the structure of the solutions toff = T + 1 log zoff , where zon zoff are the two solutions for
of the optimal control problem. A bang-bang policy [22], [23] 0 z 1 of the equation
r
is one where u(t) takes only extreme values, that is u(t) = 1 c1
(1 z) = d z d
or u(t) = 0 a.e. in [0, T ].
Notice that bang-bang policies are very convenient for
implementation purposes since they rely only on a set of B. General case
switching epochs, where the control switches from 1 to 0 or In the general case, it is sufficient to characterize the
vice versa. A threshold policy is one in the form dynamics of the multiplier p0 (t) in terms of the extremal
0 ton < t T
points attained in the interior of [0, T ].
u(t) = 1 0 < t toff (10) Lemma 4. Let S() be the set of the interior extremal points
0 toff < t < T of p0 (t) for a given choice of the constraint multiplier . Then,
Threshold policies are convenient since they depend on a pair S() is one of the following forms: , {M }, or {m, M },
of parameters only, namely thresholds ton and toff . where m := p0 (tm ) denotes a minimum and M := p0 (tM ) a
Bang-bang structure. We observe that (8) is linear in the maximum, and it holds 0 tm < tM < T .
control u. Hence, because the optimal activation control min- Finally, as proved in the Appendix.
imizes the Hamiltonian, the optimal policy has to satisfy
Theorem 2. The optimal solution of the relaxed problem is a
1 if p0 (t) < c1
u(t) = (11) threshold control.
0 if p0 (t) > c1
which depends on the dynamics of p0 , i.e., of the ODE system The optimal control is hence a threshold policy for which
(9). Actually, in order to prove that the policy is bang-bang the presence of an initial delay, i.e., ton > 0, depends on the
and non-degenerate, we need also to prove that the policy has parameters of the system. However, as a straightforward appli-
a finite number of switches and that there are no singular arcs, cation of the optimality principle, given an optimal threshold
i.e., no arcs where the Hamiltonian is null over an interval of policy with ton and toff , for a given pair T and r, the new
positive measure. threshold policy where ton = 0, toff = toff ton is optimal
for the problem where r = n Xd (tm ) r and horizon
Lemma 2. If the problem is feasible, the optimal policy is T = T ton < T . Thus we obtain the optimal solution in
bang-bang with no singular arcs. threshold form with no initial delay for more conservative
The dynamics of p0 can be derived in closed form: conditions, i.e., for smaller time horizon and larger number
of failed servers, and yet having same cost.
5
Algorithm 1: Optimal Regeneration Control regeneration technique under prescribed deadline constraints.
1: input: T , , , c1 , c2 , We have assumed a reference C = (n, k, d) MBR repairing
2: 0 s.t. u from (11) is such that Xd (T ) n code. The parameters of the code are n = 50, k = 10
3: initialize: R 0 , L 0, i 0 and d = 20 [17]1 . Also, the reference container state size
4: while |Xd (T ) n| > do is assumed B = 10 Gbytes. We recall that, based on the
5: Step i i + 1
6: i (L + R )/2 fundamental relation on MBR codes, we can derive the chunk
7: Obtain p0 (t), t [0, T ] solving backwards (9) size as = 2B/(k(2d k + 1)) [14], which in this case
8: Calculate the optimal control ui according to (11) amounts to = 64.5161 Mbytes.
9: Obtain Xd (t), t [0, T ] solving forward (2) The numerical setting is completed by assuming that re-
10: if Xd (T ) > n then
pairing servers may fail according to rate = 0.001s1 (we
11: R i
12: else remind that in our model server failures during restoration are
13: L i exponential random variables of parameter ). Furthermore,
14: end if the maximum rate at which repairing servers can be activated
15: end while is set as = 10 servers/s. Also, we need to make assump-
16: return (ui , i )
tions on the available network throughput: in our scenario,
the throughput available for repairing operations is 1 Gbit/s.
This value matches link speeds of production datacenters:
Note that, in the relaxed problem, we cannot exclude the peak bitrates for repair chunks transfer can be attained when
null control u 0, i.e., when p(0) > c1 and m > c1 . But, performing restoration in priority, i.e., giving highest priority
it cannot solve the constrained problem: to do so we need to to the traffic operating the transmission of repairing chunks.
determine the optimal multiplier , as seen next. The resulting target horizon for repairing has been set to
T = 3.5 s, which is feasible given the setting considered.
C. Optimal multiplier Fig. 2a and Fig. 2b depict the results of the optimal
The discussion so far has addressed the relaxed problem, activation control in case of simultaneous failure of r = 11
and the multiplier has been treated as a constant for the servers at time t = 0. We have reported on the dynamics
sake of discussion. However, determining the optimal solution of the costate variable p0 (t), superimposed to the switching
requires to identify a pair (u , ) where u solves the original threshold value, namely c1 (upper graph), the graph of the
constrained problem. corresponding optimal control dynamics (middle graph) and
The main result in this section is that we can calculate the one corresponding to the dynamics of the number of
the value using a simple bisection search as described in repairing servers Xd (t) (bottom graph).
Algorithm 1, under the feasibility assumptions of Lemma 1. In both cases, the optimal multiplier has been determined
The algorithm starts by exploring the interval for [0, 0 ], using Algorithm 1 with tolerance = 0.05. In particular, in
where 0 > 0 is a suitably large value such that it holds Fig. 2a we have considered the case of a null communication
Xd (T ) n. At line 5, 6 and 7 it solves the optimal control cost c2 = 0, which corresponds to = 12.7719 whereas
problem determining finally the terminal value Xd (T ) within in Fig. 2b we have considered c2 = 100 dollars/Gbyte, for
a certain tolerance > 0. which the optimal cost is attained for = 175.855. In both
The search algorithm leverages the fact that the terminal cases the threshold policy is such that the pair ton = 0 s and
number of repair servers is monotone in . In fact, when the toff = 1.22 s identifies the unique control driving he dynamics
target value number exceeds n, it explores on the left of the to satisfy terminal state constraint Xd (T ) = n.
current interval, i.e., it searches in [L , ]. Viceversa, when the Fig. 2c contains two tables calculated for different values
target value is below n, it explores the right interval [ , R ]. of the cost c1 and c2 . They report on the value of the optimal
The formal justification of the correctness of the above cost J (u ). We note that, as expected, it increases with both
search strategy, and the optimality of the output of the al- cost c1 and c2 . Also, we observe same behavior for : the
gorithm is resumed by the following result, proved in the optimal multiplier value increases and we ascribe this behavior
Appendix. to the fact that the value of has to enforce the terminal state
Theorem 3. Under the assumptions of Lemma 1, the optimal constraint against augmented running costs c1 and c2 .
pair (u , ) which solves the relaxed problem is unique, u
solves Prob.1, and can be approximated using a bisection VII. C ONCLUSIONS
search as in Alg. 1. In this paper we have presented an analytical framework
for the optimal control of state regeneration, a promising
VI. N UMERICAL R ESULTS technology in order to offer high availability of containerized
This section presents some numerical results on optimal applications at scale and ease stateful containers migration.
storage regeneration under a realistic parameter setting. It The idea is that leveraging the network filesystem, it is possible
also serves the purpose of explaining how to make use of
the proposed model to characterize limit performance of the 1 In [17] the code redundancy targets storage availability of 0.99
6
a) b)
0 Numerical 100 Numerical
c2
p (t)
p (t)
5 Theory 50 Theory J (u )
0
0
10 0 0 10 100
50
0 1 2 3 0 1 2 3 1 12.2 169.0 1580.6
1 1 c1 10 122.5 279.1 1691.9
u(t)
u(t)
0.5 0.5 20 244.9 401.3 1812.9
0 0 c2
0 1 2 3 0 1 2 3 0 10 100
1 1.2766 17.5851 164.0627
X (t)
X (t)
50 50
d
d
40 40 c1 10 12.8000 29.1024 175.8790
30 30 20 25.5990 41.7977 188.2813
0 1 2 3 0 1 2 3
t t
Figure 2. Optimal regeneration control a) zero communication cost b) c2 = 100 dollar/Gbyte c) The optimal cost and the optimal multiplier as function of
costs c1 and c2 .
to decouple the storage of containers state and the execution [4] J. Gray and D. P. Siewiorek, High-availability computer systems,
of application images running in pods. Computer, vol. 24, no. 9, p. 3948, 1991.
[5] Infinit International Inc, https://infinit.sh/documentation/reference.
We have studied optimal time-constrained regeneration, a
[6] Docker, Docker: The linux container engine, http://www.docker.io.
crucial aspect to ensure high availability in the containers state [7] D. Borthakur et al., Apache Hadoop goes realtime at Facebook, in
access. Under failure of a number of servers, regeneration is Proc. of ACM SIGMOD PODS, Athens, Greece, June 12-16 2011.
performed by transferring repairing chunks to newly deployed, [8] R. J. Chansler, Data availability and durability with the hadoop dis-
tributed file system, The USENIX Magazine, vol. 37, no. 1, February
clean slate repair servers. This occurs at a communication cost 2012.
and at a server activation cost. The optimal activation strategy [9] S. Ghemawat, H. Gobioff, and S.-T. Leung, The Google file system,
is of threshold-type and can be evaluated in closed form. SIGOPS Oper. Syst. Rev., vol. 37, no. 5, pp. 2943, Oct. 2003.
This work has been motivated by the limited number of [10] A. Lakshman and P. Malik, Cassandra: A decentralized structured
storage system, SIGOPS Oper. Syst. Rev., vol. 44, no. 2, pp. 3540,
studies on storage regeneration at system level [17] and it is Apr. 2010.
by no means conclusive. Indeed, several research directions [11] A. Cidon, S. Rumble, R. Stutsman, S. Katti, J. Ousterhout, and
are due in order to understand the potential of these novel M. Rosenblum, Copysets: Reducing the frequency of data loss in cloud
storage, in Proc. of USENIX ATC, San Jose, US, June 26-28 2013.
restoration techniques in cloud systems.
[12] A. Cidon, R. Escriva, S. Katti, M. Rosenblum, and E. G. Sirer, Tiered
The first one relates to the frequency of updates of the replication: A cost-effective alternative to full cluster geo-replication,
containers state, a design choice required in order to decide in Proc. of USENIX ATC, Santa Clara, CA, July 8-10 2015.
how often to dump the containers state onto the network [13] K. V. Rashmi, N. B. Shah, D. Gu et al., A solution to the network
challenges of data recovery in erasure-coded distributed storage systems:
filesystem. Such rate determines how much of the computation A study on the Facebook warehouse cluster, in Proc. of USENIX
already elapsed can be recovered using regeneration. HotStorage, San Jose, CA, June 27-28 2013.
Another relevant issue is the case of repeated failures. [14] N. B. Shah, K. V. Rashmi, P. V. Kumar, and K. Ramchandran,
Actually, the information on where faults are more likely be- Distributed storage codes with repair-by-transfer and nonachievability
of interior points on the storage-bandwidth tradeoff, IEEE Trans.
comes available to the administrator over time, e.g., based on Information Theory, vol. 58, no. 3, pp. 18371852, 2012.
direct observation or online learning techniques. The optimal [15] D. S. Papailiopoulos and A. G. Dimakis, Locally repairable codes,
policy may in turn span several cycles of faults/restorations IEEE Trans. Information Theory, vol. 60, no. 10, pp. 58435855, Oct
2014.
and would account for techniques to learn the aposteriori [16] M. Sathiamoorthy, M. Asteris, D. Papailiopoulos, A. G. Dimakis et al.,
distribution of faults over which to operate the optimal control. Xoring elephants: novel erasure codes for big data, in Proc. of PVLDB,
Also, correlated faults described in this work are simul- Riva del Garda, Italy, August 26-30 2013.
taneous. In reality, they may be scattered in time, e.g., due [17] S. Jiekak, A.-M. Kermarrec, N. Le Scouarnec, G. Straub, and
A. Van Kempen, Regenerating codes: A system perspective, SIGOPS
to cascading failures. Under such fault dynamics, the optimal Oper. Syst. Rev., vol. 47, no. 2, pp. 2332, Jul. 2013.
control studied in this work may be suboptimal. New models [18] H. Weatherspoon and J. Kubiatowicz, Erasure coding vs. replication:
should identify how to counter the effect of later additional A quantitative comparison, in Proc. of IPTPS, Cambridge, MA, USA,
March 7-8 2002.
faults occurring during regeneration. [19] O. Khan, R. C. Burns, J. S. Plank, W. Pierce, and C. Huang, Rethinking
erasure codes for cloud file systems: minimizing I/O for recovery and
R EFERENCES degraded reads, in Proc. of USENIX FAST, San Jose, US, February
14-17 2012.
[1] B. Burns, B. Grant, D. Oppenheimer, E. Brewer, and J. Wilkes, Borg,
Omega, and Kubernetes, Comm. of the ACM, vol. 59, no. 5, pp. 1837 [20] C. Huang, H. Simitci, Y. Xu et al., Erasure coding in Windows Azure
1852, May 2016. storage, in Proc. of USENIX ATC, Boston, MA, June 26-28 2012.
[2] W. Li and A. Kanso, Comparing containers versus virtual machines for [21] E. Altman, L. Sassatelli, and F. De Pellegrini, Dynamic control of
achieving high availability, in Proc. of IEEE IC2E, Tempe, US, March coding for progressive packet arrivals in DTNs, IEEE Trans. on Wireless
9-12 2015. Comm., vol. 12, no. 2, pp. 725735, 2013.
[3] V. Salapura, R. Harper, and M. Viswanathan, ResilientVM: High [22] G. Leitmann, An introduction to optimal control. McGraw-Hill, 1966.
performance virtual machine recovery in the cloud, in Proc. of ACM [23] D. E. Kirk, Optimal Control Theory. An Introduction., 13th ed. Prentice
AIMC, Bordeaux, France, Apr 21-24 2015, pp. 712. Hall, 2004.
7
A PPENDIX Qd (s) = s+ , it follows
P ROOF OF L EMMA . 1 d1
d d! c2 d X d 1 i+1 i!
Q0 (s) = Qd + Q
Proof: Feasibility is indeed equivalent to X d to respect k s i=0 i k h
T d k=0 (s + ) h=0 (s + )
the constraints. Condition 1 e d X d (0) ensures
In order to obtain Qthe closed form of p0 (t), auxiliary expres-
that sup X d (T ) n, which is attained for = 0. We k
sions of the kind h=0 (s + )h have to be inverted. Let us
observe that d is also well defined: g() = inf t[0,T ] X d (t)
is a continuous function of . Because inf g() = 0 and
denote fh (t) := e h t
1 {t 0}, for the sake of notation. By
recalling L{et 1 {t 0}} = 1/(s + ), it is possible to
g(0) = Xd (0) d, there exists a value of that satisfies
calculate
the definition. The statement follows immediately from the ( k )
ei t 1 {t 0}
Y Xn
definition of d and n and from a continuity argument. 1 h Qn
L (s + ) = f1 . . . fn =
j=0 j i
h=0 i=0
P ROOF OF L EMMA . 2 j6=i
ei t 1 {t 0} X (1)ni
n
X n
Proof: Preliminarily, let observe that a feasible solution = Qn = fi (t) (13)
must be such that Xd (t) d, for t [0, T ]. Thus, the dual i=0
n j=0 i j i=0
n i!(n i)!
j6=i
ODE system has to be solved as in the non-augmented case,
The statement follows after some algebraic manipulations of
where it holds pd = d pd . Hence, since the Hamiltonian is
the above expression.
linear in the control, a feasible policy is a bang-bang one.
In order to exclude the presence of singular arcs, we need to
exclude the possibility that c1 +p0 (t) = 0 over an interval I of P ROOF OF T HM . 1
positive measure. We shall prove that multiplier p0 s cannot be Proof: From Lemma 3, if c2 = 0, it follows
a constant over any interval I of positive measure (I) > 0,
and this guarantees that the control is actually bang-bang [22]. p0 (t) = e(T t) (1e(T t) )d1 +(+d)e(T t)
Let assume that p0 is a constant p0 = c1 over interval I: from which it is immediate to observe that the absolute mini-
hence all its k-th order derivatives vanish in I. But, it follows
mum over the real line is attained at tmin = T 1 log 1+ d ;
from (9) that p1 = d0 p0 : thus p1 is also a constant over I, and d d+
1 the minimum writes m := ( d ) /(1 + d ) .
since p2 = (d1) p1 , p2 as well. We hence iteratively obtain
Switching epochs ts are determined by the instants solving
that pi is a constant for i = 1, , 2, . . . , d. However, pd = d pd ,
p0 (ts ) = c1 . First, let observe that p0 (T ) = 0 > c1 and
so that 0 = pd = pd1 = . . . = p1 . Finally, p0 = 0, which is
p0 (T ) = d, so that the control is indeed null in a left interval
a contradiction.
of T . In particular, it is possible to identify three cases: for
P ROOF OF L EMMA . 3 a given value of > 0, there might exist either two, one or
zero switching epochs in the interior of [0, T ]. We consider
Proof: The adjoint ODE system can be solved via Laplace the three cases separately.
transform. We make the replacement qk (v) = pk (T t), thus
Case i: single switch. The condition for a unique switching
considering the backward time variable v = T t. It holds
epoch is p0 (0) < c1 , which writes (1 eT )d eT > c1 ,
qk (v) = pk (t), so that system (9) writes
so that
q0 = 0 q0 + d q1 + c2 d d p
.. > log d /c1 (1 eT ) := 0
T
.
By inspection of (14), due to the continuity of p0 , there exists
qk = k qk1 + (d k) qk+1 + c2 (d k), switching epoch 0 < toff < T such that p0 (toff ) = c1 . Because
k = 1, . . . , d 1 p0 (t) has unimodal structure, such switch is unique so that the
.. corresponding optimal control is in threshold form. Namely,
. u(t) = 1 for 0 t < toff and zero otherwise.
qd = d qd + 2(Xd (t) d) 1 {d Xd (t)} qd+1 Case iii: two switches. Condition p0 (0) > c1 leads to a
qd+1 = 0 non-null control if and only if m < c1 . From the unimodal
structure of p0 , and from classic continuity arguments, there
Let Qk (s) = L{qk (v)} for t = 0, 1, . . . , d be the Laplace
exist two real values, namely ton < tmin < toff where p0 (ton ) =
transform of the k-th variable qk . The corresponding system
c1 = p0 (toff ), so that u(t) = 1 for ton < t < toff and zero
writes
otherwise.
1
sQk (s) = k Qk (s) + (d k)Qk+1 (s) + c2 (d k) , Case ii: no switch. This is the case ton = toff = 0, i.e., the
s
for k = 0, 1, . . . , d 1 optimal control is the null one. It occurs when p0 (0) > c1
and m c1 .
sQd (s) = d Qd (s) (12)
Finally, the explicit expression of the switching epochs is
(dk) c2 (dk)
from which Qk (s) = (s+k ) Qk+1 (s) + s(s+k )
is obtained. obtained by solving equation p0 (t) = c1 , which concludes
By iterative replacement, and by accounting for the fact that the proof.
8
P ROOF OF L EMMA . 4 vi. S = {m, M } with p0 (0) > c1 with m > c1 implies
Proof: It is possible to write the derivative of the mul- a two-switch control with ton > 0 and 0 < ton < toff < T .
tiplier p0 (t) in a convenient form. For notations sake, we This concludes the proof, since in all cases the optimal bang-
denote p0 (t) the expression of p0 (t) when c2 = 0, and tm bang control is a threshold policy.
the point (on the real line) where the minimum of p0 (t) is
attained. We hence obtain P ROOF OF T HM . 3
p0 (t) = p0 (t) c2 de(+)(T t) (14)
Proof: In this proof we need to make the dependence on
where we know that p0 (tmin ) = 0, p0 (t) < 0 for t < tmin and explicit in the notation: e.g., u is the optimal control when
p0 (t) > 0 for t > tmin . multiplier is adopted in the relaxed objective function J (u).
However, p0 (T ) = 0 and p0 (t) = c2 d < 0, so that i. The fact that pair (u , ) minimizing J (u) is unique
there exists a whole left neighborhood of T where p0 (t) > 0 follows from the expression J (u) = J(u)+(nXd(t)). Let
and decreasing. And, p0 (t) < 0 for t < tmin . assume by contradiction another pair (u, ) is optimal, then it
By taking into account the sign of p0 and the additional must hold J(u ) = J(u). However, this implies that the two
negative term appearing in (14), it is immediate to conclude threshold policies must be identical, i.e., u = u, and so also
that only the following three cases are possible: = , because of the linear dependence with multiplier in
i S = : in this case p0 (t) is strictly decreasing in [0, T ]; (3).
ii S = {M } and the maximum is attained at 0 < tM < ii. The fact that the relaxed problem solves for the optimal
T : in this case p0 (t) strictly increasing in [0, tM ] and solution of the original constrained minimization follows from
decreasing in [tM , T ]; the following argument. Let define Ufn = { u| Xd(T ) =
iii S = {m, M } otherwise, where M is attained at 0 < n, Xd (t) d, t [0, T ]} U, let be the optimal
tM < T and m is attained at 0 < tm < tM < T ; i.e., multiplier and u the optimal solution of the constrained
in this case p0 (t) is decreasing in [0, tm ], increasing in problem.
[tm , tM ] and then decreasing in [tM , T ];
J(u ) = minn J(u) = minn J(u) + (n Xd (T )) = J (u )
which concludes the proof. uUf uUf
P ROOF OF T HM . 2 where the equality follows from the fact that (nXd(u)) = 0
over set Ufn .
Proof: From Lemma. 4, the structure of the control can be
iii. The correctness of the bisection search is due to the fact
analyzed exhaustively counting the possible switches induced
that J (u ) is indeed monotone in . In fact, costate variable
by the dynamics of p0 (t), similarly to what has been done in
Thm. 1: p (t) = Fe(t) + G(t)
0
i. S = implies the null control, i.e., u 0, i.e., ton = where we have made explicit the dependence on appearing
toff = 0; in (3). Now, with respect to switching epoch toff , let us consider
ii. S = {M } and p0 (0) c1 implies the null control; multiplier + , for some > 0. Then we can write
iii. S = {M } and p0 (0) < c1 implies a single switch
p+ (toff ) = Fe (toff ) + G(toff ) F (toff ) < 0
0
control with ton = 0 and 0 < toff < T ;
iv. S = {m, M } and p0 (0) > c1 with m > c1 implies which implies toff < t+ off . Opposite holds for ton : ton >
+
the null control; ton . From direct inspection of the cost function, it follows
v S = {m, M } with p0 (0) < c1 implies a single-switch J (u ) < J+ (u+ ), which proves the claimed monotony
control with ton = 0 and 0 < toff < T ; argument.