International Journal of Computer Networks & Communications (IJCNC)

ENHANCING AVAILABILITY FOR DISTRIBUTED REPLICATED SERVICES CONSIDERING NETWORK EDGE AVAILABILITY

Manghui Tu¹, Liangliang Xiao² and Dianxiang Xu³

¹Department of CITG, Purdue University Northwest, Hammond, Indiana

²Department of CSIT, Frostburg State University, Frostburg, Maryland

³Department of Computer Science, Boise State University, Boise, Idaho

Abstract

Mechanism to improve data or service availability is critical for an enterprise to ensure the quality of service in terms of availability. Replication has been used to improve system availability. The number and location of the replicas are two impact factors on availability. In this paper, we will consider the impact of the node and network edge failures on the availability of replicated data or services. The Effective availability modeling approach is designed and efficient availability computing algorithms are developed to model and compute availability of replicated services for systems with the tree topology. The availability enhancement problem (maximizing the objective function) is transformed to a p-median problem (minimizing the objective function) through re-define the availability enhancement problem. An efficient replica allocation algorithm is developed to improve data availability in tree networks, with a runtime complexity of O(K|V|²), where K is the number of replicas and |V| is the number of nodes in the tree network. Finally, experimental studies have been conducted to evaluate how efficient and effective the proposed availability computing algorithm and the availability enhancement algorithm on improving the availability of replicated data or services. The results show that the proposed solutions are efficient and effective on availability computing and availability enhancement.

Keywords

Network Protocols, Wireless Network, Mobile Network, Virus, Worms & Trojon

1. Introduction

It is well known that the utility of the data and service is mainly limited by availability [1, 2, 3, 4], and the dynamic environment and unreliable internet connectivity raise the concerns on availability of distributed systems [5, 6, 7, 8, 9]. Replication and erasure coding techniques [7, 8, 9, 10, 11, 12, 30] have been used to enhance data availability. However, the data object that is replicated also impacts the effective availability. Replicating data at highly available sites will result in high data availability. Also, widely distributing data can result in higher availability than distributing data in local clusters. In [8, 9], it has shown that even 100% local availability does not necessarily provide high availability to end users. The network failures may, for 1.5-2 %of the time, prevent clients from successfully accessing a cluster [5, 6, 7, 13]. Thus, the location of data stored in the system plays a significant role in overall system availability.

Most of the existing research on system availability consider the availability of the storage nodes only [11, 12, 14]. Some other research also considers the impact of data consistency on service availability [8, 9]. All these research do not consider the reach availability of users’ requests to service/data sites. As discussed earlier, availability of the network links also impacts service data availability, especially the reach availability of users’ requests. In [6, 7], network link availability is considered, but the evaluation algorithm itself requires exponential computation time relative to the number of storage nodes even in a tree graph with a single replica. Thus, it is not feasible to compute the availability of distributed systems with replications. Also, the work in [6, 7] does not provide an availability model in a system with more than one replica, i.e, the system has multiple service/data sites available to serve a user’s request.

In this paper, we consider modeling the service/data reach availability (i.e., the availability of service/data to remote requests) in distributed systems in a tree network. Data or services are replicated in the system to improve both performance and availability. Since the system has multiple service/data sites available to serve a user’s request, a single site failure does not lead to the unavailability to user’s request. Therefore, our problem is fundamentally different from network reliability problem (reliable only ensures the access to a single site) [15]. To assess system availability, efficient algorithms are developed to compute system availability considering both node and link failures.

The remainder of the paper is organized as follows. System modeling is described in Section II. An algorithm on computing system availability in a tree network is presented in Section III. Section IV, issues on how to reduce the runtime complexity and how to improve the service/data availability will be discussed. Efficient replica allocation algorithms to improve system availability will be introduced in Section V. Experimental studies conducted will be described in Section VI to give some numerical results of the proposed replica allocation algorithm applied in a system with tree network topology. Section VII briefly discusses some related works in availability modeling and enhancement and Section VIII concludes the paper.

2. System & Availability Modeling

The Study shows that user access requests usually follow a constant path routing to data sites and most routes are stable in days or even weeks [16]. So the data access routes in such a distributed system can be modeled as a tree graph, T = (V, E), as shown in Fig.1. Let e(u, v) denote the edge that directly links two nodes u and v, and d(u, v) denote a single path between u and v. Note that e(u, v) itself is a path between u and v with a single edge. The system hosts a set of data objects replicated to different nodes. For simplicity, we consider a single data object d. The set of nodes hosting replicas is defined as the residence set, R = {u| u Î V and u holds a replica}, and |R| is the number of replicas. A read request needs to access the closest node to the request. If it is not available, then, the request will be further forwarded to another node. Let r denote a read request and A^r(u) denote the number of read access requests from a node u over a time period. Majority of the user accesses are reading accesses, update accesses are not considered in this research.

Figure 1. A sample tree network with replica.

In a distributed system, data or service availability may be affected by other factors such as node up/downtime, system load, data consistency, and network failure or congestion [5, 8, 9]. According to [8, 9], weaker consistency would result in higher availability. Here, we assume that users are allowed to read stale data and, hence, the effect of data inconsistency will have no impact on data availability. Also, due to replication, the overloading problem is not an issue, hence, the impact of system load on data availability is not considered.

2.1. Data Availability Analysis

If the preferred node in R is not available, then the read request will be forwarded and served at the next available node. This process repeats until it is finally served by one of the replicas, or finally none of the replica can serve the read request, by choosing different combinations of paths. The access path of such a read request can be modeled as a directed graph, and thus both the network topology and the location of the read requests originally issued will affect data availability. Therefore, to compute data availability, we need to consider requests originated from different nodes.

2.2. Data Availability Modeling in a Tree Network

Figure 2. A sample tree graph rooted at u.

Subsequently, S_A(u, T_u, R) = 1 – S_UA(u, T_u, R). The data availability for a request from another node x, e.g., S_A(x, T_x, R) (note that the tree should now be rooted at x) can be computed in a similar way. Based on the analysis above, the system availability of d is computed as the average availability for requests in the entire system. Let S_A(G, R) denote the system availability of data d. Then, we have,

The algorithm Tree_S_A(T, R), which is developed to compute S_A(T, R) in a tree graph T = (V, E) is shown in Fig. 3. The algorithm works as what follows. It calls the recursive algorithm T_S_UA(u, T_u, R) for each node u in T to compute the unavailability for request issued at node u. The algorithm T_S_UA(u, T_u, R) recursively compute the data unavailability for the tree rooted at node u in T, by traversing each link and each vertex once. The computation uses the models we proposed earlier. Note that the children of node v are the nodes in the new tree rooted at node u (There are different implemetnations on building a new T_u in the original tree T, e.g., 1) physically re-build the tree rooted at T_u; 2) build a new virtual tree T_u by tracking the neighboring nodes for each node v in T, such that the unvisited neighboring nodes are child of v and the visited neighboring node is the parent node of v in new tree T_u.

Figure 3. The algorithm Tree_SA(T, R) in tree graph T.

In Theorem 1, we show that the availability d in tree T, S_A(T, R), is monotonically increasing.

Theorem 1. Let R₁ and R₂ denote two resident sets of T and R₁ Í R₂, then S_A(T, R₁) £ S_A(T, R₂).

Proof: Consider R₁ = R₂. According to the definition of S_A(T, R₁) and S_A(T, R₂), we know that S_A(T, R₁) = S_A(T, R₂).

Consider R₁ Ì R₂. For each request in a node u, it has at least one more node that can provide a replica of data object d to access. Without loss of generality, assume that |R₂| =|R₁| +1, and R₂ = R₁ È {v}. According to the availability model, the unavailability of v, 1− N_A(v) < 1. According to the availability computing model, for each request r issued at node u, which means S_UA(u, T_v, R₂) < S_UA(u, T_v, R₁), or S_A(u, T_v, R₂) > S_A(u, T_v, R₁). Thus, we have S_A(T, R₁) < S_A(T, R₂). It follows that S_A(T, R₁) £ S_A(T, R₂) if R₁ Í R₂.

We now show that the runtime complexity of our proposed availability computing algorithm in a tree T is in the order of the square of the number of nodes in T.

Theorem 2. The runtime complexity of algorithm Tree_S_A(T, R) is O(|V|²).

Proof. It suffices to prove the correctness of recursive relations.

Let v_j be a leaf node. If v_jÏ{r_j¹, …, r_jⁱ} or q £ 0, the value of G(v_j,q,r_jⁱ) should not be counted since the situation is not included in the definition. Therefore we set G(v_j,q,r_jⁱ)=-∞. On the contrary, if v_j=r_j^sÎ{r_j¹, …, r_jⁱ} and q ³ 1, G(v_j,q,r_jⁱ) equals to the optimal value of the sub-problem defined on the sub-tree T_j, which is the node v_j. Therefore G(v_j,q,r_jⁱ) = A^r(v_j)×A_j^s.

If r_jⁱÎV_j or q < 0, the value of F(v_j,q,r_jⁱ) should not be counted since the situation is not included in the definition. Therefore we set F(v_j,q,r_jⁱ)=-∞. If r_jⁱÏV_j and q = 0, the leaf node v_j has to fetch data from node r_jⁱ. Therefore F(v_j,q,r_jⁱ)=A^r(v_j)×A_jⁱ. If r_jⁱÏV_j and q ³ 1, the leaf node v_j can either fetch data from node r_jⁱ or fetch data from itself. Therefore F(v_j,q,r_jⁱ)=max{F(v_j,0,r_jⁱ),G(v_j,1,v_j)}.

Let v_j be an internal node. If V_j∩{r_j¹, …, r_jⁱ}=Æ or q £ 0 or i = 0, the value of G(v_j,q,r_jⁱ) should not be counted since the situation is not included in the definition. Therefore we set G(v_j,q,r_jⁱ)=-∞. If r_jⁱÏV_j, at least one node must be selected in {r_j¹, …, r_jⁱ} ∩ V_j is equivalent to at least one node must be selected in {r_j¹, …, r_jⁱ^–¹} ∩ V_j. Therefore G(v_j,q,r_jⁱ)=G(v_j,q,r_jⁱ^–¹). The rest situations can be separated into 3 cases: r_jⁱ=v_j and q ³ 1, r_jⁱÎV_j₁ and q ³ 1, and r_jⁱÎV_j₂ and q ³ 1. We discuss them as follows.

In case 1 we consider that r_jⁱ=v_j and q ³ 1. Without loss of generality, we assume that r_jⁱ corresponds to r_j₁ⁱ¹ for v_j₁ (the left child node of v_j) and r_j₂ⁱ² for v_j₂(the right child node of v_j).We claim that G(v_j,q,r_jⁱ)=max{G(v_j,q,r_jⁱ^–¹),A^r(v_j)×A_jⁱ+max{F(v_j₁,q₁,r_j₁ⁱ¹)+F(v_j₂,q₂,r_j₂ⁱ²)| 0£q₁,q₂£q-1, q₁+q₂=q-1}}.

In order to prove that, we must show (1) the optimal solution appears either in G(v_j,q,r_jⁱ^–¹) or A^r(v_j)×A_jⁱ+max{F(v_j₁,q₁,r_j₁ⁱ¹)+F(v_j₂,q₂,r_j₂ⁱ²) | 0£q₁,q₂£q-1, q₁+q₂=q-1}, and (2) the optimal solution is the maximum value.

First, if a node in {r_j¹, …, r_jⁱ^–¹} ∩ V_j is selected in the optimal solution, then the optimal solution appears in G(v_j,q,r_jⁱ^–¹). Otherwise we assume that no node in {r_j¹, …, r_jⁱ^–¹} ∩ V_j is selected but r_jⁱ is selected in the optimal solution. Then it is better for v_j to fetch data from node r_jⁱ=v_j than other nodes. Let q₁ nodes will be selected from T_j₁ and q₂ nodes will be selected from T_j₂ where q₁+q₂=q-1. Hence the optimal solution appears in A^r(v_j)×A_jⁱ+max{F(v_j₁,q₁,r_j₁ⁱ¹)+F(v_j₂,q₂,r_j₂ⁱ²) | 0£q₁,q₂£q-1, q₁+q₂=q-1}.

Second, A^r(v_j)×A_jⁱ+max{F(v_j₁,q₁,r_j₁ⁱ¹)+F(v_j₂,q₂,r_j₂ⁱ²) | 0£q₁,q₂£q-1, q₁+q₂=q-1} may contain some inconsistent values. For example, if F(v_j₁,q₁,r_j₁ⁱ¹) or F(v_j₂,q₂,r_j₂ⁱ²) selects a node in {r_j¹, …, r_jⁱ^–¹} ∩ V_j, v_j should fetch data from that node instead of r_jⁱ. However since the term A^r(v_j)×A_jⁱ is used, it actually counts that v_j fetches data from node r_jⁱ. Or F(v_j₁,q₁,r_j₁ⁱ¹) selects a node such that v_j₂ should fetch data from that node outside T_j₂ instead of r_j₂ⁱ². However the term F(v_j₂,q₂,r_j₂ⁱ²) indicates that actually it counts that v_j₂ fetches data from node r_j₂ⁱ² outside T_j₂. Fortunately, the inconsistent values are smaller than the corresponding consistent values, and hence are smaller than the optimal solution. Therefore the inconsistent values will not affect the optimal solution to be the maximum value.

In case 2 we consider r_jⁱÎV_j₁ and q ³ 1. Without loss of generality, we assume that r_jⁱ corresponds to r_j₁ⁱ¹ for v_j₁ (the left child node of v_j) and r_j₂ⁱ² for v_j₂ (the right child node of v_j). We claim that G(v_j,q,r_jⁱ)=max{G(v_j,q,r_jⁱ^–¹),A^r(v_j)×A_jⁱ+max{G(v_j₁,q₁,r_j₁ⁱ¹)+F(v_j₂,q₂,r_j₂ⁱ²) | 1£q₁£q, 0£q₂£q, q₁+q₂=q, the associated set of G(v_j₁,q₁,r_j₁ⁱ¹) does not contain r_j¹, …, r_jⁱ^–¹ but contains r_jⁱ}}.

First, if a node in {r_j¹, …, r_jⁱ^-1} ∩ V_j is selected in the optimal solution, then the optimal solution appears in G(v_j,q,r_jⁱ^-1). Otherwise we assume that no node in {r_j¹, …, r_jⁱ^-1} ∩ V_j is selected but r_jⁱis selected in the optimal solution. We claim that v_j will not be selected. Because if v_j is selected, then it cannot be in {r_j1, …, r_ji-1} ∩ V_j. Thus it is better for v_j to fetch data from node r_ji than v_j. Consequently it is better for all nodes to fetch data from node r_ji than v_j. It implies that the selection of v_j will not gain any benefit. Therefore v_j will not be selected. So q₁ nodes will be selected from T_j₁ and q₂ nodes will be selected from T_j₂ where q₁+q₂=q. Hence the optimal solution appears in Ar(v_j)×A_ji+max{G(v_j₁,q₁,r_j₁i1)+F(v_j₂,q₂,r_j₂i2) | 1£q₁£q, 0£q₂£q, q₁+q₂=q, the associated set of G(v_j₁,q₁,r_j₁i1) does not contain r_j1, …, r_ji-1 but contains r_ji}.

Second, if the associated set of G(v_j₁,q₁,r_j₁ⁱ¹) does not contain r_j¹, …, r_jⁱ, then that G(v_j₁,q₁,r_j₁ⁱ¹) should not be counted. If the associated set of G(v_j₁,q₁,r_j₁ⁱ¹) contains r_j¹, …, r_jⁱ^–¹, then the inconsistent value is smaller than the corresponding consistent value, and hence smaller than the optimal value. so that G(v_j₁,q₁,r_j₁ⁱ¹) can be omitted. Therefore it suffices to consider G(v_j₁,q₁,r_j₁ⁱ¹) whose associated set does not contain r_j¹, …, r_jⁱ^–¹ but contains r_jⁱ.

In case 3 we consider r_jⁱÎV_j₂ and q ³ 1. Without loss of generality, we assume that r_jⁱ corresponds to r_j₁ⁱ¹ for v_j₁ (the left child node of v_j) and r_j₂ⁱ² for v_j₂ (the right child node of v_j). Analogous to the proof of case 2, it can be proved that G(v_j,q,r_jⁱ)=max{G(v_j,q,r_jⁱ^–¹),A^r(v_j)×A_jⁱ+max{F(v_j₁,q₁,r_j₁ⁱ¹)+G(v_j₂,q₂,r_j₂ⁱ²) | 0£q₁£q, 1£q₂£q, q₁+q₂=q, the associated set of G(v_j₂,q₂,r_j₂ⁱ²) does not contain r_j¹, …, r_jⁱ^–¹ but contains r_jⁱ}}.

If r_jⁱÎV_j or q < 0, the value of F(v_j,q,r_jⁱ) should not be counted since the situation is not included in the definition. Therefore we set F(v_j,q,r_jⁱ)=-∞. If r_jⁱÏV_j and q ³ 0, without loss of generality, we assume that r_jⁱ corresponds to r_j₁ⁱ¹ for v_j₁ (the left child node of v_j) and r_j₂ⁱ² for v_j₂ (the right child node of v_j). We claim that F(v_j,q,r_jⁱ)=max{G(v_j,q,r_jⁱ),A^r(v_j)×A_jⁱ+max{F(v_j₁,q₁,r_j₁ⁱ¹)+F(v_j₂,q₂,r_j₂ⁱ²) | 0£q₁£q, 0£q₂£q, q₁+q₂=q}} First, if the optimal solution selects node in {r_j¹, …, r_jⁱ} ∩ V_j, then the optimal solution appears in G(v_j,q,r_jⁱ).

Otherwise we assume that no nodes in {r_j¹, …, r_jⁱ} ∩ V_j is selected in the optimal solution. Then it is better for v_j to fetch data from node r_jⁱ than other nodes. Additionally v_j will not be selected because of the same reason in the proof of case 2. So q₁ nodes will be selected from T_j₁ and q₂ nodes will be selected from T_j₂ where q₁+q₂=q. Hence the optimal solution appears in

A^r(v_j)×A_jⁱ+ max{F(v_j₁,q₁,r_j₁ⁱ¹)+F(v_j₂,q₂,r_j₂ⁱ²) | 0£q₁£q, 0£q₂£q, q₁+q₂=q}.

Finally, the inconsistent values in A^r(v_j)×A_jⁱ+max{F(v_j₁,q₁,r_j₁ⁱ¹)+F(v_j₂,q₂,r_j₂ⁱ²) | 0£q₁£q, 0£q₂£q, q₁+q₂=q} are smaller than the corresponding consistent values, and hence are smaller than the optimal solution. Therefore the inconsistent values will not affect the optimal solution to be the maximum value.

3. System Availability Enhancement

In this research, we proposed to develop mechanisms to optimize data availability in a system with a fixed number of replicas, K, i.e., to find a resident set with a size of K such that the overall data availability in a tree graph is optimal. It has been shown in our previous work in [17, 18] that reducing average distance of the paths (number of hops) from user’s requests to service/data sites can result in high availability, as the shorter the distance (the smaller the number of hops), the higher the path availability from a requester to its closest replica. Therefore, a potential approach to improve the overall system availability is to allocate data replicas in the system such the average distance of the paths from user’s requests to service sites is minimal.

3.1 Availability Enhancement Problem Modeling

The replica allocation problem to optimize data availability is much more complex than the traditional optimal resident set problems [19, 20, 21]. For traditional resident set problem, each request will be served by a single site (usually the closest site) and the corresponding communication cost will be computed as the final communication cost for such a request. Therefore, each request will be associated with a single data replica site only. However, with the availability model defined in this research, a request may need to find another available site if the current accessing site is not available. This process will repeat till such access is served by a replica site or is not served after all replica sites have been tried but none of them are available. Thus, each request should be associated with all sites within the resident set for data availability computing. To find the resident set with maximal data availability in a tree network, the computation complexity could be too high to be feasible. For example, an intuitive approach is to try every combination of the resident set in the tree and the runtime complexity would be |V|^|R|. Thus, an approximate solution is to apply the property observed in [17, 18] that the smaller the number of hops from a request to the resident set, the higher the data availability.

First, since a request will be forwarded to a secondary replica only the primary replica node cannot serve such request, and such an event has a very low probability (node availability is usually very high), we can assume that a request is only associated with the primary replica node to reduce the complexity (note that this assumption is only valid for replica allocation purpose, a request is still associated with all replicas when computing availability). Thus, our problem can be reduced to a resident set problem maximizing the total of the product of the number of reading requests and the path availability of each request to the resident set. i.e., Σνεv(A^γ(ν)*(A_δ(ν,R)*N_A(ν))), where is the path availability from a request issued at node v to a replica site in the resident set R.

Second, we all understand that the heterogeneity of the node (replica site) availability can have a big impact on the overall system availability. Here, we will show that the heterogeneity of the edge availability can have big impact on the overall system availability. In a system with line topology shown Fig. 4, the allocation of two replicas to optimize the availability without considering the heterogeneity of edge availability is shown in allocation scheme A, and the allocation considering the heterogeneity of edge availability is shown in allocation scheme B (with homogeneous node availability of 0.999). In Scheme B, the service to requests at node v₁ is available locally with availability of 0.999, and requests need to be forwarded to node v₃only if node v₁ is not available (with a probability of 0.001). Requests at other nodes can be served by the replica at node v₃ with very high probability, and only need to be forwarded to node v₁ if node v₃ is not available or edge e(v₂, v₃) is not available (for requests from node v₂). The overall normalized data availability in system with replica allocation scheme B is around 0.96 by applying the availability computing algorithm described in Fig. 3. In Scheme A, the service to requests at node v₁ is not available locally and thus need to be forwarded to node v₂or v₃ where the availability of edge e(v₁, v₂) is 0.5. The overall normalized data availability in system with replica allocation scheme A will definitely be lower than 0.90.

Figure 4. The impact of the edge availability on system availability.

In our previous work in [18], solutions have been given to availability optimization problem, by considering the homogeneous of node availability or network link availability, or both. However, these solutions are only partial solutions to Problem 1.

3.2 The Availability Enhancement Algorithm

Problem 1 is an optimization problem similar to the p-median problem [22, 15]. In a p-median problem, it is given a graph G(V, E) where V = {v₁, v₂, …,v_n}. Each node v_i is associated with a nonnegative weight c_i, and each edge e(v_i, v_j) is associated with a nonnegative weight d(e(v_i, v_j)). Additionally, it is given a real monotonically increasing function f_i for each node v_i. The goal is to find a subset R Í V containing at most p nodes (p is an arbitrary but pre-determined number) to minimize the objective

Problem 1 and p-median problem have some common issues. They both consider weighted tree networks where each node/edge is assigned value(s). They both look for optimal solutions to the objective functions. Problem 1 looks for the maximum value of an objective function while p-median problem looks for the minimum value of another objective function. It is therefore possible for us to borrow some ideas of the solution to the p-median problem to develop the solution to Problem 1.

However, it is difficult to directly apply the solution to the p-median problem to Problem 1 since they are different. Consider the p-median problem. Suppose that a node v_j is selected and the edges e(v_i, v_i₁), e(v_i₁, v_i₂), …, e(v_ik, v_j) form a path connecting nodes v_i and v_j. Then the computation of the cost for v_i to access v_j includes k+1 piece of information: f_i(d(e(v_i,v_i₁))), f_i₁(d(e(v_i₁,v_i₂))), …, f_ik(d(e(v_ik,v_j))), and c_j. While for Problem 1, the computation of the contribution for v_i to fetch data from v_j includes k+2 piece of information: A^r(v_i), A_e(v_i,v_i₁), A_e(v_i₁,v_i₂), …, A_e(v_ik,v_j), and N_A(v_j). Therefore the solution to Problem 1 should include and handle more information than that to p-median problem. Thus we need to modify the solution to p-median problem to re-develop the solution to Problem 1.

Problem Transformation and Re-definition

A three step solution is provided to Problem 1:

Transformation: reduce Problem 1 from an arbitrary tree network to a binary tree network.
Pre-processing: compute A_jⁱ and r_jⁱ for each node v_j, 1 £ i, j £ n
Computing function values: compute the values of two functions G(v_j,q,r_jⁱ) and F(v_j,q,r_jⁱ) based on their recursive relations.

Step 1. Transformation

Given an arbitrary tree network T(V, E), where V = {v₁, v₂, …,v_n}, E is the set of all edges in T. It can be transformed to that of a binary tree network T’(V’,E’). We show that by considering the following two cases, Case 1 and Case 2.

Case1. Suppose that an internal node v_j has exactly one child node (Fig. 5(a)). Without loss of generality, we assume that v_jhas a left child node v_j1_.Then we introduce a right child node v_j2 and set the availabilities N_A(v_j2) = A_e(v_j, v_j2) = 0 and the number of read requests A^r(v_j2) = 0 (Fig. 5(b)).

Figure 5. Node with a single child node (a) is transformed to (b)

Since N_A(v_j₂) = A_e(v_j, v_j₂) = A^r(v_j₂) = 0, there is no benefit to select v_j₂ and add v_j₂ to the resident set R. Therefore if R’ is an optimal solution to Problem 1 for T’(V’,E’), then R = R’ – {v_j₂ | j} is an optimal solution to Problem 1 for T(V,E).

Case 2. Suppose that an internal node v_j has more than 3 childe nodes v_j1, v_j2, …, v_jt where t ³ 3 (Fig 6(a)). Then we introduct t-2 nodes u_js where 2 £ s £ t − 1, replace edge e(v_j, v_js) by edge e(u_js, v_js) for 2 £ s £ t − 1, replace edge e(v_j, v_jt) by edge e(u_jt₋₁, v_jt), and add edges e(v_j, u_j₂) and e(u_js’, u_js’₊₁) for 2 £ s’ £ t – 2, and set the availabilities N_A(u_js) = 0 for 2 £ s £ t − 1, A_e(u_js, v_js) = A_e(v_j, v_js) for 2 £ s £ t − 1, A_e(u_jt₋₁, v_jt) = A_e(v_j, v_jt), A_e(v_j, u_j₂) = A_e(u_js’, u_js’₊₁) = 1 for 2 £ s’ £ t – 2, and the number of read requests A^r(u_js) = 0 for 2 £ s £ t – 1 (Fig 6 (b)).

Figure 6(a). Case 2, the node with over 3 child nodes.

Figure 6(b). Transformed graph for the graph shown in Fig. 6 (a).

First, it can be verified that the success availability for node v_j (v_js) to fetch data from node v_js (v_j) does not change, 2 £ s £ t – 1. Second, the new assigned availabilities and number of requests guarantee that there is no benefit to select nodes u_js, 2 £ s £ t − 1. Therefore if R’ is an optimal solution to Problem 1 for T’(V’,E’), then R = R’ – {u_js | j, s} is an optimal solution to Problem 1 for T(V,E).

Based on the previous discussion, Problem 1 can be efficiently transformed from an arbitrary tree network to a binary tree network. The solution to the transformed problem can be easily transformed back to the original problem.

Step 2. Pre-processing:

In this step we compute A_jⁱ and the corresponding r_jⁱ for each node v_j, 1 £ i, j £ n. A_jⁱ and r_jⁱ are defined in the following definition.

Definition 1: A_j¹ is the greatest success probability for node v_j to fetch data from all nodes, …, A_jⁿ is the smallest success probability for node v_j to fetch data from all nodes. r_j¹ is the node corresponds to A_j¹, …, r_jⁿ is the node corresponds to A_jⁿ.

Given a binary tree network, we compute the success probability p_ij for node v_i to fetch data from node v_j. Specifically, if the edges e(v_i, v_i₁), e(v_i₁, v_i₂), …, e(v_ik, v_j) form the path connecting v_i and v_j, then the probability p_ij = A_e(v_i,v_i₁)×A_e(v_i₁,v_i₂) … A_e(v_ik,v_j)× N_A(v_j). Then we sort p_1j, …, p_nj from the smallest to the largest, and let the results be A_j¹, …, A_jⁿ. Additionally, let node r_jⁱ corresponds to A_jⁱ, 1 £ i £ n. The equality of success probabilities can be resolved by arbitrary set the order of nodes.

Step 3. Computing recursive function values

Let v_j be a node, q be an integer, and r_jⁱ and A_jⁱ be the values computed in the pre-processing step. The functions G(v_j,q,r_jⁱ) and F(v_j,q,r_jⁱ) are defined in the following definition.

Definition 2: G(v_j,q,r_jⁱ) is defined to be the optimal value of the sub-problem defined on the sub-tree T_j given that total of at least 1 and at most q nodes can be selected in T_j, and at least one node must be selected in {r_j¹, …, r_jⁱ} ∩ V_j. The set of nodes for G(v_j,q,r_jⁱ) to achieve the optimal value is called the associated set of G(v_j,q,r_jⁱ).

Definition 3: F(v_j,q,r_jⁱ) is defined to be the optimal value of the sub-problem defined on the sub-tree T_j under the constraints that a total of at most q nodes can be selected in T_j and there are already some selected nodes in V – V_j, and the best amongst them that v_j can fetch data is r_jⁱ. The set of nodes for F(v_j,q,r_jⁱ) to achieve the optimal value is called the associated set of F(v_j,q,r_jⁱ).

We establish the recursive relations between G(v_j,q,r_jⁱ) and F(v_j,q,r_jⁱ) as what follows.

Leaf node v_i

The Algorithm for the Transformed Problem

We integrate the three steps described in previous subsections to develop the algorithm, Alg_Enhan, to solve the transformed Problem 1.

transforms the problem to a binary tree network if necessary,
pre-processes the binary tree network and compute A_jⁱ and r_jⁱ for each node v_j, 1 £ i, j £ n,
computes the values of G(v_j,q,r_jⁱ) and F(v_j,q,r_jⁱ) based on the recursive relations following the post order of the nodes: first compute the function values for the leaf nodes, then the internal nodes, and finally the root node. The associated sets of G(v_j,q,r_jⁱ) and F(v_j,q,r_jⁱ) are recorded at the same time. If two sets of nodes both achieve the optimal value, then the set containing the node in V_j from which it is better for v_j to fetch data will be recorded, and
Returns G(v₁,K,r₁ⁿ) together with the associated set, where v₁ is the root node.

3.3 A Numerical Example for the Availability Enhancement Algorithm (Alg_Enhan)

We now use a numerical example to demonstrate the algorithm we developed for Problem 1.

Consider the binary tree network T(V, E) as shown in fig. 7, where V = {v₁, v₂, v₃, v₄, v₅} and E = {e(v₁, v₂), e(v₁, v₃), e(v₂, v₄), e(v₂, v₅)}. The nodes availabilities N_A(v₁) = 0.3, N_A(v₂) = 0.1, N_A(v₃) = 0.6, N_A(v₄) = 0.7, N_A(v₅) = 0.9. The edges availabilities A_e(v₁, v₂) = 0.5, A_e(v₁, v₃) = 0.4, A_e(v₂, v₄) = 0.2, A_e(v₂, v₅) = 0.8. The numbers of read requests A^r(v₁) = 100, A^r(v₂) = 200, A^r(v₃) = 300, A^r(v₄) = 400, A^r(v₅) = 500. The tree network is illustrated in the following figure. We further assume that K = 2.

Figure 7. A sample network for algorithm numerical demonstration.

Pre-processing:

Since it is a binary tree network, transformation is not needed. We compute A_jⁱ and r_jⁱ in this subsection.

Root node v₁

The success probability to fetch data from v₁ = 0.3.

The success probability to fetch data from v₂ = 0.5×0.1 = 0.05.

The success probability to fetch data from v₃ = 0.4×0.6 = 0.24.

The success probability to fetch data from v₄ = 0.5×0.2×0.7 = 0.07.

The success probability to fetch data from v₅ = 0.5×0.8×0.9 = 0.36.

We sort the success probabilities and record A₁ⁱ and the corresponding r₁ⁱ in the following table, 1£i£5.

Table 1. Pre-processing the data access for root node v1.

Internal node v₂

The success probability to fetch data from v₁ = 0.5×0.3 = 0.15.

The success probability to fetch data from v₂ = 0.1.

The success probability to fetch data from v₃ = 0.5×0.4×0.6 = 0.12.

The success probability to fetch data from v₄ = 0.2×0.7 = 0.14.

The success probability to fetch data from v₅ = 0.8×0.9 = 0.72.

We sort the success probabilities and record A₂ⁱ and the corresponding r₂ⁱ in the following table, 1£i£5.

Table 2. Pre-processing the data access for internal node v2.

Leaf node v₃

The success probability to fetch data from v₁ = 0.4×0.3 = 0.12.

The success probability to fetch data from v₂ = 0.4×0.5×0.1 = 0.02.

The success probability to fetch data from v₃ = 0.6.

The success probability to fetch data from v₄ = 0.4×0.5×0.2×0.7 = 0.028.

The success probability to fetch data from v₅ = 0.4×0.5×0.8×0.9 = 0.144.

We sort the success probabilities and record A₃ⁱ and the corresponding r₃ⁱ in the following table, 1£i£5.

Table 3. Pre-processing the data access for leaf node v3.

Leaf node v₄

The success probability to fetch data from v₁ = 0.2×0.5×0.3 = 0.03.

The success probability to fetch data from v₂ = 0.2×0.1 = 0.02.

The success probability to fetch data from v₃ = 0.2×0.5×0.4×0.6 = 0.024.

The success probability to fetch data from v₄ = 0.7.

The success probability to fetch data from v₅ = 0.2×0.8×0.9 = 0.144.

We sort the success probabilities and record A₄ⁱ and the corresponding r₄ⁱ in the following table, 1£i£5.

Table 4. Pre-processing the data access for leaf node v4.

Leaf node v₅

The success probability to fetch data from v₁ = 0.8×0.5×0.3 = 0.12.

The success probability to fetch data from v₂ = 0.8×0.1 = 0.08.

The success probability to fetch data from v₃ = 0.8×0.5×0.4×0.6 = 0.096.

The success probability to fetch data from v₄ = 0.8×0.2×0.7 = 0.112.

The success probability to fetch data from v₅ = 0.9.

We sort the success probabilities and record A₅ⁱ and the corresponding r₅ⁱ in the following table, 1£i£5.

Table 5. Pre-processing the data access for leaf node v5.

Recursive function computing:

We now compute the function values based on the pre-processing results and recursive relations developed above. We follow the post order of the nodes, which is v₄ → v₅ → v₂ → v₃ → v₁.

Leaf node v₄

The G and F function values for node v₄ is summarized in the Table 6. The associated set of nodes is also recorded.

Table 6. G and F function values for node v4

Leaf node v₅

The G and F function values for node v₅ is summarized in the Table 7. The associated set of nodes is also recorded.

Table 7. G and F function values for node v5.

Internal node v₂

The G and F function values for node v₂ is summarized in the Table 8. The associated set of nodes is also recorded.

Table 8. G and F function values for node v2

Leaf node v₃

The G and F function values for node v₃ is summarized in the Table 9. The associated set of nodes is also recorded.

Table 9. G and F function values for leaf node v3.

Root node v₁

The G and F function values for node v₁ is summarized in the Table 10. The associated set of nodes is also recorded.

Table 10. G and F function values for root node v1.

According to the entry (G(v₁,2,r₁⁵) = 953.2; v₄,v₅) in Table 10 function values for root node v₁, when selecting nodes v₄ and v₅, the objective function in Problem 2 achieves the maximum value 953.2.

3.4 Complexity and Correctness Analysis for the Availability Enhancement Algorithm (Alg_Enhan)

Complexity Analysis for the Algorithm (Alg_Enhan)

We now analyze the runtime complexity for the availability enhancement algorithm (Alg_Enhan). For simplicity, we assume that the computational complexity for arithmetic operations (+, -, ×, /) and comparison is O(1).

Theorem 1: The computational complexity of the algorithm is O(Kn²).

Proof. In step 1 (transformation), O(n) nodes and edges may be added to the original tree network. The corresponding parameters (e.g. probabilities and number of read requests) may also be added. So the computational complexity of step 1 is O(n).

In step 2 (pre-processing), the success probabilities for each node to fetch data from all nodes are computed and sorted. It is possible that the computation of a success probability costs O(n) multiplications (consider unbalanced binary tree). So the computational complexity of step 2 is O(n)×(O(n)×O(n)+O(nlogn)) = O(n³). However, the computational complexity can be improved to O(n²) based on the similar technique used in [25, 15].

In step 3 (computing function values), the values of two functions are computed based on the recursive relations. O(n) tables are computed for O(n) nodes. Each table contains O(n)×O(K) = O(Kn) entries. In the worst case, the computation of an entry requires O(K) operations. So the computational complexity of step 3 is O(n)×O(Kn)×O(K) = O(K²n²). However, the computational complexity can be improved to O(Kn²) based on the similar technique used in [26, 15].

Therefore the computational complexity of the algorithm is O(n) + O(n²) + O(Kn²) = O(Kn²)

Correctness Analysis for the Algorithm(Alg_Enhan)

We now show the correctness in Theorem 2 for the availability enhancement algorithm (Alg_Enhan).

Theorem 2: The algorithm Alg_Enhan outputs the optimal solution of Problem 1.

Proof. It suffices to prove the correctness of recursive relations.

4. Experimental Studies

4.1 Evaluating the Effectiveness of Availability Modeling in Tree Networks

In section II, we have shown that the proposed availability modeling algorithm in a tree network has a complexity of |V|², where |V| is the number of nodes in the tree network. In this section, we investigate the practical complexity of our proposed algorithm. In this simulation, a variety of random tree topologies are generated. We use four parameters to control the generation of a tree: the total number of nodes (|V|), resident size (|R|), the number of reads issued from each node (A^r(u)), and the maximum node degree (degree). A random tree is generated in a breadth-first manner, i.e., starting from the root node, at each level, a random number of node are generated for the next level based on the randomly generated node degrees, until the number of nodes specified is reached. The root of the tree always hosts a replica and the rest of |R|-1 replicas are allocated to randomly selected |R|-1 nodes. The number read on each node is randomly generated following a uniform distribution, each in the range of (0 10). Link availability (LA) of each edge and node availability (NA) of each node in the system is generated randomly following a uniform distribution, each is in the range of (0.9 1.0).

Table 11. G and F function values for root node v1.

To study actual performance of the algorithm, the actual runtime and memory used are determined. The memory used is dynamic and the maximum memory used is measured in the middle of the recursive algorithm execution process. The runtime is determined as the time between the start of the execution of the algorithm and end of the algorithm. Note that the actual runtime and memory used here do not necessarily mean the lowest runtime and maximum memory used, and the algorithm is implemented in Java. The parameters are set as follows: degree = 5, each node can have 0 to 5 children, |R| = 6, and |V| is ranged from 100 to 100000. As can be seen in Table 11, the memory used by the algorithm linearly increased along the increase of |V|, while the run time of the algorithm increases along with the increase of |V|, governed by a growth function of the square of |V|. The results confirmed our theoretic analysis in Section II such that the availability modeling algorithm in a tree network has a runtime complexity of |V|².

We then test our modeling approach against the simulated distributed environment to see how good it is. The impacts of three factors on Tree_S_A(T, R) are studied: (a) the graph size |V|; (b) the graph degree, which is the average number of neighbors of a node; and (c) the size of the resident set |R|. To avoid biased access patterns and topology structures, we repeat the experimental steps 100 times. The final result is the average of the 100 trials. The results are shown in Fig. 8 (a), Fig. 8 (b), and Fig. 8 (c).

Figure 8(a). The impact of resident set size.

Fig.8(a) shows the impact of resident set size on the effectiveness of the three replica allocation algorithms. The parameters are set as follows: |V| = 100, degree = 5, and |R| is increased from 1 to 20. The result shows that data availabilities of the three replica allocation schemes increase along with the increase of |R|, which confirms the conclusion of Theorem 1. From Fig. 7(a), the two resident sets computed by the two replica allocation algorithms, Alg_Hom and Alg_Het, can always achieve higher availabilities than the availability achieved by the resident set that is computed by the randomized replication allocation algorithm. However, the difference on the availabilities of the resident sets between the randomized and the two algorithms, Alg_Hom and Alg_Het, decreases along with increase of |R|. Also, the availability increase rate decreases along with increase of |R|. The reason is obvious. With a larger |R|, the average distance from a user to service/data site (number of edges in the path) will be reduced and thus the service/data reach availability should be increased. Thus, the space of service availability enhancement will also be reduced when availability reaches a high level.

Figure 8(b). The impact of node degree.

Fig.8(b) shows the impact of node degree on the effectiveness of the three replica allocation algorithms. The parameters are set as follows: |V| = 100, |R| = 6, and degree is increased from 2 to 10. The result shows that data availabilities of the three replica allocation schemes increase along with the increase of node degree. From Fig. 7(b), the two resident sets computed by the two replica allocation algorithms, Alg_Hom and Alg_Het, can always achieve higher availabilities than the availability achieved by the resident set that is computed by the randomized replication allocation algorithm. However, the difference on the availabilities of the resident sets between the randomized and the two algorithms, Alg_Hom and Alg_Het, decreases along with increase of node degree. The availabilities increase rate also slows down when node degree reach a certain size (e.g., degree = 5). The reason is obvious. With a larger node degree, the height of the tree will be reduced. Therefore, the average distance from a user to service/data site (number of edges in the path) will be reduced and thus the service/data reach availability should be increased. Thus, the space of service availability enhancement will also be reduced when availability reaches a high level.

Figure 8(c). The impact of graph size on data availability.

Fig.8(c) shows the impact of graph size on the data availability modeled and computed by Tree_S_A(T, R). The parameters are set as follows: |R| = 6, degree = 5 and |V| is increased from 100 to 1000. The result shows that data availability decreases along with increase of graph size. The reason is obvious. With a larger |V|, the average distance from a user to service/data site (number of edges in the path) will be increased and thus the service/data reach availability should decrease.

4.2 Evaluating the Effectiveness of Availability Enhancement Algorithms in Tree Networks

In this section, we test two replica allocation algorithms we developed in this research in a simulated distributed environment to see how effective they are on enhancing service availabilities. Specifically, we compare the availability enhanced algorithm, Alg_Enhan, with the randomized replica allocation scheme, Alg_Ran. In this simulation, a variety of random tree networks are generated. We use to simulate a tree with 30 nodes which have a variety of resident size (|R. A random tree is generated in a breadth-first manner, i.e., starting from the root node, at each level, a random number of the node are generated for the next level based on the randomly generated node degrees, until the number of nodes specified is reached. The entire tree hosts |R| replicas. The number read on each node is randomly generated following a uniform distribution, each is in the range of (0, 10). Link availabilities of all the edges and node availabilities of all the nodes in the system are generated randomly following a uniform distribution in the range of (0.9, 1.0). For the randomized allocation scheme, replicas are allocated to the randomly selected |R| nodes in the system. For algorithms Alg_Enhan, we will first implement the algorithm based on a tree network, and then use this algorithm to compute the resident sets in the simulated tree network. Finally, we will apply our availability computing algorithm shown in Fig. 3 to compute the service availabilities for the two resident sets in the simulated tree network.

Our goal is to test which replica allocation scheme is more effective in enhancing overall service availability in a tree network. The impacts of the resident set size |R| (from 1 to 6). The experiments are repeated 10 times and the results are shown in Fig. 9.

Figure 9. The impact of resident set size on improved availability.

Fig.9 shows the impact of resident set size on the effectiveness of the three replica allocation algorithms. The parameters are set as follows: |V| = 30, degree = 5, and |R| is increased from 1 to 6. The result shows that data availabilities of the two replica allocation schemes increase along with the increase of |R|, which confirms the conclusion of Theorem 1. From Fig. 8, the resident set computed by Alg_Enhan can always achieve higher availabilities than the availability achieved by the resident set that is computed by Alg_Ran. However, the difference on the availabilities of the resident sets between Alg_Ran and Alg_Enhan decreases along the increase of |R| (from 0.05 to 0.002). Also, the availability increase rate decreases along the increase of |R|. The reason is obvious. With a larger |R|, the average distance from a user to service/data site (number of edges in the path) will be reduced and thus the service/data reach availability should be increased. Thus, the space of service availability enhancement will also be reduced when availability reaches a high level. From the above experimental study, we can conclude that when the replica size is small, the replica location has the bigger impact on availability. Hence, it is more critical to employ a better replica allocation algorithm to improve service availability, such as Alg_Enhan.

5. Related Work

Network reliability has been well studied and a large amount of works have been conducted on this topic [15, 23, 24, 31]. The modeling and computation of network reliability is similar to (but not the same as) a specific case of network availability modeling for a distributed system, i.e., modeling the availability of a distributed system with no replication. For such a system, the data availability can be computed by computing the reach availability from each node to the destination node (the node with the data). The difference between availability and reliability modeling is that the unavailability of a node does not affect the reaching probability in our data availability modeling (in reliability, the node unavailability will affect the reliability). However, the availability modeling in a distributed system with data replication is very different from reliability modeling. The data availability in the system to a node w (actually the clients associated with such node) is affected by multiple nodes (each with a copy of the data) at the same time, i.e., if a copy of data on node u is not available to node w, then another node v can provide availability to w. Therefore, computing and modeling data availability to node w not only needs to compute the reach probability to multiple destination nodes (for example u and v), but also needs to consider their impact on each other.

Availability has long been considered as a critical issue for systems providing replicated services. In [8, 9], the authors explored the benefits of dynamically trading consistency for availability using a continuous consistency model. The availability model is a function of consistency, workload, and fault load. If consistency requirement is greatly relaxed, availability can be very high. In data storage systems such as Ocean store [14, 27] and PASIS [28], both pure replication and data fragmentation approaches are considered. Their availability models only consider node availability. [11] and [12] both present an analytical model for determining the overall availability of data stored in a data grid or P2P system. The availability model captures the characteristics of peer-to-peer systems and grid systems. The availability of replica location service (metadata service) of data grids or P2P systems can be considered independent of data storage availability.

All the above works on data availability computation only consider the availability of the service or data sites and do not consider network link availability. In [6, 7], the authors propose a model for studying the availability of replicated systems. The system availability is computed in a stochastic graph, i.e. each node and each link are assigned a specific failure probability. The overall system availability is computed exhaustively using a state enumeration method. However, the runtime complexity of the state enumeration method is exponential in the number of nodes in the system in a tree network. Also, it does not give a solution to compute data availability with multiple replicas in the system. In our previous research in [17, 29], we proposed effective availability modeling and computing approaches that consider both node and network link failures, for systems with the tree, ring, and general graph topologies. In [18], we provided two availability enhancement algorithms for a tree network with homogeneous node availability and link availability, which is a special case for the problem we consider in this research. Hence, a solution is needed for the general case, i.e., each node availability and link (network edge) availability are heterogeneous.

6. Conclusions

In this paper, we address the system service or data availability modeling and enhancement approaches for distributed systems with data replication. We first provide an efficient availability computing algorithm for tree networks. We then analyze the availability enhancement problem and formalized availability enhancement in a general tree network as Problem 1. Then, the problems are transformed into a special p-median problem and a recursive algorithm-based solution, Alg_Enhan is developed to maximize the service availabilities for distributed systems with a tree topology. The algorithm Alg_Enhan considers the impact of replica location on overall service availability and the impact of network node and edge heterogeneity. Theorems have shown that the algorithm Alg_Enhan is efficient with a runtime complexity of O(K|V|2), where K is the number of replicas and |V| is the number of nodes in the network. Finally, experimental studies show that the algorithm Alg_Enhan can compute resident sets with higher availabilities than those computed by the randomized algorithm, but it is more effective when the number of replicas is small.

References

[1] J. Hennessy, “The future of systems research”, Computer, vol. 32, no. 8, pp. 27-33, 1999.

[2] J. Liu and H. Shen, “A Low-Cost Multi-failure Resilient Replication Scheme for High Data Availability in Cloud Storage”, In High Performance Computing (HiPC), 2016 IEEE 23rd International Conference on (pp. 242-251),2016.

[3] K. Meenakshi and S.B. Singh, “Availability Assessment of Multi-State System by Hybrid Universal Generating Function and Probability Intervals”, International Journal of Performability Engineering, Vol.12, No. 4, pp. 321-339, 2016.

[4] S. Ren, Y. Yang, Y. Chen and Y. Du, “Fluctuation analysis of instantaneous availability under specific distribution”, Neurocomputing, vol. 270, pp. 152-158, 2017.

[5] M. Dahlin, B. Chandra, Lei Gao and A. Nayate, “End-to-end WAN service availability”, IEEE/ACM Transactions on Networking, vol. 11, no. 2, pp. 300-313, 2003.

[6] G. On, J. Schmitt, and R. Steinmetz, “On availability Qos for replicated multimedia service and content”, In Proceedings of International Workshop on Interactive Distributed Multimedia Systems, 2002.

[7] G. On, J. Schmitt, and R. Steinmetz, “Quality of availability: replica placement for widely distributed systems”, In IWQos’03, 2003.

[8] H. Yu and A. Vahdat, “The costs and limits of availability for replicated services”, ACM SIGOPS Operating Systems Review, vol. 35, no. 5, p. 29, 2001.

[9] H. Yu and A. Vahdat, “The costs and limits of availability for replicated services”, ACM Transactions on Computer Systems, vol. 24, no. 1, pp. 70-113, 2006.

[10] M. Aguilera, R. Janakiraman, and L. Xu, “Using Erasure Codes Efficiently for Storage in Distributed System,” 2005 International Conference on Dependable Systems and Networks (DSN05).

[11] K. Ranganathan, A. Iamnitchi, and I. Foster, “Improve data availability through dynamic model-driven replication in large peer-to-peer communities”, In proceedings of 2nd IEEE/ACM International symposium on Cluster Computing and the Grid. 2002.

[12] F. Schintke and A. Reinefeld, “Modeling Replica Availability in Large Data Grids”, Journal of Grid Computing, vol. 1, no. 2, pp. 219-227, 2003.

[13] B. Tang, N. Jaggi, and M. Takahashi, “Achieving data K-Availability in intermittently connected sensor networks”, In Proceedings of the 23rd International Conference on Computer Communication and Networks (ICCCN’14), Shanghai, China, 2014.

[14] H. Weatherspoon and J. Kubiatowicz, “Erasure coding vs. replication: a quantitative comparison”, In proceedings of Peer-to-Peer Systems: First International Workshop (IPTPS), 2002.0

[15] A. Tamir, “An O(pn2) algorithm for the p-median and related problems on tree graphs”, Operations Research Letters, vol. 19, no. 2, pp. 59-64, 1996.

[16] V. Paxson, “End-to-end routing behavior in the Internet”, IEEE/ACM Transactions on Networking, vol. 5, no. 5, pp. 601-615, 1997.

[17] M. Tu, D. Xu, Z. Xia, and J, Fu, “Reach availability of replicated services”, In Proceedings of the 35th IEEE International Computer Software and Application Conference, July 2011.

[18] M. Tu, L. Xiao, and D. Xu, “Maximizing the availability of replicated services in widely distributed systems”, In Proceedings of IEEE International Conference on Software Security and Reliability, Washington DC, June, 2013.

[19] K. Kalpakis, K. Dasgupta and O. Wolfson, “Optimal placement of replicas in trees with read, write, and storage costs”, IEEE Transactions on Parallel and Distributed Systems, vol. 12, no. 6, pp. 628-637, 2001.

[20] O. Wolfson and A. Milo, “The multicast policy and its relationship to replicated data placement”, ACM Transactions on Database Systems, vol. 16, no. 1, pp. 181-205, 1991.

[21] O. Wolfson, S. Jajodia and Y. Huang, “An adaptive data replication algorithm”, ACM Transactions on Database Systems, vol. 22, no. 2, pp. 255-314, 1997.

[22] O. Kariv and S. Hakimi, “An Algorithmic Approach to Network Location Problems. II: The p-Medians”, SIAM Journal on Applied Mathematics, vol. 37, no. 3, pp. 539-560, 1979.

[23] L. Fratta and U. Montanari, “A Recursive Method Based on Case Analysis for Computing Network Terminal Reliability”, IEEE Transactions on Communications, vol. 26, no. 8, pp. 1166-1177, 1978.

[24] J. Fussell, “How to Hand-Calculate System Reliability and Safety Characteristics”, IEEE Transactions on Reliability, vol. -24, no. 3, pp. 169-174, 1975.

[25] “On the location of a tree-shaped facility”, Location Science, vol. 5, no. 1, p. 63, 1997.

[26] M. M. Halldórsson, K. Iwano, N. Katoh, and T. Tokuyama, “Finding Subsets Maximizing Minimum Structures,” SIAM Journal on Discrete Mathematics, vol. 12, no. 3, pp. 342–359, 1999.

[27] J. Kubiatowicz, C. Wells, B. Zhao, D. Bindel, Y. Chen, S. Czerwinski, P. Eaton, D. Geels, R. Gummadi, S. Rhea and H. Weatherspoon, “OceanStore”, ACM SIGARCH Computer Architecture News, vol. 28, no. 5, pp. 190-201, 2000.

[28] J. Wylie, M. Bakkaloglu, V. Pandurangan, M. Bigrigg, S. Oguz, K. Tew, C. Williams, G. Ganger, and P. Khosla, “Selecting the right data distribution scheme for a survivable storage system”, Technical Report CMU-CS-01-120, Carnegie Mellon University, 2000.

[29] M. Tu, H. Ma, I. Yen, F. Bastani, and D. Xu, “Availability, security, access performance and load balance in P2P data grid”, Journal of Grid Computing, Vol. 11, No. 1, 2013.

[30] A. Rahmani, Z. Fadaie and A. Chronopoulos, “Data placement using Dewey Encoding in a hierarchical data grid”, Journal of Network and Computer Applications, vol. 49, pp. 88-98, 2015.

[31] O. Theologou and J. Carlier, “Factoring and reductions for networks with imperfect vertices”, IEEE Transactions on Reliability, vol. 40, no. 2, pp. 210-217, 1991.

Authors

Michael Tu is a Ph.D. in Computer Science, Associate Professor of Computer Information Technology, director of the Center for Cyber Security and Infrastructure Protection at Purdue University Northwest. Dr. Tu’s areas of expertise are information assurance, digital forensics, and cybersecurity education. He has published over 40 peer reviewed papers in prestigious journals and peer reviewed conference proceedings. Dr. Tu is a Certified Information System Security Professional (CISSP), Certified Ethical Hacker (CEH), Certified Hacking and Forensics Investigator (CHFI) and AccessData Computer Examiner (ACE).

Dianxiang Xu, received the B.S., M.S., and Ph.D. degrees in computer science from Nanjing University, Nanjing, China. He is a Professor with the Department of Computer Science, Boise State University, Boise, ID, USA. His research interests include software security and safety, software engineering, access control, and software-defined systems. Dr. Xu has published over 100 peer reviewed papers in prestigious journals and peer reviewed conference proceedings.

Liangliang Xiao, received the M.S., and Ph.D. degrees in computer science from The University of Texas at Dallas, Richardson, USA. He is an Associate Professor with the Department of Computer Science & Information Technologies, Frostburg State University, Frostburg, MD, USA. His research interests include data security and distributed systems. Dr. Xiao has published over 30 peer reviewed papers in prestigious journals and peer reviewed conference proceedings.

International Journal of Computer Networks & Communications (IJCNC)

ENHANCING AVAILABILITY FOR DISTRIBUTED REPLICATED SERVICES CONSIDERING NETWORK EDGE AVAILABILITY

Leave a comment Cancel reply

Information

Shortlink

Navigation

International Journal of Computer Networks & Communications (IJCNC)

ENHANCING AVAILABILITY FOR DISTRIBUTED REPLICATED SERVICES CONSIDERING NETWORK EDGE AVAILABILITY

Share this:

Leave a comment Cancel reply

Information

Shortlink

Navigation