AIRCC PUBLISHING CORPORATION
Towards Efficient Privacy-Preserving DataAggregation for Advanced Metering Infrastructure
Navid Alamatsaz1, Arash Boustani2, Nima Alamatsaz3, Ashkan Boustani4
1,2Department of Electrcial Engineering and Computer Science, Wichita State University, Wichita, KS, USA.
3Department of Biomedical Engineering, New Jersey Institute of Technology, Newark, NJ, USA.
4Department of Statistics and Mathematics, University of Red Crescent Society of Iran, Mashad, Iran.
Recent changes to the existing power grid are expected to influence the way energy is provided and consumed by customers. Advanced Metering Infrastructure (AMI) is a tool to incorporate these changes for modernizing the electricity grid. Growing energy needs are forcing government agencies and utility companies to move towards AMI systems as part of larger smart grid initiatives. The smart grid promises to enable a more reliable, sustainable, and efficient power grid by taking advantage of information and communication technologies. However, this information-based power grid can reveal sensitive private information from the user’s perspective due to its ability to gather highly granular power consumption data. This has resulted in limited consumer acceptance and proliferation of the smart grid. Hence, it is crucial to design a mechanism to prevent the leakage of such sensitive consumer usage information in smart grid. Among different solutions for preserving consumer privacy in Smart Grid Networks (SGN), private data aggregation techniques have received a tremendous focus from security researchers. Existing privacy-preserving aggregation mechanisms in SGNs utilize cryptographic techniques, specifically homomorphic properties of public-key cryptosystems. Such homomorphic approaches are bandwidth-intensive (due to large output blocks they generate), and in most cases, are computationally complex. In this paper, we present a novel and efficient CDMA-based approach to achieve privacy-preserving aggregation in SGNs by utilizing random perturbation of power consumption data and with limited use of traditional cryptography. We evaluate and validate the efficiency and performance of our proposed privacy preserving data aggregation scheme through extensive statistical analyses and simulations
Smart Grid; Data-oriented Privacy; Secure data Aggregation; Spread Spectrum.
A Series Of Power Surges Over A Twelve-Second Period Triggered A Cascade Of Shutdowns In The US And Ontario On August 14, 2003. The Result Was The Biggest Blackout In North American history. 61800 megawatts of power were lost to over 50 million people. Studies showed that the outage was because of lack of real-time monitoring and diagnosis and failure in proper load balancing . Recently, Smart Grid has been proposed as the next generation power grid. A Smart Grid is an electrical grid that leverages communication technologies and information processing to gather, process, and act on collected information to improve reliability, efficiency, economics, and sustainability of the power grid in generation, transmission, and distribution . This information-based power grid will help the Utility Companies (UC) to act on consumer information gathered from Smart Meters (SM) at the user’s premises. The two-way communication capability will enable functions such as demand-response, demand-dispatch, self-monitoring, and self-diagnosis for the existing power grid . It also promises reduced prices through dynamic pricing schemes, wide penetration of renewable resources such as wind and solar, and fewer power outages . The topic of smart grid has attracted researchers to study various aspects of modernizing the electricity grid. The research community has been studying miscellaneous subjects such as communication technologies and infrastructure [3, 6, 7, 8, 9], legal and policy concerns [10, 11], reliability, failure diagnosis and recovery [12, 13, 14], demandresponse, demand-dispatch, load shaping, and peak-shaving [15, 16, 17], data aggregation [3, 18, 19, 20, 21, 22, 23] and, last but not the least, security and privacy [4, 5, 3, 24, 25, 26].
Advanced Metering Infrastructure (AMI) are systems that measure, gather, analyze energyusage, and communicate with metering devices such as water meters, gas meters, heat meters, and electricity meters. This communication is either on request or on a predetermined schedule. Government agencies and utilities are adopting AMI systems as part of the deployment of the smart grid. AMI improves current Advanced Meter Reading (AMR) technology by enabling two-way communications between the meter and the utility. This allows UCs to send commands to the meters for different purposes, such as time-of-use pricing information, demand-response actions, or remote disconnects .
Although AMI provides the UC with state-of-the-art capabilities, having access to fine grained consumer usage data can reveal information regarding the private lives of its users. For instance, it can be easily determined if a residential house is vacant or not by observing the fine-grained energy consumption patterns . It is also possible to track the location of the residents of a household based on the appliance they are using . Insurance companies can monitor and track eating, sleeping, and possibly exercise habits of a household [29, 30]. In 2009, the Dutch Parliament prohibited the utilization of smart meters because of privacy issues. It is worth mentioning that in Smart Grid Networks (SGN), data-oriented privacy is more of interest, as opposed to context-oriented privacy, because it deals with private consumer data. There are also many cyber security related challenges for the deployment of the Smart Grid . This “Internet-like distributed power grid” is vulnerable to many known and unknown cyber security attacks . The security threats to the Smart Grid can target the confidentiality and the integrity of the gathered fine-grained user data. They can also threaten the availability of the power grid. Computerworld  reports more than 170 outages caused by cyber-security attacks. It should go without saying that without appropriate security and privacy-preserving techniques, large-scale deployment and consumer-acceptance of the Smart Grid paradigm is difficult.
In general, data aggregation techniques are utilized to significantly reduce the volume of traffic being transmitted in an SGN by compressing data in the intermediate nodes (also called aggregators). Aggregation is an important technique for preserving network re-sources, such as bandwidth and energy . Also, it is deployed as a common approach to preserve data privacy against external adversaries as the aggregation process compresses large inputs to small outputs at the intermediate aggregators. However, this can lead to several new vulnerabilities against potential internal adversaries, such as the aggregator node itself. Thus, it is of paramount importance to design appropriate mechanisms for privacy-preserving data aggregation . Earlier privacy-preserving approaches have primarily used cryptographic techniques such as homomorphic encryption and secure multiparty computation in order to preserve user privacy while aggregating usage data . These approaches, although providing strong guarantees of confidentiality, are very heavy from a computational and communicational stand-point and may not be feasible on lowend smart meters with limited computation capabilities . Considering the huge scale of future smart meter deployment and the granularity of the data being gathered, existing communication networks will have difficulty handling this data because of resource constraints such as network capacity (bandwidth) [66, 67, 68]. Homomorphic cryptosystemsusually generate an output of a huge fixed-length as compared with the data generated by smart meters. This ciphertext can be up to one hundred times larger than the actual smart meter data . Given the frequency of the data being sent and possible bandwidth scarcity, this can lead to unacceptable delay and network overhead .
In this paper, we investigate the feasibility of existing privacy-preserving data aggregation approaches. We devise a novel, efficient, and feasible (from a communications perspective)data aggregation mechanism for SMs using coding theory, spread spectrum communications(SSC), and random perturbation techniques [36, 37]. We also evaluate the privacy protection level of our proposed scheme with well-established information-theoretic and statistical tools [38, 64, 39, 40]. Finally, we validate the performance of our aggregation mechanism by means of simulations.
The rest of the paper is organized as follows. Related work in the literature and backgroundon existing secure aggregation schemes is outlined in Section 2. The network and adversary model assumed in this work along with basics of SSC are presented in Section 3. Our proposed perturbation-based privacy-preserving aggregation utilizing SSC is outlined in Section 4. Evaluation and simulation results are discussed in Section 5. We conclude the paper with a summary of contributions and results in Section 6.
2. Background And Relatedwork
In this section, we outline mechanisms in the literature for privacy-preserving data aggregationin SGNs and also study some data aggregation methods in other networking infrastructure with similar constraints such as Wireless Sensor Networks (WSN).
2.1 Homomorphic Encryption For Data Aggregation
A public-key cryptosystem is known to have homomorphic properties if E(m1 m2) = E(m1) 4 E(m2), where E is the encryption function, and 4 are two mathematical operations, and m1;m2 are two input messages. In other words, a homomorphic property enables certain mathematical operations on the plaintext by performing specific operations on the ciphertext without observing any intermediate results in plaintext. Based on the supported operations, homomorphic cryptosystems fall into two broad categories: par tially homomorphic and fully homomorphic. Partially homomorphic cryptosystems only support either addition or multiplication, or in some cases polynomials up to certain degrees, whereas fully homomorphic cryptosystems support both addition and multiplication [3, 26]. It goes without saying that fully homomorphic cryptosystems provide much more flexibility and have recently received significant attention [41, 42]. However, given their computational complexity, they are not widely used in practical applications yet.Well-known homomorphic cryptosystems include RSA , El Gamal , Paillier ,Naccache-Stern , and Boneh-Goh-Nissim [46, 47].
In general, data aggregation techniques might support different aggregation functions such as sum, max, min, avg, median, and variance. However in SGNs, the UC is mostly interested in total consumption (sum) of a given neighborhood in a specific time period to enable functions such as demand-response, load-shaping, peak-shaving, and self-monitoring [4, 3, 15, 17]. Also, the average (avg) usage of each household might be of interest. Given that sum of consumed electricity of all smart meters in a residential neighborhood is required to be computed in a private fashion, the additive homomorphic property of the Paillier  cryptosystm can be useful. Also, the Boneh-Goh-Nissim cryptosystem [47, 26] (which is an extension of Paillier with bilinear groups) supports the additive homomorphic function. Rather than adding the consumption data in plaintext, one can multiply the encrypted values and then decrypt the result to get the addition of plaintext data. The Paillier encryption system works as explained in Protocol 1 (Key Generation), 2 (Encryption), and 3 (Decryption) . As it can be observed, the sum of plaintext can be computed from multiplication of the ciphertext, i.e. D(E(m1):E(m2) mod N2) = (m1 + m2)mod N or D(C1:C2 mod N2) = (m1 + m2)mod N, where N is the modulus for encryption and decryption.
Protocol 1. Key Generation.
He et al.  present a secure data exchange scheme for the smart grid based on homomorphic properties of Goh cryptosystem . Goh supports an arbitrary number of additions and a single multiplication on the ciphertext. It is worth noting that the aforementioned protocol is only a secure data communication scheme and does not address the problem of secure aggregation. Li et al.  utilize the homomorphic properties of Paillier to propose an incremental data aggregation scheme. In , every node passes its encrypted time-series data to its parent node on the aggregation tree. The parent node multiplies the received value into its own encrypted consumption data and passes the total result to the next parent node. Therefore, all the SMs participate in the aggregation without seeing any intermediate or final result. Garcia and Jacobs  present a privacy-preserving protocol using Paillier based on secret sharing. Their proposal hides consumption data from the UC as it receives random shares of data (instead of the entire data) which it cannot decrypt. The other nodes cannot retrieve meaningful information either since they only receive random shares. Kursawe et al.  propose two approaches to calculate total consumption in SGN. In their first approach, called aggregation protocols, smart metering data are masked in such a way that after summing the data from all smart meters masking values cancel each other out and the UC gets the total consumption information. In their second approach, named comparison protocols, they consider that the UC roughly knows the total consumption. Erkin and Tsudik  propose a cryptographic protocol based on a modified version of the Paillier cryptosystem to calculate the total consumption of all the SMs in a given neighborhood as well as a single SM in the AMI. Acs and Castelluccia  suggest a solution using masking and differential privacy and utilizing the homomorphic properties of a computationally-cheap cryptosystem for private data aggregation. Lu et al.  propose an Efficient and Privacy-Preserving Aggregation (EPPA) for smart grid communications by structuring multidimensional data and encrypting them with the Paillier cryptosystem. Erkin et al.  study different existing secure signal processing mechanisms in SGNs and compare different existing cryptographic methods in terms of computational complexity, efficiency, and imposed overhead.
It is worth noting that in WSNs another non-homomorphic, cryptographic approach has also been utilized; an intermediate node in the aggregation tree has to decrypt the data received from a downstream node, then aggregate the data according to the aggregation function, for instance sum, and finally encrypt the output of the aggregation function before forwarding the result to the up-stream node on the tree. Such schemes have several shortcomings, the most important of which is that they do not protect the privacy of the transmitted data from the neighboring sensor nodes. All neighbors share pairwise keys and are able to decrypt the incoming data. Hence, if the neighboring sensor node is honest-butcurious or if it is compromised and monitored by the adversary, the data in transit can be easily intercepted.
2.2 Non-Homomorphic Private Data Aggregation
A common path to privacy-preserving aggregation in WSNs is perturbing the raw data being transmitted by introducing a random noise [36, 37, 54, 61]. He et al.  propose two approaches to privacy-preserving data aggregation in WSNs. The basic idea of their first approach, Cluster-based Data Aggregation (CPDA), is to introduce noise to the raw data sensed by the sensor node, such that this noise will be cancelled out in the aggregation operation resulting in an accurate aggregate value. The main idea of their second proposed method, Slice-Mix-AggRegaTe (SMART), is to slice original data into pieces and recombine data. However, confidentiality and integrity of the data is not protected. In our work, we propose a secure aggregation scheme based on the properties of OCSs to preserve the confidentiality of the transmitted data without relying on traditional cryptographic techniques.
In the homomorphic encryption-based approaches discussed in [3, 18, 26, 48, 49, 50], we observe that the power-usage information is generally of small size (e.g. 20 bits) [4, 52]. However, the plaintext input size of most existing homomorphic cryptosystems is huge [3, 52], for example 2048 bits for the widely-used Paillier cryptosystem [42, 48, 50, 52]. As a result, the input data has to be padded before encryption and the size of the output is also large. Given the high frequency of data collection and the number of deployed smart meters, this will result in unacceptable communication overhead on the network, and also high processing burden on the smart meters with limited computational capabilities [52, 62]. Aggregation schemes that construct and utilize the spanning-tree, for instance by Li et al. , also do not consider performance issues. The processing and communication overhead makes the protocol less suitable in practical implementations. Moreover, depending on the depth of the spanning tree of the network, there can be large delays between the time power consumption data is reported by the meters and the time the aggregated data is received at the UC. In approaches proposed in [34, 54], the perturbed or the sliced data need to be encrypted before being sent to the neighbors. However, the keydistribution for such symmetric pair-wise encryption is non-trivial. In other words, any two node in the network will share symmetric keys which will result in a key distribution complexity of order O(n2), where n is the number of nodes in the network. Moreover, this encryption can put extra burden on the nodes with limited capabilities. Phulpin et al.  study the efficiency and benefits of network coding in both Power Line Communications (PLC) and wireless SGNs. The authors also show that using coding theory in SGN reduces the delay by decreasing the number of time slots and saves energy by reducing the number of transmissions.
Based on the aforementioned observations, designing an efficient privacy-preserving technique for aggregating SM data without using traditional crypto primitives with homomorphic properties seems to be necessary. We are proposing a privacy-preserving aggregation scheme using coding theory, spread spectrum communications, and statistical perturbation in order to efficiently aggregate power usage while improving network performance and decreasing unnecessary communication and computation loads on the SGN. Our contention-free scheme will also decrease the delay, BER, and interference. Our contributions are twofold: First, we introduce a simple, yet efficient, approach to perturb user data before aggregation in order to preserve user privacy. Second, we propose a secure aggregation scheme, AgSec, using SSC. Finally, we assess the privacy level and the performance of our scheme through analytical evaluations and simulations.
3.1 Network And Communication Model
Communication standards and technology to be used in the future smart grid and AMI is an ongoing debate. There are various communication options proposed for the smart grid including fiber optics, copper-wire line, power line communications, and miscellaneous wireless technologies. We consider the widely used wireless architecture for the deployment of SGN . The wireless communication between SMs, which are organized into groups called clusters, and the aggregator or Cluster Head (CH) uses IEEE 802.15.4 or Zigbee due to characteristics such as low power, short delay, self-organization, scalability, and high security . The aggregated data will be forwarded from the CH to the UC using a dedicated point-to-point link.
Figure 1 depicts the assumed three-level hierarchical network architecture. The communicationbetween the UC and the ith aggregator (CH) is denoted as UAi. Similarly ASi;j represents the communication between the ith aggregator and the jth smart meter in the ith cluster. Also there exists a separate out-of-band control and signaling channel between the ith aggregator and the jth smart meter in the ith cluster referred to as CCi;j . The signaling and control messages, which are used in the initialization phase, are discussed in detail in Section 4.1. The Zigbee medium access protocol on all AS channels is CDMA. Also, all UA communications are on a dedicated point-to-point channel. Our signaling channel uses a low-range wireless technology such as IEEE 802.15.4 or IEEE 802.11. The main advantage of Wi-Fi over Zigbee is its high data rate. However, Wi-Fi’s high energy consumption is an issue that should be considered. The Zigbee and Wi-Fi alliances have been working towards designing a standard that promotes Zigbee to work onWi-Fi, called Smart Energy 2.0 . Finally, the ith aggregator uses a CDMA broadcast channel BCi to distribute the perturbation information. n OCSs are used to broadcast random noise information on BCi. These random numbers will be utilized by SMs to perturb their time-series data.These nrandom numbers are placed in a [ ]ij=n Perturbation Matrix, where n is the number of SMs in the cluster. Every element of this matrix is coded with a unique OCS as described in Section 4.2. Figure 3 illustrates the components implemented in different network entities.
3.2 Communications On The CDMA Channel
All communications take place over four separate channels, as discussed in Section 3.1. All smart meter data from the smart meter to the aggregator are sent over the CDMAbased data channel, represented as the AS channel (in Fig. 1). The OCSs for encoding data transmission on the AS channel are generated using the Golay or PCC code generation
Figure 1 :Network Architecture
algorithms [58, 59]. These OCSs will be used to spread the data as explained later in Section
4.4. Golay OCSs can be generated recursively, as shown in Eqn. 1.
In Eqn. 1, L = 2M is the total number of available OCSs (which is also equal to the OCS length), where M 1 is the number of chips in each OCS. AL and BL are submatrices. In recursive OCS generation algorithms such as Golay (or PCC), OCSs can be organized into groups called flock based on chip pattern similarity and chip distance between OCSs. In Fig. 2-a, we can see the different flocks for 16-chip OCSs. Both Golay and PCC algorithms are able to produce L OCSs with a length of L-chips. The PCC generator matrix is shown in Eqn. 2 and OCSs of 16-chip length generated using PCC are shown in Fig. 2- b. OCSs generated by PCC have a uniform distribution of 1’s and -1’s, in contrast to OCSs generated by Golay. This property, which will result in having equal number of 1’s and -1’s, makes data transmission using PCC more fault tolerant than Golay. We can use any OCS generator algorithm (synchronous or asynchronous) in our proposed method. However, PCC and Golay are preferred because of equality in OCS lengtand number of generated OCSs, and high level of orthogonality .
Let us assume that time is divided into periods of random length denoted by a random variable ψT.. During each period, each smart meter is assigned a subset of OCSs for use in that period by the CH. The assignment happens over the CC signaling channel. The communications over the CC channels are secured, from possible sniffing nodes, using symmetric
Figure 2: a) A 16-chip Golay OCS matrix. b) A 16-chip PCC OCS matrix.
meter are randomly selected by the CH from a large pool of available OCSs. Each smart meter will use the OCSs uniquely assigned to it in the time frame ψT . In order to spread data bits on the AS data channel, the smart meter calculates the inner-product of every data-bit in appropriate OCS. Every single bit of data is coded independently with an OCS different from the previous and next data bit. This will build the foundation of our secure scheme as described in Section 4.4. It should be noted that it is possible for multiple smart meters to use the same OCS for data transmission in different parts of the network as long as their transmission ranges do not overlap and the SMs are in two diffrent clusters. This is required to make sure that the transmissions do not interfere with each other (in general, interference is anything that alters, modifies or disrupts a signal as it travels between a source and a receiver). The same CDMA concepts and principles are also deployed on the BCi channel. This broadcast channel is used by the CH to advertise perturbation data to the SMs, as discussed in Section 4.2.
It should be noted that, before spreading the data on the CDMA channel using the introduced OCSs, a scrambling code is utilized between the sender and receiver for security purposes. This code, which is generally 242 chips long , is referred to as the Long Code. In order to appropriately use this long code, the sender and receiver must be synchronous with a GPS or Coordinated Universal Time (UTC) system .
Figure 3: Entities used in the privacy-preserving aggregation.
3.3 Adversary Model
Based on their behavior, all entities in the proposed smart grid communication network can fall into one of the following three broad categories. (i) honest entities that fully follow the rules of the established protocol. (ii) malicious or cheating nodes that do not follow the protocol. Malicious behavior includes, but is not limitted to, insertion, deletion, and forging of messages in the system. (iii) semi-honest or honest-but-curious nodes that follow the defined protocols but they attempt to infer privacy-sensitive data from the input/output of the protocols and the intermediate data generated due to protocol execution. In our proposed scheme we consider the UC and the CH as honest-but-curious. In other words, they follow the established protocol but they can also try to infer privacy-sensitive information from the time-series data. The neighboring SMs are, generally, semi-honest. Our objective is to completely secure all the communications from malicious and semi-honest SMs and other adversarial nodes against possible sniffing, spoofing, and inference attacks and hence, maintain the consumers’ privacy while still providing the UC with required aggregate values. Particularly, we are interested in protecting the system against the following attacks: (i) inference of individual data by CH and UC. (ii) eavesdropping (sniffing) by external adversaries. (iii) forging (spoofing) of smart meter data.
4. Privacy-Preserving Aggregation
4.1 Initialization Phase
Upon initial deployment, CHi communicates control information to smart meter SMj through CCi;j . For each time duration the CH assigns each smart meter, SMj , a set
Figure 4: Initialization Parameters
4.2. Privacy-Preserving Via Random Noise Perturbation
Before discussing our secure aggregation protocol, we would like to introduce our random
noise perturbation technique. Instead of aggregating the original smart meter data and sending the aggregate value to the UC, every smart meter utilizes a pseudo-random noise to perturb its data before aggregation. This perturbed data (instead of the original data) will be sent for aggregation to the CH. The received perturbed values Pi will be aggregated at the CH given the aggregation function in Section 4.4. Perturbation techniques in the literature usually follow two approaches. The basic idea of one group of such approaches is to add noise to the actual data such that the aggregator, or the CH in our case, can calculate an accurate aggregate value without inferring individual data transmitted by every node . In a second similar direction, the data can be manipulated such that the aggregator can calculate an aggregate value which is an estimate of the histogram of data distribution rather than the actual aggregate value of the original data .
After all SMs are configured with appropriate OCS and ID information; they should start transmitting their readings periodically. Different time intervals for data reporting, ranging from 30 seconds to a few hours, could be found in the literature . However, before transmitting, some noise should be added to this raw data. This random noise should be chosen in such a way that it does not affect the total aggregate value.
As noted earlier, in smart metering systems, the UC is generally interested in the output of two aggregation functions for a given neighborhood in a specific time period ψT . First, the sum of consumed electricity is desired, and second, the average consumption of every smart meter is of interest. These two values can help power companies plan accordingly for demand-response purposes. Based on these assumptions, our perturbation technique must be designed in such a way that the aggregator can calculate an accurate aggregate value while keeping individual meter readings confidential. Assume every SMi,j in cluster i has the data dj to transmit. The sum and average of the data of all the SMs in this n smart meter cluster is:
As an alternative solution, after a smart meter fetches a pseudo-random number, it can send a packet on the control channel back to the CH indicating that pseudo-random number has been used. The sender of the packet has to be anonymized such that CH cannot distinguish which SM is using that pseudo random number. Different anonymization techniques (such as replacing the sender ID with a pseudonym) can be found in the literature . In the anonymization process, the packets sent from SM to CH are anonymized, i.e., the user part (source) of each packet is replaced by a user pseudonym.
4.3 Privacy Protection Evaluation
Many efforts in the past few years have been focused on designing privacy-preserving mechanisms for the smart grid. However, only a limited number of these works have presentedan analytical framework to quantify the privacy leakage before and after the deployment
Figure 5: Perturbation Matrix.
of their privacy-preserving approach. Here, we introduce a simplistic certainty-based privacy analysis. The notion of entropy by Shannon is a well-known measure of uncertainty in information theory . The maximum uncertainty is achieved when entropy is maximized. Let X be a continuous random variable with probability density function (pdf) fX(x), then, theentropy of X is defined as follows:
It should go without saying that increasing the range from which the random numbers are selected (c b) will increase entropy, and thus, decrease certainty of potential inference attacks. Now, we would like to see the result of this perturbation on the entropy of bX. In general, assuming that X and A are two continuous random variables with the same range,we have :
As it can be concluded from the above equation, the entropy of bX is always greater than or equal to the maximum entropy of X and A. Since the uniform distribution has the maximum entropy among all distributions, adding the smart meter data with uniformly distributed pseudo-random numbers will maximize entropy, minimize certainty, and hence, improve privacy. It should be noted that, here, we are not assuming any specific attack functions or adversarial strength. Given the a priori knowledge of the adversary, it might be able to infer information by observing bX.In such a scenario, the entropy of the inferred information, denoted by the random variable Y, should be studied.
4.4 Proposed Secure Aggregation Protocol(Agsec)
CH will receive a signal including all the bits transmitted by all the smart meters. The received signal will be decoded by CH using all valid OCSs that it initially assigned to theSMs. Since CH maintains a table of assigned OCSs (in the same order that was agreed in the initialization phase) and IDs to every single SM in the network, it is able to decode thedata by using appropriate OCS for every bit. Hence, after decoding the received signal,CH has all individual (perturbed) data sent by all the SMs in the cluster. Then, it adds allthe received data and sends the aggregate value to the UC on the point-to-point UA link.It should be noted again, the perturbation noise will be cancelled out upon addition. Ourproposed secure aggregation technique is outlined in protocols 4, 5 and 6. (Even if data in transit could be decoded, it would still not be useful to the adversary as they are alreadyperturbed.)
Protocol 4: UC function.
In protocol 4, the UC receives the aggregated data from the CH on the UA channel.Protocol 5 elaborates how CH generates and distributes OCSs (for aggregation and perturbation)to the SMs. Also, it shows how the data is despread, aggregated, and forwarded to the UC by CH. Finally, protocol 6 elaborates how SM receives the initialization information,perturbs data and transmits to the CH on the AS channel.
4.4.1 Security Analysis
Here, we would like to show that sniffing attacks against our CDMA-based aggregation are not feasible. This claim is based on the following considerations:
Protocol 5: CH function.
Protocol 6: SM function.
5. EVALUATION AND SIMULATION RESULTS
Below, we present a simple analysis that compares end-to-end and hop-by-hop delays in homomorphic approaches versus our proposed CDMA-based aggregation. We evaluatethe performance of our aggregation scheme through extensive simulations.
5.1 Comparative Performance Evaluation By Numerical Analysis As discussed in Section 2.1, existing secure aggregation schemes impose a significant communication and computation overhead on SGNs with limited capabilities. Private aggregationschemes based on the homomorphic properties of cryptosystems require fixed large size input blocks and are not ideally suited for small-sized data generated by SMs.The 20 to 30 bit  output data generated by SMs has to be padded, e.g.,to 2048 bits for Paillier , before encryption. In our approach, by choosing OCSs with appropriate length,this overhead can be significantly reduced. Readers should note that in our scheme each bit will be spread to L bits after encoding.
It is worth mentioning that Saputro and Akkaya  have analyzed the performance of homomorphic aggregation through extensive simulations. Not surprisingly, their resultsconfirm our evaluation. The authors show that homomorphic encryption for data aggregation is very expensive in terms of communication overhead. They have also comparedETE homomorphic data aggregation with Hop-by-Hop (HBH) decrypt, aggregate, encrypt at intermediate aggregator nodes via regular stream-ciphers, such as RC-4. Surprisingly, bothapproaches show similar performance from a computation perspective (One multiplication in homomorphic ETE aggregation is as expensive as three operations in HBH aggregation:decrypt, add, encrypt) . However, as our analysis also confirms, the authors show that ETE aggregation via homomorphic encryption generates extraordinarily large data whichwill result in unacceptable communication overhead on the SGN.
5.2 Simulation Results
Figure 7: OCS Length versus Communication Overhead.
Figure 8: OCS Length versus Delay.
Table 2: Simulation Parameters
Existing approaches to privacy-preserving data aggregation in smart grid generally utilize the homomorphic properties of public-key cryptosystems. However, as we have thoroughlyinvestigated, these approaches are expensive from a communication stand-point. In this paper, we proposed a two-step approach towards efficient private data aggregation inSGNs. First, we introduced a random perturbation technique which is used to statistically alter the time-series data of every SM such that individual consumption patterns could notbe inferred and yet the sum and average values of the reported power consumption in a given neighborhood can be calculated accurately. Second, we proposed an efficient andsecure data aggregation scheme which utilizes the properties of spread spectrum communications.Our evaluation and simulation results confirmed that our approach increasesperformance and decreases communication overhead on SGNs considerably, as compared with existing homomorphic aggregation schemes.
 Alamatsaz, N., Boustani, A., Jadliwala, M., Namboodiri, V. (2014) AgSec: Secure and EfficientCDMA-based Aggregation for Smart Metering Systems,Consumer Communications and NetwokringConference (CCNC), 2014 11th IEEE, 102-108.
 IESO, Blackout 2003, url = http://www.ieso.ca/imoweb/EmergencyPrep/ blackout2003, 2012.
 Erkin, Z., Troncoso-Pastoriza, J.R., Lagendijk, R.L., Perez-Gonzalez, F. (2013) Privacy-preserving data aggregation in smart metering systems: an overview, Signal Processing Magazine, IEEE, volume 30, number 2, pages=75-86, ISSN=1053-5888
 Rouf, Ishtiaq and Mustafa, Hossen and Xu, Miao and Xu, Wenyuan and Miller, Rob and Gruteser, Marco, (2012) NeighborhoodWatch: Security and Privacy Analysis of Automatic Meter Reading Systems, Proceedings of the 2012 ACM Conference on Computer and Communications Security, CCS ’12, isbn = 978-1-4503-1651-4, Raleigh, North Carolina, USA,462–473, ACM,New York, NY, USA.
 Barenghi, A. and Pelosi, G. (2011) Security and Privacy in Smart Grid Infrastructures, Database
and Expert Systems Applications (DEXA), 2011 22nd International Workshop on, 102-108, ISSN=1529-4188
 Van Engelen, A.G., Collins, J.S., (2010) Choices for Smart Grid Implementation, System Sciences (HICSS), 2010 43rd Hawaii International Conference on, ISSN=1530-1605.
 Bose, A., (2010) Smart Transmission Grid Applications and Their Supporting Infrastructure, Smart Grid, IEEE Transactions on, volume 1, number 1, pages 11-19, ISSN 1949-3053.
 Yu, F.R., Zhang, p., Weidong, X., Choudhury, P., (2011), Communication systems for grid integration
of renewable energy resources, Network, IEEE, volume 25, number 5, pages 22-29, ISSN 0890-8044.
 Amin, R., Martin, J., Xuehai Z., (2012) Smart Grid communication using next generation heterogeneous
wireless networks, Smart Grid Communications (SmartGridComm), 2012 IEEE Third International Conference on, pages 229-234.
 Tabors, R.D., Parker, G., Caramanis, M.C., (2010) Development of the Smart Grid: Missing Elements
in the Policy Process, System Sciences (HICSS), 2010 43rd Hawaii International Conference on, pages 1-7, ISSN 1530-1605.
 Schuler, R.E., (2010) Electricity Markets, Reliability and the Environment: Smartening-Up the Grid, System Sciences (HICSS), 2010 43rd Hawaii International Conference on, pages 1-7, ISSN 1530-1605.
Niyato, D., Wang, P., Hossain, E., (2012) Reliability analysis and redundancy design of smartgrid wireless communications system for demand side management, Wireless Communications,IEEE, volume 19, number 3, pages 38-46, ISSN 1536-1284.
 Falahati, B., Yong F., Lei W., (2012) Reliability Assessment of Smart Grid Considering Direct
Cyber-Power Interdependencies, Smart Grid, IEEE Transactions on, volume 3, number 3, pages
1515-1524, ISSN 1949-3053.
 Moslehi, K., Kumar, R., (2010) A Reliability Perspective of the Smart Grid, Smart Grid, IEEE
Transactions on, volume 1, number 1, pages 57-64, ISSN 1949-3053.
 Shao, S., Pipattanasomporn, M., Rahman, S., (2011) Demand Response as a Load Shaping Tool in an Intelligent GridWith Electric Vehicles, Smart Grid, IEEE Transactions on, volume 2, number4, pages 624-631, ISSN 1949-3053.
 Shao, S., Pipattanasomporn, M., Rahman, S., (2012) Grid Integration of Electric Vehicles and
Demand ResponseWith Customer Choice, Smart Grid, IEEE Transactions on,volume 3, number1, pages 543-550, ISSN 1949-3053.
 Paschalidis, I.C., Binbin, L., Caramanis, M.C., (2012) Demand-Side Management for Regulation
Service Provisioning Through Internal Pricing, Power Systems, IEEE Transactions on, volume27, number 3, pages 1531-1539, ISSN 0885-8950.
 Li, F., Luo, B., Liu, P., (2010) Secure Information Aggregation for Smart Grids Using HomomorphicEncryption, Smart Grid Communications (SmartGridComm), 2010 First IEEE InternationalConference on, pages 327-332.
 Yan, Y., Qian, Y., Sharif, H., (2011) A Secure Data Aggregation and Dispatch Scheme for Home
Area Networks in Smart Grid, Global Telecommunications Conference (GLOBECOM 2011),
2011 IEEE, pages 1-6, ISSN 1930-529X.
 Bartoli, A., Hernandez-Serrano, J., Soriano, M., Dohler, M., Kountouris, A., Barthel, D., (2010)
Secure Lossless Aggregation for Smart Grid M2M Networks, Smart Grid Communications
(SmartGridComm), 2010 First IEEE International Conference on, pages 333-338.
 Li, H., Lin, K., Li, K, (2011), Energy-efficient and High-accuracy Secure Data Aggregation in
Wireless Sensor Networks, Elsevier Science Publishers B. V., volume 34, number 4, issn 0140-3664, pages 591–597.
 Weng, C., Li, M., Lu, X., (2008) Data Aggregation with Multiple Spanning Trees in Wireless
Sensor Networks, Embedded Software and Systems, 2008. ICESS ’08. International Conferenceon, pages 355-362.
 Mustafa, M., Zhang, N., Kalogridis, G., Fan, Z., (2014) DESA: A Decentralized, Efficient and
SelectiveAggregation Scheme in AMI, Smart Grid, IEEE Transactions on.
 Fhom, H.S., Bayarou, K.M., (2011) Towards a Holistic Privacy Engineering Approach for Smart
Grid Systems, Trust, Security and Privacy in Computing and Communications (TrustCom), 2011
IEEE 10th International Conference on, pages 234-241.
 Line, M.B., Tondel, I.A., Jaatun, M.G., (2011) Cyber security challenges in Smart Grids, Innovative
Smart Grid Technologies (ISGT Europe), 2011 2nd IEEE PES International Conference and Exhibition on, pages 1-8, ISSN 2165-4816.
 Xingze, H., Pun, M., Kuo, C.-C.J., (2012) Secure and efficient cryptosystem for smart grid using
homomorphic encryption, Innovative Smart Grid Technologies (ISGT), 2012 IEEE PES, pages
 Lisovich, M. A., Wicker, S., (2008) Privacy concerns in upcoming residential and commercial
demand-response systems, Clemson University Power Systems Conference.
 Lisovich, M.A., Mulligan, D.K., Wicker, S.B., (2010) Inferring Personal Information from
Demand-Response Systems, Security Privacy, IEEE, volume 8, number 1, pages 11-20, ISSN
 Molina-Markham, A., Shenoy, P., Fu, K., Cecchet, E., Irwin, D., (2010) Private Memoirs of a Smart Meter, ACM BuildSys Work shop, isbn 978-1-4503-0458-0, Zurich, Switzerland, series
BuildSys ’10, pages 61–66, url= http://doi.acm.org/10.1145/1878431.1878446, New York, NY,USA.
 McDaniel, P., McLaughlin, S., (2009) Security and Privacy Challenges in the Smart Grid, Security Privacy, IEEE, volume 7, number 3, pages 75-77, ISSN 1540-7993.
 Cohen, F., (2010) The Smarter Grid, Security Privacy, IEEE, volume 8, number 1, pages 60-63,ISSN 1540-7993.
 Vijayan, j., (2010) Stuxnet renews power grid security concerns, Computerworld magazine.
 Li, N., Zhang, N., Das, S. K., Thuraisingham, B., (2009) Privacy Preservation in Wireless Sensor Networks: A State-of-the-art Survey, Elsevier Science Publishers B. V., Ad Hoc Network,volume 7, number 8, issn 1570-8705, pages 1501-1514.
 He,W., Nguyen, H., Liu, X., Nahrstedt, K., Abdelzaher, T., (2008) iPDA: An integrity-protecting private data aggregation scheme for wireless sensor networks, IEEE Military Communications
Conference, pages 1-7.
 Lagendijk, R.L. and Erkin, Z. and Barni, M., (2013) Encrypted signal processing for privacy TRANSACTIONS ON DATA PRIVACY ()
24NavidAlamatsaz, ArashBoustani, NimaAlamatsaz, AshkanBoustani protection: Conveying the utility of homomorphic encryption and multiparty computation,
Signal Processing Magazine, IEEE, volume 30, number1, pages 82-105, ISSN 1053-5888.
 Agrawal, R., Srikan, R., (2000) Privacy-preserving data mining, SIGMOD, pages 49-54.
 Huang, Z., Du, W., Chen, B., (2005) Deriving Private Information from Randomized Data, Proceedings of the 2005ACMSIGMOD International Conference on Management of Data SIGMOD’05, isbn 1-59593-060-4, pages 37-48, url = http://doi.acm.org/10.1145/1066157.1066163.
 Ross, S. M. (2009) Introduction to Probability and Statistics for Engineers and Scientists, FourthEdition- Academic Press.
 Koralov, L. B., Sinai, Y. G., Theory of Probability and Random Processes,Second Edition-Springer.
 Cover, T., Thomas J., (2006) Elements of Information Theory, JohnWiley and Sons.
 Van Dijk, M., Gentry, C., Halevi, S., Vaikuntanathan, V., (2010) Fully Homomorphic Encryption over the Integers, Proceedings of the 29th Annual International Conference on Theory and Applications
of Cryptographic Techniques EUROCRYPT’10, isbn 3-642-13189-1, 978-3-642-13189-9,French Riviera, France, pages 24-43, Springer-Verlag.
 Paillier, P., (1999) Public-key cryptosystems based on composite degree residuosity classes, EUROCRYPT.
 Rivest, R. L., Shamir, A., Adleman, L., (1978) A Method for Obtaining Digital Signatures and Public-key Cryptosystems, Commun. ACM, volume 21, number 2, issn 0001-0782, pages 120-
 El Gamal, T., (1985) A Public Key Cryptosystem and a Signature Scheme Based on Discrete Logarithms, Proceedings of CRYPTO 84 on Advances in Cryptology, isbn 0-387-15658-5, Santa Barbara, California, USA, pages 10-18, Springer-Verlag.
 Naccache, D., Stern, J., (1998) A New Public Key Cryptosystem Based on Higher Residues, ACM Conference on Computer and Communications Security, isbn 1-58113-007-4, pages 59-66, ACM.
 Goh, E.J., (2007) Encryption Schemes from Bilinear Maps, Department of Computer Science,Stanford University.
 Boneh, D.,Goh, E., Nissim, K., (2005) Evaluating 2-DNF Formulas on Ciphertexts, Proceedings of the Second International Conference on Theory of Cryptography, TCC’05, isbn 3-540-24573-1,
978-3-540-24573-5, pages 325–341, Springer-Verlag.
 Garcia F. D, Jacobs B., (2010) Privacy-friendly energy-metering via homomorphic encryption.
 Kursawe, K., Danezis, G., Kohlweiss, M., (2011) Privacy-friendly Aggregation for the Smart-grid, Proceedings of the 11th International Conference on Privacy Enhancing Technologies
PETS’11, isbn 978-3-642-22262-7, Waterloo, ON, Canada, pages 175-191, url = http://dl.acm.org/citation.cfm?id=2032162.2032172, Springer-Verlag.
 Erkin, Z., Tsudik G., (2012) Private computation of spatial and temporal power consumption with smart meters ACNS.
 Cs G., Castelluccia C., (2011) I have a DREAM! (Differentially PrivatE smart Metering),ACM IH.
 Rongxing, Lu, Xiaohui, L., Xu, L., Xiaodong, L., Xuemin, S., (2012) EPPA: An Efficient and Privacy-Preserving Aggregation Scheme for Secure Smart Grid Communications, EEE Tran. on
Parallel and Distributed Systems, volume 23, number 9, pages 1621-1631, ISSN=1045-9219.
 Plantard, T., Susilo, W., Zhang, Z., (2013) Fully Homomorphic Encryption Using Hidden Ideal Lattice, Information Forensics and Security, IEEE Transactions on, volume 8, pages 2127-2137,ISSN 1556-6013.
 Wenbo H., Xue L., Hoang N., Nahrstedt, K., Abdelzaher, T., (2007) PDA: Privacy-Preserving Data Aggregation in Wireless Sensor Networks, IEEE INFOCOM, pages 2045-2053, ISSN 0743-166X.
 Zanjani, M.B., Monsefi, R., Boustani, A., (2010) Energy efficient/highly secure data aggregation method using tree-structured orthogonal codes for Wireless Sensor Networks, Software
Technology and Engineering (ICSTE), 2010 2nd International Conference on, volume 2, pages V2-260-V2-265.
 Zanjani, M.B., Boustani, A., (2011) Energy aware and highly secured data aggregation for gridbased asynchronous Wireless Sensor Networks, IEEE PacRim’11, pages 555-560, ISSN 1555-5798.
 Phulpin, Y., Barros, J., Lucani, D., (2011) Network coding in Smart Grids, Smart Grid Communications(SmartGridComm), 2011 IEEE International Conference on, pages=49-54.
 Chen, H. H., (2007) The Next Generation CDMA Technologies, John Wiley and Sons.
 Boustani, A., Sabet, J., Azizi, M., Mirmotahhary, N., Khorsandi, S., (2010) Persian Code: A new orthogonal spreading code generation algorithm for spread spectrum CDMA systems,Wireless
Advanced (WiAD), 2010 6th Conference on, pages 1-5.
 Boustani, A., Alamatsaz, N., Jadliwala, M., Namboodiri, V. (2014) LocJam: A Novel Jammingbased Approach to Secure Localization in Wireless Networks, Consumer Communications and
Netwokring Conference (CCNC), 2014 11th IEEE.
 Zhang, W., Wang, C., Feng, T., (2008) Generic Privacy-Preservation Solutions for Approximate Aggregation of Sensor Data (concise contribution), Pervasive Computing and Communications,
 Saputro, N., Akkaya, K., (2012) Performance Evaluation of Smart Grid Data Aggregation via
Homomorphic Encryption, Wireless Communications and Networking Conference (WCNC),2012 IEEE, pages 2945-2950, ISSN 1525-3511.
 Xu, D.,Wang, Y., Shi, X., Yin, X., (2010) 802.11 User Anonymization, Global Telecommunications Conference (GLOBECOM 2010), 2010 IEEE, pages 1-5.
 Gray, R. M., (2011) Entropy and Information Theory, Second Edition- Springer, isbn 978-1441979698.
 Sankar, L., Rajagopalan, S.R., Mohajer, S., Poor, H.V., (2013) Smart Meter Privacy: A Theoretical Framework, Smart Grid, IEEE Transactions on, volume 4, pages 837-846, ISSN 1949-3053.
 Karimi, B., Namboodiri, V., Jadliwala, M., (2013), On the scalable collection of metering data in smart grids through message concatenation, Smart Grid Communications (Smart-
GridComm), 2013 IEEE International Conference on, pp. 318-323, doi 10.1109/SmartGrid-Comm.2013.6687977.
 Luan, W., Sharp, D., Lancashire, S., (2010) Smart grid communication network capacity planning for power utilities, in Transmission and Distribution Conference and Exposition, IEEE PES,
 Allalouf, M., Gershinsky, G., Lewin-Eytan, L., Naor, J., (2011) Data-qualityaware volume reduction in smart grid networks, in Smart Grid Communications (SmartGridComm), 2011 IEEE
International Conference on, 2011, pp. 120125
 Fazel, K., Kaiser, S., (2008) Multi-Carrier and Spread Spectrum Systems From OFDM and MCCDMA to LTE and WiMAX, A JohnWiley and Sons, Ltd, ISBN 978-0-470-99821-2.