Dissertation Study on Securing Multi-Layer Communications

nityaaroma · Jul 9, 2013

Description
Communication is the activity of conveying information through the exchange of thoughts, messages, or information, as by speech, visuals, signals, writing, or behavior.

ABSTRACT

Title of Dissertation:

SECURING MULTI-LAYER COMMUNICATIONS: A SIGNAL PROCESSING APPROACH

Yinian Mao, Doctor of Philosophy, 2006

Dissertation directed by: Professor Min Wu Department of Electrical and Computer Engineering

Security is becoming a major concern in this information era. The development in wireless communications, networking technology, personal computing devices, and software engineering has led to numerous emerging applications whose security requirements are beyond the framework of conventional cryptography. The primary motivation of this dissertation research is to develop new approaches to the security problems in secure communication systems, without unduly increasing the complexity and cost of the entire system. Signal processing techniques have been widely applied in communication systems. In this dissertation, we investigate the potential, the mechanism, and the performance of incorporating signal processing techniques into various layers along the chain of secure information processing. For example, for application-layer data

con?dentiality, we have proposed atomic encryption operations for multimedia data that can preserve standard compliance and are friendly to communications and delegate processing. For multimedia authentication, we have discovered the potential key disclosure problem for popular image hashing schemes, and proposed mitigation solutions. In physical-layer wireless communications, we have discovered the threat of signal garbling attack from compromised relay nodes in the emerging cooperative communication paradigm, and proposed a countermeasure to trace and pinpoint the adversarial relay. For the design and deployment of secure sensor communications, we have proposed two sensor location adjustment algorithms for mobility-assisted sensor deployment that can jointly optimize sensing coverage and secure communication connectivity. Furthermore, for general scenarios of group key management, we have proposed a time-e?cient key management scheme that can improve the scalability of contributory key management from O(log n) to O(log(log n)) using scheduling and optimization techniques. This dissertation demonstrates that signal processing techniques, along with optimization, scheduling, and bene?cial techniques from other related ?elds of study, can be successfully integrated into security solutions in practical communication systems. The fusion of di?erent technical disciplines can take place at every layer of a secure communication system to strengthen communication security and improve performance-security tradeo?.

SECURING MULTI-LAYER COMMUNICATIONS: A SIGNAL PROCESSING APPROACH

by Yinian Mao

Dissertation submitted to the Faculty of the Graduate School of the University of Maryland, College Park in partial ful?llment of the requirements for the degree of Doctor of Philosophy 2006

Advisory Committee: Professor Professor Professor Professor Professor Min Wu, Committee Chair K. J. Ray Liu Virgil Gligor Gang Qu Lawrence C. Washington

c Copyright by Yinian Mao 2006

ACKNOWLEDGEMENTS

I am eternally grateful to my advisor, Professor Min Wu, for her guidance by instilling me with the quintessence of scienti?c research; for her unwavering support and constant encouragement for my research endeavors; for her being my role model in endless pursuit of academic and professional excellence; and for her immense patience in teaching the unteachable and mentoring the unthinkable. I am also deeply indebted to Professor K. J. Ray Liu, Professor M. K?van¸ c M?h¸ cak, and Professor Yan Lindsay Sun for their help and support during my graduate studies. Professor Liu brought vision and energy to help propel my research exploration. Professor M?h¸ cak and Professor Sun are resourceful mentors to learn from and wonderful collaborators to work with. I would like to thank my colleagues at University of Maryland multimedia group: Guan-Ming Su, Shan He, Hongmei Gou, and Ashwin Swaminathan. They have kept me in good company during the past ?ve years, provided a helping hand when needed, and contributed unsel?shly during collaborations. Finally, I give my heartfelt gratitude to my mother, Ms. Youqin Zhou. She gave me unconditional love and support, made enormous e?orts and countless sacri?ces, in order for me to have the opportunity to pursue my Ph.D. study. I dedicate this thesis to her.

ii

TABLE OF CONTENTS

List of Tables List of Figures 1 Introduction 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Organization and Contribution . . . . . . . . . . . . . . . . . . . . 2 Multimedia Encryption and Content Authentication 2.1 Background and Preliminaries . . . . . . . . . . . . . . . . . . . 2.1.1 Encryption Before and After Coding . . . . . . . . . . . 2.1.2 Encryption at Intermediate Stages of Coding . . . . . . . 2.1.3 Content-based Authentication using Image Hashing . . . 2.2 Generalized Index Mapping with Controlled Overhead . . . . . . 2.2.1 Generalized Index Mapping . . . . . . . . . . . . . . . . 2.2.2 Analysis and Control of Overhead . . . . . . . . . . . . . 2.2.3 Set-Partitioning in Index Mapping . . . . . . . . . . . . 2.2.4 Examples and Discussions . . . . . . . . . . . . . . . . . 2.3 Constrained Shu?ing . . . . . . . . . . . . . . . . . . . . . . . . 2.3.1 Intra Bitplane Shu?ing (IBS) . . . . . . . . . . . . . . . 2.3.2 Analysis of Overhead . . . . . . . . . . . . . . . . . . . . 2.3.3 The Security of IBS . . . . . . . . . . . . . . . . . . . . . 2.3.4 Other Forms of Constrained Shu?ing . . . . . . . . . . . 2.4 Attack Modelling and Security Measurement . . . . . . . . . . . 2.4.1 Approximation Recovery . . . . . . . . . . . . . . . . . . 2.4.2 Visual Similarity Scores . . . . . . . . . . . . . . . . . . 2.4.3 Evaluation Framework for Multimedia-Oriented Security 2.5 Application Example: Video Encryption System Design . . . . . 2.5.1 System Setup . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2 Bitrate Overhead Study for Index-Mapping Encryption . 2.5.3 Base-Layer Video Encryption . . . . . . . . . . . . . . . 2.5.4 Protecting FGS Enhancement Layer Video . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi vii 1 1 3 7 10 13 14 15 17 19 19 21 24 26 27 28 29 32 33 34 34 36 39 41 41 42 43 50

iii

2.6

2.7 2.8

The Security in Media Authentication using Image Hashing . . . . . 2.6.1 An Analysis of Image Hashing . . . . . . . . . . . . . . . . . 2.6.2 Unicity Distance for Polar Fourier Transform Hash . . . . . 2.6.3 Unicity Distance for Hash using Randomized Spatial Statistics 2.6.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: Derivations . . . . . . . . . . . . . . . . . . . . . . . . . 2.8.1 Overhead Analysis for Generalized Index Mapping . . . . . . 2.8.2 Overhead Analysis for Intra-bitplane Shu?ing . . . . . . . .

52 53 56 58 64 65 66 66 67 69 73 73 74 78 80 80 83 87 88 88 91 97 98 101 103 106 109 110 112

3 Tracing Malicious Relay in Cooperative Wireless Communications 3.1 System Model and Proposed Framework . . . . . . . . . . . . 3.1.1 System Setting . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Attack Modelling . . . . . . . . . . . . . . . . . . . . . 3.1.3 Proposed Framework . . . . . . . . . . . . . . . . . . . 3.2 Detecting Garbled Signals . . . . . . . . . . . . . . . . . . . . 3.2.1 Resolving Ambiguity Using One Receive Antenna . . . 3.2.2 Resolving Ambiguity Using Two Receive Antennas . . 3.2.3 Generalization . . . . . . . . . . . . . . . . . . . . . . . 3.3 Cross-Layer Scheme for Tracing Adversarial Relay . . . . . . . 3.3.1 The Tracing Algorithm . . . . . . . . . . . . . . . . . . 3.3.2 Analysis of the Correlation Statistics ? . . . . . . . . . 3.3.3 A Randomized Attack Strategy . . . . . . . . . . . . . 3.3.4 E?ects of Channel Estimation Error . . . . . . . . . . . 3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1 Basic Attack with Perfect Channel Information . . . . 3.4.2 Randomized Attack with Perfect Channel Information 3.4.3 Randomized Attack with Channel Estimation Error . . 3.4.4 Discussions . . . . . . . . . . . . . . . . . . . . . . . . 3.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

4 Coordinated Sensor Deployment for Security and Sensing Performance 4.1 Background and Related Works . . . . . . . . . . . . . . . . . . . 4.1.1 The Sensing Coverage Problem . . . . . . . . . . . . . . . 4.1.2 Key Pre-distribution for Sensor Networks . . . . . . . . . . 4.1.3 Adversary Model and Link Compromise Probability . . . . 4.2 Lattice-Structured Deployment . . . . . . . . . . . . . . . . . . . 4.2.1 Fundamental Relations Between Deployment Lattices . . . 4.2.2 Secure Connectivity Under Perturbed Deployment Lattice 4.3 Security-Aware Sensor Location Adjustment . . . . . . . . . . . .

114 . 117 . 118 . 122 . 123 . 124 . 124 . 126 . 129

iv

4.4 4.5

Improving Secure Connectivity Using the Virtual Force Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3.2 Sensor Location Adjustment Based on Vector Quantization . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix: A Model for Link Compromise Probability . . . . . . . .

4.3.1

129 136 150 151

5 Optimizing Time E?ciency in Contributory Group Key Management 5.1 E?ciency Aspects in Contributory Key Agreement . . . . . . . . 5.1.1 Background on Tree-based Contributory Key Management 5.1.2 Time-E?ciency Issues in Contributory Key Agreements . . 5.1.3 Communication and Computation E?ciency . . . . . . . . 5.2 Join-Exit Tree (JET) Algorithms . . . . . . . . . . . . . . . . . . 5.2.1 The Join Tree Algorithm . . . . . . . . . . . . . . . . . . . 5.2.2 The Exit Tree Algorithm . . . . . . . . . . . . . . . . . . . 5.3 Group Key Agreement Based on Join-Exit Tree . . . . . . . . . . 5.3.1 Group Key Establishment . . . . . . . . . . . . . . . . . . 5.3.2 Join Protocol . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.3 Leave Protocol . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Experiments and Performance Analysis . . . . . . . . . . . . . . . 5.4.1 Key Establishment for Sequential User Join . . . . . . . . 5.4.2 Experiment Using MBone User Activity Data . . . . . . . 5.4.3 Experiments Using Simulated User Activity Data . . . . . 5.5 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Extension to Multi-Level Join Tree . . . . . . . . . . . . . 5.5.2 The Implementation of Key Agreement . . . . . . . . . . . 5.5.3 Protocol Complexity . . . . . . . . . . . . . . . . . . . . . 5.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 5.7 Appendix: Derivations . . . . . . . . . . . . . . . . . . . . . . . . 5.7.1 Derivation for Inequality (5.5) . . . . . . . . . . . . . . . . 5.7.2 Derivation for Inequality (5.11) . . . . . . . . . . . . . . . 6 Conclusions and Future Perspectives Bibliography

154 . 158 . 158 . 160 . 161 . 161 . 163 . 170 . 173 . 173 . 174 . 175 . 176 . 176 . 177 . 179 . 186 . 186 . 187 . 189 . 191 . 192 . 192 . 194 196 200

v

LIST OF TABLES

2.1 2.2 2.3 2.4 2.5 2.6 3.1 3.2 3.3 3.4 3.5 3.6 5.1 5.2 5.3 5.4

Index Mapping Encryption Overhead Comparison . . . . Encryption and attack settings for security analysis . . . Perception based security measures for video encryption Relative Compression Overhead of the Encrypted Videos Intra Bitplane Shu?ing and Approximation Attack . . . Twelve sets of increments for parameter update . . . . . Space-Time Code Used by Two Relay Nodes . . . . . . . Forwarding Rules by Compromised Relay-1 . . . . . . . . Channel Conditions with Two Receive Antennas . . . . . Received Signals at the Receive Antennas . . . . . . . . . ?2 ?tting statistics of ? under H1 . . . . . . . . . . . . . Experimental occurring frequency of miss and false alarm

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . .

44 46 49 51 53 60

. 74 . 76 . 83 . 84 . 97 . 110 . . . . 167 179 185 189

Latency of Sequential User Join . . . . . . . . . . . . . . . . . Statistical Parameters for User Behavior . . . . . . . . . . . . Simulated Data Experiment: P0 = 0, P1 = 0.1, R = 0.1 . . . . A Comparison of Rekeying Protocol Complexity (group size n)

vi

LIST OF FIGURES

2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 2.14 2.15 2.16 2.17 2.18 2.19 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 3.11 3.12

Typical usage of multimedia content . . . . . . . . . . . . . . . . . Candidate domains to apply encryption to multimedia . . . . . . . Index mapping within subsets . . . . . . . . . . . . . . . . . . . . . Set partition with subsets asymmetric to zero . . . . . . . . . . . . Encryption results using Lenna image . . . . . . . . . . . . . . . . . Illustration of intra bitplane shu?ing . . . . . . . . . . . . . . . . . Expected histograms of zero-run lengths after intra-bitplane shu?ing Overhead of each bitplane by intra-bitplane shu?ing algorithm. . . Approximated Lenna image by setting all DC coe?cients to 0. . . . Eight representative edge directions used in ESS score evaluation . . Video encryption system . . . . . . . . . . . . . . . . . . . . . . . . Encryption results for Football video . . . . . . . . . . . . . . . . . Encryption results for Coast-guard video . . . . . . . . . . . . . . . Frame-by-frame ESS of Coastguard video . . . . . . . . . . . . . . Encryption results for Foreman FGS video . . . . . . . . . . . . . . Average key estimation error for polar FFT hash . . . . . . . . . . Illustration of the parameter searching algorithm . . . . . . . . . . . Key estimation results for image hash using regional mean. . . . . . . . Histogram of the width estimation error. . . . . . . . . . . . . . . . . .

11 14 23 25 27 29 31 32 36 39 41 47 48 50 52 58 62 63 64

Two nodes, C and D, try to forward the message from node A to B. 71 Schematic of the tracing scheme for malicious relay. . . . . . . . . . 79 The pair-wise combined signal constellation . . . . . . . . . . . . . 80 The mean of ? under hypothesis H0 and H1 . . . . . . . . . . . . . 95 The variance of tracing statistics ? under 10 dB SNR. . . . . . . . 96 The distribution of tracing statistics ?. . . . . . . . . . . . . . . . . 96 Illustration of the impact of channel estimation to symbol detection 100 Tracing statistics ? under 13 dB SNR . . . . . . . . . . . . . . . . . 104 Statistics of ? under basic attack with perfect channel information. 105 Tracing statistics ? under advanced attack strategy, 10 dB SNR . . 107 Statistics of ? under advanced attack with perfect channel information.108 Statistics of ? under advanced attack with channel estimation error. 111

vii

4.1 4.2 4.3

Two possible lattice deployment . . . . . . . . . . . . . . . . . . . . Expected number of secure links versus communication radius . . . Expected number of secure links per node under deployment deviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 Impact of location adjustment to the establishment of secure links using VFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Comparison of VFA and VFSec . . . . . . . . . . . . . . . . . . . . 4.6 Comparison of deployment lattices using VFSec . . . . . . . . . . . 4.7 Comparison of VFA and VFSec using lattice deployment . . . . . . 4.8 Illustration of the weight assignment in the weighted centroid algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.9 First comparison of the weighted centroid and minmax algorithm . 4.10 Second comparison of the weighted centroid and minmax algorithm 4.11 Third comparison of the weighted centroid and minmax algorithm . 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11 Notations for a key tree. . . . . . . . . . . . . . . . . . . . . . Topology for the proposed join-exit tree. . . . . . . . . . . . . User join at the join tree root . . . . . . . . . . . . . . . . . . Sequential user join strategy (only the join tree is shown). . . Relocation methods for the join tree. . . . . . . . . . . . . . . Average time cost for sequential user join. . . . . . . . . . . . Average join and leave time for simulations using MBone data. Average time costs for the ?rst experiment . . . . . . . . . . . A breakdown of the contributions to the user leave time . . . . Average time costs for the second experiment . . . . . . . . . Average time costs for the third experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

119 125 128 130 137 138 140 141 146 148 149 159 162 164 165 166 177 178 180 181 183 184

viii

Chapter 1 Introduction
1.1 Motivation

Security is becoming a major concern in this information era. The development in wireless communications, networking technology, personal computing devices, and software engineering has led to numerous emerging applications whose complex security requirements are beyond the framework of end-to-end cryptography. The complexity lies in the fact that emerging electronic devices often contain multiple functional layers and components, each performing di?erent functionality while interacting with each other. On the system level, applications that use such electronic devices nowadays could involve a large number of users, who are often required to perform tasks in a coordinated way with heterogeneous communication capability. The primary motivation of this dissertation research is to discover new approaches to the security problems in emerging communication technologies and multimedia applications, without unduly increasing the complexity and cost of the entire system. The issue of information security arises in di?erent forms and may cause dam-

1

ages of di?erent severity. For example, piracy in digital video, audio and software incurs huge damage to the economy. A 2001 study shows that the estimated annual revenue losses due to piracy by the Recording Industry Association of America, the Motion Picture Association of America, and the software industry are 3 billion, 4.2 billion, and 11 billion US dollars, respectively. The impact of identity theft on the economy is even larger. A comprehensive study by the US Federal Trade Commission found that there were nearly 10 million victims of identity theft in 2002, and identity theft cost businesses and individuals nearly 50 billion dollars in that year alone [19]. In addition to economic damages, security problems in the emerging communication and multimedia systems may also lead to the compromise of personal privacy, and corporate or state proprietary information. Such information can be exploited by adversaries for malicious purposes. For example, electronic patient record has the potential to allow health care provider and patients to interact more e?ciently. However, such records may also be illegally sold to insurance companies, police departments, employers, drug companies. Such concerns have been recently raised by academic experts toward the patient record system “Connecting for Health”, which was deployed by the National Health Service of the United Kingdom [5]. An astonishing security breach in electronic records recently occurred at the United States Department of Veteran A?airs, which could leak private information of up to 26.5 million veterans and active-duty service-members 1 . In addition, security issues also exist during information transmission. For instance, many communications system rely on wireless radio transmission, which is very ?exible in terms of deployment, but can be easily eavesdropped and interfered. Applications
1

http://www.firstgov.gov/veteransinfo.shtml

2

using radio communications, such as ad-hoc network and sensor network, may also be deployed in hostile environment. For these applications, a comprehensive security framework is required to ensure information con?dentiality and authenticity, detect system intrusion, and trace security breaches.

1.2

Background

In this section, we introduce general background related to security in communication systems and multimedia applications. We focus on various aspects where existing security solutions cannot meet the requirements of the emerging applications, and provide general philosophies on how to resolve such issues. In the past, the security in communication systems have been addressed mainly by cryptography. To achieve various security goals, cryptographic primitives, such as encryption tools, authentication tools, and cryptographic random number generators are designed [76][115]. The role of encryption tool is to protect the con?dentiality of data, by either encrypting the data one block at a time (i.e. block-cipher), or performing bit-by-bit encryption (i.e. stream cipher)[115]. For authenticating the content of a message, hash functions can be used. A hash function is a oneway function producing a ?xed-length binary sequence from a variable-length input message [76]. Based on the cryptographic primitives, cryptographic protocols can be developed to specify how to use the cryptographic primitives in order to achieve a security objective [52]. Usually a cryptographic protocol speci?es a sequence of actions to be taken by the protocol participants, as well as certain conditions to be satis?ed for the protocol to be successfully carried out. One such example is the key management protocol. Since most (symmetric-key) cryptographic primitives require a secret key as input, the information sender and receiver usually need to

3

agree upon the same secret key before such cryptographic operations as encryption and decryption can be performed. The key management protocol achieves this goal by either using a trusted third party, called the key distribution center, or using public-key cryptographic primitive such as the Di?e-Hellman algorithm [52]. Traditionally, communication system and upper layer multimedia applications are separately considered in system design. With the advances in communications and networking, and multimedia signal processing, more integrated approaches have been proposed for multimedia communications. For instance, unequal error protection techniques for transmitting multimedia over error prone channels have been proposed to exploit the di?erent importance in various layers of media representation [121]. For multicasting multimedia content over heterogeneous network, e?cient media transcoding schemes have been proposed to adapt the bit-rate of media content according to available bandwidth, processing power, or display resolution [138]. At the same time, more communication system designs take into account the nature of data being transmitted and the applications that utilize the communication channel. For example, di?erent streams of data, such as voice, video, image, or text are assigned di?erent priorities in the communication system. For delay-sensitive applications, such as interactive voice, more network resources will be reserved to achieve better quality of service. In such an integrated multimedia communication system, incorporating a security component is a challenging task. If we encrypt the multimedia content using a contemporary cipher directly, the encryption will remove the bit stream syntax contained in the compressed media. However, doing so may hinder the capability of rate adaptation in network transmission, or applying unequal error protection when the media is transported through error prone channels [72]. This is because most of the bit-rate adaptation

4

schemes and unequal error protection schemes rely on the syntax and structure in the coded media content to identify which part of the media stream is more important. Verifying the authenticity of the media content is equally challenging as encryption. Since the media content can go through rate adaptation or be corrupted by transmission noise/packet loss, the bit-by-bit representation of media content may be changed. Ideally, such graceful quality degradation does not indicate malicious modi?cation of the content, and the received content should still be considered authentic in principle. Unfortunately, these goals cannot be achieved by directly applying cryptographic hash to media authentication. In this dissertation, we will introduce techniques to address these challenges in secure multimedia communications. One special class of security problems is how to trace an adversary or compromised user within a group of users. We introduce an abstraction of the traitor tracing in the cooperative communication scenario as follows. Let us consider a group of nodes that can communicate with each other. One node s sends a message M to a destination node d, through a number of relay nodes r1 , r2 , ..., rn . Each of these relay nodes obtains a message from sender s containing (almost) the same content, and may or may not forward this message to d. There are one or multiple adversaries within the group of relay nodes. These adversaries can modify the message M to M and send M to the destination d. Upon receiving a message M , d may want to ?nd out the following information: (1) from which relay the message M was sent, and (2) whether the message has been tampered. When combining the answer to both questions, we can detect the tampering and relate it to the compromised relay node(s). This is especially useful in tracing information tampering in relay communications, such as in wireless ad hoc networks. In

5

such networks, the wireless communication channel and the malicious nature of attacks can make the task of traitor tracing very challenging. The message s sent to each relay node ri could contain the same content plus the relay node’s unique ID. However, if the relay node(s) is adversarial, it (they) will have every incentive to remove such ID information to avoid being identi?ed. Another example is that the received message at d is a superposition of relay messages from multiple relay nodes through wireless channel. The relay signals are combined by physical laws related to microwave signal propagation. In this situation, it is very hard for the destination node to distinguish which part of the received message is contributed by which relay node. It is clear that solving such a traitor tracing problem not only requires cryptographic tools, but also involves signal processing to take into account the nature of signal transmission over the physical communication channel. In an electronic system that involves communication, signal processing, and system security, there are tradeo?s among security, performance and functionality. Since the security subsystem is a part of the overall system, it will interact with other components in the system. The functionality and performance of all system components may be mutually constrained and a?ected. The security-performance or security-functionality tradeo?s are very important because, as secure communication systems become more complex, the interaction and interdependency between the security subsystem and other system components are also becoming more intricate. Sometimes adding a security component may lead to incompatibility with other system components or substantial performance loss. In order to achieve security and improve the performance in the overall system, one can seek to optimize the security protocol when the security sub-system is stand-alone and such optimization will not drastically a?ect other system components. Other-

6

wise, one should seek to optimize the tradeo? between the security component and other performance aspects in the system design, which can lead to a more balanced system performance. One example in security-performance tradeo? is the tradeo? between key manage protocol and other system performance aspects. Such a tradeo? has an especially signi?cant impact to the overall system performance in group communications. First, the complexity and cost of the key management should have good scalability with respect to the communication group size. Second, the key management scheme should also take into consideration the memory and computation resources available at each individual node, such as in sensor networks or other networked embedded systems. Third, the key management scheme often interacts with other system performance criterions. Given a practical system, how to jointly optimize the performance of key management together with other system performance aspects is an important issue open for investigation.

1.3

Organization and Contribution

This dissertation focuses on identifying, modelling, and solving the security problems in a single layer or across several layers of a secure communication system. The main challenge in solving such security problems is that, the issues in application security are often coupled with system constraints, and directly applying existing cryptographic tools cannot solve such emerging security problems without violating the constraints. The contributions of this dissertation can be summarized in three aspects. Given an application and its security requirements, we ?rst model and formulate the security problems and system constraints into a common framework. Based on such a framework, we propose new security solutions and evaluate its strength using existing cryptographic tools and signal processing al-

7

gorithms. When performance versus security tradeo?s are involved, we introduce quantitative metrics that capture the tradeo? within the system formulation, and improve the tradeo? using signal processing and optimization techniques. The nature of this research is to view security issues in secure communication systems from cross-layer, inter-disciplinary perspectives and jointly consider system security, system constraint, and system performance. To that end, not only signal processing theory and algorithms, but also optimization techniques, scienti?c computing algorithms, and other related knowledge from di?erent disciplines can be employed in solving security problems and optimizing security versus performance tradeo?. In this dissertation, we show such design principle through a number of examples in di?erent layers of secure communication systems. In Chapter 2, we focus on application-layer communication security and introduce a joint signal processing and cryptographic approach to multimedia encryption. We also discuss the security and robustness tradeo? in image hashing by adapting the concept of unicity distance pioneered by Shannon [101]. In Chapter 3, using signal processing techniques for physical-layer wireless communications, we discuss how to trace malicious attacks from adversarial relays in cooperative wireless communications, which is an emerging communication paradigm for ad hoc networks. In Chapter 4, we look into the joint optimization of key establishment and sensing coverage for secure sensor network design and deployment, and propose a coordinated sensor deployment framework that can improve the sensing and secure communication tradeo?. For general scenarios of group key management, in Chapter 5, we investigate how to optimize the time e?ciency of rekeying operations in contributory key management, using dynamic scheduling and cost amortization. Through these typical scenarios, we demonstrate the potential and bene?ts of introducing sig-

8

nal processing in formulating and solving security problems in di?erent layers of a secure communication system. Finally, conclusions and future perspectives are drawn in Chapter 6.

9

Chapter 2 Multimedia Encryption and Content Authentication
Signal processing has played a signi?cant role in developing coding and communication technologies for digital multimedia, paving ways to many opportunities for people around the world to acquire, utilize, and share multimedia content [93]. To allow for wider availability of multimedia information and successful commercialization of many multimedia related services, assuring that multimedia information is used only by authorized users for authorized purposes has become essential. This chapter focuses on jointly utilizing signal processing and cryptography to protect the con?dentiality and achieving access control of multimedia information, as well as authenticating the media content received through communication channels. In a typical use of multimedia illustrated in Fig. 2.1, the owner of the multimedia content wants to distribute the content through networks or archive it for future use. With the sophistication of heterogeneous networks and the growing amount of information being generated, it is becoming less e?cient for the content owners to manage the distribution or archiving process all by themselves.

10

Figure 2.1: Typical usage of multimedia content As a result, third-party service providers, equipped with specialized servers, huge disk space, and abundant bandwidth resources, will serve as delegates for content owners to perform content distribution, archiving, search and retrieval. On one hand, the delegate service providers often need to process the received media data, such as adapting the rate of the media data according to the available bandwidth. On the other hand, the owner may not want to reveal the media content to these delegates because of security and privacy concerns. One such example is the privacy-preserving data retrieval using untrusted server [40]. A common way to achieve content con?dentiality is to encrypt the entire multimedia data using a cipher, such as DES, AES, or RSA [76][115]. However, many types of processing, such as rate adaptation for multimedia transmission in heterogeneous networks [58] and DC-image extraction for multimedia content searching [137], cannot be applied directly in the bitstream encrypted by these generic encryption tools or their simple variations. This implies that the delegates are still have to hold the decryption keys to decrypt the content, process the data, and then re-encrypt the content. Since revealing decryption keys to potentially untrustworthy delegates is often not in line with the security requirements of many

11

applications, generic encryption alone is inadequate in the delegate service scenario. Another unique security issue with multimedia delivery is the authentication of received media content. Since the media content may undergo changes during network communication, using a traditional cryptographic hash for authentication may “reject” media contents that has undergone format changes, resolution reduction, or bit/packet errors, which are not intentional tampers to the media content [117][37]. For this reason, an ideal hash should authenticate based on the media content, not the exact bit-by-bit representation. At the same time, the hash should also use a secret key as input and contain randomization that cannot be easily forged or attacked by other means. In addition to image authentication, robust image hashing can also be used in non-oblivious watermarking for image and video [24][14]. Instead of using the original image, the hash can provide partial information of the image content to be used in the watermark detection phase. In video ?ngerprinting, video frame hashes have been used for temporal registration purpose [41]. For content-based multimedia retrieval, image hashes can be used as a concise ID for each image. Thus fast and e?ective comparison of image content can be achieved by comparing the distance of hash vectors [62]. In all these applications, the image hashing scheme needs to be resilient to a set of authenticate modi?cations and sometimes minor intentional distortions to the original image content. These modi?cations include rotation, scaling, noise corruption, and common image ?ltering. The ?rst step towards addressing the aforementioned issues is to design ?exible multimedia encryption schemes that can handle delegate processing and achieve access control by content and quality [65], as well as content-based multimedia au-

12

thentication tools. In this chapter, we ?rst focus on the design of encryption tools for multimedia that can encrypt once and securely process in many ways using the existing multimedia signal processing techniques. To achieve this goal, we jointly consider signal processing and cryptography in our exploration of multimedia encryption. In particular, we investigate the possible domains in which encryptions can be applied, including the sample domain, the quantized transform domain, the intermediate bitplanes, and the bitstream domain. We propose and analyze two atomic encryption operations tailored for multimedia signals, namely, a generalized index mapping encryption tool with controlled overhead and an intra-bitplane encryption tool compatible with ?ne granularity scalable coding. A video encryption system, which incorporates the proposed operations and other relatively straightforward extensions of generic encryption, is then studied. The resulting system takes into consideration the structure and syntax of multimedia sources and protects the content con?dentiality during delegate processing. Then we investigate the security and robustness tradeo? exhibited in image hashing schemes, by adapting the unicity distance concept pioneered by Shannon [101]. Through two examples, we show that the security of typical image hashing schemes can be quantitatively analyzed. Such security analysis framework can provide guidance in the design of multimedia hashing schemes.

2.1

Background and Preliminaries

We examine in this section the possible domains in which encryption can be applied to multimedia, along with a review of prior work. Using a widely adopted multimedia coding framework, we illustrate the candidate domains for applying encryption to multimedia in Fig. 2.2.

13

(QWURS\ &RGLQJ

7UDQVIRUP 3UHGLFWLRQ

4XDQWL]DWLRQ

5/&

9/&

0XOWLSOH[ 3DFNHWL]H

Figure 2.2: Candidate domains to apply encryption to multimedia

2.1.1

Encryption Before and After Coding

According to Fig. 2.2, there are two straightforward places to apply generic encryption to multimedia. The ?rst possibility is to encrypt multimedia samples before any compression (i.e. Stage 1 in Fig. 2.2). The main problem with this approach is that the encryption often signi?cantly changes the statistical characteristics of the original multimedia source, resulting in much reduced compressibility. It is worth noting a novel approach has been recently proposed to e?ciently compress encrypted data [49]. By employing distributed source coding theory, this new method achieves the same compression gain as compressing the unencrypted data in the case of ideal Gaussian source. The compression gain, however, would be reduced for more general source that is common in practice, and it cannot easily support many other forms of delegate processing. The second possibility is to apply generic encryption to the encoded bitstream after compression (i.e., Stage 5 and 6 in Fig. 2.2) [94]. This approach introduces little overhead, but may destroy the structures and syntax readily available in the unencrypted bitstream. Such structures, often indicated by special header/marker patterns, would enable many kinds of processing in delegate service providers and intermediate network links, such as bandwidth adaptation, unequal error protection, and random access [136, 131, 121, 126].

14

As headers and markers are special bit patterns in a compressed bitstream for a variety of purposes [121], a simple way to realize syntax-aware encryption is only to encrypt the content-carrying ?elds of the compressed multimedia bitstream, such as the ?elds of motion vectors and DCT coe?cients in MPEG video, and keep the structure and headers/markers of the bitstream unchanged [125][140]. However, this method can only preserve limited syntax from unencrypted media to facilitate a ?xed set of processing. Other advanced features in multimedia signal processing, such as the ?ne granular scalable (FGS) coding [102, 95, 58], cannot be easily preserved using this approach. Bitrate overhead also occurs in the range of 1%-8% due to block padding and the added side information. Furthermore, each tailored encryption technique may also require di?erent handling by delegate processing units. This is not always realistic, because the processing units are likely coming from di?erent vendors and the barrier to standardization seems insurmountable due to market considerations [7]. After applying generic encryption, some parts of the encrypted media data could become identical to certain headers/markers. This emulation problem can bring potentially serious troubles to delegate service providers and intermediate processing modules [126] when the multimedia data go through certain network protocols, transcoding, and error recovery. One possible remedy to header emulation is bit-stu?ng [132], a technique that is widely adopted in the packetization stage of network communications [111].

2.1.2

Encryption at Intermediate Stages of Coding

Recently, there have been interests in studying how to encrypt multimedia data in such a way that the encrypted data can still be represented in a meaningful,

15

standard-compliant format [86]. They are particularly useful for secure delegate services and multimedia communications that prefer handling media streams compliant to certain multimedia coding standard, such as JPEG or MPEG-1/2/4 standard [126][127]. The encryption is performed in the intermediate stages illustrated in Fig. 2.2. For example, at Stage-2, the motion vectors in video can be encrypted by applying DES to their codeword indices [126]. At Stage-3, DC and selected AC coe?cients in each block of a JPEG image or an MPEG video frame can be shu?ed within the block [112], or across blocks but within the same frequency band [141]. At Stage-4, the entropy codeword can be spatially shu?ed within the compressed bitstream [127]; the Hu?man codewords of coe?cients and/or motion vectors can be encrypted by alternating between several Hu?man codebooks in a cryptographically secure fashion [130]. At Stage-5, only intra-coded frames and blocks of an MPEG video are selected and encrypted using a classic DES cipher [133] or its variations [94]. Some of these schemes are also known as selective encryption [127, 133, 130], i.e., they encrypt only portions of multimedia data stream that carry rich content, in hope of alleviating the problem of high computational complexity and the potential bitrate overhead. However, we believe that the computational complexity in encryption is not a major concern given the fastgrowing computation power and the e?cient implementation of established generic encryption tools. In contrast, the protection of content con?dentiality under different processing scenarios, such as delegate services and communications, is of paramount urgency. Unfortunately, few existing work has thoroughly considered these scenarios. In Section 2.2 and Section 2.3, we propose two general atomic encryption operations using index mapping and constrained shu?ing to achieve con?dential-

16

ity protection in delegate services and other processing scenarios. The rationale is to ensure that the encrypted bitstream still complies with the state-of-the-art multimedia coding techniques. We are particularly interested in how much price in terms of security and compressibility these techniques have to pay to achieve standard-compliant encryption, and we answer this question through analysis and simulations.

2.1.3

Content-based Authentication using Image Hashing

Image hash is a content-based concise representation of an image. Just as the traditional cryptographic hash, image hash can be used to verify the authenticity of image content. To this end, the generation of the image hash must be randomized according to a secret key, which is to ensure that an unauthorized user will not be able to forge a hash without the key. What makes image hash di?erent from the cryptographic hash is that, an image usually allows di?erent representations for approximately the same content. For example, an image may be compressed to lower bit-rate or corrupted by noise during transmission. After such distortions, the bit-by-bit representation of the image has been changed while the content has been preserved. In order to verify the authenticity of the image content under such situations, an ideal image hashing scheme should be robust against moderate distortions, such as rotation, scaling, ?ltering, compression, and additive noise, etc. In the literature, a number of robust image hashing schemes have been proposed in recent years [117][37][108][79]. In Section 2.6, we show that the robust image hashing schemes should be used in a proper way, otherwise an adversary would be able to exploit the robustness constraint in image hashing schemes, deduce the hashing key, or forge a valid hash for a completely new image. We will present our

17

analysis under the framework of unicity distance proposed by Shannon. The concept of unicity distance was pioneered by Claude Shannon in investigating encryption systems[101]. The basic idea is that, when a secret key is used to encrypt multiple messages, the amount of uncertainty in the encryption key (or the cipher text for a new message) reduces with the increased number of observed clear text-cipher text pairs. When the number of observed cryptograms is large enough, one can almost surely determine the secret key from the observations. The number of observations (or some normalized form) is called the unicity distance. Thus the unicity distance quanti?es how many clear text-cipher text pairs are required to almost surely determine the secret key. Shannon demonstrated the concept of unicity distance using a two-letter substitution cipher in his paper[101]. The ?eld of cryptography has been advanced signi?cantly since Shannon’s discovery. Most contemporary block ciphers nowadays require a minimum key length that is 64 bits or more. For a message block of 128 bits or more, one would require an enormous number of clear text-cipher text pairs to deduce the secret key. Therefore it is hard to demonstrate the unicity distance in contemporary ciphers. However, for perceptual hashes that are constrained by the robustness requirement, we found that demonstrating unicity distance only requires a few dozen of observed image-hash pairs. These results are presented in Section 2.6.

18

2.2

Generalized Index Mapping with Controlled Overhead

Unlike generic data encryption where the encryption output can take values over the entire data space, joint signal processing and cryptographic encryption requires that the encrypted output should satisfy additional constraints. These constraints are essential to preserve the structure, syntax, and standard compliance that enable delegate processing and leads to communication friendliness. A format-compliant encryption scheme was proposed in [126] by assigning a ?xed-length index to each variable length codeword (VLC), encrypting the concatenated indices, and then mapping the encrypted indices back to codeword domain to form an encrypted bitstream. This prior approach would work well with such codes as the Hu?man codes and the Golomb-Rice codes, which associate each symbol coming from a ?nite set with a unique codeword of integer length, but it cannot be directly applied to VLCs that allow fractional codeword length per symbol, such as the arithmetic codes. In addition, this prior encryption work incurs a substantial amount of bitrate overhead, and analytic study has not been provided regarding the overhead. In this section, we construct and analyze an encryption tool that can overcome these two problems.

2.2.1

Generalized Index Mapping

We extend the index encryption idea to apply encryption directly to symbols that take values from a ?nite set before getting into VLC codeword domain. Examples include working with quantized coe?cients and quantized prediction residues (Stage #3 in Fig. 2.2), as well as run-length coding symbols (Stage #4 in Fig. 2.2).

19

The encryption process to produce a ciphertext symbol X (enc) from a clear-text symbol X is shown as follows: X (enc) = Enc(X ) T ?1 [E (T (X ))], (2.1)

where E (·) is a core encryption primitive such as AES or one-time pad [115], and T (·) represents a codebook that establishes a bijective mapping between all possible symbol values and indices represented by binary strings. The goal of this bijection is to produce ?xed-length indices that will be passed to subsequent encryption or decryption. The decryption process has a similar structure: X = Dec(X (enc) ) T ?1 [D(T (X (enc) ))], (2.2)

where D(·) is a core decryption primitive corresponding to E (·). As a simple example, we consider encrypting a string of symbols coming from a ?nite set {A, B, C, D}. The symbol sequence to be encrypted is “ABBDC”. We ?rst assign a ?xed-length index to each symbol: A C ? ? [00], [10], B D ? ? [01], [11].

We then convert the symbol sequence to an index sequence “00 01 01 11 10”, and encrypt the index sequence using an appropriate encryption primitive such as a stream cipher (the one-time pad) with a random bit-stream [0100 1011 1001 ...]. Finally we convert the encrypted index sequence “01 01 11 00 00” back to symbol sequence “BBDAA”. After encryption, any appropriate VLC coding can be applied to the encrypted symbol sequence. It is worth noting that in such an encryption one input symbol can be mapped to di?erent encrypted cipher-text. For instance, in the previous example the symbol B has appeared in the clear-text sequence twice, the ?rst time it was mapped to B and the second time to D.

20

When processing a large sequence of symbols, the encryption method by index mapping tends to make the encrypted symbols uniformly distributed, which is good in terms of security [115]. However, the entropy of the encrypted symbols is increased from the unencrypted ones. Since the compressibility of a sequence of symbols using entropy coding depends on the entropy of the source symbols [23], the index-mapping encryption would bring a bit-rate overhead in compression. Next, we discuss how to quantify and control this overhead.

2.2.2

Analysis and Control of Overhead

In the following analysis, we investigate the impact of the index encryption on the compressibility of the source symbols, which can be quanti?ed by the changes in average code length before and after encrypting a sequence of symbols. Case-1 We consider compressing the source symbols using a default entropy codebook as provided by many multimedia standards. The default codebook is obtained from a set of representative training samples and is used most of the time for the simplicity of implementation. We denote the probability mass function of the symbols prior to encryption by {pi }, that of the symbols after encryption by {qi }, and the code length designed for distribution {pi } by {li }. If encryption is performed on an index drawn from the full range of symbol values, the distribution of ciphertext symbols, q , will be uniform over the entire range. Alternatively, if we partition the range of symbol values into mutually exclusive subsets {Sj } and restrict the outcome of the encryption of a symbol x ? Sj to be within the subset Sj , i.e., Enc(x) ? Sj , the distribution q will be a piecewise uniform approximation of p, as

21

illustrated in Fig. 2.3. Consider {qi } to be a piecewise uniform approximation of {pi }, i.e., for each subset Sj from a non-overlapping partition of the symbols’ range, we have
i?Sj i?Sj

pi =

qi = |Sj |q (Sj ) , where q (Sj )

qi for all i ? Sj , and | · | denotes the cardinality

of a set. Assuming that encryption changes the symbol distribution from {pi } to the above {qi } and the same codebook of code length {li } is used both before and after encryption, we can show that the changes of the expected code length ?L is ?L =
i

(qi ? pi )li = D(p||q ) + D(q ||r) ? D(p||r),

(2.3)

where D(·||·) represents the Kullback-Leibler divergence, and r represents a probability distribution of {ri in Appendix-A. If we partition the symbol range S into more than one subset and restrict the encryption output to be in the same subset as the input symbol, the complexity of a brute-force attack for each symbol is reduced
1

P (R = i) = 2?li /

k

2?lk }. The derivation is presented

from 2|S | to 2|Sj | , where Sj is

the subset to which the symbol belongs. On the other hand, the overhead is also reduced because in the Kullback-Leibler divergence sense the distance from the original distribution p to the piecewise uniform distribution q is closer than that to a completely uniform distribution. Thus by controlling the set partitioning, we can adjust the tradeo? between the security and the overhead. In addition, Eq. (2.3) suggests that the optimality of the codelength {li } designed for probability distribution {pi } a?ects the changes of the expected code length after encryption. If the code is optimal for {pi }, i.e., li = ? log pi , then D(p||r) = 0 and Eq. (2.3) is reduced to ?L = D(p||q ) + D(q ||p).
1

This is equivalent to encrypting fewer bits of the indices. As to be discussed later in this

section, symmetric set partitioning can protect the sign bit, which is important to resist attacks.

22

Original source

Encrypted with controlled overhead

Encrypted source

S2

S1

S2

Figure 2.3: Index mapping within subsets gives piecewise constant approximation of the distribution. Case-2 We consider compressing the source symbols using adaptive or universal entropy coding that adjusts itself to the source’s distribution. Arithmetic coding and Lempel-Ziv coding are two such examples. Assuming that the adaptive entropy coding can achieve the entropy bound for any source distribution, we exploit the piece-wise constant property of {qi } and show in Appendix-A that the change of
1

the average codeword length is ?L = H{qi } ? H{pi } = D(p||q ), (2.4)

where H{xi } is the entropy of a discrete random variable following the distribution {xi }. Similar to the ?rst case, this result also indicates that if we partition the symbol range S into more than one subset and restrict the encryption output to be in the same subset as the input symbol, the distribution of encrypted source, {qi }, can better resemble that of the original source, {pi }, leading to reduced overhead in compression. The ?nal result of the relative bitrate overhead (? ) also depends on the ratio of the size of the content to be encrypted (B1 ) to the overall size of the stream (B ). That is, B1 B ? B1 × 100% = ?e × 100%, ?= 1 B B
(e)

(2.5)

23

where B1 denotes the size of the encrypted part and ?e is the relative overhead for the part being encrypted. Even if ?e is large as in the case of prior work [126], the overall overhead can be constrained if only a relatively small part of the stream is encrypted. With our proposed technique of set partitioning, the overall overhead can be controlled both through reducing ?e and through maintaining a low B1 /B ratio by selectively encrypting only a portion of the stream (such as the perceptually signi?cant coe?cients).

(e)

2.2.3

Set-Partitioning in Index Mapping

As we have seen, the set partitioning technique can control the bitrate overhead introduced by the index mapping encryption, trading o? the resistance against brute-force attacks. The choice of partition also a?ects the security against estimation attack. Since the encryption ?attens the distribution within each subset, no particular clue of the exact unencrypted source value can be inferred from its encrypted value. The best estimate of an unencrypted value X in terms of the mean square error has to resort to its conditional statistical distribution in the subset Sj . It can be shown that, by observing X (enc) ? Sj , the minimum mean ˆ for X is square error (MMSE) estimate X ˆ = argmint E (|X ? t|2 |X (enc) ) = E (X |Sj ), X and the corresponding mean square error is ˆ ? X |2 |Sj ) = V ar(X |Sj ). E (|X (2.7) (2.6)

It has been known that many variables in e?cient multimedia representation, including the quantized transform coe?cients and prediction residues, follow a symmetric zero-mean distribution such as a Gaussian distribution or a Laplacian

24

distribution. For these variables, if the subset partition is symmetric around zero, as shown in the example of Fig. 2.3, the MMSE estimate of the original value given the encrypted one will always be zero, regardless of the subset. This means that the sign of the original value is not inferrable. Since the sign, or more generally, the phase, is known to carry important information of a signal [84], an attacker can hardly get any useful information from such MMSE estimation. However, if the partition is not symmetric, for example, as in Fig. 2.4, the MMSE estimates still preserve the sign and the approximate energy of the original value for each subset. Such a choice of partition can leak a substantial amount of information to an attacker after estimation, and therefore should be avoided.

S1

S2

S3

S4

Figure 2.4: Set partition with subsets asymmetric to zero For several popular entropy coding techniques such as those in JPEG and MPEG, the choice of set partition can be tailored to eliminate the overhead. For example, the JPEG standard employs run-length coding, where the run of zero coe?cients are coded before the encoding of a non-zero coe?cient. If we only encrypt the values of non-zero coe?cients and group all coe?cient values with the same code-length into the same subset, the code-length of the encrypted value will be identical to that of the unencrypted one. No bitrate overhead will be introduced in this particular case.
1

25

2.2.4

Examples and Discussions

As an example, we encrypt the DC prediction residues of the JPEG representation of the 512 × 512 Lena image using the index mapping approach. In JPEG, the DC coe?cient in each block collectively captures the coarse information of an image and is di?erentially encoded to reduce the redundancy. For natural images, DC di?erential residues are approximately Laplacianly distributed with very small probability outside the range of [?63, 64]. We encode both the unencrypted and the encrypted prediction residues with the default Hu?man table in the JPEG standard. Without encryption, the average code length for encoding DCs is 5.78 bits. In the ?rst encryption experiment, we apply the proposed generalized index encryption to the DC di?erential residues within [?63, 64] without set partitioning. The index encryption is realized via XORing with a one-time pad, resulting in an average code length of 8.60 bits, or an overhead of 2.82 bits. In the second encryption experiment, we partition the symbol range of [?63, 64] into two subsets [?31, 32] and [?63, ?32] ? [33, 64], and restrict the input and output of index encryption to be in the same subset. Fig. 2.5 shows the encryption result of the Lenna image 2 . With set partitioning, the overhead in average code length is reduced from 2.82 bits to 1.53 bits.
2

Encrypting DC alone is not secure enough as an attacker can still get the edge information

by setting the DCs to constant and observing the resulting image. We only encrypt DC in this experiment for the purpose of demonstrating the proposed approach as one potential building block. We will show in Section 2.5 that a complete encryption system should encrypt both DCs and other information.

26

Figure 2.5: Encryption results using the Lenna image based on generalized index mapping of DC di?erential residues: (left) original, (right) encrypted.

2.3

Constrained Shu?ing

Random permutation or shu?ing is a common cryptographic primitive operation. The temporal characteristic of audio and video data as well as the spatial characteristic of visual data make permutation a natural way to scramble the semantic meaning of multimedia signals. Even before the arrival of digital technology, an early work by Cox et al. builds an analog voice privacy system on a subband representation framework and permutes time segments of subband signals across both time and frequency [25]. More sophisticated coding techniques have been employed by modern digital coding systems. Thus to control the bit-rate overhead and allow for delegate processing, random permutation should be performed in a constrained way and in appropriate domains. In this section, we use the encryption of scalable video with ?ne granularity as an example to illustrate the proposed constrained shu?ing technique, which is known as the intra bitplane shu?ing. This encryption technique is compatible with ?ne granularity scalable coding and provides a tool for access control of multimedia content at di?erent quality levels.

27

2.3.1

Intra Bitplane Shu?ing (IBS)

Fine granularity scalability (FGS) is desirable in multimedia communications to provide a near-continuous tradeo? between bitrate and quality. FGS is commonly achieved by bitplane coding, as used in the embedded zero-tree wavelet (EZW) coder [102] and the MPEG-4 FGS coder [95]. We shall use the MPEG-4 FGS to illustrate the concept and the approach can be extended to other FGS coders. As surveyed in [58], MPEG-4 FGS is a functionality provided by the MPEG-4 streaming video pro?le. A video is ?rst encoded into two layers, namely, a base layer that provides a basic quality level at a low bit rate and an enhancement layer that provides successive re?nement. The enhancement layer is encoded bitplane by bitplane from the most signi?cant bitplane to the least signi?cant one to achieve ?ne granularity scalability. Each bitplane within an image block is represented by (Ri , EOPi ) symbols, where Ri is the run of zeros before the i-th “1”, and EOPi is an end-of-plane ?ag indicating whether the current “1” is the last bit with value 1 in the current bitplane. The run-EOP symbols are encoded using variable-length codes and interleaved with sign bits. To provide access control to the FGS encoded enhancement layers, the indexbased encryption discussed in Section 2.2 can be applied to each run-EOP symbol, and the overhead can be analyzed using Eq. (2.3) or (2.4). We now present an alternative encryption by shu?ing each bitplane according to a set of cryptographically secure shu?e tables. Fig. 2.6 illustrates the proposed intra-bitplane shu?ing. We perform random shu?ing on each bitplane of n bits and the shu?ed bitplane will then be encoded using the run-EOP approach. For example, the ?rst unencrypted bitplane
3

3

in

We use “the ?rst bitplane” to denote the MSB bitplane throughout this chapter.

28

Fig. 2.6 “1 0 1 0 0 0 0 0 0 0” has n1 = 2 bits of value “1” out of a total of n = 10 bits, which will lead to
n n1

= 45 di?erent permutated patterns. In addition to

bit-plane shu?ing, the sign bit si of each coe?cient is randomly ?ipped according to a pseudo-random bit bi from a one-time pad, i.e., the sign remains the same when bi = 0 and changes when bi = 1.
Transform coefficients before coding and encryption
9 4 -10 -5 -5 3 -4 5 3 2

Bitplane representation
+
1 0 0 1

+
0 1 0 0

1 0 1 0

0 1 0 1

0 1 0 1

+
0 0 1 1

0 1 0 0

+
0 1 0 1

+
0 0 1 1

0 0 1 0

Proposed encryption
0 0 1 1

0 1 0 1

0 0 0 1

+
1 1 0 0

0 0 1 1

0 1 1 0

+
0 0 0 0

+
0 1 1 1

1 0 0 0

0 1 0 1

Encrypt sign bits (using a stream cipher) and bit-planes (using proposed intra bit-plane shuffling)

Figure 2.6: Illustration of intra bitplane shu?ing

2.3.2

Analysis of Overhead

An important property of shu?ing is that the set of elements before and after shu?ing are identical. For intra-bitplane shu?ing, this implies that the number of “1”s is preserved. Thus the number of run-EOP symbols representing the encrypted bitplane is unchanged, ensuring no overhead coming from the increase in the number of run-EOP symbols. We denote by Nd the number of occurrences that the run of zeros before a “1” equals to d after shu?ing. Assuming each of the

29

n n1

shu?es is equally likely for a speci?c n-bit bitplane with a total of n1 bits of

“1”, we show in the Appendix-B that Nd has expected value
d?1

E (Nd |n1 ) =

(1 ?
k=0

n1 ) n?k

n1 2 . n?d

(2.8)

We therefore arrive at an expected histogram {E (Nd |n1 )} versus d, which suggests the likelihood of getting each possible run length. Shu?ing can also be done in a block larger than the block for run-EOP encoding. For example, we can shu?e a macroblock of n = 256 bits and perform run-EOP encoding in a smaller block of nB = 64 bits. In this case, the expected zero-run histogram within an encoding block will become
d?1

E (Nd |n1 ) =

(1 ?
k=0

n1 n1 (n ? nB )(n1 ? 1) )· · n1 ? , n?k n?d n?d?1

which will be reduced to Eq. (2.8) if n = nB . From the histogram before and after encryption we can arrive at the expected overhead per symbol. As a proof-of-concept, we experiment on the FGS encoded enhancement layer of two video sequences in QCIF format (176 × 144 pixels per frame). One is 10 frames from the “Foreman ” sequence, and the other is 100 frames the “Carphone ” sequence. We use intra-bitplane shu?ing to encrypt each bitplane of the enhancement bitstream, where shu?ing is performed on a macroblock (n = 256) followed by encoding block by block (nB = 64). The expected histograms {E (Nd |n1 )} is presented in Fig. 2.7, which also shows that the experimental results match the above analytic results very well. To compare di?erent encryption approaches, we use the “Foreman ” sequence and encrypt the ?rst three most signi?cant bitplanes, which provides su?cient visual scrambling on the enhancement layer. We have found that for the “Foreman” sequence, the intra bitplane shu?ing approach gives an overhead of 7.0%, while

30

the overhead by the index-mapping approach in Section 2.2 is 14.3%. In general, the bitrate overhead for each bitplane by the proposed shu?ing approach depends on the overhead of each run-length symbol and the number of symbols (n1 ), as re?ected in the plot of the overhead contributed by each bitplane in Fig. 2.8. The arc shape is a result of an increase in symbol number from MSBs to LSBs and a decrease in overhead per symbol. From Fig. 2.8 we can also see that some videos, such as “Foreman ”, have very few number of “1”s in the MSB. Therefore the overhead from encrypting the MSB is very small. This also suggests that the MSB alone does not contribute to the perceptual quality signi?cantly in these sequences, and more bitplanes should be protected.
Zero-run s ymbol dis tribution after s huffling in a macroblock given n1=28 0.14 0.12 Occuring frequency 0.1 0.08 0.06 0.04 0.02 0 analys is res ult s imulation res ult

0

10

20

30 40 d: run of zero

50

60

70

Figure 2.7: Expected histograms of zero-run lengths after intra-bitplane shu?ing: solid line indicates the analytic result, and circles indicate the experimental result; the number of “1”s per macroblock (n1 ) is 28 bits, which is the average from the 2nd MSB bitplane of the “Foreman”.

31

60 Overhead: # of extra bits/MB

Intra Bit-plane Shuffling: Overhead vs BitPlane Foreman Carphone

40

20

0

-20

1

1.5

2

2.5

3

3.5

4

4.5

5

Bit-plane number (from MS B to LSB)

Figure 2.8: Overhead of each bitplane by intra-bitplane shu?ing algorithm.

2.3.3

The Security of IBS

The security of intra bitplane shu?ing is a?ected by how shu?e tables are generated and used. Shu?e tables can be generated from a cryptographically strong pseudorandom sequence using a classical linear-complexity algorithm [55]. Di?erent shu?e tables should be used for di?erent bitplanes, which forces attackers to resort to a brute-force search to simultaneously guess the permutation table used in multiple bitplanes. Similarly, the shu?e tables should be updated frequently without reuse. The amount of brute-force trials for ?nding the exact clear-text of an enhancement frame is proportional to
1
Nblk Nbp i=1 j =1

n
(i,j ) n1

,

where n1

(i,j )

is the number of “1”s in the j -th bitplane of the i-th block, and Nblk

and Nbp represent the number of blocks and bitplanes, respectively. As an example, we consider recovering exactly the second bit-plane in the “Foreman” sequence. This bit-plane is important because adding it to the base-plus-MSB video improves the PSNR from 29dB to 33.4dB. In the second bit-plane, the average number of “1”s in each shu?ed 8x8 block is about 7 in the “Foreman ” sequence. Suppose

32

each shu?ed block has 7 bit of “1”s in a bit-plane with QCIF size, the number of brute-force trials an attacker has to perform is proportional to
64 396 , 7

which

is equivalent to guessing approximately 105 encrypted bits. In this situation, a brute-force attacker would rather guess the cryptographic key used in generating shu?ing tables, which is 128 bits long. Another security aspect is how many bit-planes should be encrypted in order to provide su?cient protection. From our study using a number of video sequences, we found that when the sign information and the ?rst three bit-planes are unknown, adding lower bit-planes to based layer video will degrade the quality of based layer video both perceptually and in terms of PSNR. This is because without the higher bit-planes and the sign information, the lower bit-planes behave like random noise. A similar observation has also been introduced in the streaming video literature [12]. Another observation from the FGS literature is that the ?rst three bit-planes, especially the second bit-plane, contribute most to the re?nement of the quality [58]. Hence we believe that encrypting the ?rst three bit-planes along with the sign information can provide su?cient protection for most multimedia applications.

2.3.4

Other Forms of Constrained Shu?ing

Since shu?ing auditory signals temporally or visual signals spatially can easily make the shu?ed signal unintelligible, random shu?e among self-contained coding units (such as the macro-blocks in compressed video), has been a popular tool for multimedia encryption and appears in various forms [112, 94, 141, 127]. We refer to this encryption method as coded block shu?ing (CBS). A major drawback for block shu?ing alone lies in the fact that the information within a block is

33

perfectly retained. An attacker can exploit the correlation across the blocks (such as the continuity of edges and similarity of colors and textures) and reassemble the shu?ed blocks with a moderate number of trials 4 . The reassembly e?ort can be signi?cantly smaller than the brute force search as many unlikely search directions are pruned [85]. Therefore, block shu?ing alone is often not a secure encryption operation. We will incorporate block/macroblock shu?ing as a complementary building block to our two proposed operations and explore their combinations in the design of an encryption system in Section 2.5.

2.4

Attack Modelling and Security Measurement

In this section, we introduce a notion of multimedia-oriented security to evaluate the security against approximation recovery. To illustrate this concept, we shall use the security of visual data as an example and propose two visual security scores. The principles behind these security scores can be extended to auditory and other types of multimedia.

2.4.1

Approximation Recovery

One simple and common way to evaluate the security is to count the number of brute-force trials in order to break the encryption, which is proportional to min{| clear-text space |, | key space |}, where | · | denotes cardinality. Aside from the brute-force search, there are also notions of security that quantify the security of a system in terms of the amount of resources needed to break it [9] [10]. How4

This is usually true when the block is large enough. For small blocks, such as block of size

2x2, the correlation information between blocks is hard to exploit.

34

ever, the traditional all-or-nothing situation in generic data security is not always appropriate for measuring the security of multimedia encryption [112, 133, 141]. Beyond the exact recovery from ciphertext, it is also important to ensure partial information that is perceptually intelligible is not leaked out from the ciphertext. Many forms of multimedia data, such as image, video, audio and speech, contain inherent spatial and/or temporal correlation. The encrypted multimedia content may be approximately recovered based on the syntax, context, and the statistical information known as a priori [70]. This is possible even when the encrypted part is provably secure according to some generic security notions. For example, in MPEG-4 video encryption [127], when motion vector ?elds are encrypted and cannot be accurately recovered, a default value 0 can be assigned to all motion vector ?elds. This approximation sometimes results in a recovered frame with fairly good quality for frames having a limited amount of motion. Additionally, the statistical information, neighborhood patterns, and/or smoothness criterion can help estimate an unknown area in an image [122] and automatically reorder shu?ed image blocks [85]. Although these estimations may not be exact, they can reveal perceptually meaningful information once the estimated signal is rendered or visualized. As an example, we have shown in Fig. 2.5 the experimental result of encrypting the DC prediction residue of the Lena image. Although the directly rendered version of the encrypted Lena image in Fig. 2.5 is highly obscured, an attacker can obtain edge information by setting the DCs to a constant and observing the resulting image shown in Fig. 2.9. We can see that the edge and contour of the approximated Lena image is clearly comprehensible, which suggests that it is necessary to encrypt other components in addition to DCs.

35

Figure 2.9: Approximated Lenna image by setting all DC coe?cients to 0. Since the value of multimedia content is closely tied with its perceptual quality, such value composition should be re?ected in access control and con?dentiality protection [66][67]. From the considerations presented above, we propose to evaluate the security of multimedia encryption using the following framework. After encryption, the encrypted media is ?rst undergone some approximation attacks. We then use perceptual similarity scores to measure the amount of information leakage about the original media data through the approximated media. The results can indicate the security of the encryption scheme against the approximation attacks. Next, we discuss the methods and tradeo?s of measuring visual similarity for encryption applications.

2.4.2

Visual Similarity Scores

Studies on human visual system have shown that the optical characteristic of eyes can be represented by a low-pass ?lter [47], and that human eyes can extract coarse visual information in images and videos in spite of a small amount of noise and

36

geometric distortion. The important information extracted by human visual system includes spatial-luminance information and edge and contour information [36]. Motivated by these studies, we design a luminance similarity score and an edge similarity score to re?ect the way that human perceives visual information. These scores can quantitatively measure the perception-oriented distance between the clear-text copy of multimedia and the attacker’s recovered copy from the encrypted media. The proposed scores are inspired by the recent work on automated image quality measurement [123][124] that incorporated human perceptual properties. Luminance Similarity Score To capture the coarse luminance information, we introduce a block-based luminance similarity score. We assume that two given images are preprocessed to be aligned and scaled to the same size. These two images are ?rst divided into blocks in the same way, using 8 × 8 or 16 × 16 nonoverlapping blocks. Then the average luminance values of the i-th block from both images, y1i and y2i , are calculated. We de?ne the luminance similarity score LSS as LSS 1 N
N

f (y1i , y2i ).
i=1

(2.9)

Here, the function f (x1 , x2 ) for each pair of average luminance values is de?ned as ? ? ? 1 if |x1 ? x2 | < ? ; 2 f (x1 , x2 ) ? ? ?? round( |x1 ?x2 | ) otherwise, ? where the parameters ? and ? control the sensitivity of the score. Since the images under comparison may be corrupted by noise during transmission or be mis-aligned by a few pixels, such noise and perturbation should be suppressed during similarity estimation. The resistance to minor perturbation and noise can be achieved by appropriately choosing the scaling factor ? and the quantization parameter ? . In

37

our experiments, ? and ? are set to 0.1 and 3, respectively. A negative LSS value indicates substantial dissimilarity in the luminance between the two images. Edge Similarity Score The edge similarity score measures the degree of resemblance of the edge and contour information between two images. After the images are partitioned into blocks in the same way as in the LSS evaluation, edge direction classi?cation is performed for each block by extracting the dominant edge direction and quantizing it into one of the eight representative directions that are equally spaced by 22.5 degrees, as shown in Fig. 2.10. We use indices 1 to 8 to represent these eight directions, and use index 0 to represent a block without edge. Denoting e1i and e2i as the edge direction indices for the i-th block in two images, respectively, the edge similarity score (ESS ) for a total of N image blocks is computed as follows: ESS
N i=1 w (e1i , e2i ) . N i=1 c(e1i , e2i )

(2.10)

Here, w(e1 , e2 ) is a weighting function de?ned as ? ? ? 0 if e1 = 0 or e2 = 0, w(e1 , e2 ) ? ? |cos(?(e1 ) ? ?(e2 ))| otherwise, where ?(e) is the representative edge angle for an index e, and c(e1 , e2 ) an indicator function de?ned as c(e1 , e2 ) ? ? ? 0 if e1 = e2 = 0; ? ? 1 otherwise.

The score ranges from 0 to 1, where 0 indicates that the edge information of the two images is highly dissimilar and 1 indicates a match between the edges in the two images. A special case arises when the denominator in Eq. (2.10) is zero, which happens when both input images are “blank” without any edge. We assign

38

an ESS score of 0.5 to this special case. In our experiments, the input images are partitioned into non-overlapping 8x8 blocks, and the Sobel operator is used for edge detection [47]. The dominant edge direction of a block is determined by a majority voting inside the block according to the number of pixels associated with each representative direction by the Sobel operator.
90 120 60

e8 150 e7 30

e6 e5

180

0

e4

e3 210 e2 e1 330

240 270

300

Figure 2.10: Eight representative edge directions used in ESS score evaluation

2.4.3

Evaluation Framework for Multimedia-Oriented Security

When evaluating the image similarity, we ?rst calculate the ESS and LSS scores between the attacked/approximated image and the original image, and then compare the scores with two pre-determined thresholds, ESSth and LSSth , respectively. An encrypted image/video is said to pass the similarity test against a certain attack if both the ESS and the LSS are lower than the thresholds. In our experiments, we set ESSth to 0.5 and LSSth to 0. The proposed similarity scores exhibit a tradeo? between capturing coarse semantic information and texture details of images. The sensitivity of the two simi-

39

larity scores to image details can be controlled by the size of the block partition. When small blocks (e.g. 4 × 4) are used, a small amount of noise or geometric distortion can result in scores that indicates dissimilarity for two similar images. We refer to this type of mis-classi?cation as a miss. From security point of view, such a miss would lead to a security breach. When blocks with larger size (e.g. 32 × 32) are used, the scores tend to identify some images as similar when their details are di?erent. We refer to this type of mis-classi?cation as a false alarm. As preventing information leak is the main concern in many access control applications, usually there are relatively stringent requirements on keeping misses as low as possible, while allowing to tolerate a moderate amount of false alarms. Given these considerations, we suggest using 8 × 8 or 16 × 16 blocks in block partition. The two similarity scores are intended to measure the amount of information leakage through an attacked image. To this end, other types of perception-based similarity scores, such as robust image hashing [108], can also be incorporated to measure the image similarities, especially on the luminance similarity aspect. These hashing methods measure image similarity through the similarity of hash vectors using the normalized Hamming distance [78] and the receiver-operatingcharacteristics (ROC) formulation [108]. The design philosophy of the LSS and ESS scores can be extended to other types of multimedia, such as audio and speech. Similar to the design of audio hash [78], we can ?rst segment an auditory signal into a set of temporal frames, and analyze the components in various frequency ranges from each frame. The results from each interested frequency range (such as low frequency corresponding to LSS, and high frequency corresponding to ESS) can be examined and compared to arrive at an auditory similarity measurement.

40

2.5

Application Example: Video Encryption System Design

In this section, we present a framework for a video encryption system that employs the building blocks proposed in this chapter and from the literature. Using this encryption system, several example con?gurations are presented and the encrypted videos are compared in terms of the security against brute-force and approximation attack, the friendliness to delegate processing, and the compression overhead.

Figure 2.11: Video encryption system description. The encryption system is divided into two layers and for each layer candidate encryption components and methods are listed.

2.5.1

System Setup

Designed with scalable video in mind, the video encryption system has two layers as shown in Fig. 2.11. The base-layer video is coded with the MPEG-4 standard and the enhancement layer with the MPEG-4 FGS standard. The size of the group of pictures (GOP) is set to 15 and all predicted frames are set to P frames. For each

41

layer, we provide candidate encryption methods and components to be encrypted. Our experiments are conducted on a Dell workstation with 1.8GHz Pentium IV CPU and 512MB RAM. The two encryption operations proposed earlier in this chapter, namely, the generalized index mapping with controlled overhead (GIMCO) and the intra bitplane shu?ing (IBS), lend themselves naturally as building blocks for this system. A third building block is the coded block shu?ing discussed in Sec. 2.3.4. The index mapping encryption can be applied to intra block DC/AC coe?cients and inter block motion vector (MV) residues, the intra bitplane shu?ing encryption can be applied to FGS bitplanes, and the coded block shu?ing can be applied to macroblock (MB) coding units. We use AES [26] with a 128-bit key to generate the pseudo-random numbers for all encryptions.

2.5.2

Bitrate Overhead Study for Index-Mapping Encryption

As discussed in Section 2.2, the index-mapping based encryption introduces bitrate overhead. The overhead can be controlled by carefully choosing the set of components to encrypt and by using the set partitioning technique. In this part we study the bitrate overhead under di?erent encryption settings. A test video with 4000 frames in QCIF format is constructed by concatenating nine classic video clips, including Carphone, Claire, Foreman, Grandma, Miss America, Mother-Daughter, Salesman, Suzie, and Trevor. After an encryption range is chosen for each component, the range can be further partitioned into two subsets. For example, the encryption range [?63, 64] for DC residue can be partitioned into [?31, 32] and ([?63, ?32] ? [33, 64]). We also tested to encrypt the ?rst two non-zero AC coe?-

42

cients along the zig-zag scanning order in each intra block. The encryption range of motion vector residue is [?16, 15.5] with half pixel accuracy, which is also the MPEG-4 standard coding range. The encryption ranges of AC coe?cients and MV residue can also be partitioned into subsets in a similar fashion as that of DC encryption. Since the MV’s are predictively coded, the encryption of MV in one frame is able to propagate the scrambling e?ect to future frames, suggesting that encrypting MV of a subset of frames would be su?cient. For comparison, we include one experiment in which MV’s from the ?rst two P frames in a GOP are encrypted, and another experiment in which all MV’s are encrypted. The latter serves as an upper bound for the bitrate overhead by MV encryption. The above mentioned components are encrypted individually and the compressed ?le sizes are shown in Table 2.1. From Table 2.1, we can see that the compression overhead incurred by encrypting DC prediction residue ranges from 3% to 7%, depending on the encryption range and whether set partitioning is used. Encrypting AC and MV will generally incur an overhead larger than encrypting DC, especially when the motion vectors from all P frames are encrypted. However, limiting the encrypted MV’s to the ?rst two P frames in each GOP can reduce this overhead to around 5%. Also shown throughout Table 2.1 is that set partitioning is an e?ective way to control the overhead. These results provide a basis for designing a video encryption system, which is presented in the next subsections.

2.5.3

Base-Layer Video Encryption

In this part, we present experimental results for base-layer video encryption and analyze the security for di?erent con?gurations. Four video clips, from fast-motion to slow-motion, are used in our experiment. They are the Football, the Coast-

43

Table 2.1: Index Mapping Encryption Overhead Comparison Component Encrypted None DC DC DC DC AC AC AC AC MV MV MV of 2 P frames MV of 2 P frames Encryption Range [-127,128] [-127,128] [-63,64] [-63,64] [-64,64] [-64,64] [-32,32] [-32,32] [-16,15.5] [-16,15.5] [-16,15.5] [-16,15.5] Set Partition no 2 sets no 2 sets no 2 sets no 2 sets no 2 sets no 2 sets File Size (MBytes) 1.88 2.01 1.98 1.98 1.94 2.12 2.04 2.04 1.99 2.36 2.26 1.98 1.96 Relative Overhead 6.9% 5.3% 5.3% 3.2% 12.8% 8.5% 8.5% 5.9% 25.5% 20.2% 5.3% 4.3%

44

guard, the Foreman, and the Grandma, and each is 40 frames long. Encryption is performed under the settings shown in the left column of Table 2.2, where the encryption of DC, AC, and/or MV is based on the proposed generalized index mapping. The DC and AC encryption ranges are chosen as [-63,64] and [-32,32] with set partitioning, respectively; the motion vector encryption is applied to the ?rst two P frames in each GOP. Additionally, macro-block shu?ing in the compressed bitstream is applied to every frame. Each encrypted video complies with the syntax prescribed in the MPEG-4 standard. We also consider approximation attacks to these settings to emulate an adversary’s action, and list them in the right column of Table 2.2. The security for these encryption con?gurations are discussed below in details. Security Against Exact Recovery by Exhaustive Search To accurately recover an original I frame from the encrypted one, an attacker needs to recover all the DC coe?cients. For P frames, recovering the values of motion vectors is also necessary. In the above con?guration, each DC coe?cient and motion vector component has 6 bits encrypted. From the discussion in Section 2.4, each I frame in QCIF format has 2376 equivalent DC bits encrypted and the encrypted motion vectors in a P frame is equivalent to 1188 bits. Since a 128-bit key is used, the security against exact recovery by exhaustive search is determined by the cryptographic primitive with a 128-bit key. Visual Security Against Approximation Recovery To evaluate the visual security for our encryption system, we ?rst encrypt the test video and then apply approximation attacks to the encrypted video. After that

45

Table 2.2: Encryption and attack settings for security analysis
Encryption System Settings (E1) encrypting intra block DC Approximation Attack Settings (A1) set all intra block DC coe?cients to 0; (A2) set all intra block DC coe?cients to 0 and set the encrypted motion vector values to 0; (A3) including all the approximations in A2, plus set the encrypted AC coe?cients to 0;

residue by index mapping; (E2) encrypting inter block MV residue in the ?rst two P frames of a GOP, and all intra block DC residues; (E3) encrypting all the components in E2, plus the ?rst two (in the zig-zag scan order) non-zero AC coe?cients of intra block; (E4)–(E6) correspond to E1–E3 plus macro-block shu?ing in the compressed bit-stream, respectively;

(A4)–(A6) the same as A1–A3, respectively.

we obtain the ESS and LSS scores of the approximated video and compare them with the thresholds, ESSth = 0.5 and LSSth = 0. An encryption is considered not secure enough when either score is above the corresponding threshold. Fig. 2.12 and Fig. 2.13 show the video encryption results under di?erent settings for the Football and CoastGuard clips. The results presented are for Y components as they carry most of the information about the video. Visual examination suggests that encrypting DC alone still leaks contour information after approximation attacks, while extending encryption to MV and/or some ACs helps di?use the contour to reduce the information leakage. Furthermore, shu?ing self-contained coding unit such as macroblocks, coupled with the above value encryption, can scramble the content to a completely unintelligible level. Table 2.3 lists the average

46

ESS and LSS (averaged over the total 40 frames) of the videos after approximation recovery. From the average LSS and ESS scores we can see that, when coded block shu?ing is not used as an encryption tool, only the LSS score is below its security threshold of 0, and the ESS score is around or above its threshold of 0.5. This indicates that the encryption leaks out shape information and is not secure enough, which we can also observe from Fig. 2.12 and Fig. 2.13. Once the coded block shu?ing is incorporated in the encryption, the ESS and LSS indicate that the encryption is secure against approximations. These results concur with the visual examination.

Figure 2.12: Encryption results for Football. Approximation attacks are performed after encryption. The encryption-approximation settings are: (top left) unencrypted; (top right) E1-A1; (bottom left) E2-A2; (bottom right) E5-A5. To examine the detailed ESS scores, we plot the frame-by-frame ESS score of Coast-guard under di?erent encryption-attack settings in Fig. 2.14. The top

47

curve is from the attacked video with DC encrypted only, which con?rms that encrypting DC alone still leaves some contour information unprotected. The two middle curves are the results involving MV encryption for inter-blocks and AC encryption for intra-blocks, where the ESS scores are low at the beginning of a GOP and increase substantially toward the end of the GOP. This is because as it approaches the end of a GOP, motion compensation becomes less e?ective and the compensation residue provides a signi?cant amount of edge information. Such observation suggests that if we can only a?ord the bitrate overhead to encrypt two P frames in a GOP, the two encrypted P frames should be interleaved, such as choosing the 1st and the 8th P frames in a GOP of 15 frames. On the other hand, by incorporating the shu?ing of macroblock coding units, the resulting ESS measurements are consistently around 0.1 or lower.

Figure 2.13: Encryption results for Coast-guard. The encryption-approximation settings are: (top row, left to right) unencrypted, E1, E1-A1, E2-A2; (bottom row, left to right) E3-A3, E4-A4, E5-A5, E6-A6.

48

Table 2.3: Perception based security measures for video encryption
Football Settings E1-A1 E2-A2 E3-A3 E4-A4 E5-A5 E6-A6 ESS 0.70 0.53 0.53 0.12 0.13 0.12 LSS -0.78 -0.85 -0.86 -0.93 -0.92 -0.92 Grandma ESS 0.64 0.46 0.30 0.05 0.05 0.04 LSS -2.13 -2.13 -2.13 -2.13 -2.13 -2.13 Coastguard ESS 0.79 0.43 0.40 0.07 0.06 0.04 LSS -1.18 -1.19 -1.20 -1.20 -1.21 -1.20 Foreman ESS 0.71 0.43 0.40 0.07 0.06 0.05 LSS -1.42 -1.48 -1.48 -1.47 -1.45 -1.47

Relative Overhead Table 2.4 lists the compression overhead for four videos under each encryption settings. We can see that the overhead is low for high-complexity, fast-motion video such as the Football and the Foreman, and relatively high for low-complexity, slow-motion video such as the Grandma and the Coastguard. As we go from the setting E1 to E3, more components are encrypted and thus the overhead increases. We also see that the coded block shu?ing approach does not introduce overhead, as shown in setting E4 to E6. Overall, the overhead of 4–11% by the E1, E2, E4, and E5 is comparable to that of a direct adaptation of generic encryption to multimedia as discussed in Section 2.1.1. Considering both security and compression overhead, we have found that the E5 setting provides a very good tradeo?. This setting is a combination of block shu?ing and selective value encryption via generalized index mapping.

49

Figure 2.14: Frame-by-frame ESS of Coastguard video sequence under di?erent settings. The corresponding settings are listed in Table 2.2.

2.5.4

Protecting FGS Enhancement Layer Video

We use 10 frames from the Foreman video sequence to demonstrate the protection of the enhancement data while preserving the FGS characteristics from the source coding. The proposed intra bitplane shu?ing encryption is applied within each 8x8 block and the sign bit of each coe?cient is encrypted using a stream cipher. To allow for a better visual examination of the protection e?ects on the enhancement data, we combine the encrypted FGS bitplanes with a clear-text base layer. For most natural images, the coded DCT coe?cients have decreasing dynamic range versus the frequency. As such, we emulate an approximation attack, whereby for each signi?cant bitplane, all the “1”s of the bitplane is put to the lowest possible frequency bins. A total of six encryption-attack settings are used, namely:

50

Table 2.4: Relative Compression Overhead of the Encrypted Videos Football E1 and E4 E2 and E5 E3 and E6 1.29% 3.88% 6.47% Foreman 1.75% 6.41% 9.62% Coastguard 3.15% 8.74% 11.54% Grandma 6.96% 11.11% 24.61%

(a) to shu?e the ?rst bitplane with clear-text base layer, (b) to approximate the bitplane of (a) with the correct signs, (c) to approximate the bitplane of (a) with random signs, (d) to shu?e the ?rst two bitplanes, (e) to approximate the bitplanes of (d) with the correct signs, (f) to approximate the bitplanes of (d) with random signs. Fig. 2.15 shows the encrypted and attacked versions of the Foreman FGS video. The ?rst row shows the encryption and approximation results using MSB only (settings (a), (b) and (c)), and the second row shows the results using the ?rst two bit-planes (settings (d), (e) and (f)). The blocky artifacts in the settings (d), (e) and (f) are clearly more pronounced than in the settings (a), (b) and (c). This suggests that using more encrypted bit-planes will worsen the approximation attack results. Within each row, the visual di?erence is very subtle. This observation veri?es that without knowing the decryption key to the enhancement layer, an attacker cannot obtain a more re?ned video than the base-layer video using approximations from the encrypted enhancement layer. Table 2.5 lists the corresponding average PSNR, LSS and ESS of the videos under the six encryption-attack settings. From the table, we can see the approximation recovery can only reduce a little luminance error in terms of LSS and PSNR in the approximated video compared to the encrypted video, and the edge similar-

51

Base-Layer

Setting (a)

Setting (b)

Setting (c)

Base + 2FGS BP

Setting (d)

Setting (e)

Setting (f)

Figure 2.15: Encryption results for Foreman FGS video. Top row, left to right: base layer picture; and the encryption-attack settings (a), (b), (c). Bottom row, left to right: base layer plus 2 clear-text FGS bitplanes; and the encryption-attack settings (d), (e), (f). ity in terms of ESS remains imperfect and has little improvement after attack. This demonstrates that the proposed method can protect the premium quality version of the content in a FGS compatible way, with a separate key from base-layer video encryption.

2.6

The Security in Media Authentication using Image Hashing

In addition to media encryption, multimedia content authentication is equally important in secure multimedia communications. In this section, we focus on the security of image hashing schemes for content authentication. we ?rst adapt

52

Table 2.5: Intra Bitplane Shu?ing and Approximation Attack (a) PSNR (dB) LSS ESS 28.59 0.28 0.85 (b) (c) (d) 27.39 0.28 0.85 (e) 27.87 0.34 0.85 (f) 27.50 0.29 0.85

28.76 28.74 0.34 0.85 0.34 0.85

the unicity distance concept to evaluate the security of image hashing. Using the unicity distance framework, we analyze the security for two recently proposed image hashing schemes [110][117]. Our analysis shows that the secret hashing key can be disclosed with high accuracy when the key is re-used for several dozen of times. This result determines the maximum number of key reuse in the investigated hashing schemes. One way to mitigate such key disclosure is to generate a new hashing key using a master key for each di?erent image.

2.6.1

An Analysis of Image Hashing

Most image hashing schemes consists of two steps: randomized feature extraction and postprocessing of image features. To achieve robustness and security, the feature extraction stage is particularly important. If the image feature can be generated without the secret key, an attacker can try to create a di?erent image that produces the same image feature. Since this feature will result in the same image hash, the forged image becomes an arti?cial pre-image of the given hash. Hence, randomization techniques must be incorporated into the feature extraction stage to prevent hash forgery [37][110]. On the other hand, the robustness of image hashing greatly relies on the robustness of image features. A robust image feature extraction scheme must withstand minor distortions to the image, such as ?ltering,

53

compression, noise corruption, and moderate geometric changes involving rotation, scaling and translation. The post-processing of image feature involves quantization and compression of the feature. Further randomization, such as a permutation of the hash vector values, can be introduced in this step. However, to fully preserve the robustness obtained in image feature, the newly introduced randomness at this stage is usually limited [109]. For example, if a cryptographic cipher is applied on the image feature, the ultimate hash will dramatically change even if some feature values are changed by one bit of information. Various image hashing schemes mainly di?er in the way randomized features are extracted. This is because usually similar postprocessing technique can be applied to almost all randomized feature extraction schemes. For these reasons, we will concentrate on the feature extraction stage and assume the randomized image feature can be observed as the hash output. At the end of this paper, we will further discuss how to model and estimate the randomization in the post-processing stage. Let us denote an image hash function by h(·), the input image by I , the secret key by K , and the resulting hash vector by v. When the same key K is used to generate n image hashes, we observe pairs of (I1 , v1 ), (I2 , v2 ), · · ·, (In , vn ). The conditional entropy of the secret key K with respect to these observations is H (K |{(Ij , vj )}n j =1 ). According to information theory, in general, such conditional entropy will decrease with the increase of n. In our investigation of image hashing security, we mainly focus on how the uncertainty in the secret key decreases in robust image hashing schemes. Our exploration takes the following form. We randomly choose a hashing key, and generate multiple hashes using some natural images. The hash and image pairs are then provided to a key estimation algorithm.

54

This algorithm outputs an estimated hashing key. This estimation is gradually re?ned with the increased number of observed image-hash pairs. We expect to observe the estimated key becoming closer and closer to the actual key, until ?nally the two keys can be regarded functionally identical for hash generation purpose. The “functional identicality” for two keys with only a few bit di?erences is a unique property in perceptual hashing schemes with robustness constraint. The secret key for most image hashing schemes can be considered as a random variable/vector. Due to the robustness constraint in constructing the image hash, for the same input image, adding a small perturbation vector to the key vector usually results in a hash that is similar to the original hash. Such an observation indicates that we can measure the distance dK between the estimated the hashing key and the actual hashing key, and use dK to predict the distance dH between the resulting hash vectors that use these two keys and the same image as input. Conversely, we can also use dH to predict the distance dK . These two observations form the basis of the key estimation algorithms that we will present later. In the subsequent discussions, we use two kinds of vector norms to measure the distance in key vector and hash vector, namely, the L1 norm and the L2 norm. Next, we analyze the security for two image hashing schemes. The ?rst scheme was proposed by us in [110]. This scheme is one of the image hashing schemes with the best performance in terms of robustness and security. We are also the ?rst to propose an analytical framework for quantitatively evaluating the hash security based on a single input image and output hash [110]. The second scheme we analyze was proposed by Venkatesan et al. in [117]. The authors of [117] are among the ?rst to propose a perceptual hashing algorithm and to recognize that randomized feature generation is important to the robustness and security of image

55

hashing. Analyzing these two representative hashing schemes can provide insight towards the security and limitation of robust image hashing algorithms.

2.6.2

Unicity Distance for Polar Fourier Transform Hash

Our recently proposed robust and secure image hashing scheme [110] is based on Fourier transform of image in polar coordinates. To obtain the hash, the FFT of an image is ?rst taken, and the Fourier coe?cients are represented in polar coordinates with zero frequency at the origin. Along the normalized magnitude coordinate ?, m equally-spaced magnitude ?1 , ?2 , · · ·, ?m are chosen ranging from 0 to 0.4. Then for each chosen magnitude ?i , Fourier coe?cients are circularly summed to obtain one component in the feature vector, fi . The circularly summed Fourier coe?cients form a vector f = [f1 , f2 , ..., fm ]T . To randomize this feature vector, a linear transformation is performed on the vector f to obtain the randomized feature r r = [r1 , r2 , · · ·, rm ]T = Kf . Here the matrix K is a m-by-m key matrix. Each matrix element in K is a Gaussian distributed random variable with mean zero and variance one. We can also decompose this key matrix into m row vectors. Each row kT i is a component key with dimension 1-by-m. Thus we have K = [k1 , k2 , · · ·, km ]T , r i = kT i f.

When n image-hash pairs are observed, we have (f1 , r1 ), (f2 , r2 ), ···, (fn , rn ). De?ne F = [f1 , f2 , · · ·, fn ], and R = [r1 , r2 , · · ·, rn ]. We can obtain

Rm×n = Km×m Fm×n .

(2.11)

56

There are two situations when trying to solve the above linear equation array. When n ? m and m linearly independent columns of F exists, we can extract the m columns of F as F and the corresponding m columns of R as R. Now that F is invertible, we solve the linear equation R = KF to obtain K = R F
?1

. If we

ignore the numerical inaccuracy in the solution, such a solution will give an exact estimate of the secret key. When n < m or the columns of F does not constitute an invertible matrix, the problem in Eqn.(2.11) becomes solving an under-determined system and a number of existing solutions can be applied. For example, we can use the least-square method to obtain an estimate of the key matrix K. Each row kT i in K corresponds to a hash component key. The L1 and L2 norm of the estimation error for the component key is d1 = ||ki ? ki ||1 , and d2 = ||ki ? ki ||2 , respectively. We expect both d1 and d2 to decrease with the increase of n. Since each column fi of in matrix F is generated from a di?erent image, we expect the matrix F to have full column rank with high probability. Usually when n = m, the key should be uniquely determined. Experimental Validations To verify our analysis, we conduct key estimation experiments using the method discussed above. We set m = 64 as in [110]. According to the statistics gathered from several hundred natural images, we generate the circular sum of Fourier transform coe?cients at di?erent radii values f = [f (?1 ), f (?2 ), · · ·, f (?m )]T . For each randomly generated key vector k, we perform key estimation using di?erent numbers (1 ? n ? 64) of observed image-hash pairs. Then we compute the key estimation error d1

and d2

. We repeat the above experiment for 5000 times and compute the average key estimation error as a function of n. In Fig. 2.16 we

57

show that the key estimation error remains approximately constant when 1 ? n ? 40. When n > 40 the error gradually decreases. When n = 64, the error becomes zero and we are able to obtain the exact key. This experiment shows that the unicity distance for polar Fourier transform hash is equal to m = 64.
simulation results for polar FFT hash 0.8 0.7 error in estimated key vector 0.6 0.5 0.4 0.3 0.2 0.1 0

d1 (L1 norm) d2 (L2 norm) 0 10 20 30 40 50 n: number of observed hashes 60 70

Figure 2.16: Average key estimation error for polar FFT hash. The results are averaged over 5000 runs.

2.6.3

Unicity Distance for Hash using Randomized Spatial Statistics

The second representative scheme on image hashing that we investigate was proposed by Venkatesan et al. in [117]. This hashing scheme uses randomized spatial statistics from input image as image feature. To generate one feature component, the secret key is used to produce a four-tuple parameters p = (x, y, h, w) of a rectangle. Here (x, y ) is the coordinate of the upper left corner of the rectangle, and h and w correspond to the height and width, respectively. With these parameters, a rectangular region in the input image is determined. The randomized

58

spatial statistics can be either the mean or the variance of this region. To obtain a feature vector, the parameters for multiple random rectangles are generated. In this scheme, the equivalent secret key can be regarded as the four-tuple parameter set (x, y, h, w). Next, we show how to formulate the problem of estimating these parameters as a numerical optimization problem. Suppose the mean (or variance) of luminance value of the selected image region is used as the randomized image feature as in [117]. We denote the process of computing the mean value as a function v = h(p, I ). Here p indicates the secret key (rectangle parameters), I the input image, and v the regional statistics. Our goal is to estimate the secret parameter set p based on the observation of image-hash pairs. As the content of natural images are mostly smoothly varying, the function h(p, I ) varies smoothly with parameter p for any given I . If an estimated parameter p ˆ produces hash value v ˆ = h(ˆ p, I ), we can use the distance between hash value v and v ˆ as an indication of how accurate the estimate p ˆ is. For one image-hash pair, this may not be possible since many image regions will produce very close regional statistics v . When multiple hashes are generated using the same key but di?erent images, we can observe (I1 , v1 ), (I2 , v2 ), · · ·, (In , vn ). Let v = [v1 , v2 , ..., vn ]T . The hash value obtained using the estimated key p ˆ is v ˆ = [ˆ v1 , v ˆ2 , ..., v ˆn ], where v ˆi = h(ˆ p, Ii ). We formulate the key estimation problem as follows: Given (I1 , v1 ), (I2 , v2 ), · · ·, (In , vn ), ?nd p ˆ that minimizes the normalized correlation ? between v and v ˆ, where vT v ˆ . ||v|| · ||v ˆ||

?=

(2.12)

59

To e?ciently solve the above problem, we devised an iterative search algorithm that consists of two stages: the initialization stage and search re?nement stage. We consider all the input images are of the same size; otherwise, a preprocessing step can be used to achieve this e?ect. Table 2.6: Twelve sets of increments for parameter update
1 ?x ?y ?w ?h ?d 0 0 0 2 ?d ?s 0 0 3 0 ?s 0 0 4 d ?s 0 0 5 d 0 0 0 6 d s 0 0 7 0 s 0 0 8 ?d s 0 0 9 ?d/2 0 d/2 0 10 d/2 0 ?d/2 0 11 0 ?s/2 0 s/2 12 0 s/2 0 ?s/2

Iterative Search Algorithm Our proposed searching algorithm is similar in spirit to the hierarchical block matching algorithm used in video compression [90]. Di?erent spatial regions are tested to ?nd an initial rectangle that can approximate the actual rectangle area used in hash generation. Then the initial rectangle is successively re?ned through a hierarchical search to obtain the ?nal rectangle estimation. Initialization: The initialization stage is to obtain the approximate location and size of the rectangle. We partition the entire image area in several ways. The ?rst partition is an non-overlapped partition into 4-by-4 equal size rectangles. The second is an 8-by-8 non-overlapped uniform partition. The third partition is a 4-by-4 overlapped partition, with the width and height of partition rectangle being 2/5 of the image width and height, respectively. The horizontal and vertical overlap between neighboring rectangles are 1/5 of the image width and height, respectively. These partitions give a collection of 96 initial sets of parameters. Next, for each set of initial parameters, we compute the regional statistics using

60

all input images and the normalized correlation with the given image hashes as in Eqn.(2.12). The parameter set t = (tx , ty , tw , th ) that achieves the minimum correlation among 96 partitions is chosen as the initial input to the next stage. Iterations: In this stage, we update the rectangle parameter iteratively. Each iteration tries to update the existing parameter to obtain a smaller normalized correlation ? . The parameters are updated by adding an increment or decrement as x = x + ?x , y = y + ?y , w = w + ? w , h = h + ?h . We use twelve sets of parameter increments in our algorithm, as shown in Table 2.6. These parameter increments correspond to translating the current rectangle in eight directions to its neighbor areas, combined with horizontal (or vertical) expansion (or shrinking) of the current rectangle. The incremental parameters d and s are obtained from the current rectangle parameters as d = ?w/2 and s = ?h/2. Here ? is the step-size factor. We start with ? = 1. If none of the updated parameters can result in a smaller normalized correlation ? , we decrease ? by half and recompute the updated parameters. This process is repeated until we can ?nd an updated parameter set that reduces ? in (2.12). This iteration process is illustrated in Fig. 2.17. Termination conditions: The iteration stage will terminate when one of the following three conditions are met. The ?rst condition is that the normalized correlation approaches 1 within a pre-determined margin . The second condition is that when the incremental parameters d or s becomes smaller than 1, which indicates that we cannot ?nd a better parameter value within one pixel accuracy.

61

The third condition is that the number of iterations reaches a pre-determined maximum number.

Figure 2.17: Illustration of the searching algorithm for the unknown rectangle parameters

62

Experimental Validations We have collected 4000 images from 11 video sequences in QCIF format (frame size 176 × 144) and apply the proposed key estimation test. In each experiment, we randomly choose n images from this collection, generate hashes for these images using a secret key, then perform key estimation. For each ?xed n, we repeat the experiment 400 times using di?erent keys and calculate the average estimation error in the rectangle parameters (ex , ey , ew , eh ). Fig. 2.18 shows the absolute value of the average estimation errors for di?erent n values. As we can see, the error reduces quickly as n increases. In Fig. 2.19 we also plot a histogram for the width estimation error when n = 40. We can see that about half of the estimation errors have absolute value smaller than 3. Within several pixels’ range, the estimated rectangle will produce nearly identical hash values. Since most perceptual hash veri?cation relies on a soft hash distance metric by measuring the normalized Hamming distance [79], the probability of forging a valid hash for any new image becomes very high. Therefore the unicity distance for the hashing using randomized spatial parameters is approximately around 40.
16 ex 14 ey ew error value 12 eh

10

8

6 10 15 20 25 30 35 estimation error of rectangle parameters (absolute value) 40

Figure 2.18: Key estimation results for image hash using regional mean.

63

Histogram of width estimation error, 400 runs, 40 image?hash pairs 250

200

150 count 100 50 0 0

20 40 60 80 |ew|: absolute value of the error in width estimation

100

Figure 2.19: Histogram of the width estimation error.

2.6.4

Discussions

Our investigation in the security for robust image hashing shows that the secret hashing key, or its equivalent form, can be revealed with high accuracy when the key is re-used for a few dozen of times. We believe such key disclosure problem exists for all robust image hashing schemes, mainly due to the design limitation that perceptually similar images should have similar hashes, which is a drastically di?erent criterion from cryptographic hash design. One way to mitigate such key disclosure is to use a new hashing key for each di?erent image, because usually the unicity distance is more than 10 observed image-hash pairs. To reduce the increase in key storage cost, the hashing key can be generated by a key derivation function (a hash function) using a master key and a random number (the “salt”) as input [52]. The salt value can be transmitted with the image content in clear-text. We have demonstrated in this section how to employ the unicity distance concept in analyzing the feature extraction stage of image hashing. For operations in the post-processing stage of hash generation, such as the randomized quantization [79] and hash vector permutation, the concept of unicity distance can also be

64

applied. In an uni?ed framework, the randomization parameters in both the feature generation stage and post-processing stage can be searched at the same time. The exhaustive searching space may become very large. However, by exploiting the robustness constraint, incorrect searching directions can be quickly pruned and the e?ective search space will become much smaller. We believe through correct mathematical modelling and by using signal processing techniques, the proposed analytical framework based unicity distance concept can be extended to other robust perceptual hashing schemes.

2.7

Chapter Summary

In this chapter, we discussed the unique security issues related to multimedia encryption and content authentication. For media encryption, we proposed two atomic encryption operations that can preserve standard compliance and are friendly to delegate processing. We also provided quantitative analysis and demonstrated that a good tradeo? can be made between the security and bitrate overhead. We pointed out the need of quantifying the security against approximation attacks that are unique to multimedia, and have proposed a set of multimedia-oriented security scores to complement the security metrics for generic data. Using video as an example, we presented a systematic study on how to integrate di?erent atomic operations to build a video encryption system that can achieve a good tradeo? among security, friendliness to delegate processing, and bitrate overhead. For media authentication, we adapted the concept of unicity distance to evaluate the security of image hashing. Our explorations on two recent image hashing schemes shows that the secret hashing key can be disclosed with high accuracy when the key is re-used for a few dozen of times. One way to mitigate such key

65

disclosure is to generate a new hashing key using a master key for each di?erent image.

2.8
2.8.1

Appendix: Derivations
Overhead Analysis for Generalized Index Mapping

In this appendix section, we present the derivations for the overhead analysis of generalized index mapping as discussed in Section 2.2. We start with the a default entropy codebook being used to encode encrypted data and prove Eq. (2.3). Recall that r denotes a probability distribution of {ri P (R = i) = 2?li /
k

2?lk }.

Representing the constant in the denominator as c, we have li = ? log ri ? log c. Expanding ?L leads to ?L =
i

(qi ? pi )li (qi ? pi ) log ri ?
i i

(2.13) (qi ? pi ) log c (2.14)

= ? = (?
i

qi log ri +
i

qi log qi ) + pi log pi ) +
i

(?
i

qi log qi + pi log pi +
i i

(?

pi log ri )

(2.15) (2.16)

= D(q ||r) + D(p||q ) ? D(p||r),

In the above derivation, the second summation in (2.14) is zero; and the second term in (2.16) is obtained by exploiting the piece-wise constant property of q such

66

that

i?Sj

pi =

i?Sj

qi = |Sj |q (Sj ) , leading to qi log qi = ?
i j

?

(
i?Sj

qi ) log q (Sj ) pi ) log q (Sj ) pi log qi )

= ?
j

(
i?Sj

= ?
j

(
i?Sj

= ?
i

pi log qi .

Using the same technique, we can prove Eq. (2.4) for the overhead in the adaptive entropy coding case. That is, ?L = H{qi } ? H{pi } = ?
i

qi log qi +
i

pi log pi pi log pi
i

= ?
i

pi log qi +

= D(p||q ).

2.8.2

Overhead Analysis for Intra-bitplane Shu?ing

In this appendix section, we analyze the zero-run symbol distribution after intrabitplane shu?ing as discussed in Section 2.3. As we have seen in the main text, a macro-block in MPEG4 FGS coding consists of four 8 × 8 luminance blocks, and zero-run symbols are formed within each block. Intra bitplane shu?ing can be done within a block or within a macroblock. For generality, we refer to the block to which shu?ing is applied as shu?ing block and denote its size as n; and similarly, the block to which run-length encoding is applied as coding block and its size nB . In the case of shu?ing within an 8 × 8 block, the two blocks are identical and n = nB = 64; while for the case of shu?ing within a

67

macroblock and run-length encoding within an 8 × 8 block, we have nB = 64 and n = 256. We assume that there are n1 “1”s in the shu?ing block, hence the total number of zero-run symbols is n1 both before and after shu?ing. We derive the zero-run symbol histogram conditioned on n1 . We use an indicator function Ik to denote the following event of the bitplane of interests for coe?cients in a coding block: when ones appear in both k th and (k + d + 1)th position and zeros in between (i.e. the bits from (k + 1)th position to (k + d)th position forms a zero-run symbol with run-length d), Ik = 1; otherwise, Ik = 0. The range of k to be considered for Ik is between 0 and (nB ?d?1), where I0 = 1 indicates the ?rst zero-run symbol of the block has run d. The expected number of symbols with zero-run being d in a coding block can be obtained as the follows:
(d) (d) (d) (d) (d)

nB ?d?1

E (NB,d |n1 ) = E
k=0 nB ?d?1

Ik |n1
(d)

(d)

=
k=1

Pr(Ik = 1|n1 ) + Pr(I0 = 1|n1 ) ? ? ? ?

(d)

? n?d?2 ? ? n?d?1 ? ? ? ? ? n1 ? 2 n1 ? 1 ? ? ? ? = (nB ? d ? 1) × + ? n ? ? n ? ? ? ? ? n1 n1
d?1

=
i=0

1?

n1 n?i

×

(n ? nB )(n1 ? 1) n1 × n1 ? . n?d n?d?1

When n = nB , the results can be simpli?ed as
d?1

E (Nd |n1 ) =
i=0

1?

n1 n?i

×

n2 1 . n?d

68

Chapter 3 Tracing Malicious Relay in Cooperative Wireless Communications
Wireless communication systems have enjoyed a tremendous growth in the past decade [96]. Recently, cooperative communications is proposed as an emerging wireless communication paradigm that explores a new dimension of diversity to combat the unfriendly wireless communication environment. Such a new communication paradigm also introduces new information security issues, especially in the area related to detecting and verifying signals simultaneously transmitted from multiple cooperating nodes. In this chapter, we show that using signal processing techniques and cryptographic tools, we are able to trace the adversarial relay nodes who try to garble the relay signals in a cooperative communication system. This shows the promising potential of signal processing in cooperative wireless communication security. Consider a wireless communication example where node A is transmitting to

69

node B , and the direct link may be obstructed by large objects or hilly terrains. In such a scenario, two other nodes C and D that are near to node A and B can serve as relay nodes to improve the communication quality. In an cooperative communication system, each node may simultaneously be an information source, a destination, and a relay node. To achieve cooperation, the transmission of a node can usually be divided into two phases. The ?rst phase is to transmit information generated from itself while listening to other source nodes. The second phase is to transmit relay information [56][99][48]. Among the strategies employed by the relay nodes, amplify-and-forward and decode-and-forward are two most common strategies [56]. In amplify-and-forward, the relay nodes simply boost the energy of the signal received from sender and re-transmit to the receiver. Such a strategy may also amplify the noise in the received signal at the relay nodes. In decode-andforward, the relay nodes will perform physical layer decoding (demodulation plus signal detection) and then forward the decoding result to the destination. When multiple relays are available, more sophisticated relay strategy can be employed. For example, multiple relay nodes can serve as multiple transmit antennas to employ a space-time code [2][113] in transmitting the relay signals [57][99][100][48]. Such schemes can improve the communication quality by exploiting the space and time diversity. While most of these prior arts focused on the communication aspects of such cooperative system, there have also been serious concerns regarding the cooperation motivation and security in such systems. The ?rst concern is from autonomous ad hoc network perspective, where a centralized control does not exist and each node is an autonomous entity. In such a scenario, nodes may lack motivation to cooperate and behave sel?shly, such as avoiding packet forward in order to preserve

70

Relay 0

Sender
A

C

Receiver
B

D

Relay 1

Figure 3.1: Two nodes, C and D, try to forward the message from node A to B. its own energy and prolong its lifetime [139]. We classify such “sel?sh” behavior as passive non-cooperation. In the literature, passive non-cooperation have been addressed by using credit system [142], or reputation propagation system [77]. The second concern is from wireless communication perspective. Wireless communications use a shared medium and is susceptible to malicious denial-of-service attacks. For example, a powerful attack against wireless communications is signal jamming, where an adversary tries to corrupt communication by actively transmitting radio signal and interfering with the receiver [134]. We classify such attacks as active malicious attack, which is the focus of this paper. In recent years, a number of works have addressed the problem of malicious signal jamming in general wireless communications. Xu et al. proposed a scheme to detect jamming attacks based on packet delivery ratio and signal strength [134], and proposed to use channel sur?ng and spatial retreat to avoid jamming attack [135]. Noubir et al. proposed to improve node connectivity and transmission reliability under jamming attack using directional antennas, node mobility, and error correction code [61][83]. Another traditional anti-jamming techniques is spread-spectrum, which spreads the energy of one bit information onto a wide frequency spectrum. Most of these counter-jamming techniques have one common attribute: they try to prevent the attacker from getting into the sender/receiver’s

71

multiple access channel (MAC). In frequency division multiple access, the MAC channel can be defended by frequency hopping; in code division multiple access, the MAC channel is protected by each user’s spread-spectrum code; and node mobility and the use of directional antennas provide a spatial multiple access channel that is designated to the source/destination in a particular spatial region. The result is that the jamming signal energy is signi?cantly attenuated in the MAC channel, thus an improved signal-to-noise-ratio (SNR) can be obtained at the receiver. The involvement of multiple relay nodes with sophisticated relay rules in cooperative communications poses a challenge to detecting and avoiding the malicious attacks [73]. The complication brought by user cooperation is that most schemes require nodes share their multiple access channels with other nodes [57][2][99][48]. This is because using an exclusive MAC channel between a sender and a receiver would make it di?cult to bene?t from node cooperation. However, in an adversarial environment, some relay nodes could be compromised. Using the captured communication device, the compromised nodes can maliciously modify the relay information, injecting falsi?ed information, or choose not to relay. In such a situation, the counter attack techniques based on MAC channel exclusion will not be e?ective, and the security enforcement for cooperative wireless system becomes a challenging and delicate task. In this paper, we are interested in the counter attack strategies against malicious relay nodes in a cooperative wireless system. The goal of the malicious relays is to corrupt the communications between the information source and the receiver. These malicious relay nodes would exploit the weakness in user cooperation rules, especially in the multi-node relay situation, and disguise themselves as legitimate relays. We ?rst study the attack behavior and investigate the impact of

72

such attacks to the receiver. To defend against such malicious attacks, we propose a cross-layer scheme for detecting and tracing adversarial nodes, which involves both inserting pseudo-random tracing markers in the transmitted symbol stream at the sender side, and an adaptive signal detection and processing at the receiver side. Note that in addition to detecting the jamming, our scheme can also pinpoint the adversarial relay node with very high probability. This allows further actions to be taken to exclude the adversarial node from the network.

3.1
3.1.1

System Model and Proposed Framework
System Setting

Let us consider the simple decode-and-forward situation introduced in Fig. 3.1. Sender node A transmit signals to both the receiving node B and relay nodes C and D. The relay nodes obtain the signal yr = x + nr . Assuming the signal-tonoise-ratio (SNR) at the relay nodes are high enough, they decode the received signal symbol and normally should follow the forwarding rule. In this paper, we mainly consider relay nodes using coded cooperation, such as the user cooperation schemes described in [57][48] using space-time coding [2][113]. Nonetheless, the analysis about attack and defense can also be applied to non-coded forwarding strategy such as in [99][100], i.e., the relay nodes will directly transmit the received signal without coding. As an example, we consider two relay nodes employing an orthogonal spacetime code [2], described in Table 3.1, for message forwarding. After decoding, the relay nodes obtain the message bits. Suppose the signal constellation set M consists of M = 2k symbols. Each relay node takes two decoded symbols s0 and

73

Table 3.1: Space-Time Code Used by Two Relay Nodes relay node 0 t t+T s0 ?s? 1 relay node 1 s1 s? 0

s1 from consecutive symbol durations, and transmit according to the cooperation rule in Table 3.1. Denote the symbol duration by T . At the receiving side, the received signals at two consecutive symbol durations t and t + T are r0 = h0 s0 + h1 s1 + n0 r1 = ?h0 s? 1 + h1 s? 0 + n1

(3.1)

Here h0 = ?0 ej?0 and h1 = ?1 ej?1 are complex channel gain, and n0 and n1 are complex Gaussian noise. Throughout this paper, we will use such base-band equivalent expression for analysis. We assume the channel conditions h0 and h1 are known at the receiver side, but not at the relay node. This is achieved by using proper channel estimation at the receiver side with pilot symbols that are inserted frequent enough relative to the channel variations [2].

3.1.2

Attack Modelling

We now examine the attack strategies employed by an malicious relay node against the cooperative communication system. The goal of the malicious relay node is to corrupt the communication between the sender and receiver. In the meantime, the attacker also tries to disguise his/her identity as a malicious adversary and to corrupt communication for as long as possible [39]. This can be done by cleverly choosing the attack technique and the portion of the relay signals to attack. Generally, we assume that the adversarial node uses the same transmission device as a

74

normal relay node. The attacker will only corrupt communications when there is relay message. This strategy is to preserve the energy of the attacker’s device as well as to disguise attacker’s identity. When there is a relay message, the adversarial relay nodes can do one of the following: (1) transmit nothing, which is the passive non-cooperation; (2) transmit garbled signal, which is the active malicious attack. We discuss the details of these two cases subsequently. Transmit Nothing When both relay nodes are adversarial and choose not to transmit, such an event can be detected by comparing the received signal energy with ambient noise energy. Thus we shall focus on the situation when only one relay node (say, the relay node 1) is adversarial and do not forward message. When the relay node 1 does not transmit at all, the received signal from Eqn. (3.1) becomes r0 = h0 s0 + n0 , r1 = ?h0 s? 1 + n1 . This situation is also described in [2] as soft failure, i.e., one of the relay nodes fails to function. In this situation, the receiver still can detect the received signals. Here we describe the maximum likelihood detector, which is the optimal detector in Gaussian noise. First, the receiver builds the following two combined signals
2 ? c 0 = h? 0 r0 = ?0 s0 + h0 n0 ? 2 c1 = ?h0 r? 1 = ? 0 s 1 + h0 n1

It is clear that c0 only contains signal s0 , and c1 only contains signal s1 . Thus the optimal detector for s0 only depends on c0
2 ? 1)|xi |2 + d2 (c0 , xi ) . ˆ s0 = argminxi ?M (?0

(3.2)

75

Table 3.2: Forwarding Rules by Compromised Relay-1 relay node 0 t t+T s0 ?s? 1 relay node 1 s2 (instead of s1 )
? s? 3 (instead of s0 )

Here M is the entire signal constellation set and d(·, ·) denotes Euclidean distance. The optimal detector for signal s1 takes the same form as (3.2), only c0 is replaced by c1 . From the above discussion we can see that, the attacker’s strategy to transmit nothing will not prevent communications because the transmitted signals can still be detected. However, it will degrade the receiver’s performance because of the loss of diversity. Transmit Garbled Signal Instead of transmitting the valid information, the adversarial relay node can arbitrarily change some signal symbols and transmit such garbled signals. In order not to be detected as an malicious attacker, the adversarial node will transmit the valid symbols from signal constellations and according to the rule stipulated by the space-time(ST) code. Without loss of generality, we analyze the damage of such attacks for the situation where the relay node 1 is adversarial. Table 3.2 describe the strategy used the relay node 1. In this table, the adversarial relay node randomly picks two signal symbols s2 and s? 3 to transmit, while the cooperative relay (node 0) transmits the information according to the relay message and the ST code. There are two di?culties at the receiver side. First, at the physical layer that uses the conventional signal detector, the receiver may not be able to correctly perform signal detection. Second, even if the receiver can separately

76

decode the signals transmitted from relay node 0 and 1, only by looking at the physical layer symbols, it is very hard to tell which node is a friendly relay and which is a malicious attacker. It is possible either relay node 0 or 1 is garbling the transmitted signals, or even both. Ambiguity in Conventional Signal Detector To analyze the di?culty in signal detection using the conventional signal detector, we ?rst review the detection rule for the space-time code in Table 3.1. When both relay nodes send signals according to the ST code, the conventional maximum likelihood signal detector is equivalent to minimizing the following quantity min
2 2 (?0 + ?1 ? 1)(|xi |2 + |xj |2 ) + d2 (c0 , xi ) + d2 (c1 , xj ) .

xi ,xj ?M

(3.3)

Here c0 and c1 are computed from the received signals in (3.1) and channel gain
? c0 = h? 0 r0 + h1 r1 , ? c 1 = h? 1 r0 ? h0 r1 .

When the transmitted signal is garbled as in Table 3.2, the combined signals using the conventional decoding rule produces the following result:
? ? ? c0 (g) = h? 0 (h0 s0 + h1 s2 + n0 ) + h1 (?h0 s1 + h1 s3 + n1 )

(3.4)

=

2 ?0 s0

+

h1 h? 0 s2

?

h? 0 h1 s 1

+

2 ?1 s3

+

( h? 0 n0

+

h1 n? 1)

2 2 ? Comparing to c0 = (?0 + ?1 )s0 + (h? 0 n0 + h1 n1 ), which is the combined sig-

nal without signal garbling, we can see that c0 (g) contains the same noise signal
? (h? 0 n0 + h1 n1 ), but the deterministic part has been signi?cantly changed from 2 ? 2 2 2 s 0 + h1 h? )s0 to (?0 + ?1 (?0 0 s2 ? h0 h1 s1 + ?1 s3 ). Such garbling can easily lead to

a detection error, as is shown in the following example. Example: Suppose the signalling scheme is QPSK, and the information symbols are chosen as s0 = ?s3 and s1 = ?s2 . Furthermore, consider that identical

77

channel coe?cients h0 = h1 = ?ej? . From (3.4) we obtain the combined signal as
? c0 (g) = ?2 (s0 + s2 ? s1 + s3 ) + (h? 0 n0 + h1 n1 )

(3.5)

= 2? s2 +

2

( h? 0 n0

+

h1 n? 1)

Under the maximum likelihood detection rule, the receiver outputs the signal constellation that is closest to c0 (g) in Euclidean distance. Hence the detection result will most probably be s2 , while the actual signal sent is s0 . This illustrates the ambiguity in the conventional signal detector when encountering a adversarial relay node.

3.1.3

Proposed Framework

There are two main challenges to distinguish the malicious relay from cooperative nodes at the information receiver. First, traditional cryptography at the application layer may be able to detect the attack, but cannot pinpoint the source of the attack. This is because the received signals from possibly multiple relays are superimposed with each other and corrupted by noise. At the physical layer, we need to separately detect the signal symbols from each of the relay paths. Second, we notice that such symbol-by-symbol detection is only feasible under certain conditions, and the detected symbol may have low reliability. Therefore it is necessary to aggregate the multiple symbol detection results for reliably distinguishing adversary from cooperator. At the same time, the receiver needs to obtain some side information about the “ground truth” of the relay signals. This side information is used to compare with the received signals for identifying the malicious relay. We propose a cross-layer framework for tracing malicious relay, as shown in Fig. 3.2. The sender and receiver set up a secret key before sending messages. We refer to this key as the tracing key. This key is unknown to the relay nodes.

78

Sender side scheme

Source data

Channel coding

Insert tracing symbols

Modulation

To channel

From channel Demodulation

Channel decoding

Data to application layer

Receiver side scheme
Extract tracing symbols

Tracing symbol processing
tracing symbol detection + error prob. estimation Symbol aggregation: prescreening + correlation decision

Figure 3.2: Schematic of the tracing scheme for malicious relay. During communication, the information sender inserts a small number of pseudorandom signalling symbols at random locations in the symbol stream. We refer to the inserted symbols as the tracing symbols. Both the insertion location and the inserted tracing symbols are generated using a cryptographically secure function with the tracing key. Upon receiving the relay signals, the receiver uses the tracing key to ?nd out the location of the tracing symbols, extract them, and apply signal 1 detection. To process the detected tracing symbols, receiver also compute the “ground truth” of the tracing symbols using the tracing key and compare them with the detected tracing symbols from the relay path. Such a comparison can tell whether a relay node is adversarial or cooperative. The details of the tracing scheme consists of two parts: (1) how to detect the garbled tracing symbols, and (2) how to aggregate the detection results from multiple tracing symbols to achieve a reliable decision. These two components are presented in the next two sections.

79

(

x1, x1)
(

(

x2, x1) x2
( (

x0, x1)

( x 1, x 0 ) x1 (x 2, x 0) (x 0, x 0) (

x3, x1)

x3, x0)

x0

(

x2, x2)
(

x1, x2)

(

x0, x2) x3

( (

x2, x3)

x1, x3)
(

x0, x3)

x3, x2)

(

x3, x3)

Figure 3.3: The pair-wise combined signal constellations. Notation (xi , xj ) indicates the signal point is formed by the combination h0 xi + h1 xj , where h0 = 1/2 and h1 = ej?/4 .

3.2
3.2.1

Detecting Garbled Signals
Resolving Ambiguity Using One Receive Antenna

In this part, we discuss how the receiver with a single receive antenna can resolve the ambiguity in the garbled signal and estimate the information sent respectively by the two relay nodes. Consider the following example. The channel gain h0 = 1/2 and h1 = ej?/4 . QPSK is chosen as the signalling scheme with constellations M = {x0 , x1 , x2 , x3 }. The received signal, by combining the signals from two relay nodes and the additive noise, takes the form r = h0 s0 + h1 s1 + n, where s0 , s1 ? M and n is the complex Gaussian noise. If we are only concerned with the deterministic part of the signal, let y = h0 s0 + h1 s1 and we can see that y can take 16 distinct constellations in the complex signalling plane, as shown in Fig. 3.3. In this ?gure, the combined signal constellations are shown together with the original QPSK constellations. Rewrite the received signal as r = y + n. We can detect the signal y (corrupted

80

in noise) as if the original signal contains a 16-point constellation. Under the complex Gaussian noise assumption, the maximum likelihood detector is equivalent to the minimum distance detector. Thus the detection result in the two dimensional signal plane is the signal constellation closest to the received signal r. Due to the smaller separation and geometrical irregularity in the combined signal constellations (Fig. 3.3), the probability of error in signal detection would naturally increase, and the closed form expression of the error probability cannot be obtained in general. Here we outline a procedure for detecting combined signal and computing the probability of error: (a) Suppose the original signal constellation is M and the condition of the two channels are known as h0 and h1 . We ?rst ?nd the combined signal constellations Y = {yk : yk = h0 xi + h1 xj , xi , xj ? M}. (b) In the two-dimensional signal plane, ?nd the Voronoi diagram V associated with the signal constellations Y . The Voronoi cell Vi delimits the areas that are closer to a signal yi than any other signal constellations. If the received signal r ? Vj , the detection output is yj . (c) Estimate the error probability Pe for each received signal r. This procedure is presented in details below. Error Probability in Combined Signal Constellation Denote the probability that a signal yi ? Y is sent while yj ? Y is detected by Pr(yj |yi ). According to Bayes rule, the probability that an error occurred when yj is detected is Pr(e|yj ) = Pr(error and detect yj ) = Pr(detect yj )
yi ?Y ,i=j

Pr(yj |yi )Pr(yi ) . yi ?Y Pr(yj |yi )Pr(yi )

(3.6)

We ?rst simplify the above equation by noticing that each constellation point

81

yi ? Y is sent with equal prior in most practical applications, i.e., Pr(yi ) = 1/|Y| for all i. Second, if the Voronoi cells Vi and Vj are not neighbors, the probability Pr(yj |yi ) and Pr(yi |yj ) are most likely to be small. Such terms can be ignored in the estimation of (3.6). Third, if Vi and Vj are neighbors, then their shared boundary line must be perpendicular to and dissect the line segment from yi to yj . Without considering the interference from other constellation points and under complex Gaussian noise, we can approximate Pr(yj |yi ) ? Pr(yi |yj ). Such approximation is also true when the change in the probabilities Pr(yj |yi ) and Pr(yi |yj ) introduced by the geometry of other constellation points are small. This allows us to rewrite (3.6) as Pr(e|yj ) =
yi ?Y ,i=j

Pr(yi |yj ).

(3.7)

In the above equation, yj is the detected signal, and yi runs through all constellation points. The conditional error probability in (3.7) can be computed using Monte-Carlo method in two steps. Step-1: Generate samples t = yj + e, where e is drawn from noise distribution N with noise variance according to SNR. Use the detection algorithm proposed above to obtain the detected symbol r and compare them with the ground truth yj . Step-2: Repeat Step-1 for N times and count the number of times with incorrect detection Ne . When N is large, the probability of error due to receiver side noise can be estimated by PeN ? Ne /N . Error Probability in a Single Relay Stream Suppose the transmitted symbols by the two relays are x0 and x1 , respectively. Denote the combined signal by y = h0 x0 + h1 x1 . The detected combination sym-

82

Table 3.3: Channel Conditions with Two Receive Antennas relay node 0 Rx 0 Rx 1 h0 h2 relay node 1 h1 h3

bol y ˆ corresponds to a pair of detected symbols x ˆ0 and x ˆ1 . A detection error in y ˆ can be due to one of the the two cases: (1) either x ˆ0 or x ˆ1 is incorrect, but not both; (2) both x ˆ0 and x ˆ1 are incorrect. We observe that situation (1) occurs with much higher probability than (2). In most cases, the two channel coe?cients h0 and h1 have di?erent magnitudes. Usually the signal symbol associated with the channel having a smaller-magnitude coe?cient (or stronger attenuation) has a much higher chance to be incorrectly detected. We assume the relay channels are independent, and the probability for each relay channel to be attenuated more than the other channel is 1/2. Thus for two relay channels, we can approximate the error probability of the tracing symbols from a single relay stream as pe = PeN /2. (3.8)

In the above detection algorithm, the detection complexity increases to O(M 2 ), where M is the number of signals in the original constellation. The detection rule has removed the time diversity originally in the ST code scheme. The error probability depends on the channel coe?cients h0 and h1 , which in?uence the geometry of the combined signal constellations.

3.2.2

Resolving Ambiguity Using Two Receive Antennas

When the receiving side has more than one receive antenna, the signal detection for garbled signals can take advantage of the additional antenna. We assume that

83

Table 3.4: Received Signals at the Receive Antennas Rx 0 t t+T r0 r1 Rx 1 r2 r3

the channel conditions between the two relay nodes and the two receive antennas are known at the receiver, as shown in Table 3.3, and the channel variation is negligible for adjacent symbol durations. The signals sent by the two relay nodes are according to Table 3.2. The received signals at the two time slots are shown in Table 3.4. The signals received at the ?rst time slot are ? ? ? r0 = h0 s0 + h1 s2 + n0 , ? ? r2 = h2 s0 + h3 s2 + n2 . The signals received at the second time slot are ? ? ? r1 = ?h0 s? + h1 s? + n1 , 1 3 ? ? ? r3 = ?h2 s? 1 + h3 s3 + n3 .

(3.9)

(3.10)

We observe that only the signals s0 and s2 appear in (3.9), which corresponds to the received signals at the ?rst time slot. Similarly, equation array (3.10) has the same structure as (3.9). From now on we will only focus on the signal detector for s0 and s2 in (3.9). The detector for s1 and s3 can be obtained similarly. Let us de?ne d1 = d(r0 , h0 xi + h1 xj ), d2 = d(r2 , h2 xi + h3 xj ). It can be shown that under uncorrelated Gaussian noise, the maximum likelihood detector for s0 and s2 chooses signal constellations xi , xj ? M that minimizes the (3.11)

84

2 2 2 sum (d2 1 + d2 ). Expanding (d1 + d2 ) using (3.9) and (3.11) leads to 2 2 2 2 2 (d2 1 + d2 ) = |r0 | + |r2 | + d (w0 , xi ) + d (w1 , xj ) 2 2 2 ? |w0 |2 ? |w1 |2 ? d2 (v, x? i xj ) + |v| + |xi | |xj | 2 2 2 2 + (?0 + ?2 ? 1)|xi |2 + (?1 + ?3 ? 1)|xj |2 .

(3.12)

Here three auxiliary variables w0 , w1 , and v are de?ned as
? w 0 = h? 0 r0 + h2 r2 , ? w 1 = h? 1 r0 + h3 r2 , ? v = h0 h? 1 + h2 h3 . 2 Thus minimizing (d2 1 + d2 ) is equivalent to minimizing the following quantity 2 2 T = d2 (w0 , xi ) + d2 (w1 , xj ) ? d2 (v, x? i xj ) + |xi | |xj |

(3.13)

(3.14)

+

2 (?0

+

2 ?2

? 1)|xi | +

2

2 (? 1

+

2 ?3

? 1)|xj | .

2

For PSK signals, the last three summation terms in (3.14) can be removed as their values are ?xed. This reduces (3.14) to TP SK = d2 (w0 , xi ) + d2 (w1 , xj ) ? d2 (v, x? i xj ). The optimum detector has the following structure (ˆ s0 , ˆ s2 ) = argmin(xi ,xj )?M T (xi , xj ). (3.16) (3.15)

The complexity in computing T in (3.14) is O(M 2 ). This is because the signal norms {|xi |} can be pre-computed. Computing d2 (w0 , xi ) and d2 (w1 , xi ) takes
2 linear time (w.r.t. M ). Only computing d2 (v, x? i xj ) takes O (M ) time, which is

the dominating factor in the detector complexity. Similar to the case of one receive antenna, the symbol error probability in the case of two receive antennas can be obtained using Monte-Carlo method. Generally

85

speaking, the symbol error probability of using two receive antennas is lower than that of using only one receive antenna. To see this, let us consider the combined signal constellation sets Y1 = {h0 xi + h1 xj | xi , xj ? M} and Y2 = {h2 xi + h3 xj | xi , xj ? M} from the two combining channels in (3.9). With one receive antenna, the detection errors are usually caused by a constellation point u1 that is close to t1 in Y1 with a small Euclidean distance d(u1 , t1 ). Thus noise may drag the received signal closer to u1 . Recall that u1 is a mapping from a pair of sent signals (xi , xj ) ? M by channel coe?cients h0 and h1 . At the second receive antenna, the same (xi , xj ) pair is mapped by uncorrelated channel coe?cients h2 and h3 to a signal u2 in Y2 . The probability that the distance from u2 to t2 (d(u2 , t2 )) is also small will be low. Therefore when we have two receive antennas, the signal detection accuracy would naturally increase. We have shown that with two receive antennas, the receiver is able to detect the potentially garbled signalling symbols from each relay node, with higher accuracy than using one receive antenna. The only “luck” required by such a signal detection scheme is that the channel condition matrix H is non-singular, i.e., ? ? ? h0 h1 ? det(H) = det ? ? = 0. h2 h3 Otherwise, it would be easy to see that the situation described in (3.9) reduces to the situation of only having one receive antenna as discussed in Section 3.2.1.

86

3.2.3

Generalization

In this part, we generalize the proposed signal detection approach to the scenario of having R relay nodes and K receive antennas. Denote the channel coe?cient between the i-th receive antenna and j -th relay node by hij . At any given symbol duration, the transmitted signal symbol from the j -th relay node is denoted by sj . Since each relay node can potentially be adversarial, we do not impose any constraint on the signal that can be transmitted by a relay node. The receive signal at the i-th receive antenna is
R

ri =
j =1

hij sj + ni

(1 ? i ? K ).

(3.17)

The maximum likelihood detector for the signal (s1 , s2 , ..., sR ) is

K

R

(ˆ s1 , ˆ s2 , ..., ˆ sR ) = argmin(z1 ,z2 ,...,zR )?MR {
i=1

d2 (ri ,
j =1

hij zj ).}

(3.18)

De?ne
K K K 2 ?ij , i=1

wj =
i=1

ri h? ij ,

2 ?j

=

and vjt =
i=1

hij h? it (1 ? j ? R, 1 ? t ? R).

After some algebraic manipulations, it can be shown that minimizing (3.18) is equivalent to minimizing

R

R

R?1 2 (?j ? 1)|zj |2 ?

R 2 2 d2 (vjt , z? . (3.19) j zt ) ? |zj | |zt |

T =
j =1

d (wj , zj ) +
j =1

2

j =1 t=j +1

The formulation of the minimization objective function (3.19) has the advantage that it reduces the computation complexity in searching the detection output. Because zj takes values only from the signal constellation set M, there for a R-byM table can be pre-computed for d(wi , zj ), (1 ? i ? R, 1 ? j ? M , zj ? M). The

87

magnitude |zj |2 |zt |2 (1 ? j ? M, 1 ? t ? M , zj , zt ? M) can be pre-computed to obtain a M -by-M table. The term d(vjt , x? 1 x2 ) can be pre-computed for every possible combinations of vjt (1 ? j ? R ? 1, j + 1 ? t ? R) and x? 1 x2 (x1 , x2 ? M). Thus obtaining a table of size
R(R?1) 2

× M × M . Since computing each vjt (or wi )

term takes K multiplications, the total number of multiplications needed is about (R2 M 2 K/2). To search for the optimal (z1 , z2 , ..., zR ) ? MR , a total of M R possible combinations need to be computed. With the help of pre-computed tables, the rest of the computation cost would only be the cost for performing additions. If we directly compute the general formulation using (3.18), the total number of multiplications is about (M R RK ), which is signi?cantly larger than (R2 M 2 K/2) when R is large. Therefore the e?ciency of our proposed algorithm is superior than the brute-force computation of (3.18).

3.3

Cross-Layer Scheme for Tracing Adversarial Relay

The physical layer signal detection scheme, working on a symbol-by-symbol basis, has inevitably low reliability and detection accuracy. In this section, we present the adversary-tracing scheme outlined in Sec. 3.1.3 from a system perspective, which aggregates the detection results of a stream of tracing symbols, and signi?cantly improves the accuracy of tracing the malicious relay.

3.3.1

The Tracing Algorithm

In a secure cooperative communication system, it is expected that the sender applies application-layer encryption and message authentication to protect data

88

con?dentiality and integrity. In our proposed scheme, the sender also insert pseudorandom tracing symbols into the data symbol stream. The inserted symbol values and locations can be computed using a cryptographic one-way function with the secret key as input. Such a scheme enables the receiver to recompute the ground truth of the tracing symbols and their locations. At the receiver side, the receiver extracts the tracing symbols according to the signal detection algorithm presented in the previous section. Since there may be multiple relay nodes transmitting according to a cooperative relay scheme (either coded or non-coded), the detected tracing symbols and the ground truth for each relay stream should be treated separately. Next, we present the tracing scheme using aggregated information from multiple tracing symbols for one relay stream, and this scheme can be repeatedly applied to all the relay streams. Denote the detected tracing symbols from one relay node at the physical layer by [s1 , s2 , ...., sn ], and the ground truth by [t1 , t2 , ..., tn ]. Additionally, a con?dence value pi for each detected tracing symbol si , indicating the probability of correct detection for each tracing symbol is also provided. The algorithm for determining whether the relay node is cooperative or adversarial consists of the following steps: (1) Pre-processing: Remove the detected tracing symbols si (and the corresponding ground truth ti ) whose con?dence value pi is below a pre-determined threshold ? . Hence the two sequence may be shortened in this step. (2) Symbol mapping: Assign binary Gray code to the two-dimensional signal constellations. After the mapping, use antipodal signal to represent the results, i.e., represent binary bit one by “1” and binary bit zero by “-1”. Thus the mapping results are two sequences [s1 , s2 , ..., sm ] and [t1 , t2 , ..., tm ], whose elements take value in {?1, +1}.

89

(3) Correlation decision: Compute the normalized correlation ?=
i si ti 2 i si

2 i ti

.

(3.20)

Then compare it with a threshold value ? to make a decision ? ? ? cooperative if ? ? ? ; decision = ? ? adversarial if ? < ?.

(3.21)

In the above algorithm, the pre-processing is to ensure that each tracing symbol involved in the ?nal decision is reliable enough, i.e., with probability of correct detection Pc ? ? . The mapping from signal constellations to binary data is conventional in digital communication system [92]. Gray coding will ensure that the constellations that are close in Euclidean distance will be mapped to binary strings with small Hamming distance. The above algorithm can be implemented sequentially, i.e., with each received tracing symbol, the correlation value ? can be updated with a relatively low computation overhead. Suppose k reliable tracing symbols have been received and each tracing symbol maps to one binary bit (i.e. BPSK). Let Ck = Sk =
k i=1 k i=1

si ti ,

s2 i , and Tk =

k 2 i=1 ti .

When the (k + 1)-th tracing symbol arrives, the

2 receiver updates Ck+1 = Ck + sk+1 tk+1 , Sk+1 = Sk + s2 k+1 , Tk+1 = Tk + tk+1 , and ? computes the updated ? = Ck+1 / Sk+1 Tk+1 . Thus as more tracing symbols ar-

rive, the receiver can gradually improve the accuracy of the tracing scheme. Such a property will be discussed in the next subsection in details. The main costs for such a tracing scheme include: (1) the computation at the receiver side; (2) the bandwidth cost by inserting the tracing symbols into the data stream, which incurs about 1-3 % overhead; (3) the cost of setting up the secret key for the tracing scheme, which can be done at the same time when setting up

90

the application layer secret keys for using cryptographic tools. The mechanism for tracing the adversarial relay is only activated when the receiver detects abnormal behavior from the relay signals. For example, when application layer cryptographic authentication frequently would not pass or decryption results in meaningless data.

3.3.2

Analysis of the Correlation Statistics ?

In this part, we analyze the statistical property of the correlation statics ?. Consider N tracing symbols have been received, each with probability of correct detection pc = 1 ? pe , where pe is the single stream error probability in Eqn.(3.8). The result of symbol detection is independent for each symbol. This is because each tracing symbol is randomly chosen, and the time slot for transmitting each tracing symbol is sparsely and pseudo-randomly spaced. We assume that if an error happens during symbol detection, it is equi-probable to choose each of the (2m ? 1) erroneous detection results, i.e., each with probability 1/(2m ? 1). We rewrite Eqn.(3.20) as follows 1 ?= N
(k) N

Ck , and
k=1

1 Ck = m

m

S i Ti .
i=1

(k )

(k)

(3.22)

Here {Si }m i=1 is the binary expansion of the k -th tracing symbol, represented by antipodal signals ±1, and {Ti }m i=1 is the corresponding ground truth generated using cryptographic tools with the secret tracing key. Since {Ck }N k=1 are i.i.d. independently distributed, we denote their common mean and variance by µc and
2 . Thus we have ?c (k)

E(?) = µc and

2 Var(?) = ?c /N.

(3.23)

Although ? only takes discrete values, when N approaches in?nity, the distribution of ? will converge to a normal distribution according to central limit theorem. As

91

the mean and variance would su?ce to describe such distributions, we derive µc
2 and ?c under two di?erent hypothesis, i.e., whether the relay node is cooperative

(H0 ) or adversarial (H1 ). To make the presentation concise, we will drop the index k in Ck , Si , and Ti . Mean and Variance Analysis Hypothesis H0 (Relay node is cooperative): When the relay node is cooperative, the received Si will only di?er from ground truth Ti due to noise corruption.
m When a detection error occurs, the bit sequence {Si }m i=1 can di?er from {Ti }i=1 by (k ) (k)

?ipping j bits, where j ? {1, 2, ..., m}. Since each of the (2m ? 1) detection error occurs with equal probability, the probability that exact j bits are ?ipped from
m {Ti }m i=1 to {Si }i=1 is pj = m j

/(2m ? 1). Thus the mean of C is
m

1 E(C |H0 ) = m =1?

(1 ? pc )[
j =1

m j

2m ? 1

(m ? 2j )] + pc m (3.24)

2 2m

m

?1

(1 ? pc ).

The variance of C can be obtained as Var(C ) = E(C 2 ) ? [E(C )]2 The second moment of C can be computed as 1 E(C |H0 ) = 2 m
2 m

(3.25)

(1 ? pc )
j =1

m j

2m

?1

(m ? 2j )2 + pc m2 (3.26)

2m ? m = (1 ? pc ) m + pc (2 ? 1)m

Hypothesis H1 (Relay node is adversarial): When the relay node is adversarial, the transmitted tracing symbol takes value from each of the 2m constellations

92

with equal probability of (1/2m ) 1 . Thus the mean of C is 1 E(C |H1 ) = m
m m j 2m

(m ? 2j ) = 0.

(3.27)

j =0

Again, the variance of C can be obtained from its second moment, which is 1 E(C |H1 ) = 2 m
2 m m j 2m

(m ? 2j )2 =
1 . m

j =0

1 m

(3.28)

Thus the variance of C under hypothesis H1 is also

Since the variance of the correlation statistics ? decreases with more received tracing symbols,the two hypothesis can be gradually separated, resulting in improved tracing accuracy. The threshold value ? in (3.21) can be dynamically adjusted according to the number of tracing symbols N and the probability pc . Distribution Analysis The close-form probability mass function of the tracing statistics ? can be derived for some scenarios. We note that such a distribution can be expressed in a uni?ed framework for both H0 and H1 . Here we give an example when m = 2, i.e., the signal plane contains four constellation points. When m = 2, Ck in (3.22) can take three values ?1, 0, and 1. Denote the probability that Ck takes value 1 by p1 , and that it takes value 0 by p0 . When N symbols are reliably detected, ? can take
1 1 discrete values from the set {?1, ?1 + N , ..., 0, ..., 1 ? N , 1}. Denote the number of

Ck ’s with value 1 by N1 and that with value ?1 by N?1 , and that with value 0 by
1

Here we have made another simplifying assumption. The adversarial relay can certainly

choose to transmit the correct tracing symbol from time to time to avoid being identi?ed. However, in order to interrupt communications, the adversary need to transmit at least a fair portion of garbled signals. The simpli?ed analysis here can be utilized in subsequent discussions when we consider more sophisticated attacks and involving system issues such as channel coding.

93

N0 . When ? =

k , N

this means N1 ? N?1 = k. We thus can obtain the probability

mass function accordingly. When k ? 0: k Pr[? = ] = N =
t=0
N ?k 2

Pr(N1 = k + t ; N?1 = t; and N0 = N ? (k + 2t))
t=0
N ?k 2

N k+t

N ? (k + t) k+t N ?(k+2t) p1 p0 (1 ? p1 ? p0 )t(3.29) . t

When k < 0: k Pr[? = ] = N
N ?|k| 2

t=0

N |k | + t

N ? (|k | + t) t N ?(|k|+2t) (1 ? p1 ? p0 )|k|+t . (3.30) p1 p0 t

When applying this uni?ed distribution to two hypothesis H0 and H1 , the probability p1 and p0 can be adjusted. For example, under hypothesis H0 , p1 = pc and p0 = 2(1 ? pc )/3. Under hypothesis H1 , p1 = 0.25 and p0 = 0.5. Validations and Discussions In this part ,we experimentally validate the mean, variance, and distribution of ? under hypothesis H0 and H1 . When the relay node is adversarial (hypothesis H1 ), we experimentally compute the mean and variance of ? using QPSK signalling. The results about mean and variance are shown in Fig. 3.4(b) and Fig. 3.5, respectively. From Fig. 3.4(b) we can see that the experimental mean approaches the analytical value 0; and in Fig. 3.5 we can see that the variance of rho decreases with the increase in the number of tracing symbols. Approximating the mean and variance for ? under hypothesis H0 is a more intricate task. Our analytical mean and variance are obtained for any ?xed value of pc . In practice, ? is computed from a large number of tracing symbols, each with di?erent pc values. Nonetheless, we observe that the analytical expression of

94

Mean of ? under H0, two Rx, QPSK (m=2) 1 0.999 0.998 0.997 E(? | H ) 0.996 0.995 0.994 0.993 0.992 0.991 12 13 14 SNR (dB) 15 Experimental Analytical E(? | H1)
0

5 4 3 2 1 0 ?1 ?2 ?3 ?4 17

x 10

?3

Mean of ? under H1, two Rx, 200 tracing symbols experimental analytical

16

?5 12

13

14 SNR (dB)

15

16

17

(a) Relay node is cooperative (H0 )

(b) Relay node is adversarial (H1 )

Figure 3.4: The mean of ? under hypothesis H0 and H1 . E(?|H0 ) in Eqn. (3.24) is a linear function of pc . This allows us to average the estimated pc over a large number of tracing symbols to obtain the analytical value for E(?|H0 ). We compare the analytical estimation obtained using Eqn. (3.24) and the experimental results in Fig. 3.4(a). Fig. 3.4 shows that, for SNR from 12 dB to 17 dB, our analytical approximations closely ?t the experimental results, with an error margin on the order of 10?3 . We plot the analytical distribution of ? in Fig. 3.6 for N = 100. To validate our analysis that the distribution of ? approaches Gaussian as N becomes large enough, we also performed ?2 test to quantitatively measure how close the distribution of ? is with respect to a Gaussian distribution. The results in Table 3.5 indicates that with the probability of miss equal 0.1, we can classify ? as a Gaussian random variable. Such distributions of the tracing statistics ? under two hypotheses clearly indicate that we are able to distinguish the cooperative relay from malicious ones with high accuracy.

95

Variance of ? under hypothesis H1, one Rx, QPSK (m=2) 0.05 experimental analytical

0.04 var(? | H1)

0.03

0.02

0.01

0

0

50 100 150 N: number of reliably detected symbols Variance of ? under hypotheis H , two Rx, QPSK (m=2)
1

200

0.05 experimental analytical

0.04

var(? | H1)

0.03

0.02

0.01

0

0

50

100 150 200 N: number of reliably detected symbols

250

300

Figure 3.5: The variance of tracing statistics ? under 10 dB SNR.

Analytical distribution of ? under two hypothesis when N = 100 0.14

0.12 Hypothesis H0 pc = 0.95

0.1

0.08 Pr (?) Hypothesis H1 0.06

0.04

0.02

0 ?0.6

?0.4

?0.2

0

0.2 ?

0.4

0.6

0.8

1

Figure 3.6: The distribution of tracing statistics ?.

96

Table 3.5: ?2 ?tting statistics of ? under H1 N One Rx Two Rx
Note: ?2 < ?2 ?

20 3.9847 4.2556

40 3.2885 6.0017

60 3.3828 5.2737

80 4.1831 0.4385

100 1.7825 1.6241

120 1.2650 1.3392

140 4.3765 1.7076

160 3.4841 0.5940

indicates the experimental data ?ts the analytical Gaussian distribution with miss probability ?.
2 Degree of freedom = 5; ?2 0.05 = 11.07; ?0.1 = 9.236.

3.3.3

A Randomized Attack Strategy

Since adversarial relays have every incentive to avoid being caught, a more sophisticated strategy is to “comply” with the transmission rule for part of the symbol stream and randomly jam the rest of the symbols, as long as the goal of corrupting communications can be achieved. Such a strategy results in a correlation statistics (from the partially garbled tracing symbols) that consists of two parts:
N1

?=
k=1

(c) Ck

N2

+
l=1

Cl

(g )

/(N1 + N2 )

(3.31)

Here N1 is the number of symbols that complies with the ground truth and N2 is the number of garbled symbols. Superscript
(c)

and

(g )

indicate whether a symbol

is compliant or garbled. Let N = N1 + N2 and it is straightforward to decompose (3.31) into two parts N1 ?= N =
N1 (c) Ck /N1

k=1

N2 + N

N2

Cl /N2
l=1

(g )

(3.32)

N1 (c) N2 (g) ? + ? N N

We can see that ?(c) follows the distribution in Hypothesis H0 and ?(g) that in H1 . It can be seen that the mean (resp. variance) of the overall correlation statistics ? are weighted combinations from the means (resp. variances) of ?(c) and ?(g) . Therefore, in general, the mean of such an attacked ? would increase and

97

the variance would decrease. The increase in the mean results in a smaller “gap” between the attacked statistics and the complied statistics. However, the gap cannot be inde?nitely small in practice. This is because the adversary’s primary goal is to corrupt communications. When channel coding are applied before symbol formation, if the attacker only garbles a very small portion of the symbol stream, the garbled information would be recovered by the FEC at the receiver side. The minimum portion of the symbol stream that the attacker needs to garble in order to corrupt communications is thus determined by the channel coding rate. Suppose the channel code can correct 1/4 of the symbol errors on average, the attacker needs to corrupt at least 1/4 of the symbol stream, i.e., N2 /N = 1/4 in (3.32). Using QPSK signalling m = 2 and with pc = 0.9, we can estimate the mean and variance of tracing statistics as E(?) = 0.65 and Var(?) = 0.261/N.

3.3.4

E?ects of Channel Estimation Error

In the previous discussions, we have assumed that the exact channel information can be obtained at the receiver side through channel estimation. In practice, the channel estimation will deviate from the actual channel information. This require that the tracing scheme should be adapted according to such deviations. In this part, we examine the e?ect of channel estimation error on the tracing statistics ?. The existing literatures on wireless channel estimation [32][59][98] indicate that with training symbols, the normalized mean square error (NMSE) of the channel estimation can approach the Cramer-Rao Bound (CRB) at high receiver SNR, or when the number of training symbols is large enough. With a moderate number of training symbols and high enough receiver SNR, we assume that the NMSE of

98

channel estimation is equal to the negation of the receiver SNR. We note that such a result can be obtained using various channel estimation methods, such as the Minimum Mean Square Error (MMSE) method [59], the least square (LS) method [32], or the Maximum Likelihood (ML) method [98]. For example, if the receiver side has a SNR of 14 dB, the NMSE of channel estimation will be about -14 dB. The deviation in channel estimation will result in a deviation in estimating the combined signal constellations, which is used to perform symbol detection. Since the energy of the error is relatively small compared to the energy of channel coe?cients, the estimated signal constellations YE can be seen as a slightly shifted and rotated version of the true signal constellations Y . This e?ect is illustrated in Fig. 3.7, where the actual signal constellations is drawn by ×, and the deviated one by ?. To see how a deviated signal constellation will a?ect symbol detection, we consider two received signal points, C and D, shown in the ?gure. Point C is generated by sending signal point A1 with additive noise. Using minimum distance detection, C is detected as point B1 in the estimated constellations. Since B1 ? YE and A1 ? Y correspond to the same set pair of transmitted symbols in M×M, such a result is a correct detection. However, for received signal point D generated by A2 and detected as B1 , the detection result is incorrect, because the constellation point in YE corresponding to A2 ? Y is B2 . When the channel estimation error is considered, the received signal model can be written as r = (H + nH )s + nT = Hs + nT + nH s, (3.33)

where nH is the uncertainty in channel estimation and nT represents receiver thermal noise. It can be seen that the second noise term nH s is proportional to the signal power |s|. Thus we refer to the noise introduced by channel estimation error

99

1.5 true constellations deviated const. signal point 1 signal point 2 B1 D Im 0 A2 B2

1 C A1 0.5

?0.5

?1

?1.5 ?1.5

?1

?0.5

0 Re

0.5

1

1.5

Figure 3.7: Illustration of channel estimation error and symbol detection in combined signal constellation plane. QPSK signalling, M = {1, ?, ?1, ??}. The true signal constellations (indicated by ×) Y = {y : y = h0 xi + h1 xj }, where h0 = 0.82 + 0.15?, h1 = ?0.36 ? 0.19?, xi , xj ? M. Because of the error in ˆ 0 = 0.84 ? 0.12? and h ˆ 1 = ?0.24 ? 0.22?. The channel estimation, the estimated h ˆ 0 xi + h ˆ 1 xj }. estimated signal constellations (indicated by ?) YE = {y : y = h

100

as “self-noise” 2 . We now quantitatively analyze the e?ect of inaccurate channel information to the tracing statistics ? under hypothesis H0 and H1 . The analysis under hypothesis H1 is straightforward. Since the adversary randomly choose one symbol in M to transmit, at the correlation phase (Eqn.(3.22)), even if the transmitted symbol is incorrectly detected, the e?ect is still the same as receiving a randomly chosen symbol in M. Thus the mean, variance, and distribution of ? under H1 will not change with inaccurate channel information. Under hypothesis H0 , if we know the mean square error (MSE) of channel estimation, we can then estimate the energy of the self-noise term in (3.33). Assuming the self-noise and thermal noise are independent, we can obtain the aggregated noise energy and the equivalent SNR. This SNR will lead to an equivalent error probability Pe for symbol estimation in the combined signal plane. From here on the rest of previous analysis for tracing statistics ? can be applied. In practical implementations, the total noise energy including the self-noise can be measured.

3.4

Simulation Results

In this section we present a number of simulation results using the tracing statistics ? for detecting malicious realy under di?erent Signal-to-Noise-Ratio (SNR). We consider three simulation scenarios. In the ?rst scenario, the malicious relay uses the basic attack strategy and we assume the receiver obtains perfect channel information. In the second scenario, the malicious relay employs the advanced attack strategy, and the receiver has perfect channel information. In the third scenario,
2

The term “self-noise” was introduced by Dr. Rajiv Laroia during his talk at University of

Maryland on Jan. 13, 2006.

101

the malicious relay uses the advanced attack strategy, and the receiver’s channel information is inaccurate, where the inaccuracy is quanti?ed by the normalized MSE of the channel estimation. In our simulations, The channel conditions are generated using the modi?ed Jakes model in [27]. Each channel follows time-correlated slow Rayleigh fading. Di?erent channels are uncorrelated. QPSK is chosen as the signalling scheme. Every second, a total of physical layer 20K symbols are transmitted. Since the fading is slow, the channel coe?cient for each symbol duration is considered constant. Every 64 symbols are grouped as a frame. In each frame the sender will send one, two, or three tracing symbols with equal probability. The time slot to send the tracing symbols are generated uniformly within each frame from 1 to 64. We consider two types of decision threshold ? . The ?rst is a simple average of the analytical expected values under two hypotheses as follows ? = (E(?|H0 ) + E(?|H1 ))/2. (3.34)

The second decision threshold setting we consider is the Neyman-Pearson setup, where we require the probability of miss is smaller than ? and minimize the probability of false alarm. The decision threshold can be computed using the distribution analysis results in Section 3.3.2. Here we demonstrate that even a simple threshold setting as in (3.34) will be very e?ective in detecting malicious relay. We note that these threshold settings are inherently adaptive. Because the mean and distribution of ? under two hypothesis are based on the receiver SNR, the attacker’s strategy, and the channel estimation error. Another important system parameter is the threshold probability of correct detection for each tracing symbol, ? (ref. Section 3.3.1). Generally speaking, when ? is large, each tracing symbol will have higher accuracy, thus we will need fewer

102

tracing symbols to obtain a converged tracing statistics ?. However, the fact that ? is high will reduce the chance that a received tracing symbol can be admitted to compute the ?nal tracing statistics. Since ? converges with more admitted tracing symbols, a higher ? will take a longer time for the tracing statistics to converge. This is especially a problem at very low SNR because most received tracing symbols have lower probability of correct detection. In our simulations, we set ? = 0.9 when channel estimation is considered perfect, and set ? = 0.75 otherwise.

3.4.1

Basic Attack with Perfect Channel Information

In this experiment, we simulate the tracing scheme with one adversarial relay node randomly transmit signalling symbols, and the other relay node transmit cooperatively. The simulations are performed for SNR in the range of 12 – 17 dB. In Fig. 3.8 we present one realization of the tracing statistics ? with respect to the number of received tracing symbols, under 13 dB SNR. The adversarial relay randomly chooses one symbol to send and the simulation runs in a 200-frame duration. We can see that using one receive antenna (left ?gure), only 40% of the tracing symbols are considered reliable, because the threshold probability ? is set rather high at 0.9. However, using two receive antennas (right ?gure) signi?cantly improves symbol detection probability. All the received tracing symbols are reliable. We can also observe that the tracing statistics from adversarial relay converges to its mean 0; and that from the cooperative relay converges to its mean close to 1. In both cases the convergence rate is fast, requiring only 50-100 reliable tracing symbols. We note that the simulation time duration is about 0.6 second, which indicates that the proposed tracing scheme can obtain a highly con?dent result

103

One Rx, SNR = 13 dB, symbol Pc ? 0.9, T = 200 frames 1

0.5 ? Symbol recovery rate = 40 % 0

cooperative adversarial ?0.5 0 20 40 60 80 100 120 140 number of reliably detected symbols 160 180

Two Rx, SNR =13 dB, symbol Pc ? 0.9, T = 100 frames 1.2 1 0.8 0.6 ? 0.4 0.2 0 ?0.2 cooperative adversarial Symbol recovery rate = 100%

0

50

100 150 200 number of reliably detected symbols

250

Figure 3.8: Tracing statistics ? under 13 dB SNR. Left: one receive antenna. Right: two receive antennas.

104

Histogram of tracing statistics with 200 experiments, SNR = 14 dB 70

60 decision threshold ? 50 ? under H
0

40 ? under H1

30

20

10

0

0

0.2

0.4

0.6

0.8

1

(a) histogram of ? under 14 dB SNR
Min(? | H0) and Max(? | H1) under different SNR, one Rx, 200 runs 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 12 min( ? ) under H0 max( ? ) under H1 decision threshold ?

13

14 SNR (dB)

15

16

17

(b) min and max of ? under di?erent SNR

Figure 3.9: Statistics of ? under basic attack with perfect channel information.

105

within half a second. For SNR value ranging from 12dB to 17dB, we perform the simulation 200 times for each integer SNR value. Each simulation run contains one random node that is adversarial and the other that is cooperative. We use the proposed tracing scheme to identify the adversarial and cooperative node. The number of received tracing symbols ranges from 100 to 300 in our simulations. Through all of our simulations, we achieve 100% correctness in detecting both malicious and cooperative nodes. We note that a tracing error can happen by either classifying a cooperative node as adversarial (false alarm), or classifying a malicious node as cooperative (miss). In Fig. 3.9(a) we plot the histogram of ? under 14 dB SNR. We can see that the decision threshold ? computed according to Eqn.(3.34) clearly separates the two classes of ? values. We note that although the values of ? over the 200 experiments are not exactly the same, they are very close as can be seen in the histogram. In Fig. 3.9(b) we show and minimum and maximum values of ? from 12 dB SNR to 17 dB. We also show the average value of decision threshold ? . Again, these decision thresholds perfectly separate the cooperative nodes from malicious ones.

3.4.2

Randomized Attack with Perfect Channel Information

In this subsection we present the tracing scheme under the advanced attack strategy discussed in Section 3.3.3. Similar to [61], we consider that the information sender employs a RS code to protect the transmitted information. Suppose the RS code has parameters (40, 20, 21). This forces the malicious relay to jam at least 1/4 of the symbol stream on average in order to successfully corrupt the communication link. In Fig. 3.10 we plot one realization of the tracing statistics under

106

Avanced Attack, One Rx, SNR = 10 dB, symbol Pc ? 0.9, T = 400 frames 1

0.9

0.8 ? 0.7 Symbol recovery rate = 18 % 0.6 0.5

cooperative adversarial 0 50 100 number of reliably detected symbols 150

0.4

Advanced Attack, One Rx, SNR = 10 dB, symbol Pc ? 0.9, T = 200 frames 1 0.9 0.8 0.7 ? 0.6 0.5 0.4 0.3 0.2 Symbol recovery rate = 85 % cooperative adversarial 0 50 100 150 200 250 number of reliably detected symbols 300 350

Figure 3.10: Tracing statistics ? under advanced attack strategy, 10 dB SNR. Left: one receive antenna. Right: two receive antennas.

107

Histogram of ? under advanced attack, 300 experiments, SNR = 12 dB 250

200

150

decision threshold ?

? under H0

100 ? under H 50
1

0

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

(a) histogram of ? under advanced attack, SNR = 12 dB
min(?|H0) and max(?|H1) under advanced attack, 300 runs 0.98 0.96 0.94 min( ? ) under H 0.92 0.9 0.88 0.86 0.84 0.82 0.8 0.78 12 13 14 SNR (dB) 15 16 17
0 1

max( ? ) under H

decision threshold ?

(b) min and max of ? under advanced attack

Figure 3.11: Statistics of ? under advanced attack with perfect channel information.

108

such a situation. The receiver SNR is 10 dB. For each signalling symbol, the malicious relay randomly choose to transmit a garbled signal with probability 1/4, or to transmit the correct signal symbol with probability 3/4. We can see from the ?gure that the separation of tracing statistics ? between the adversarial relay and the cooperative relay becomes smaller, but is still separable. The convergence rate of tracing statistics ? is slower than in the basic attack. Under advanced attack, the decision threshold ? is adaptively calculated according to Eqn.(3.32). For example, when adversarial relay garble 1/4 or more signalling symbols, the expected value E(?|H1 ) = 3E(?|H0 )/4. A threshold by simple averaging E(?|H0 ) and E(?|H1 ) would give ? = 7E(?|H0 )/8. We plot the histogram of ? under advanced attack in Fig. 3.11(a) with that from a cooperative relay. The histogram for decision threshold ? is also shown in the ?gure. Together these histograms indicate that the tracing statistics ? under two hypotheses are well separated by the adaptive threshold ? . In Fig. 3.11(b) we present the minimum of ? under hypothesis H0 and maximum of ? under H1 . Each data point (min or max) in the ?gure is obtained from 300 experiments. In all the experiments under all SNR values, we obtained perfect classi?cation, i.e., no false alarm or miss event appeared.

3.4.3

Randomized Attack with Channel Estimation Error

In Eqn.(3.33), we modelled channel estimation error as an additional noise term in the received signal. In our simulation, we intentionally add noise to the actual channel coe?cient, and provide the inaccurate channel coe?cients to the tracing algorithm. We assume the channel estimation mean square error is a known parameter. In this setting, we explore the Neyman-Pearson threshold setting by

109

Table 3.6: Experimental occurring frequency of miss and false alarm SNR (dB) number of miss number of false alarm 12 13 0 0 0 0 14 15 0 0 0 0 16 17 0 0 0 0

Advanced attack with channel estimation error, Neyman-Pearson setting Pm ? 10?5 , 300 runs.

analytically computing the distribution of ? under H0 . The threshold ? ensures that the miss probability Pf ? 10?5 . As in the previous simulation setting, the malicious relay randomly garble 25% of relayed signals. In Fig. 3.12 we the histogram of ? from 300 experiments. Two methods to compute the Neyman-Pearson threshold are tested. The ?rst method is to compute the threshold based on the estimated symbol error probability from each experiment run, as shown in Fig. 3.12(a). The second method is to use the average symbol error probability from all experiments, as shown in Fig. 3.12(b). In both cases the miss or false alarm occurred. In addition, we plot the analytical distribution for ? under hypothesis H0 in Fig. 3.12(b), which is the basis for computing the threshold ? . Due to the inaccuracy in estimating symbol detection error, the analytical histogram slightly deviate from the actual one by a mean shift of about 3 × 10?3 . In spite of the slight deviation in estimated distribution, the tracing algorithm performs very well as it achieves perfect detection accuracy. If the tracing scheme can allow a training stage, even such deviation in analysis can be corrected by using the results from training.

3.4.4

Discussions

The cost of the tracing scheme is the extra bandwidth for transmitting the tracing symbols, the computing resource used to detect garbled symbols and aggregate the

110

Histogram of ? under advanced attack with channel estimation error, SNR=13dB

200

150 ? under H0

decision threshold ? 100 ? under H1 50

0

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

(a) Threshold ? computed for each experiment run
Histogram of ? under advanced attack with channel estimation error, SNR=13dB 100 90 80 70 60 ? under H 50 40 30 20 10 0 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
1

Neyman?Pearson threshold ?

? under H

0

Histogram envelope according to analytical distribution

(b) Threshold ? computed using collective information from 300 experiments.

Figure 3.12: Statistics of ? under advanced attack with channel estimation error.

111

tracing symbols. The performance of the proposed tracing scheme is quanti?ed by detection probability. Generally speaking, improving the receiver SNR can improve the e?ciency and performance of the tracing scheme. This can be achieved in several ways, such as increase transmit power, using multiple receive antennas. With improved SNR, the receiver can achieve a better detection probability using a smaller number of tracing symbols, thus improving e?ciency and performance of tracing simultaneously. Another way to improve the detection probability of tracing scheme is to use stronger channel coding at the sender side. Thus the malicious relay has to garble more signals in order to corrupt communications. Throughout this paper, we have assumed that the pilot symbols for channel estimation are not manipulated by the malicious relay node. In reality, the malicious node can also garble these pilot symbols. However, in cooperative communications, a relay node is chosen only when the channel condition between itself and the receiving node is good enough. As such, in order to be chosen as relays and have the potential to corrupt communications, the malicious nodes cannot choose to significantly manipulate the pilot symbols. Otherwise the channel estimation result will indicate high ?uctuation in the channel condition and the malicious node will not be chosen as a relay.

3.5

Chapter Summary

In this paper, we identify the threat of signal garbling by malicious relay nodes in cooperative wireless communication systems. To counter such malicious attack, we proposed a cross-layer tracing scheme that can pinpoint the malicious relay with high accuracy. The proposed tracing scheme employs an adaptive signal detection algorithm, coupled with an statistical tracing symbol aggregation scheme. Our

112

analysis and simulation results show that the proposed tracing scheme can identify the malicious relay with probability of miss and false alarm as low as 10?5 , requiring only 1 ? 3% of the bandwidth when activated, and obtain the tracing result in a few seconds of signal transmission.

113

Chapter 4 Coordinated Sensor Deployment for Security and Sensing Performance
Wireless sensor network is an emerging area that integrates sensing, wireless communication, internetworking, distributed signal processing, embedded system and information security all through a set of tiny sensor nodes that are deployed in a designated area. It has a great potential to be widely used in environmental monitoring, building surveillance, industrial manufacturing, and military combat [1, 50]. To achieve security and other functionality using sensor nodes, the protocols for sensor networks must take into consideration the computation, memory, and power constraints of the sensor nodes. In addition, one aspect in sensor network design is often intricately involved with a number of other aspects. Since sensor network systems have inherent complex criteria and objectives, optimizing a single objective may impair the system performance on other aspects. One such example is the the sensor node deployment. The main jobs for most sensor nodes include sensing

114

and communications. In sensor deployment literature, there are works concerning the e?cient sensing coverage issue [74, 145, 120], or the secure communication problem [30, 64], but not yet both. As a ?rst step toward developing the theory and algorithms of more coordinated sensor deployment, this chapter focuses on jointly considering two important aspects, namely, the sensing coverage and communication security. A sensor network usually performs its task in the following way. Depending on applications, appropriate type of environmental information in the ?eld are ?rst gathered by the individual sensor nodes and processed; and the necessary information is then relayed to and/or collected by other nodes [35, 46]. The physical characteristics of the sensing and communication devices on board of a sensor impose limits on both the sensing range and the communication range. Therefore, the placement of sensor nodes will have a substantial impact on both the sensing coverage [74] and the communication connectivity [60]. Recently, the security of sensor networks has been brought to the attention of the research community [18][15][39]. As the sensor nodes rely on wireless transmission for communications, malicious adversaries could intercept the communications, modify the data packets, or inject falsi?ed packets. To ensure secure sensor communications, cryptographic mechanisms can be employed to encrypt the data and produce message authentication code (MAC). As symmetric-key cryptography that employ the same cryptographic key in the sending and receiving ends generally have substantially lower computational complexity than the public-key ones, symmetric-key cryptographic tools is generally preferred in practice because of limited computational resources at each sensor node. Furthermore, resourceconstrained sensor networks impose stringent constraints on the key establishment

115

scheme. Conventional key management schemes are either centralized by employing a key distribution server, or contributory by using public-key cryptography, and both often require a non-trivial amount of communications. These conventional schemes are not suitable in the sensor network scenarios [34]. To meet the challenge in designing secure sensor networks, key pre-distribution schemes have been introduced to address the special needs in sensor networks [34, 16]. There are two main scenarios that sensor deployment are modelled and studied, depending on whether the locations of sensor nodes can be adjusted after the initial deployment. The ?rst scenario is static deployment, where the location of the sensors will not change once deployed. When e?cient sensing coverage is the sole concern, the existing literature suggests that di?erent deployment topologies lead to substantially di?erent e?ciency in sensing coverage [20, 53]. In the mean time, researchers focusing on secure sensor communications have recently shown that, if the key pre-distribution can be adapted according to the sensors’ locations, we can substantially improve the probability for establishing secure communication links between sensors as well as the security against compromised nodes [30, 64, 143]. However, there is a very limited amount of analysis on how the topology of sensor locations a?ects both security and coverage issues [71]. In Section 4.2 of this chapter, we present analytic model and experimental validations on several practical topologies in terms of both sensing coverage and ability to establish secure communication links. We shall consider both the case when each sensor can be accurately placed at any desired location, and the case when the actual deployment deviates from the desired location. This investigation will provide important guidelines to sensor deployment for a variety of applications. The second scenario of sensor deployment considers more advanced sensing

116

devices, where the sensors have the capability of adjusting their locations after being deployed in the ?eld. This is particularly useful when the actual deployment deviates from the desired location. The current literature primarily concerns the development of adjustment algorithms to optimize the sensing coverage. As we shall show in Section 4.3, such adjustment may negatively a?ect the probability for sensors to establish secure communication links. In [91] and [22], the authors have proposed general frameworks for constrained sensor location adjustment by jointly considering sensing coverage and other performance aspects, such as sensor communications. For communication connectivity, the distance between nodes is a primary factor. However, more complicated factors than distance also play important roles in determining the secure connectivity between nodes. As such, the extension of prior work to incorporate secure communication is not straightforward. The nature of secure sensor communication requires comprehension of the key establish schemes, investigation into deployment topology, and special formulation of the key sharing constraint as part of the location adjustment algorithm. This motivates us to develop new location adjustment algorithms that can jointly optimize the sensing coverage and communication security. We further relate to the ?rst scenario by examining how di?erent topologies for the desired deployment locations a?ect the overall performance under these security-aware adjustment algorithms.

4.1

Background and Related Works

In this section, we review the background on sensing coverage and secure communications in sensor networks, and brie?y survey the related prior works. Throughout the discussion we adopt a simpli?ed mathematical model for sensing coverage. A

117

sensor node located at x0 has the capability S of sensing for a given location x ? ? ? 1 if d(x0 , x) ? Rs ; S (x0 , x) = (4.1) ? ? 0 if d(x0 , x) > Rs . where Rs is referred to as the sensing radius, and the distance metric d(·, ·) is usually the Euclidean distance. S = 1 indicates the sensor has the capability to sense and S = 0 otherwise. Analogous to the sensing capability, we can simplify the existence of a communication link between two sensor nodes n0 (located at x0 ) and n1 (located at x1 ) using the following model ? ? ? 1 if d(x0 , x1 ) ? Rc ; T (x0 , x1 ) = ? ? 0 if d(x0 , x1 ) > Rc .

(4.2)

where Rc is referred to as the communication radius. T = 1 indicates the link exists and T = 0 otherwise.

4.1.1

The Sensing Coverage Problem

E?cient Sensing in Static Deployment Suppose the sensor nodes with sensing radius Rs can be hand placed in the ?eld to the exact location of our choice. We are interested in the optimal way to place the sensors so that: (1) any location in the ?eld can be covered by at least one sensor; and (2) the nodes can perform sensing in an e?cient way. To quantify the e?ciency of sensing coverage, we de?ne the sensing e?ciency ratio ?, which is the ratio of two areas ?= Asep . Acol

Here Acol is the actual covered area by all the sensor nodes, and Asep is the sum of the area covered by each individual sensor. Apparently, we have Asep ? Acol as

118

25

20
20

15
15

10
10

5 0 ?5 ?10 ?15 ?20 ?20 ?10 0 10 20

5 0 ?5 ?10 ?15 ?20 ?25 ?20 ?10 0 10 20

(a) The square lattice.

(b) The hexagon lattice.

Figure 4.1: Two possible lattice deployment. Under full coverage requirement, the hexagon lattice has the lowest node density.

the coverage between sensors may overlap, thus ? ? 1. The closer the e?ciency ratio gets to 1, the higher the e?ciency. So the problem of optimizing coverage e?ciency can be formulated as to minimize the e?ciency coe?cient ? subject to the whole area can be fully covered. This problem is traditionally known as the circle covering problem [128], where a number of equivalent circles (i.e. circles with the same radius) are placed into a ?eld to completely cover the ?eld area. A sensor node is located at the center of a circle, and the radius of the circle corresponds to the sensing radius. If the circles are placed in repeated regular patterns, the circle centers form a lattice and the dissecting lines among the centers form a cell pattern. In Fig. 4.1 we show two possible covering layout using the square lattice and the hexagon lattice, respectively. Each layout leads to a speci?c e?ciency ratio ?, which is also known as the covering density or covering thickness in the mathematics literatures [20].
2 , where n is the total For the simpli?ed sensing model of Eqn. (4.1), Asep = n · ?Rs ? number of sensors. Kershner [53] has shown that a lower bound for ? is 2?/ 27 ?

119

1.21, which is achieved when the center of the circles (i.e. the sensor nodes) form a hexagon lattice. In this case, the distance between any two neighboring nodes is ? D = 3Rs . Fig. 4.1(b) illustrates the geometry of such a placement. Compared to the square lattice placement, which has e?ciency ratio ? = ?/2, the hexagon lattice placement gives much more e?cient coverage. Further, sensor placement can be viewed as spatial sampling from signal processing perspective. The literature there also suggests the superiority of hexagonal sampling lattice over the square lattice when the spatial spectrum of a 2-D signal being measured (such as a temperature ?eld) is bandlimited with a circular support. For the convenience of discussion, we de?ne the following notations. In the square lattice deployment, we denote the distance from a node to its horizontal/vertical neighbor by D1 , and the distance to its diagonal neighbor by D2 . ? 2 2 Thus we have D2 = 2D1 ? 1.41D1 , and the covering density ? = ?Rs /D1 . In the hexagon lattice, we denote the distance from a node to its six neighbors by D3 . Further, if we require that the two lattice have the same node density, we ? have D3 = 2/ 3D1 ? 1.07D1 . Throughout this chapter, we use the normalized distance with respect to D1 as the distance metric, and study the impact of deployment topology on the performance of sensing coverage and secure communications. Sensor Location Adjustment Algorithms In recent years, the advances in micro-electromechanical systems (MEMS) have made it possible for tiny sensor devices to walk as microrobots [89]. The locomotion capabilities of sensor nodes have made it possible to improve the sensing coverage after the initial sensor deployment. Consequently, a number of prior works have studied how to adjust the location of sensor nodes to maximize the total coverage in

120

a given area [75, 44, 74, 120, 145]. The total sensing coverage, ? , is the percentage of the area covered within the sensing range with respect to the total ?eld area. We can see that ? ? 1 and a larger ? represent a better coverage. Most existing algorithms for sensor location adjustment uses an iterative framework. In each iteration, sensors (or a cluster head) obtain their current locations and the relative locations to their neighbors. Based on these information, each node will compute a new location using the location updating algorithm. The general strategy is to spread out the sensor nodes as evenly as possible. For example, the virtual force algorithm (VFA) proposed in [145] compares the distance between a sensor and its neighbor nodes with a threshold distance. An attractive (resp. repulsive) virtual force is applied to the sensor node if the distance is greater (resp. smaller) than the threshold. The Minmax algorithm proposed in [120] employs the Voronoi cell concept and move the sensors to the center of the minimum-radius circum-circle of its Voronoi. Further, for calculating the sensing coverage, the authors of [74] and [75] have proposed polynomial-time algorithms to calculate the worst case and average case coverage. From secure communication point of view, however, the location adjustment intended only to maximize sensing coverage may reduce the secure communication connectivity. This is because the secure links established before location adjustment may no longer exist after location adjustment and some sensor nodes can be moved to un-preferred locations in terms of establishing secure communications using pre-distributed key. In Section 4.3, we will present a detailed example to illustrate the limitation of the existing adjustment methods and discuss how to balance the tradeo? between the sensing coverage and the node connectivity using secure links.

121

4.1.2

Key Pre-distribution for Sensor Networks

As reviewed earlier, one of the critical issues for secure sensor communication is to establish a cryptographic key between two sensors. To accommodate stringent resource constraints in sensor network systems, an increasingly popular approach is to preload each sensor with a set of keys from a large collection of keys. This entire collection of keys is referred to as the key pool and the set of the keys loaded by each sensor is referred to as the key ring. Once the sensors are deployed in the ?eld, neighboring sensors will follow certain protocol to discover whether they share some secret keys. If so, they will use these shared secret keys to establish a secure communication link. There are two requirements for establishing a secure communication link between two sensor nodes: (1) two nodes should be within communication range; and (2) two nodes should share at least one secret key. The ?rst work on random key pre-distribution [34] was proposed by considering the sensor nodes are randomly deployed into the ?eld. Later, Du et al. and Liu et al. proposed to incorporate sensor location knowledge into key pre-distribution [30] [64]. The deployment model in these works considers the sensors being deployed at the center of evenly partitioned square cells. Each cell will have its own key pool, and only neighboring cells will have overlap between their key pools. The sensors in each cell will randomly pre-load keys from the key pool of its own cell. Since the key pool of each cell is much smaller compared to the key pool for the entire sensor node collection, neighboring sensors will have a higher chance to share keys. Most recently, Zhou et al. identify the improved circular symmetry of the hexagon cell than the square one to re?ect the common shape of sensors’ communication range, and propose to use hexagon lattice in location-based key pre-distribution [143]. In location-based key pre-distribution, each sensor has a designed deployment

122

location for establishing secure communication link. In practice, these designed locations may not be the same as the locations determined according to the sensing performance. This motivates us to study the impact of practical sensor deployment on establishing secure communications.

4.1.3

Adversary Model and Link Compromise Probability

In addition to the ability to establish secure links between nodes, resilience against node compromise is another important security aspect to be examined. As sensor nodes may be deployed in adversarial environment, a deployed node could be captured by adversaries. We assume the adversaries will try to capture the deployed sensor nodes and use the pre-loaded keys in the captured nodes to eavesdrop secure links among sensor nodes that are not compromised. In measuring such a potential threat, the probability of link compromise due to node compromise is an important security metric considered by previous key pre-distribution works [34, 16, 63, 30]. To allow fair comparison on di?erent deployment topologies, we should require each topology to have the same link compromise probability when the same number of nodes are compromised, and then compare the connectivity of secure communication. We have constructed a probabilistic model to compute and approximate the link compromise probability in the location-based key pre-distribution scheme [30] using lattice deployment. It can be shown that if the compromised nodes are statistically uniformly distributed among all nodes and each node carries the same number of keys, then the link compromise probability is approximately the same for the location-based scheme using the square lattice, the hexagon lattice, and the basic scheme using random deployment, up to the ?rst-order Taylor expansion [34]. The detailed derivation can be found in the Appendix. With this ?nding, we can

123

construct a fair comparison between deployment topologies. Our study shows that, for ?xed-size key ring, the group-based scheme using structured/lattice deployment usually can achieve a better connectivity than the random deployment.

4.2

Lattice-Structured Deployment

In this section, we jointly examine the sensing coverage and communication security under the static sensor deployment scenario. Given that a very limited amount of study has been done in the literature on how the sensor topology a?ects both security and coverage issues, we focus on analyzing the impact of deployment topology on the performance of sensing coverage and e?ciency as well as to the ability of establishing secure communications. We will consider two main deployment topologies, namely, the square lattice and the hexagon lattice.

4.2.1

Fundamental Relations Between Deployment Lattices

As the ?rst example to illustrate the impact of sensor deployment topology on the establishment of secure communications links, we consider the simple case of sensors being placed exactly at the desired location. We deploy sensors under a square lattice and a hexagon lattice, respectively, and employ the basic key pre-distribution scheme [34], where each node has the same probability to share a secret key with any other node. We denote the key sharing probability by Pshare , and use the same node densities in the two lattice deployment, which is the number of nodes per unit area. As the communication radius Rc increases from 0 to 1.6D1 , each sensor can gradually have more reachable neighbors, and this in turn will a?ect the number of secure links per node. Because the distance

124

expected number of secure links versus communication radius 8 6 Nsec/Pshare 4 2 0 0.9 1 1.1 1.2 Rc / D1 1.3 1.4 1.5 1.6 square lattice hexagon lattice

Figure 4.2: Expected number of secure links versus communication radius using the basic key pre-distribution scheme.

between a node and its eight neighboring nodes in a square lattice is not circularly symmetric [143], the number of neighbor nodes that can be reached is a twostep function of the communication range. That is, as the communication range increases, four vertical and horizontal neighbors of the center nodes (also known as the 4-way connection [47]) will be reached ?rst, before the other four neighbors on the diagonal directions being reached. This can be seen from Fig. 4.2, where we show the relation of the expected number of secure links per node, normalized by the key sharing probability, with respect to the normalized communication radius. The result for a hexagonal lattice, on the other hand, is a one-step function, owing to the circular symmetry between a center node and its six neighbors. Under the same node density, Fig. 4.2 shows that hexagon lattice achieves a better topology when the communication radius Rc is between 1.07 and 1.41 times of D1 ; and outside this range, square lattice achieves a better connectivity. Later in this chapter, we shall see several more examples re?ecting this fundamental relations between the square and hexagon deployment lattice.

125

4.2.2

Secure Connectivity Under Perturbed Deployment Lattice

While Fig. 4.2 depicts the trend for the secure communication connectivity using square and hexagon lattices in the ideal situation, sensors may not be deployed with high accuracy at the designed lattice locations in practice. Such inaccuracy may be caused by measurement error (if sensors are deployed by human), or by wind speed (if sensors are deployed by vehicle or airborne methods). Suppose a sensor node is designed to be deployed at location (x0 , y0 ) in the ?eld. The actual deployment location (x, y ) can be modelled as x = x0 + rx ; y = y0 + ry .

Here the deviation terms rx and ry are zero-mean random variables. One can model these deviation terms as Gaussian distributed [30] or uniformly distributed [63] random variables with variance ? 2 as the deviation parameter. Taking the deployment deviation into consideration, we investigate the impact of deployment topology on the connectivity of secure communication. Here we choose the location-based key pre-distribution in [30] and the Gaussian deployment deviation model and compare the square lattice deployment used in [30] with the hexagon lattice deployment. In the hexagon lattice deployment, each node is surrounded by six neighbor nodes. By using location-based key pre-distribution, the key pool for any given node, referred to as the center node, has 1/6 overlap with each of its six neighbors’ key pools. Thus the the center node will have equal probability to share keys with each neighbor node. We denote the probability that the center node can still be a neighbor with one of its neighbor node under Gaussian deployment deviation by Pr(neighbor), and the probability that the two nodes

126

can share a key by Pr(share). As the deployment deviation is independent of key distribution, the probability that a designed neighbor in the hexagon lattice can establish a secure link with the center node is Pr(hexlink ) = Pr(neighbor) Pr(share). Because of the geometrical symmetry, the expected number of secure links for the center node is
hex ) = 6 Pr(hexlink ). E(Nsec

Similarly, we can compute expected number of secure links per node in the square lattice deployment. In the square lattice, we refer to the horizontal/vertical neighbors of a node as type-A neighbors and the diagonal neighbors as type-B neighbors. Denote the probability that a node can establish a secure link with one of its typeA neighbors as Pr(sqlinkA), and that with type-B neighbors as Pr(sqlinkB ). The expected number of secure links per node is
sq E(Nsec ) = 4 Pr(sqlinkA) + 4 Pr(sqlinkB ).

In Fig. 4.3 we show the expected number of secure links per node under Gaussian deployment deviation. Each node carries 100 keys and each key pool contains 1200 keys. The neighbor probability and key sharing probability can be computed using the model in [31]. In Fig. 4.3, in both square and hexagon lattice deployment, the expected number of secure links increases with the normalized communication radius Rc /D1 . The hexagon lattice achieves a slightly higher connectivity over the range of 0.9 and 1.7 in the normalized communication radius, exhibiting a similar trend as in Fig. 4.2. This suggests that the communication radius, deployment topology, deviation parameter, and the number of pre-loaded keys per sensor all play a role in establishing secure links. It is also worth noticing that while the numerical gain in connectivity by hexagon lattice over the above mentioned communication range is small, its practical impact is non-trivial. Within this com-

127

Gaussian deviation, ?/D1=0.25 5 4.5 4 3.5 Nsec 3 2.5 2 1.5 1 0.5 0.9 square lattice hexagon lattice

1.1

1.3

1.5 Rc/D1

1.7

1.9

2.1

Figure 4.3: Expected number of secure links per node versus communication radius. Shown here are the analytical values under Gaussian deployment deviations.

munication range, the average node degree increases from 0.5 to around 4, and the sensors gradually change from isolated nodes to connected components, where the boundary value for the average links per node is around 2. This phenomenon will be illustrated later in Section 4.3. To achieve the same connectivity, the square lattice would require a larger communication range. As the power consumption of wireless communications is related to the communication range by a power law (from the 2nd to the 4-th power, depending on the propagation environment [97]), a lower requirement on communication range with lower power consumption while maintaining communication connectivity is more desirable in many sensor network designs. This makes hexagon deployment lattice attractive for power-limited applications.

128

4.3

Security-Aware Sensor Location Adjustment

The static deployment strategy described in the previous section considers the sensor deployment as a one-time task. Once the sensors are deployed in the ?eld, their locations are ?xed and cannot be further adjusted. In recent years, a number of works on practical sensor deployment have considered movement-adjusted sensor deployment for improving sensing coverage [120, 145, 44]. In this section, we investigate the impact of location adjustment in sensor deployment on secure communications. We propose two new location updating algorithms for sensor deployment that jointly consider sensing coverage and secure communications.

4.3.1

Improving Secure Connectivity Using the Virtual Force Framework

E?ect on Secure Connectivity by the Existing Approach When secure communications is required for sensor nodes, the existing location adjustment algorithms may negatively a?ect the establishment of secure communication links. As an example, we examine the establishment of secure communication links when the sensors are moved by the Virtual Force [145] location updating algorithm. The Virtual Force algorithm adjusts the sensor locations based on the relative distance from a sensor to its neighbors compared to a pre-determined threshold dth . Suppose a node ni has a neighbor node nj and their distance is dij . The virtual force applied by nj to ni is ? ? ? ? wA (dij ? dth ) · ? v ij if dij ? dth ; ? ? F ij = ? ? ? (wR /dij ) · ? v ji dij < dth .

(4.3)

129

Coverage ratio ? = 0.7; Average secure links per node Nsec = 2.5 50

40

30

20

10

0 0 10 20 round 0 30 40 50

(a) Initial deployment locations
Coverage ratio ? = 0.85; Average secure links per node Nsec = 1.6 50

40

30

20

10

0 0 10 20 round 4 30 40 50

(b) Locations after 4 VFA iterations

Figure 4.4: Impact of location adjustment to the establishment of secure links using VFA

130

? Here ? v ij is the unit-length pointing from the location of ni to that of nj . The ? ? total virtual force F i on ni is the aggregated virtual force from all of its neighbors, i.e., ? ? Fi=
Nb

? ? F ij .

(4.4)

j =1

After computing the virtual force for each node, the node ni is moved to the ? ? direction speci?ed by the aggregated virtual force F i with a step size equal to its ? ? magnitude | F i |. Fig. 4.4 shows an example of location updating using the VFA and its impact on secure communications. Initially, 49 sensors are deployed into a 60 × 60 area with a hexagon lattice pattern. A uniform distributed deployment deviation is applied to the initial locations, with the deployment variance ? 2 = 4/3. This initial deployment is shown in Fig. 4.4(a) with the established secure links marked as lines connecting the sensor nodes. In this example, the sensing radius is 5 and the communication radius is 9. The sensing coverage achieved by the initial deployment is ? = 0.7 and the average number of secure links per node is Nsec = 2.5. Next, we apply the VFA to update the sensor locations. After four iterations, the sensing coverage has been improved to ? = 0.85, while Nsec has been reduced signi?cantly to Nsec = 1.6, implying most of the nodes are no longer connected. This is illustrated in Fig. 4.4(b). At the initial deployment, most nodes form a connected component using the secure links; after four iterations, about half of the nodes are no longer connected with the largest connected group, which reduces the capability of secure communications between the sensor nodes. Our study shows that such a phenomenon is common in sensor location update using virtual force type of schemes. To maintain a comparable sensing coverage while improving secure connectivity, we propose a modi?ed sensor location updating algorithm

131

based on the virtual force framework. We call the modi?ed algorithm VFSec, indicating that secure communications is one of the main factors in updating the sensor locations. VFSec Algorithm As we have seen, there is a tradeo? between the sensing coverage and secure connectivity. For balancing this tradeo?, we de?ne an additional performance metric as ? = w1 ? + w2 Nsec . (4.5)

The weights w1 and w2 are chosen such that w1 ? and w2 Nsec are approximately in the same value range, so as to achieve a desired tradeo?. Since the sensing coverage is always within [0, 1], and the average number of secure links per node is around 3 in most of our experiments, we choose w1 = 1 and w2 = 1/3 in our experiments. Our algorithm uses the combined performance metric ? to measure the optimality of sensor locations, which balances the tradeo? between coverage and secure communications. During each iteration of location adjustment, the algorithm tries to keep the distance between those nodes that can establish secure links closer. To ? ? achieve this, we add a new term of virtual force, F sec , to the total virtual force. ? ? The virtual force F sec ij applied to a node ni by a neighbor node nj is as follows ? ? ? ? ws (dij ? Dsec ) · ? v ij if Dsec < dij < Rc ? ? ? ? ? sec F ij = (4.6) (ni , nj ) share key; ? ? ? ? ? 0 otherwise. ? ? In computing F sec ij , Dsec is a threshold distance smaller than the communication radius Rc , dij is the distance between node ni and nj , ws is the weight assigned

132

? to the added virtual force, and ? v ij is the unit-length vector pointing from the location of ni to that of nj . The total virtual force for secure link applied on ni ? ? ? ? ? ? is F sec = j F sec i ij . This force is added to distance-based forces F ij in Eqn.(4.4) ? ? to compute the total virtual force F i . To update the sensor location, the sensor ? ? node ni is moved along the direction of F i by a distance equal to the magnitude ? ? of vector | F i |. The complete algorithm is described in Algorithm 1. Simulation Results and Discussions To study the performance of VFSec, we have performed three experiments and compared the VFA and VFSec in terms of sensing coverage and secure link establishment. Throughout these experiments, we set the communication radius Rc as twice the sensing radius Rs . This is to ensure that even when the sensing range is very small and two neighboring sensors are barely disjointly placed (i.e. the distance between two neighboring sensor nodes is 2Rs ), it is still possible to establish a communication link between the two sensors. In all the experiments, we choose ws = 0.2 and Dsec = 0.6Rc based on heuristics. Both the VFA and VFSec algorithms are run for seven iterations. In the ?rst experiment, we place 36 sensors nodes uniformly in a 50 × 50 area. Using VFA and VFSec, the locations of the sensors are adjusted. The sensing coverage and the number of secure links per node are recorded. We repeat such experiment 400 times and computed the average coverage and the number of secure links per node under di?erent sensing and communication radius. From the results shown in Fig. 4.5 we can see that, the proposed VFSec algorithm can improve the average number of secure links by approximately 15-20%, with a small reduction in the sensing coverage by approximately 2-5%. In addition, the VFSec consistently

133

Algorithm 1 VFSec algorithm n Input: sensor locations {(xi , yi )}n i=1 and key index set {Ki }i=1
opt opt opt Output: new locations (xopt 1 , y1 ), ... (xn , yn )

/* Initialization */
n Compute ?opt using Eqn.(4.5) with ({(xi , yi )}n i=1 , {Ki }i=1 ) opt (xopt i , yi ) ?? (xi , yi ) for 1 ? i ? n

/* Iteration */ for i = 1 to MAX-ITERATION do for each sensor node ni do ? ? Calculate F ij using the formulation in [145] ? ? Calculate F sec ij using (4.6) ? ? ? ? ? ?sec F i ?? F ij + F ij end for /*Update sensor locations*/ (x1 , y1 ) ?? (xi + Fix , yi + Fiy ) for 1 ? i ? n
n Compute ? using Eqn.(4.5) with ({(xi , yi )}n i=1 , {Ki }i=1 )

if ? > ?opt then
opt (xopt i , yi ) ?? (xi , yi ) for 1 ? i ? n

end if end for

134

achieves a better performance in terms of the overall performance metric ? in Eqn.(4.5). In the second experiment, we compare the VFSec and VFA using square lattice deployment under Gaussian deployment deviation. We ?x the node density and the deviation parameter ? = 0.4D1 . From Fig. 4.7, we can see that VFSec improves the average number of secure links per node, with a small compromise in the sensing coverage. In this experiment, we have excluded the boundary nodes of the square deployment area in computing the sensing coverage and number of secure links. Thus the results can be viewed as if the performance is evaluated in an in?nitely large area. In spite of the di?erence in accounting the performance, the results in Fig. 4.7 shows the same trend as in Fig. 4.5. In the third experiment, we compare the square and hexagon lattice deployment using the corresponding location-based key pre-distribution. We use the proposed VFSec algorithm for location updating and the results are obtained for di?erent communication and sensing radius under small deployment deviation (?/D1 = 0.2). Fig. 4.6 shows that the hexagon and square lattices achieve comparable sensing coverage. In terms of the average number of secure links per node, the hexagon lattice achieves a better performance when the normalized communication radius Rc /D1 is in the approximate range of [1, 1.5]; outside this range, the square lattice performs better. Such a result again shows that there is no all-time winner in terms of deployment lattice, as is shown in the step function connectivity graph in Fig. 4.2 for the ideal hexagon and square lattices. When designing secure sensor networks, the deployment lattice as well as system parameters, such as the communication radius, should be taken into consideration.

135

Implementation Issues Similar to the VFA algorithm, the VFSec can be performed by the cluster head of the sensor nodes. As indicated in Algorithm 1, the cluster head needs to collect the sensors’ key index sets and their current locations. The key indices are ID’s assigned to secret keys, which is used in the shared key discovery phase in key pre-distribution schemes [34][16]. The iterations for updating sensor locations are performed by the cluster head and only the ?nal results obtained are sent back to the sensor nodes. The actual location adjustment is performed only once by each sensor. When the cluster head is not available, the algorithm can be performed by the individual sensors only based on its neighborhood information. In this case, Algorithm 1 must be adjusted to suit the distributed implementation. The sensors need to perform movement adjustment after each iteration. At the same time, computing and comparing the global performance metric ?opt as in Algorithm 1 would not be feasible; and the number of iterations must be limited to reduce the power consumption in sensor movement.

4.3.2

Sensor Location Adjustment Based on Vector Quantization

One limitation of the VFSec algorithm is that in order to achieve a better secure connectivity, the sensing coverage is somewhat sacri?ced. The reason is that the virtual force based approach simpli?es the problem in a two-dimensional area to a set of vectors. In this part, we propose a new approach for updating sensor locations that can explore more freedom in the two-dimensional space to jointly optimize sensing coverage and secure communications.

136

Coverage Ratio Comparison 0.9 VFA VFSec

0.8 coverage ratio

0.7

0.6

0.5

4

4.5

5

5.5 sensing radius Rs

6

6.5

7

Number of Secure Link Comparison 3 average # of secure links per node VFA VFSec

2.5

2

1.5

1

0.5

8

9

10 11 12 communication radius R = 2R
c

13

14

s

Overall Performance ? 1.8 1.6 1.4 ? 1.2 1 0.8 VFA VFSec

4

4.5

5 5.5 6 sensing radius Rs; Rc = 2 Rs

6.5

7

Figure 4.5: Comparison of VFA and VFSec with uniform random initial deployment.

137

coverage ratio under Gaussian deployment deviation, ?/D = 0.2
1

1

0.9 R /D =1/sqrt(2)
s 1

0.8

?

0.7

VFSec, hexagon VFSec, square 0.5 0.55 0.6 0.65 R /D
s

0.6 0.45 0.7 0.75 0.8 0.85
1

number of secure links under Gaussian deployment deviation, ?/D = 0.2
1

4.5 4 3.5 Nsec 3 2.5 2 1.5 1 VFSec, hexagon VFSec, square R /D =sqrt(2)
c 1

1

1.1

1.2

1.3 R /D
c

1.4

1.5

1.6

1

Figure 4.6: Comparison of deployment lattice using VFSec under Gaussian deployment deviation.

138

The problem of covering a region using distributed sensor nodes is analogous to the vector quantization problem in signal compression [38]. In the sensing coverage problem, each node can sense its nearby region with certain accuracy. The goal is to maximize the total coverage given a limited number of sensors. In the quantization problem, each point in the k -dimensional space is associated with a representative point in the codebook. The goal is to use a limited number of points to represent all points in the region with minimum error. In two-dimensional space, if the input signal is statistically uniformly distributed, the minimum-error quantization lattice and the most e?cient covering lattice are the same hexagon lattice [20] as we have seen in Fig. 4.1(b). This has motivated us to employ insights in the vector quantization literature to explore solutions for sensor deployment. The Weighted Centroid Algorithm Several prior works have proposed location updating algorithms that are similar to the two-dimensional vector quantization solution [74][120]. In particular, the MinMax algorithm proposed in [120] computes the Voronoi cell V for each sensor node n, and move the sensor to the minmax location xminmax so that the maximum distance from the new location to any point in the cell V is minimized, i.e., xminmax = arg min{max d(x, y)}.
x ?y?V

It has been shown in [120] that the minmax location is the center of the minimumradius circum-circle of the Voronoi cell associated with each node. Inspired by these works, we propose a new approach for updating sensor locations based on the Lloyd-Max quantization algorithm [38]. We consider that the sensor has a communication range Rc and can know the locations of its neighbors and its own location [74]. Furthermore, the proposed approach will allow the

139

coverage ratio under Gaussian deployment deviation, ?/D1 = 0.4 1 0.97

0.87 Rs/D1=1/sqrt(2) ?

0.77

0.67

VFA, square VFSec, square 0.5 0.55 0.6 0.65 Rs / D1 0.7 0.75 0.8

0.57 0.45

number of secure links under Gaussian deployment deviation, ?/D1=0.4 4

3.5

3 Nsec

2.5

Rc/D1=sqrt(2)

2 VFA, square VFSec, square 1 1.1 1.2 1.3 Rc/D1 1.4 1.5 1.6

1.5

1 0.9

Figure 4.7: Comparison of VFA and VFSec using square deployment lattice under Gaussian deployment deviation.

140

illustration of weight assignment in the weighted centroid algorithm 12 neighbor node with shared key

10 0.7R

8

c

y?axis

6 0.95Rc

4

Fig. 9. Comparison of the weighted centro small Gaussian deployment deviation.

2 weighted centroid

0

center node

?2

2

4

6

8

10 x?axis

12

14

16

of location adjustment is to maxim same weight can be assigned to a for all points. In this situation, the

Figure 4.8: Illustration of the weight assignment in the weighted centroid algorithm.

sensors to take secure communication as a factor in updating locations. Our proposed algorithm aims at minimizing the weighted average distance of a sensor node to the points in its Voronoi cell. We choose a weighted square distance as the distance metric. Suppose in the two-dimensional space, there are N points uniformly distributed at locations {(xi , yi )}N i=1 inside the Voronoi cell formed of a sensor node located at (x0 , y0 ) and its neighbors. Each point is associated with a weight wi . Then the weighted square distance Dw is 1 Dw = N
N

wi [(x0 ? xi )2 + (y0 ? yi )2 ].
i=1

(4.7)

From the classic vector quantization results [38], we know that given the set of
N points {(xi , yi )}N i=1 and the weight {wi }i=1 , the optimal value for (x0 , y0 ) that

minimizes the weighted distance Dw is xopt 0 =
N i=1 wi xi ; N w i i=1 opt y0 = N i=1 wi yi . N w i i=1

(4.8)

141

For the dual problem, we know directly from the de?nition of Voronoi cell that, for any point p located inside a voronoi cell Vi of sensor ni , the node ni is closer to the point p than any other node outside the Voronoi cell Vi . Thus we have naturally obtained an iterative algorithm for location updating. The proposed algorithm works as follows. In each iteration, each sensor ni discovers its neighbors and generates its Voronoi cell Vi according to the neighbor locations. Next, the sensor node generates a set of uniformly distributed grid points {(xi , yi )}N i=1 inside Vi and assign weight to each point. Then the node will compute its new location (x0 , y0 ) that can minimize the weighted square distance Dw according to Eqn.(4.8). The simplest weight assignment is to assign equal weight of one to all points. When di?erent weights are assigned to the sampling
opt grid points, the solution (xopt 0 , y0 ) is the centroid of the Voronoi cell with respect

to weight assignment {wi }. Therefore we refer to the algorithm as the weighted centroid (WTC) algorithm and describe it in Algorithm 2. To jointly consider secure communication and sensing coverage, we propose the following weight assignment procedure. For each sensor node ni , after the Voronoi cell has been formed and the grid points are generated, the base weight for each grid point inside the cell is 1. If the node ni already has a secure link with a neighbor node nj , each grid point that falls into the ring area centered at nj , and between the radius 0.7Rc and 0.95Rc will be assigned an extra weight of wsec = 0.5. An algorithmic description of the weight assignment procedure is presented in Algorithm 3. In Algorithm 3, N b refers to the number of neighbors of a center node nc ; and the function Sec(ni , nj ) is an indicator function, which returns true if node ni and nj has a secure communication link and f alse otherwise. In Fig. 4.8 we illustrate the weight assignment procedure. In this ?gure, the

142

Algorithm 2 The Weighted Centroid Algorithm b Input: sensor location (x0 , y0 ), neighbor locations {(xi , yi )}N i=1 Output: movement vector v Compute Voronoi cell V Generate uniform grid points {(xi , yi )}N i=1 ? V Assign weight {wi }N i=1 using Alg. 3 Compute updated location (x0 , y0 ) using (4.8) Compute the movement vector v ?? [(x0 , y0 ) ? (x0 , y0 )] /* adjustment for stability */ if |v | > Rs /2 then v ?? Rs v/(2|v |) end if

Algorithm 3 The Weight Assignment Procedure b N Input: neighbor locations {xi }N i=1 , sampling points {pi }i=1 , Rc , and wsec Output: weight vector {wi }N i=1 wi ?? 1 for 1 ? i ? N for i = 1 to N b do for j = 1 to N do if Sec(ni , nc ) and 0.7Rc ? d(xi , pj ) ? 0.95Rc then wi ?? wi + wsec end if end for end for

143

center node is shown as a square, its neighbor nodes are shown as circles, and the node that already has a secure link with the center node is shown as a plus sign. The Voronoi cell is shown as the shaded area. The grid points are shown either as cross or as dots, where a dot indicates that grid point is inside the ring area between radius 0.7Rc and 0.95Rc of its secure communication neighbor. The weighted centroid is shown as a diamond in Fig. 4.8. We can see that the updated location is within the center of the ring area, at the same time tends to cover more areas in the Voronoi cell. The choice of the ring area to be within [0.7Rc , 0.95Rc ] is due to the joint consideration of sensing and communications. When the center node is far away from its neighbor, the ring-based weighting tend to pull the center node towards its neighbor. When the center node is too close to its neighbor, the ring-based weighting tend to push the center node away from its neighbor. Thus this weight assignment maintains the communication connectivity between the center node and its neighbors, at the same time avoids too much wasteful overlaps between their sensing regions. Simulation Results and Discussions We study the performance of the weighted centroid (WTC) algorithm using several experiments. We compare it with the performance of the MinMax algorithm proposed in [120], which is known as one of the best schemes in sensor location updating for improving sensing coverage. In Fig. 4.9 and Fig. 4.10 we compare the sensing coverage ratio and the average number of secure links per node achieved by the proposed WTC algorithm and by MinMax, with respect to the normalized sensing/communication radius.

144

We set communication radius Rc = 2Rs . The initial deployment uses hexagon lattice and the key pre-distribution uses location-based scheme. Each node has preloaded 100 keys. Both algorithms are run locally by each sensor for four iterations. In these experiments we have excluded all boundary nodes, which allows the results to be interpreted as the expected performance in an in?nitely large deployment ?eld. The comparison results can be summarized as follows: (1) when the sensing/communication radius is small, MinMax out perform WTC in both sensing coverage and the average number of secure links; (2) as the sensing/communication radius becomes moderately large, WTC outperforms MinMax in both performance categories; (3) when the sensing/communication radius becomes large enough, the performances of the two schemes will converge. These results can be interpreted from resource allocation and optimization perspectives [103]. The minmax criterion employed by the MinMax algorithm emphasizes fairness, i.e., even when a point in Voronoi cell is very far away from the current sensor location, the location adjusting algorithm tries to cover that point. In contrast, the criterion employed by WTC emphasizes e?ciency. It tries to minimize the weighted square distance from the sensor node to all points in its Voronoi cell, which is a more greedy philosophy compared to the minmax criterion. In sensor networks, the sensing and communication range are valuable resources to be allocated to the deployment ?eld. The results shown in Fig. 4.9 and Fig. 4.10 indicate that, with a resources-scarce situation (relative to the resource needed for a full coverage/connectivity), the MinMax is a better criterion, with a moderate enough amount of resources, WTC outperforms MinMax. To quantify the demarcation for resource-scarce and resource-abundant situations, we note that in the ideal hexagon lattice, the normalized sensing radius needs to be

145

coverage using hexagon lattice under Gaussian deployment deviation, ?/D1=0.2 1

0.95

0.9

?

0.85

WTC minmax

0.8 0.45 0.5 0.55 0.6 0.65 Rs / D1 0.7 0.75 0.8

average number of secure links under Gaussian deployment deviation, ?/D =0.2
1

4.5

4

3.5 Nsec 3 2.5

WTC minmax

2 1 1.1 1.2 1.3 Rc/D1 1.4 1.5 1.6

Figure 4.9: Comparison of the weighted centroid and minmax algorithm, small Gaussian deployment deviation, hexagon lattice deployment and location-based key pre-distribution.

146

? 2/ 27 ? 0.62 to achieve full coverage, and the normalized ? communication radius needs to be at least Rc /D1 = 2/ 3 ? 1.07 to achieve full at least Rs /D1 = connectivity with the neighbors, which is a pre-requisite for establishing secure links. As shown in Fig. 4.9 and Fig. 4.10, usually the proposed WTC outperforms the MinMax algorithm when the normalized sensing and communication radius are beyond their respective thresholds of 0.62 and 1.07. In practical situations, since the resource budgets are known prior to the design of sensor networks, dynamically determining which criterion to use will best serve the purpose of improving sensing coverage and establishing secure links. In a separate experiment, we simulated the WTC and MinMax algorithms under random deployment with uniform distribution over the entire ?eld. We place a total of 49 sensors into a 60×60 area and use the basic key pre-distribution scheme for establishing secure links. In this experiment, as it is not possible to exclude the boundary nodes in calculating the average node degree, the average number of secure links drops signi?cantly when compared to the previous lattice-based experiments. In spite of the change in accounting the performance, the simulation results presented in Fig. 4.11 shows the same trend as Fig. 4.9 and Fig. 4.10 in that, the WTC algorithm achieves better performances in both performance categories in the resource-abundant situations, and performs worse than the MinMax in the resource-scarce situations. Implementation Issues In the WTC algorithm, the grid points are chosen to discretize the computation of the centroid instead of using a continuous integration over the Voronoi cell. As power consumption is a major concern in sensor networks, the grid points

147

sensing coverage, Gaussian deployment deviation, ?/D =0.4
1

1 0.95 0.9 ? 0.85 0.8 0.75 WTC minmax 0.5 0.55 0.6 0.65 Rs / D1 0.7 0.75 0.8

average number of secure links, Gaussian deployment deviation, ?/D1=0.4 3.7 3.45

3.05

Nsec

2.65

2.25

1.85

WTC minmax 1 1.1 1.2 1.3 R /D
c

1.45

1.4

1.5

1.6

1

Figure 4.10: Comparison of the weighted centroid and minmax algorithm, large Gaussian deployment deviation, hexagon lattice deployment and location-based key pre-distribution.

148

coverage ratio under uniform random deployment 1

0.95

0.9 ? 0.85 WTC, basic key pre?distribution MinMax, basic key pre?distribution 0.8 0.75 0.45

0.5

0.55

0.6

0.65 R /D
s

0.7

0.75

0.8

1

average number of secure links, uniform random deployment 2 1.8 1.6 1.4 Nsec 1.2 1 0.8 WTC, basic key pre?distribution MinMax, basic key pre?distribution 0.9

1

1.1

1.2

1.3 Rc / D1

1.4

1.5

1.6

Figure 4.11: Comparison of the weighted centroid and minmax algorithm, uniform random deployment with basic key pre-distribution.

149

can be chosen as sparse or dense according to the power budget, thus trading o? computation accuracy with energy. If the only goal of location adjustment is to maximize the coverage, the same weight can be assigned to all wi ’s, i.e., wi = 1 for all points. Unlike the VFSec algorithm, the WTC is more suitable to be performed by individual sensors. This is because computing both the Voronoi cell and the weighted centroid can be done locally. The simulation results have shown that the locally computed location updates using WTC and MinMax can outperform the location updates run by a cluster head using the VFA and VFSec. However, using the grid-based method in computing the weighted centroid, performing WTC generally requires more computation than performing schemes based on virtual force. One way to reduce the computation is to decompose the weighted centroid in Algorithm 2 into a weighted summation of several uniform-weight centroid problem, and compute the centroid of uniform-weighted polygon (or ring area) using the close-form solution provided in [22] and other related work. We plan to further look into this aspect in our future work.

4.4

Chapter Summary

In this chapter, we have investigated the impact of sensor deployment on the performance of sensing coverage and secure connectivity. For static sensor deployment, we have investigated the hexagon and square lattice topology and compared them with the random deployment. We show that the two lattice topology exhibits range-dependent performance and there is no all-time winner in the context of secure connectivity. For designing secure sensor networks, the system parameters, such as sensing and communication range, should be jointly considered with the

150

deployment topology. When sensor locations can be adjusted after the initial deployment, we have proposed two sensor location updating algorithms, the VFSec and the WTC algorithm, to jointly optimize sensing covering and secure connectivity. The simulation results show that the WTC algorithm outperforms the existing algorithms in both secure connectivity and sensing coverage under moderate to abundant node density, while VFSec achieves a superior tradeo? in both performance categories than the existing virtual force based algorithms.

4.5

Appendix: A Model for Link Compromise Probability

In this appendix, we analyze the link compromise probability in the locationbased key pre-distribution scheme in [30] and compare it with that of the basic scheme [34]. Let us denote the size of the key pool in the basic scheme by P , and there are x nodes compromised. For a given link e in the basic scheme, it has been shown in [34] that the probability that the link e is compromised is Pbasic = Pr(e|x) = 1 ? (1 ? m x ) . P

Next, we consider in the group-based scheme [30], the size of the group key pool is S and the number of sensor groups is N . We require the total number of distinct keys in the group based scheme to be the same as in the basic scheme. As each distinct key appears in exactly two group key pools, we have P = (N S )/2. In the group-based scheme, suppose the given link e uses key Ke . This key Ke is in the key pools of exactly two groups, denoted by G1 and G2 . Only when a compromised

151

node n(c) is from one of the two groups, the link e can potentially be compromised by n(c) . When the compromised nodes are i.i.d. uniformly distributed among all groups, denoting the probability that a compromised node n(c) falls into group G1 or G2 by p, we have p = Pr(n(c) ? {G1 ? G2 }) = 2/N. In the group-based scheme, the probability that link e is compromised given x nodes are compromised is Pgroup = Pr(e|x)
x

=
k=0

Pr(e|(k out of x) ? {G1 ? G2 }) · Pr((k out of x) ? {G1 ? G2 })
x

=
k=0

[1 ? (1 ?

m k x k ) ] p (1 ? p)x?k . S k

For function f ( ) = (1 ? )a with ? 0, the ?rst-order approximation at = 0 using Taylor expansion is f ( ) ? 1 ? a . As the key pool size P and the group key pool size S are much larger than the key ring size m, both to zero, we can apply ?rst-order approximation to both (1 ? (1 ?
m x ) P m P

and

m S

are close
m k S

m k ) S

? 1?

and

?1?

m x. P

Thus we arrive at Pbasic ? 1 ? (1 ? x ·
x

m m )?x· P P (4.9)

Pgroup ?
k=0

m x k k p (1 ? p)x?k k S

= mpx/S Since we have S = 2P/N and p = 2/N , substituting these into Eqn. (4.9), we obtain Pgroup ? x · m ? Pbasic . P

152

This shows that the link compromise probability of the basic scheme and the group-based scheme are approximately the same with the ?xed key ring size m.

153

Chapter 5 Optimizing Time E?ciency in Contributory Group Key Management
The complexity of communication security protocols is often determined by multiple factors such as the type of cryptographic primitive used, the connectivity and reliability of the communication links, the number of user who intend to communicate, etc. Among these factors, the scalability of communication security protocol with regard to the user group size is especially important. Such scalability will greatly in?uence the system e?ciency and user satisfaction, and emerging communication applications may involve user groups ranging from thousands to millions [42][69][88][129]. In this chapter, we introduce scheduling and optimization techniques into the design phase of a special class of communication security protocol, group key management protocols. We demonstrate that with these techniques, we can signi?cantly improve the scalability of contributory group key management, hence improve system e?ciency and satisfy users’ demands.

154

An important aspect of communication security is content con?dentiality and access control [82], which becomes a necessity in a wide range of applications, including bank transactions, teleconferencing, and data collection in sensor networks [87][34]. For secure group-oriented applications, access control is a challenging task due to the potentially large group size and dynamic membership. To achieve con?dentiality in group communications, usually a key known to all group members is used to encrypt the communication content [13][51]. This key is usually referred to as the group key [42, 88, 129]. In a group with dynamic membership, the group key needs to be updated upon each user’s join to prevent the new user from accessing the past communications. Similarly, upon each user’s departure, the group key needs to be updated to prevent the leaving user from accessing the future communications. Thus group members need to agree upon the same key management protocol for key establishment and update. Sometimes the group key management protocol is also referred to as the group key agreement. In Chapter 4, we have seen the probabilistic key pre-distribution in sensor networks, which does not guarantee that two nodes who try to communicate share a secret key. In deterministic key management, a group key management scheme follows either a centralized or a contributory approach and can maintain a shared key between information sender and recipient. The centralized approach uses a central key server to generate and distribute keys for all group members [17][119][129], whereas in the contributory approach, each group member contributes his/her own share to the group key [105, 54, 29]. Since contributory schemes do not rely on a central key server, they become necessary in situations where: (a) a central key server cannot be established, such as in Ad Hoc networks, (b) group members do not trust another entity to manage their private keys, or (c) members and server do

155

not share any common knowledge about each other’s secret keys beforehand. Contributory schemes remove the need of the key server at the expense of performing computationally expensive cryptographic primitives, such as modular multiplication and exponentiation [51][33]. This poses a challenge to the design of e?cient key agreements. In the literature, many group key management protocols have been proposed [129, 88, 42, 119, 17, 105, 54, 29, 107, 81, 80, 6, 118, 114, 106, 144, 68]. The early designs of contributory key agreements mostly consider the e?ciency of key establishment [45][104][8]. Among them, Ingemarsson et al. ?rst introduced a conference key distribution system (CKDS) based on a ring topology [45]. Later, Burmester and Desmedt proposed a key distribution system (BD) that takes only three rounds to generate a group key [11]. Steiner et al. extended the two-party Di?e-Hellman (DH) protocol and proposed group Di?e-Hellman protocols GDH.1/2/3 [104]. Becker and Willie studied the minimum communication complexity of contributory key agreements and proposed the octopus and 2d -octopus protocols [8], which have proven optimality for key establishment. While achieving e?ciency in key establishment, most of these early schemes encounter high rekeying complexity in either member join or departure. Recent research on key management became more aware of the scalability issue. As a means to improve scalability, tree-based approach for group rekeying was ?rst presented in the centralized scenario by Wallner et al. in [119] and Wong et al. in [129], independently. Later, tree-based schemes were also proposed for the contributory setting by Kim et al. in their TGDH scheme [54], and by Dondeti et al. in their DISEC scheme [29]. The treebased schemes use a logical key tree to organize the keys belonging to the group members and achieve a rekeying complexity of O(log n) [129][54][29][114], where

156

n is the group size. In addition, [54] and [29] also pointed out that the rekeying cost is related to both the key tree structure and the location of member join or departure in the key tree, and suggested a balanced key tree to reduce the rekeying cost based on heuristics. In [144], Zhu et al. proposed two schemes to optimize the rekeying cost in centralized key management. The key tree structure is reorganized according to the temporal patterns of the group members, or the packet loss probability along the route from the key server to each member. In this chapter, we investigate the time e?ciency of contributory key agreement. The time e?ciency is measured by the processing time in group key establishment and update. In order to participate in the group communications, a joining user has to wait until the group keys are updated. Since computing cryptographic primitives and exchanging rekeying messages are time-consuming, such waiting time is not negligible. Similarly, the amount of time needed to recompute a new group key re?ects the latency in user revocation. Thus from a quality of service (QoS) perspective, the rekeying time cost is directly related to users’ satisfaction and a system’s performance. Traditionally, the rekeying time complexity is analyzed only for one join or departure event. The design rationale of our scheme is to look into the combination of multiple events, and optimize the time cost over the dynamics of group membership. To improve the time e?ciency, we design a new key tree topology with join and exit subtrees, which are small subtrees located close to the root of the key tree. With this key tree topology, we propose a set of algorithms to handle the key update for join and leave events. In particular, we show through analysis that the sizes of join and exit trees should be at the log scale of the group size. The resulting scheme is called Join-Exit Tree (JET) Group Key Agreement. Analytical results show that the proposed scheme achieves an average asymptotic

157

time cost of O(log (log n)) for a join event, and also O(log (log n)) for a departure event when group dynamics are known a priori. In addition to the improved time e?ciency, our scheme also has low communication and computation complexity.

5.1

E?ciency Aspects in Contributory Key Agreement

5.1.1

Background on Tree-based Contributory Key Management

We brie?y review rekeying operations for join and leave events in tree-based contributory key agreements [29][54], which use the two-party DH protocol [28] as a basic module. In a tree-based key agreement, three types of keys are organized in a logical key tree, as illustrated in Fig. 5.1(a). The leaf nodes in a key tree represent the private keys held by individual group members. The root of the tree corresponds to the group key. All other inner nodes represent subgroup keys, each of which is held by the group members that are descendants of the corresponding inner node. We denote the i-th group member by Mi , and the key associated with the j -th node in the key tree by Kj . In addition, g and p are the exponentiation base and the modular base for the DH protocol, respectively. To establish a group key, the keys in the key tree are computed in a bottom-up fashion. Users are ?rst grouped into pairs and each pair performs a two-party DH to form a sub-group. These sub-groups will again pair up and perform the two-party DH to form larger sub-groups. Continuing in this way, the ?nal group

158

1 New intermediate node

1

2

3

2

3

4

5

6

8

New member

M1
4 5 6 7

M 2

M3
7 9

M1

M 2

M 3

M 4

M4

M 5

(a) a key tree

(b) user join

Figure 5.1: Notations for a key tree. key can be obtained. An example is shown in Fig.5.1(a) with four group members, and member Mi has private key ri . The group key K1 corresponding to node 1 is computed in two rounds as K1 = g (g
r1 r2

mod

p)(g r3 r4

mod

p)

mod p.

In a user join event, the new user will ?rst pair up with an insertion node, which could be either a leaf node or an inner node, to perform a two-party DH. Then all the keys on the path from the insertion node to the tree root are updated recursively. An example is shown in Fig.5.1. When member M5 joins the group, node 7 in Fig.5.1(a) is chosen as the insertion node. Then M4 (node 7) and M5 (node 9) perform a DH key exchange to generate a new inner node 8 in Fig. 5.1(b), followed by the key updates on the path node 8 ? node 3 ? node 1. Upon a user’s departure, the leaving user’s node and its parent node will be deleted from the key tree. Its sibling node will assume the position of its parent node. Then all the keys on the path from the leaving user’s grandparent node to the tree root are updated from the bottom to the top.

159

5.1.2

Time-E?ciency Issues in Contributory Key Agreements

The time e?ciency of DH-based contributory group key agreement is usually evaluated by the number of rounds needed to perform the protocol during a key update [45, 104, 54, 29]. However, in some schemes, the number of operations may be di?erent in distinct rounds. For example, in GDH.2 [104], i modular exponentiations are performed in the i-th round. To address this problem, the notion of “simple round” was introduced in [8], where every party can send and receive at most one message in each round. In our work, we apply the notion of simple round in the tree-based contributory schemes. In each round, each user can perform at most one two-party DH operation. With the new de?nition of round, we propose performance metrics for time e?ciency below. Average Join/Leave Time We de?ne the user join time as the number of rounds to process key updates for a user join event. The average user join time, denoted by Tjoin , is de?ned as Rjoin , Njoin

Tjoin =

(5.1)

where Rjoin is the total number of DH rounds performed for Njoin join events. Similarly, the user leave time is de?ned as the number of rounds to process key updates for a user leave event. The average user leave time, denoted by Tleave , is de?ned as Rleave , Nleave

Tleave =

(5.2)

where Rleave is the total number of DH rounds performed for Nleave leave events. Let N = Njoin + Nleave and R = Rjoin + Rleave . The overall average processing

160

time T is de?ned as R , N

T =

(5.3)

where T can also be interpreted as a weighted average of Tjoin and Tleave as T =
Njoin Tjoin N

+

Nleave Tleave . N

5.1.3

Communication and Computation E?ciency

The communication e?ciency of a contributory key agreement refers to the number of messages sent for a key update during a join or leave event. The underlying assumption is that sending each message incurs about the same communication cost. In practice, the size of each message could be di?erent. However, the main cost in sending a rekeying message is the cost in software and hardware to go through the protocol stack and form a packet, along with the cost in networks while routing and transmitting the packet. Similar to the case of time e?ciency, we choose the average number of messages per user join or departure as the performance metric for communication e?ciency. In a DH-based contributory group key agreement, the computation of modular exponentiation dominates the total computation cost. Therefore we use the average number of exponentiations per join or departure event as the performance metric for the computation e?ciency.

5.2

Join-Exit Tree (JET) Algorithms

In this section, we present a new logical key tree topology and the associated algorithms to achieve better time e?ciency in contributory key agreement. As

161

Exit Tree Main Tree

Join Tree

Join Tree Main Tree

(a) join, exit, and main tree

(b) join and main tree

Figure 5.2: Topology for the proposed join-exit tree. shown in Fig. 5.2(a), the proposed logical key tree consists of three parts: the join tree, the exit tree, and the main tree. The proposed key tree is a binary tree built upon the two-party DH protocol. We refer to the key tree in Fig. 5.2(a) as a joinexit tree and a key tree without special structures as a simple key tree. The prior works have shown that, if a user joins the group at a location closer to the tree root, fewer number of keys need to be updated, thus the join time will be shorter. Similar reasoning applies to user departures. So the join tree and exit trees should be much smaller than the main tree. We de?ne the join tree capacity and the exit tree capacity, denoted by CJ and CE , as the maximum number of users that can be accommodated in the join and exit tree, respectively. The number of users in the join tree and the main tree are denoted by NJ and NM , respectively. In the proposed scheme, a joining user will ?rst be added to the join tree. Later on, when the join tree reaches its capacity, all users in the join tree will be relocated together into the main tree. In addition, when users’ departure time is known, users who are most likely to leave in the near future will be moved in batch from the main tree to the exit tree. The design rationale of the join and exit trees resembles that of memory hierarchy in computer design [43]. Furthermore, the

162

capacities of the join and exit trees can change over time, resulting in a dynamic key tree structure. For example, when there is no user in the exit tree, the key tree reduces to a main tree and join tree topology, as shown in Fig. 5.2(b).

5.2.1

The Join Tree Algorithm

The join tree algorithm consists of four parts: the join tree activation, the insertion strategy, the relocation strategy, and the join tree capacity update. When the group has only a few members, the join tree is not activated. As the group size increases and exceeds a threshold we activate the join tree and choose an initial join tree capacity. Such a threshold condition is referred to as the activation condition for the join tree. After the activation, any user joining the group is ?rst inserted to a node in the join tree. The insertion node is chosen according to the insertion strategy. When the join tree is full, the members in the join tree are merged into the leaf nodes of the main tree. Such a process is called the batch relocation. Since the number of users in the main tree is changed after the batch relocation, the join tree capacity is updated according to a rule that relates the join tree capacity to the main tree user number. According to this rule, the optimal join tree capacity in the sense of time e?ciency can be computed. We explain these four parts in details below. User Insertion in the Join Tree When the join tree is empty and a new user wants to join, the root of the current key tree is chosen as the insertion node. The insertion is done by treating the entire existing group as one logical user, and performing a two-party DH between this logical user and the new user. This process is illustrated in Fig. 5.3, where

163

New group key 8 1

M5 join
2 3 2 4 5 6 7

1

9

New member

3

M5

M1

M 2

M3

M 4

4

5

6

7

M1

M 2

M3

M4

Figure 5.3: User join at the join tree root. Note that the new user M5 becomes the root of the join tree.

the new user M5 becomes node 9, the root of the join tree. Member M5 is paired up with the original root of the key tree (node 1) to perform a DH key exchange and the new group key is established as node 8. When the join tree is not empty, the insertion node is determined by Algorithm 4, where usernumber(x) returns the number of users under a given node x in the key tree. After the insertion node is found, the new member node performs a two-party DH key exchange with the insertion node. Then the keys on the path from the insertion node to the tree root are updated through a series of DH key exchange. Fig.5.4 illustrates the growth of the join tree from one user to eight users using the insertion strategy. Algorithm 4 Finding the insertion node x ? join-tree-root while usernumber(x) = 2k for some integer k do x ? rightchild(x) end while insertion-node ? x

164

1

2

3

4

5

………

8

Join tree user #

M1 M1 M2 M5 M3

………

M1 M2

M1 M2 M3 M4 M1 M2 M3 M4 M1 M2 M3 M4 M5 M6 M7 M8

Figure 5.4: Sequential user join strategy (only the join tree is shown).

The Batch Relocation We present two relocation methods that di?er in whether the subgroup keys in the join tree are preserved. In the ?rst method, all users in the join tree are viewed as a logical user during relocation, and this logical user is inserted into the shortestdepth leaf node of the main tree. Thus, the subgroup keys among the users in the join tree are preserved. This process is shown in Fig.5.5(a). Then all keys along the path from the insertion node to the tree root are updated, which is indicated by the dash line in Fig. 5.5(a). The reason to choose the shortest branch leaf node in the main tree as the insertion node is to guarantee that the relocation time is at most the log of the main tree size ( log NM ), because the shortest branch must be smaller or equal to the average length of the branches, which is log NM
1

.

The only exception comes when the main tree is a complete balanced tree, the relocation time is log NM + 1, because one more level of the key tree must be created to accommodate the new logical user. In the second relocation method, we ?nd the NJ shortest-depth leaf nodes in the main tree as the insertion nodes for NJ join tree user. These insertion
1

Throughout this chapter, log stands for base-2 logarithm and ln stands for natural logarithm.

165

Relocation

Relocation

Join Tree Main Tree

Main Tree

Join Tree Main Tree

Main Tree

Join Tree

Join tree & main tree users

(a) Method 1

(b) Method 2

Figure 5.5: Relocation methods for the join tree. nodes are found so that the unbalance-ness of the key tree can be alleviated by the relocation process. Then we relocate the join tree users simultaneously to the insertion nodes. The keys on the branches from all original join tree users to the tree root are updated in parallel and ?nally a new group key is obtained. This process is illustrated in Fig.5.5(b). To analyze the time complexity, we note that this relocation may ?ll up the empty nodes at the shortest-depth leaf nodes of the main tree. The maximum depth of any relocation path would not exceed log(NM + NJ ) . Since the join tree is much smaller than the main tree, the relocation time is upper bounded by log NM + 1. Although the two relocation methods have similar time complexity, the ?rst method will generally produce a skewed main tree. Since users may leave from a branch longer than the average depth of the key tree, an unbalanced key tree may cause the user departure time to be longer than the case when a balanced key tree is used. The second relocation method helps maintain the balance of the key tree, which reduces the expected cost of leave events [54]. We shall choose the second relocation method in this work because it takes into consideration both the join and leave time cost.

166

Table 5.1: Latency of Sequential User Join k 1 2 3 4 5 2 3 2 6 7 8 3 3 4 9 10 ... 2 3 ...

r(k ) 1 2

The Optimal Join Tree Capacity Using the proposed insertion strategy, the user join latency for the k -th user in the join tree is measured as r(k ) rounds, which is listed in Table 5.1. We observe a special property of the sequence r(k ), namely, r(2p + q ) = 1 + r(q ), p ? 0, 0 < q ? 2p , (5.4)

where p is a non-negative integer, and q a positive integer. For the user join latency r(k ) in (5.4), the following inequality holds for any positive integer n, and equality is achieved when n is of power of 2: 1 n
n

r(k ) ?
k=1

1 log n + 1. 2

(5.5)

The proof is presented in Appendix 5.7.1. Consider the average join time for x users joining the group starting form an empty join tree. These x users are inserted into the join tree one by one, then they are relocated together into the main tree. From previous analysis we can see that, when the main tree has NM users, the average join tree relocation time is log NM , where we relax the integer value of the tree height to a continuous value to simplify analysis. Taking into account the relocation time, the average join time for these x users is Tjoin 1 r(k ) + log NM ). = ( x k=1
x

(5.6)

167

Using (5.5), we obtain Tjoin ? 1 1 log x + log NM + 1. 2 x (5.7)

Since it is not easy to minimize Tjoin directly, we try to minimize its upper bound over x. The optimal join tree capacity CJ that minimizes the upper bound is given by 1 1 CJ = arg minx>0 { log x + log NM + 1} 2 x = 2 ln NM (5.8)

The above analysis shows that, for a given number of main tree users NM and the insertion rule speci?ed by Algorithm 4, the optimal join tree capacity CJ is 2 ln NM . Since between two consecutive join tree relocations, the main tree size is ?xed at NM , the join tree capacity should also be ?xed during this time at CJ ? 2 ln NM and the average join time is upper bounded by Tjoin ? 1 3 1 1 log log NM + + ? log log e. 2 2 2 ln 2 2 (5.9)

This upper bound indicates that on average, a user needs to spend only O(log(log n)) rounds for a rekeying operation in user join, where n is the group size. We note that this asymptotic performance is not a?ected by the variation of the relocation time, because the relocation time of around log NM rounds is averaged over log NM join events, contributing approximately only one round to the average join cost. This validates the use of the approximate average relocation time log NM in the above analysis. For the joining users, since they can start to communicate once they are inserted into the join tree, their waiting time do not include the relocation time of log NM rounds. We refer to the waiting time for the joining users as user join latency. We

168

can see that the average user join latency, Ljoin , is also upper bounded as Ljoin ? The Join Tree Activation To decide whether to activate the join tree, we compare the average join time with and without employing the join tree. For a key tree structure with join tree, adding each user in the join tree incurs at most a time cost of log CJ rounds. Consider the average user join time for CJ users when the join tree changes from empty to full, followed by a batch relocation of log NM rounds. The average join time for these CJ users satis?es 1 3 1 log(log NM ) ? log log e + . 2 2 2

Tjoin ? log CJ + (log NM )/CJ .

(5.10)

If a simple key tree with only a main tree is used, the average join time would be at least log NM . Consequently, a reduction in time cost can be obtained by using the join tree when the following inequality holds, log CJ + (log NM )/CJ ? log NM , or equivalently, log NM ? CJ log CJ . CJ ? 1 (5.11)

We can see that when the number of users in the group is large enough, a join tree should be activated to reduce the average join time. In Appendix 5.7.2, we show that when CJ = 2 ln NM , the inequality (5.11) is satis?ed for any NM > 8. Thus we have found a threshold group size T Hjoin = 8. When the group size is smaller than or equal to 8, a simple key tree is used. Otherwise, the join tree is activated.

169

5.2.2

The Exit Tree Algorithm

In some group applications, users can estimate the duration of their staying time according to their own schedule. Such information can help reduce the time cost of rekeying operations in user departure. In the following analysis, we assume that we can obtain accurate information about users’ duration of stay. In later sections, the cases of inaccurate or unavailable staying time will be discussed. Similar to the join tree algorithm, the exit tree algorithm consists of four parts, namely, the activation condition, the batch movement, the user insertion in the exit tree, and the optimization of the exit tree capacity. The Batch Movement The batch movement refers to the operations to move the users that are likely to leave in the near future from the main tree to the exit tree. The group communications is not interrupted since the old group key can still be used before the batch movement is completed. A batch movement takes place when there is a user leaving from the exit tree and a batch movement condition is satis?ed. Denoting the number of users in the exit tree after the last batch movement as Up , and the current number of users in the exit tree as Uc , we propose a batch movement condition as Uc ? ?Up , (5.12)

where ? ? [0, 1) is the exit tree residual rate (residual rate for short), a predetermined parameter to control the timing of batch movement. In a batch movement, the ?rst B users who are most likely to leave soon are moved to the exit tree, where B is referred to as the batch movement size. Starting from an empty exit tree (Up = 0), the number of users in the exit tree after the k -th batch movement

170

will be upper bounded by

k?1 i=0

?i B . As k goes to in?nity, the number of users

in the exit tree converges to the upper bound B/(1 ? ?). Therefore the exit tree capacity CE is related to the batch movement size by CE = B/(1 ? ?). (5.13)

We propose to use a priority queue [21] to keep the departure time of all the users in the main tree. This queue is referred to as the leaving queue. The users’ departure time is obtained from their arrival time and their estimated staying time. The leaving queue will be update under two circumstances. First, after a batch relocation of the join tree, the departure information of the join tree users are added to the leaving queue. Second, after the batch movement of the exit tree, the departure information of the moved users are removed from the leaving queue. User Insertion in the Exit Tree The insertion locations for the users being moved into the exit tree are chosen to maintain the balance of the exit tree. For each user insertion, the leaf node with the minimum depth in the exit tree is chosen as the insertion node. Optimal Exit Tree Capacity Here we derive the optimal exit tree capacity that minimizes an upper bound of the average leaving time. Suppose that b users are moved together into the exit tree. A batch movement of these b users will incur a time cost of (log NM + 2), where log NM is the average height of the main tree, and the addition of 2 refers to the additional two levels above the main tree due to the use of the join tree and the exit tree (refer to Fig. 5.2(a)). If the exit tree capacity is x, each user leaving from the exit tree will incur at most a time cost of (log x + 2). Thus the average

171

user leave time for these b users is bounded by 1 Tleave ? (log NM + 2) + (log x + 2). b (5.14)

Using (5.13), b = x(1 ? ?), and minimizing the right hand side of (5.14) we obtain CE = arg minx 1 (log NM + 2) + (log x + 2) (1 ? ?)x ln NM + 2 ln 2 = . (1 ? ?)

(5.15)

When the capacity of the exit tree is computed as in (5.15), the average leave time is bounded by Tleave ? log(log NM + 2) + ?, (5.16)

where ? = 2 ? log(1 ? ?) + log e ? log log e. Combining (5.15) and (5.13), we have B = ln NM + 2 ln 2. (5.17)

A few comments should be made to provide more insights from the above analysis. First, the batch movement size B is only determined by the number of users in the main tree, and independent of the residue rate ?. Second, there are actually only two parameters, B and ?, in our system, since the exit tree capacity is a function of B and ? as in (5.13). Third, with perfect departure information, the average leave time is bounded by O(log log n), where n is the group size, and the residue rate ? should be set to 0 to minimize the upper bound in (5.16). However, in practice, the choice of ? is a tradeo?. When ? is 0, a batch movement cannot be performed unless the exit tree is completely vacant. If some users inaccurately estimate their departure time and stay in the exit tree for a long period of time, no other users can utilize the exit tree during that period. When ? is close to 1, batch movements are frequently performed, resulting in a large overhead. Based on experimental heuristics, we suggest setting ? to around 0.5.

172

The Activation of Exit Tree The average leave time using a simple key tree with NM users is log NM . Comparing this result with the upper bound in (5.14), a reduction in the average leave time can be obtained if 1 (log NM + 2) + (log CE + 2) ? log NM . (1 ? ?)CE Using (5.15), we simplify the above condition as log NM ? log CE + log e + 2. (5.19) (5.18)

Similar to the case of the join tree activation, we can prove that when the exit tree capacity is chosen as in (5.15), the inequality (5.19) is satis?ed for any NM > 256. Thus we have found a threshold group size T Hleave = 256. When the group size is larger than this threshold, activating the exit tree can reduce the average leave time.

5.3

Group Key Agreement Based on Join-Exit Tree

In this section we present a protocol suite of the Join-Exit Tree (JET) Group Key Agreement, which consists of a key establishment protocol, a user join protocol, and a user leave protocol. These protocols are based on the algorithms we discussed in the previous section.

5.3.1

Group Key Establishment

Many prior works [114][45] assume that all group members are available before starting the group communications, thus parallel computation can take place to

173

establish a group key. We refer to this situation as concurrent user join. In reality, there are situations when members join the group sequentially, and we refer to them as sequential user join. The proposed JET scheme treats the key establishment in these two types of situations di?erently. For concurrent user join, subgroup keys in the key tree are computed in a bottom-up fashion in parallel to obtain the ?nal group key, as in [114]. For sequential user join, we use the join protocol (as discussed below) to handle the sequential key updates. The join tree is activated when the group size exceeds the activation threshold T Hjoin = 8, but the exit tree will not be activated during the key establishment stage.

5.3.2

Join Protocol

The key update for a user join event follows the next few steps: 1. Choose an insertion node in the key tree. (a) Before the join tree is activated, Algorithm 4 is used in the simple key tree to choose the insertion node. (b) After the join tree is activated, when inserting the new user according to Algorithm 4 will not make the join tree height more than log CJ , the insertion strategy in Algorithm 4 is followed. Otherwise, the insertion node will be chosen as the leaf node with the minimum depth in the join tree. (When there are user departures from the join tree, this helps keep the join tree balanced.) 2. The insertion node and the new member perform a two-party DH key exchange. Then all the keys on the path from the insertion node to the root are updated subsequently.

174

3. Adjust the key tree topology and parameters according to the rules speci?ed as follows: (a) When the group size is larger than T Hjoin = 8, the join tree is activated. When the group size is larger than T Hleave = 256, the exit tree is activated. (b) When the join tree becomes full after a join event, users in the join tree are relocated into the main tree using the relocation strategy in Section 5.2. Additionally, the departure information of those users who can report their staying time is stored in the leaving queue. (c) Update the join and exit tree capacities according to Eqn. (5.8) and Eqn. (5.15), respectively.

5.3.3

Leave Protocol

The exit tree residual rate is set to ? = 0.5. The key update for a user leave event follows the next few steps: 1. Delete the leaving user node and its parent node. Promote the leaving user’s sibling node to their parent node’s position. Mark the keys on the path from the leaving user’s grandparent node to the tree root as to be updated later. 2. When the user is leaving from the main tree and there are also users in the join tree, perform a join tree relocation. Mark the keys to be updated for relocation. 3. Update all the keys marked in step 1 and 2 from the bottom to the top of the key tree. 4. When the user is leaving from the exit tree and the batch movement condition is satis?ed, perform a batch movement as speci?ed in Section 5.2.

175

5. Perform the updates for key tree management as follows: (a) Remove the leaving user’s departure information if it is in the leaving queue. (b) Compute the new join and exit tree capacities according to Eqn. (5.8) and Eqn. (5.15), respectively. If the newly-computed join/exit tree capacity becomes larger than the current number of users in the join/exit tree, the join/exit tree capacity is updated immediately. Otherwise, no update is done. (c) When the main tree user number NM falls below the threshold for join/exit tree activation and the join/exit tree is empty, the join/exit tree is deactivated.

5.4

Experiments and Performance Analysis

In this section, we present three simulations. The ?rst simulation focuses on group key establishment, in which we consider sequential user join. The second and third simulation have both join and departure activities. In each simulation, the performance of our proposed scheme is compared with that of TGDH scheme [54], a typical tree-based contributory key agreement.

5.4.1

Key Establishment for Sequential User Join

For sequential user join, the proposed JET protocol uses a simple key tree for small group size, and activates the join tree when the group size is larger than 8. The exit tree will not be activated. We compare the average join time for sequential user join using the proposed JET and TGDH [54] in Fig. 5.6. It can be seen that JET achieves the same performance as TGDH when the group size is

176

Sequential user join average time cost 8

7

JET Analytical upper bound for JET TGDH

6

5 Average join time

4

3

2

1

0 0 10

10

1

10

2

10 Group size

3

10

4

10

5

Figure 5.6: Average time cost for sequential user join.

small, and outperforms TGDH when the group size becomes large. Regarding the asymptotic performance, TGDH achieves an average time cost of O(log n), while the proposed JET scheme achieves O(log (log n)). The dashed line in Fig. 5.6 shows the theoretical upper bound for the average time cost from (5.9).

5.4.2

Experiment Using MBone User Activity Data

In this simulation, we choose three user activity log ?les from three Multicast Backbone (MBone) multicast sessions [116] as user activity for our simulation. Two of these three sessions are NASA space shuttle coverage and the other one is CBC News World online test 2 .
2

The sources of these MBone sessions are: (1) NASA-space shuttle STS-80 coverage, video,

starting time 11/14/1996, 16:14:09; (2) NASA-space shuttle STS-80 coverage, audio, starting

177

Simulation Using MBone Data 6 5 Average Join Time 4 3 2 1 0 NASA1 5 Average Leave Time 4 3 2 1 0 2.730 3.651 3.665 4.595 3.266 3.977 NASA2 CBC News 1.688 1.740 2.056 4.716 5.651 JET TGDH 4.961

Figure 5.7: Average join and leave time for simulations using MBone data.

Fig. 5.7 shows the experimental results using JET and TGDH scheme, where we can see that JET has about 50% improvement over TGDH in user join, and about 20% improvement in user departure. It is worth noting that the improvement in user departure is not resulted from the use of the exit tree, since all the three sessions have maximum group size below 100 and the exit tree is not activated. From the study of the MBone multicast sessions, Ammeroth et al. observed that the MBone multicast group size is usually small (typically 100-200), and users either stay in the group for a short period of time or a very long time [4][3]. Using the proposed JET scheme, the exit tree will not be activated for a small group size. However, when a user stays in the group for only a short period of time, it is highly likely that this user joins and leaves the group in the join tree without
time 12/4/1996, 10:54:49; (3) CBC Newsworld on-line test, audio, starting time 10/29/1996, 12:35:15.

178

Table 5.2: Statistical Parameters for User Behavior Duration ?i mi Characteristic 0-199 7 2500 long stay 200-499 5 500 500-4499 2 500 short stay 4500-5000 1 500

getting to the main tree. Thus the use of the join tree reduces both the user join time and the user leave time.

5.4.3

Experiments Using Simulated User Activity Data

In this experiment, we generate user activities according to the probabilistic model suggested in [4]. The duration of simulation is 5000 time units and is divided into four non-overlapping segments, T1 to T4 . In each time segment Ti , users’ arrival time is modelled as a Poisson process with mean arrival rate ?i and users’ staying time follows an exponential distribution with mean value mi . The values of ?i and mi are listed in Table 5.2. The initial group size is 0. The simulated user activities consist of about 12000 join and 10900 leave events. The maximum group size is approximately 2800 and the group size at the end of simulation is about 1100. In practice, users’ accurate staying time will not always be available. To model the inaccuracy in users’ estimated staying time (EST), we consider three classes of users. The ?rst class of users do not report EST, the second class of users reports accurate EST, and the third class of users reports inaccurate EST. In the third class, the EST for user i is modelled as a random variable with Gaussian
2 ), and the mean value µi is the actual staying time 3 . We also distribution N (µi , ?i
3

Because of the Gaussian distribution, a user could report a negative staying time. Such a

179

Simulated Data Experiment: P1 = 1 ? P0 12 TGDH performance Average join time Average overall time 10 Average leave time

Average Operation Time

8

JET with join tree only

6 JET performance 4 Average Join Time Average Leave Time Average Overall Operation Time

2

0

0

0.1

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 P0: the probability that a user will not report estimated staying time

1

Figure 5.8: Average join, leave and overall time costs for the ?rst experiment using simulated data. A user either do not report EST with probability P0 , or reports accurate EST with probability P1 = 1 ? P0 .

assume that in the third class, the ratio of the standard deviation ?i to the mean µi of the EST is constant across the users, and is denoted by R = ?i /µi . The probability that a user is in the ?rst, second, and third classes is denoted by P0 , P1 , and (1 ? P0 ? P1 ), respectively. In the ?rst experiment, we consider that a user either does not report EST or reports an accurate EST, i.e., P1 = 1 ? P0 . By varying the value of P0 , the average join and leave time costs are shown in Fig. 5.8, where the average leave time increases with P0 almost linearly. The only exception is the data point at P0 = 1, i.e., when no user reports EST. When P0 = 1, the average join, leave, and overall time costs are: Tjoin = 1.87, Tleave = 10.97, and T = 6.22. This is the
case is treated as EST unavailable.

180

User Leave Cost Break Down
12

10

Leave from exit tree Leave from main tree Leave from join tree

8

Cost contribution

6

4

2

0

P0 =0

.8 47

.7 52

.6 00

.6 54

.3 99

.4 54

.2 49

.0 97

.1 51

.0 51

.2 00

.3 00

.3 54

.4 95

.5 50

.6 99

.8 00

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0

=0 P0

P0

P0

P0

P0

P0

P0

P0

P0

P0

P0

P0: the probability that a user does not report EST

Figure 5.9: A breakdown of the contributions to the user leave time in Fig. 5.8. Shown in the ?gure is the contribution to the user leave time by users leaving from the exit tree, the main tree, and the join tree. These contributions (in rounds) is plotted against P0 , with P1 = 1 ? P0 .

P0

P0

P0

181

P0

P0

P0

P0

P0

=0

.8 99

.9 49

situation when the exit tree is not activated during the group lifetime. From these data, we can see that when the exit tree is not used, the average overall time cost is equal to or lower than the costs when P0 ? 0.8. This is because activating the exit tree increases the depth of the key tree by one. When P0 is large, a large portion of users without departure information cannot take advantage of the exit tree, and the overhead of the exit tree structure outweighs its bene?t. The bene?t of the exit tree is substantial as long as more than 30% (corresponding to P0 = 0.7) or more users report accurate EST. We have also compared the performance of JET with that of TGDH in Fig. 5.8, where the performances of TGDH are shown as horizontal lines because they do not vary with probability P0 . We can see that JET always outperform TGDH in terms of the overall time cost and the join time cost. For user leave time cost, JET will outperform TGDH as long as more than 35% of the users report accurate EST. The average leave time presented in Fig. 5.8 consists of three parts, namely, the cost of users leaving from the exit tree, from the main tree, and from the join tree, respectively. We illustrate these three parts in Fig. 5.9. In particular, to obtain the ?rst part, we obtain the average leave time for users leaving from the exit tree; then we multiply it with the percentage of user departures from the exit tree with respect to the total number of user departure events. The other two parts can be computed similarly and the average leave cost is the summation of these three parts. We can see that when P0 is small, the user leave time is dominated by the time cost of the users leaving from the exit tree. As P0 increases, the user leave time is gradually dominated by the time cost of users leaving from the main tree. In this experiment, most users will stay in the group for a non-trivial period of time, therefore the percentage of users leaving from the join tree is very small.

182

Simulated Data Experiment: P0 = 0; R = 0.1, 0.2, 0.3 12 R=0.3 R=0.2 10 R=0.1

Average leave time

Average Operation Time

8 R=0.3 R=0.2 6 R=0.1

Average overall time

4 R=0.1 R=0.2 2 R=0.3 0

Average join time

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

(1?P1): the probability that a user reports inaccurate EST

Figure 5.10: Average join, leave and overall time costs for the second experiment using simulated data. A user either reports accurate EST with probability P1 , or reports inaccurate EST with probability 1 ? P1 . For inaccurate EST, the deviation parameter R ? {0.1, 0.2, 0.3}.

In the second experiment, we consider that all users will report EST (P0 = 0). The degree of deviation is at R ? {0.1, 0.2, 0.3}, and P1 varies in the range of [0,1]. The average join and leave time under di?erent P1 values are plotted in Fig. 5.10. We can see that when the proportion of inaccurate estimates (1 ? P1 ) is small, the proposed JET scheme can achieve good time e?ciency in both join and leave events. However, the average leave time is sensitive to the change in R value, especially in the range where (1 ? P1 ) is small, where the gain obtained by using the exit tree diminishes quickly with the increase of R. In the third experiment, all users report inaccurate EST, which corresponds to P0 = P1 = 0. The average join and leave time costs are simulated when the value

183

Simulated Data Experiment: P0 = P1 = 0, 12 Average Join Time Average Leave Time Average Overall Operation Time 10

Average Operation Time

8

6

4

2

0 ?4 10

10

?3

10

?2

10

?1

10

0

R: the ratio of standard deviation to mean of the estimated staying time

Figure 5.11: Average join, leave and overall time costs for the third experiment using simulated data. All users report inaccurate EST. The deviation parameter R ? [10?4 , 1].

of R is in the range of [0,1]. Fig. 5.11 shows that, when the standard deviation is two orders of magnitude smaller than the true staying time (R ? 0.01), the proposed JET scheme can e?ciently manage both user join and leave events. We also note from Fig. 5.10 and Fig. 5.11 that, when a large portion of users do not report accurate estimation, the advantage of the exit tree diminishes. Table 5.3 lists the average and the worst case time costs for JET with both join and exit tree (when P0 = 0, P1 = 0.1 and R = 0.1), JET using only the join tree, and TGDH. All the worst case time costs do not change with the simulation parameters (P0 , P1 , and R). Comparing the performance of JET using only the join tree, which does not depend on users’ reported EST, with that of JET using

184

Table 5.3: Simulated Data Experiment: P0 = 0, P1 = 0.1, R = 0.1 average join JET (join tree only) JET (join-exit tree) TGDH 1.87 3.25 10.83 leave 10.97 6.34 9.96 overall 6.22 4.73 10.41 worst case join 13 14 12 leave 13 14 12

both join and exit tree, we can see that the exit tree indeed provides a reduction in terms of the average overall time cost, even when around 10% of users report inaccurate EST. Comparing these time costs with those of TGDH, we can see that the proposed JET scheme can improve time e?ciency in terms of the average time costs, while tolerating a small amount of inaccuracy in EST. However, for a group of size n, the worst case operation time of JET using only the join tree is log n +1, and that of JET using both join and exit tree is log n + 2. These worst case time costs are one and two more rounds than that of TGDH, respectively. This is due to the increased depth by the join and exit tree structure, which we refer to as the structural overhead of the JET scheme. Two observations can be made from the above experiments. First, regardless of the accuracy in EST, the join tree scheme can improve the time e?ciency for join events. Second, although the inaccuracy in EST comes in di?erent forms (no EST or inaccurate EST), the overall operation time is not very sensitive to the change of experiment parameters P0 , P1 , and R. This is because inaccurate EST leads to user departures from the main tree. When users leave from the main tree, we simultaneously relocate the users from the join tree to the main tree (refer to Section 5.3.3). As such, part of the join tree relocation cost can be amortized by

185

the leave cost. Such an amortized cost can be counted either toward the join time or toward the leave time. In this chapter, we have counted it toward the leave time. Therefore when we see a cost increase in user leave events, we will often see a cost reduction in user join events, which will partially o?set the overall cost increase in the overall e?ciency.

5.5
5.5.1

Discussions
Extension to Multi-Level Join Tree

The idea of caching the joining and leaving users, as in the design of memory hierarchy, can be extended to multiple levels. Here we illustrate an extension of the one-level join tree to a two-level join tree. We consider a new join tree topology, where a smaller join tree is attached directly to the tree root, a larger join tree is attached one level lower from the tree root with the main tree as its sibling sub-tree. We refer to the smaller join tree as Level-1 join tree and the larger one as Level-2 join tree, respectively. Such a topology can be visualized as in Fig. 5.2(a), where the exit tree is replaced by the Level-2 join tree. We note that the exit tree does not exist in this topology. We compare the average join time using a two-level join tree with that using a one-level join tree, and try to ?nd the condition under which the two-level join tree has advantage over the one-level join tree. From our analysis, the smallest group size that can bene?t from a two-level join tree is around or greater than 180. Compared to the activation group size of 8 for the one-level join tree, the result shows that the two-level join tree would improve the rekeying time e?ciency when the group grows larger.

186

5.5.2

The Implementation of Key Agreement

In this subsection, we discuss two implementation methods for the proposed JET protocol with and without a group coordinator. We show that these two implementations will give the same time cost. In the ?rst case we consider using a group controller in the implementation of JET. In [105], group controller was suggested to be one of the group members who knows all group membership information and facilitates adding and excluding members. However, the group controller does not have the knowledge of the secret keys of other members and therefore would not violate the security requirements of the contributory key management. The joining/leaving user will send a request to the group controller. Then the group controller sends a broadcast message specifying the change in the logical key tree, including the insertion node ID, adjustment of the key tree structure, etc. For each join tree relocation or batch movement to the exit tree, the group controller will also send a broadcast message to specify where each user will be relocated to. The group controller would have a storage overhead proportional to the group size as it needs to store the topology of the key tree. In addition, the group controller has a communication overhead of one broadcast message per join or leave event and one broadcast message for each batch relocation, which takes place infrequently. When the current group controller leaves the group, another member in the group is chosen as the new group controller and the related information is passed from the leaving group controller to the new one. The group controller can also be a non-member entity. The second implementation does not require a group controller. Instead, each group member stores the topology of the key tree and follows the JET protocol. Thus all users can achieve consistent action for key update. During a join event,

187

the joining user will broadcast a request to the whole group. All members will ?nd the same insertion node according to the insertion strategy, then a sponsor is chosen as the rightmost leaf node in the subtree rooted at the insertion node [54]. Since the sponsor knows all the subgroup keys along the path from the insertion node to the root, the sponsor will perform a two-party DH key exchange with the new user, compute the keys on the path from the insertion node to the tree root, and then broadcast the blinded keys on this path. The blinded keys are the results of exponentiation base g raised to the power of the secret keys, which can be public known. Such a procedure is detailed in [54]. When a leave event occurs, the sponsor is chosen as the rightmost leaf node of the subtree rooted at the leaving user’s sibling node, and a key update procedure similar to that of a join event takes place. Since all users have the same view of the key tree structure, they can also cooperate in the update operations of the key tree, such as the batch relocation/movement. An important factor in the rekeying cost is the depth of the joining/leaving node, d. In the ?rst implementation, d rounds of DH key exchange will be performed in key update, which has 2d exponentiations and d message transmissions in total. in the second implementation, the sponsor needs to compute d un-blinded keys, d blinded keys, and send d messages. Therefore, the two implementations have the same total computation and communication cost, as well as the same time complexity. But in the second implementation, half of the computation load and almost all communication load are on the sponsor. As for the storage overhead, in the ?rst implementation, only the group controller needs to store the structure of the key tree, whose size is proportional to the group size. In the second implementation, each user needs to store a copy of the key tree structure and the total

188

Table 5.4: A Comparison of Rekeying Protocol Complexity (group size n) Key update time in group lifetime for one user JET TGDH GDH.2 O

O(n log n) O(n2 ) for entire system O(n log(log n)) O(n log n) O(n2 ) Rekeying message overhead (# of messages) with multicast O(log n) O(log n) O(n2 ) without multicast O

O

O(n2 )

Computation overhead (# of exponentiations) for one user JET TGDH GDH.2 O(log(log n)) O(log n) O

for all users O

O

O(n2 )

storage overhead grows proportional to the square of the group size.

5.5.3

Protocol Complexity

In this subsection, we provide a more comprehensive comparison of the protocol complexity between JET and TGDH. The complexity aspects we consider are the rekeying time cost during a group lifetime, the messaging overhead, and the computation overhead. These results are summarized in Table 5.4. Time Complexity from Other Perspectives Our previous discussions have been focused on the time cost from individual user’s perspective and on a per event basis. Two additional aspects can help evaluate

189

the overall e?ciency of key management from a system perspective. One is the amount of time a user spends on key update during his/her lifetime in the group, and the other is the amount of time the whole group spends on key update during the lifetime of the group. Consider a sequence of n join events followed by n leave events. We assume that the ?rst user joining the group is also the last one to leave the group. In JET, such a user will spend most of the life time in the main tree for key management purpose. On average, this user will spend 2-round time for key update with each user join event and 3-round time with each user leave event, assuming all users report their staying time accurately. Therefore this user has spent ?

rounds in total on key update during the life time. Since this user has the longest life time among all users, O

is the upper bound for any user’s total key update time. For tree-based key agreement using a simple key tree, this ?rst-come-last-leave user will spend O(n log n) rounds in total on key update. If we consider a group of n users as a whole, for the same sequence of events described above, the group will spend O(n log(log n)) rounds in key update using JET. If a key agreement using a simple key tree is employed, the time cost will be O(n log n). Compared to TGDH, we note that the improvement from a system perspective is not as signi?cant as that from a user’s perspective. Communication Complexity During member join and leave, a joining/leaving member should send a join/leave request. Afterwards, in the rekeying process, at least two messages will be sent for each DH key exchange. This messaging overhead is the communication cost of the rekeying protocol.

190

We now discuss the average number of messages for user join and leave events in JET protocol. In the ?rst scenario, we consider that multicast is available. In particular, if a message needs to be sent to m users, sending one multicast message would su?ce. In this case, the average number of messages is O(log n) for both join and leave events. In the second scenario, we consider that multicast is not available. If a message needs to be sent to m users, m duplicate copies of the same message must be sent. In this case the average number of messages is O

for both user join and leave event. From Table 5.4 we can see that, the rekeying message overhead in JET is comparable to those in TGDH. Computation Complexity In the proposed JET protocol, the total number of exponentiations performed by all users is O

during the key update for a join or leave event. Such a measurement captures the overall computation load of the entire group. During each join or leave event, the number of exponentiations performed by any individual member is less than or equal to two times the number of DH rounds. Therefore for any single user, the average number of exponentiations is also O(log(log n)) per join/departure event.

5.6

Chapter Summary

In this chapter, we have presented a new contributory key agreement, known as the Join-Exit-Tree Group Key Agreement, for secure group communications. Built upon tree-based group key management, the proposed scheme employs a main tree as well as join and exit subtrees that serve as temporary bu?ers for joining and leaving users. To achieve time e?ciency, we have shown that the optimal subtree

191

capacity is at the log scale of the group size and designed an adaptive algorithm to activate and update join and exit subtrees. As a result, the proposed JET scheme can achieve an average time cost of O(log(log n)) for user join and leave events in a group of n users, and reduces the total time cost of key update over a system’s life time from O(n log n) by prior works to O(n log(log n)). In the meantime, the proposed scheme also achieves low communication and computation overhead. Our experimental results on both simulated user activities and the real MBone data have shown that the proposed scheme outperforms the existing tree-based schemes by a large margin in the events of group key establishment, user join, and user departure for large and dynamic groups, without sacri?cing the time e?ciency for small groups.

5.7
5.7.1

Appendix: Derivations
Derivation for Inequality (5.5)

In this appendix, we prove the inequality (5.5) in Section 5.2: 1 A
A

r(k ) ?
k=1

1 log A + 1, 2

(5.20)

where r(1) = 1, r(2p + q ) = 1 + r(q ), p is a non-negative integer, and q ? [1, 2p ] is a positive integer. The equality holds when A is a power of 2. We ?rst use induction to show that when A = 2p , p = 0, 1, 2, ..., the equality holds. When A = 1, LHS = RHS = 1. Next, we assume the equality holds for A = 2p , namely, 1 2p
2p

r(k ) =
k=1

1 log 2p + 1. 2

(5.21)

192

Consider the case of A = 2p+1 . LHS = = = = 1 2p+1 1 2p+1 1 2p+1
2p+1

r(k )
k=1 2p 2p

r(k ) +
k=1 k=1

(r(k ) + 1) (5.22)

1 2 · ( log 2p + 1)2p + 2p 2

1 log 2p+1 + 1 = RHS, 2

where (5.22) is obtained using the induction assumption (5.21). We now prove the inequality for any positive integer A. It is obvious to see that inequality is true for A = 1, 2. By induction, suppose that the inequality is true for all 1 ? A < 2p + q , and we consider A = 2p + q , where 0 < q ? 2p . 1 LHS = A 1 = A ?
A

r(k )
k=1 2p q

r(k ) +
k=1 k=1

(r(k ) + 1) (5.23) (5.24)

1 1 1 [( log 2p + 1)2p + q ( log q + 1) + q ] A 2 2 1 1 p = (2 log 2p + q log q + 2q ) + 1, 2 A

where (5.23) is obtained using the induction assumption. To prove that (5.24) ? 1 log A + 1 is equivalent to prove 2 2p q log 2p + log(4q ) ? log A. A A Applying the identity log k = log e · ln k and ln k = as an integration form log e 2p A
2p 1 k 1 dx, 1 x

(5.25) (5.25) can be written

q 1 dx + x A

4q 1

1 dx x

A

? log e
1

1 dx x

193

? 2p

A 2p

1 dx + q x

A 1

1 dx ? x

4q 1

1 dx ? 0 x

(5.26)

We denote B = 2p and ?x p (hence B is ?xed). Thus A = B + q . It is straightforward to see that (5.26) holds when B + q ? 4q , or 1 ? q ? When B/3 ? q ? B , (5.26) is equivalent to 2p A
A 2p B . 3

1 q dx ? x A

4q A

1 dx ? 0. x

(5.27)

Since q is the only variable in (5.27), let f (q ) be the LHS of (5.27), and consider f (q ) as a continuous function of q f (q ) = B B+q
B +q B

1 q dx ? x B+q

4q B +q

1 dx, x

where q ? [B/3, B ]. Taking the derivative of f (q ), we get d B f (q ) = ? dq (B + q )2
4q B

1 dx < 0. x

(5.28)

In previous proof we showed that the equality of (5.20) holds when A is power of 2, i.e. f (B ) = 0. We also showed that f (q ) > 0 for 1 ? q ?
B . 3

Since f (B/3) > 0,

f (B ) = 0, f (q ) is continuous on [B/3, B ] and f (q ) < 0, we must have f (q ) > 0 on [B/3, B ]. Thus (5.26) also holds for B/3 ? q ? B . This completes the proof.

5.7.2

Derivation for Inequality (5.11)

In this appendix, we prove the inequality (5.11) holds for any x > 8, log x > 2 ln x log(2 ln x). 2 ln x ? 1 (5.29)

Let y = ln x. We consider the case when the group size is larger than 1, so x ? (1, +?), and y ? (0, +?). Under such a condition, (5.29) becomes 2y ? 1 > 2 ln 2(1 + log y ). (5.30)

194

Let g (y ) = (2y ? 1) ? 2 ln 2(1 + log y ). Function g (y ) havs two zeros at y0 = 1/2 and y1 ? 1.7564. In addition, g (y ) > 0 for any y > y1 . Therefore (5.29) holds for any x > ey1 ? 5.7917. In our proposed protocol, we choose a larger threshold value 8 as eight users lead to a balanced main tree.

195

Chapter 6 Conclusions and Future Perspectives
In this dissertation, we have investigated the potential, the mechanism, and the performance of incorporating signal processing techniques, as well as bene?cial knowledge from other related ?elds of study, into di?erent layers of a secure communication system. The fusion of di?erent technical disciplines can take place at physical layer communications for tracing the adversary, at application layer for con?dentiality protection and content authentication for multimedia, at system deployment phase of sensor networks for jointly optimizing sensing coverage and secure communication, and at protocol design phase for improving key management e?ciency and user satisfaction. Such an interdisciplinary and cross-layer approach for application security is important in meeting the demands of emerging communication paradigms and multimedia applications. In Chapter 2, we have looked into the unique issues related to multimedia encryption and content authentication in the multimedia communication scenario. We have proposed atomic encryption operations for multimedia that can preserve

196

standard compliance and are friendly to delegate processing, provided quantitative analysis on the security and bit-rate overhead, and presented a systematical study on how to strategically integrate di?erent encryption operations to build a video encryption system. For media authentication, we have adapted the concept of unicity distance to evaluate the security of image hashing, discovered the potential key disclosure problem for popular image hashing schemes, and proposed mitigation solutions. In chapter 3, we focus on the security issues in cooperative wireless communication systems. We have discovered the threat of signal garbling attack from compromised relay nodes, and propose a countermeasure to trace and pinpoint the adversarial relay. The proposed tracing scheme employs an adaptive signal detection algorithm, coupled with an statistical tracing symbol aggregation scheme, and can pinpoint the malicious relay with very high accuracy and low bandwidth overhead. In Chapter 4, we have investigated the impact of sensor deployment on the performance of sensing coverage and secure connectivity in sensor networks. We have provided analysis for static sensor deployment scenario, and proposed two sensor location adjustment algorithms for mobility-assisted sensor deployment. Our results show that by jointly optimizing sensing coverage and secure communication connectivity, we can achieve a better and more balanced performance in both performance categories. In Chapter 5, we have studied the scalability of contributory key management and proposed a new scheme, the Join-Exit Tree (JET) key management scheme, for time-e?cient key management. To improve time e?ciency, optimization and scheduling techniques, together with cryptographic primitives, are integrated into

197

the proposed key management protocol. The time e?ciency for user join and departure event is signi?cantly improved from O(log n) to O(log(log n)), while achieving high communication and computation e?ciency. Throughout this dissertation, we have shown that the various security issues in a multi-layer communication system can be better solved by instilling signal processing concept into the design of application security. Such a design paradigm can have a broad impact and wide applications, including in other related areas such as con?dentiality preserving search and retrieval for multimedia content, digital forensics for communication and multimedia, as well as high performance and trusted computing. The possible future extensions of this dissertation include the following aspects. In the multimedia security area, the content encryption and authentication techniques introduced in this paper can be jointly employed, and combined with feature domain obfuscation techniques to build a privacy-preserving search and retrieval engine for multimedia data. To ensure the scalability of such system, the system can employ database techniques such as hashing and indexing to achieve e?ciency and scalability. Such an media search and retrieval will have the following bene?ts. Since the data stored in the database are encrypted, when the database is compromised, the attacker cannot observe the clear-text content stored in the database. The search and retrieval can be performed by comparing the hash distance without the decryption of content, which is computation-e?cient and time-e?cient. With obfuscation and randomization techniques on the hash vector, the content owner can share the searching capability with some authorized users, without revealing the content of the media to unauthorized users. In the wireless communication security area, our adversary-tracing scheme in

198

cooperative wireless communications focuses on one-hop relay situation. It will be fruitful to extend such a framework to multi-hop relay, and combine networklayer routing information into the adversary tracing scheme. One security problem similar to the adversary-tracing scenario is the security of channel estimation. Usually channel estimation is achieved through the sender sending pre-determined pilot symbols to the receiver. When the sender node is compromised and sends garbled pilot symbols to confuse the receiver, the receiver should be able to detect such malicious behavior. In our preliminary study, we have seen that the statistical behavior of the estimated channel condition from garbled pilot symbols deviates from that of the normal estimation result. Therefore the receiver can use multiple garbled pilot symbols as a “self-reference” to detect channel estimation attack in a probabilistic manner. We envision such an approach to be especially e?ective under slow-fading channel condition.

199

BIBLIOGRAPHY

[1] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. Wireless sensor networks: a survey. Computer Networks, 38:393 – 422, November 2002. [2] S. M. Alamouti. A simple transmit diversity technique for wireless communications. IEEE Journal on Selected Areas in Communications, 16(8):1451– 1458, Oct. 1998. [3] K. C. Almeroth. A long-term analysis of growth and usage patterns in the multicast backbone (MBone). In Proceedings of the IEEE INFOCOM’00, volume 2, pages 824–833, March 2000. [4] K. C. Almeroth and M. H. Ammar. Multicast group behavior in the Internet’s multicast backbone (MBone). IEEE Communications Magazine, pages 124– 129, June 1997. [5] R. Anderson, J. Backhouse, and et al. Open letter to the health select committee. http://www.dawba.net/healthselectcommittee20020327.pdf, April 2006. [6] S. Banerjee and B. Bhattacharjee. Scalable secure group communication over IP multicast. IEEE Journal on Selected Areas in Communications, 20(8):1511–1527, Oct. 2002. [7] J. Baumgartner. Deciphering the ca conundrum. Communication Engineering and Design, 2003. [8] K. Becker and U. Wille. Communication complexity of group key distribution. In Proceedings of the 5th ACM conference on Computer and Communications Security, pages 1–6. ACM Press, 1998. [9] M. Bellare. Practice-oriented provable security. In Proc. of First International Workshop on Information Security(ISW 97), 1997. [10] M. Bellare, A. Desai, E. Jokipii, and P. Rogaway. A concrete security treatment of symmetric encryption. In Proc. of the 38th Symposium on Foundations of Computer Science, 1997.

200

[11] M. Burmester and Y. Desmedt. A secure and e?cient conference key distribution system. In Proceedings of the EUROCRYPT’94, pages 275–286. LCNS 950, 1994. [12] H. Cai, B. Zeng, G. Shen, and S. Li. Error-resilient unequal protection of ?ne granularity scalable video bitstreams. In Proceedings of IEEE ICC 2004, 2004. [13] R. Canetti, J. Garay, G. Itkis, D. Micciancio, M. Naor, and B. Pinkas. Multicast security: A taxonomy and some e?cient constructions. In Proceedings of the IEEE INFOCOM’99, pages 708–716, 1999. [14] J. Cannons and P. Moulin. Design and statistical analysis of a hashaided image watermarking system. IEEE Transactions on Image Processing, 13(10):1393–1408, Oct. 2004. [15] H. Chan and A. Perrig. Security and privacy in sensor networks. IEEE Computer Magazine, 36(10), Oct. 2003. [16] H. Chan, A. Perrig, and D. Song. Random key predistribution schemes for sensor networks. In Proc. of the 2003 IEEE Symposium on Security and Privacy, May 2003. [17] I. Chang, R. Engel, D. Kandlur, D. Pendarakis, and D. Saha. Key management for secure internet multicast using boolean function minimization techniques. In Proceedings of the IEEE INFOCOM’99, volume 2, pages 689– 698, 1999. [18] C.-Y. Chong and S. P. Kumar. Sensor networks: evolution, opportunities, and challenges. Proceedings of IEEE, 91(8):1247–1256, Aug. 2003. Identity theft [19] Federal Trade Commission. http://www.consumer.gov/idtheft/pdf, Sept. 2003. survey report.

[20] J. H. Conway and N. J. A. Sloane. Sphere Packings, Lattices and Groups. Springer, 3rd edition, 1991. [21] T. H. Corman, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. The MIT Press and McGraw-Hill Book Company, second edition, 2001. [22] J. Cortes, S. Martinez, T. Karatas, and F. Bullo. Coverage control for mobile sensing networks. IEEE Transactions on Robotics and Automation, 20(2):243–255, Apr. 2004. [23] T. M. Cover and J. A. Thomas. Elements of Information Theory. John Wiley & Sons, second edition, 1991.

201

[24] I. J. Cox and J-P. M. G. Linnartz. Public watermarks and resistance to tampering. In Proceedings of International Conference on Image Processing, 1997. [25] R.V. Cox, D.E. Bock, K.B. Bauer, J.D. Johnston, and J.H.Snyder. The analog voice privacy system. AT&T Technical Journal, Jan-Feb 1987. [26] J. Daemen and V. Rijmen. Aes proposal: Rijndael. http://csrc.nist.gov/CryptoToolkit/aes/rijndael/Rijndael-ammended.pdf. [27] P. Dent, G. E. Bottomley, and T. Croft. Jakes fading model revisited. IEEE Electronic Letter, 29(13):1162 – 1163, June 1993. [28] W. Di?e and M. Hellman. New directions in cryptography. IEEE Transactions on Information Theory, IT-22(6):644–654, November 1976. [29] L. R. Dondeti and S. Mukherjee. DISEC: a distributed framework for scalable secure many-to-many communication. In Proceedings of the 5th IEEE Symposium on Computers and Communications, pages 693–698, 2000. [30] W. Du, J. Deng, Y. S. Han, S. Chen, and P. K. Varshney. A key management scheme for wireless sensor networks using deployment knowledge. In Proc. of the 2004 IEEE INFOCOM, Mar. 2004. [31] W. Du, J. Deng, Y. S. Han, and P. K. Varshney. A pairwise key predistribution scheme for wireless sensor networks. In Proc. of the 2003 ACM CCS, Oct. 2003. [32] T. Ekman. Analysis of the ls estimation error on a rayleigh fading channel. In Proc. of IEEE Vehicular Technology Conference - VTC’01, pages 372–376, 2001. [33] S. E. Eldridge and C. D. Walter. Hardware implementation of montgomery’s modular multiplication algorithm. IEEE Transactions on Computers, 42(6):693–699, June 1993. [34] L. Eschenauer and V. D. Gligor. A key-management scheme for distributed sensor networks. In Proceedings of the 9th ACM conference on computer and communications security, 2002. [35] D. Estrin, R. Govindan, J. Heidemann, and S. Kumar. Next century challenges: scalable coordination in sensor networks. In Proceedings of ACM Mobicom’99, 1999. [36] D. J. Field, A. Hayes, and R. F. Hess. Contour integration by the human visual system: evidence for a local ‘association ?eld’. Vision Research, 33(2), Jan. 1993.

202

[37] J. Fridrich and M. Goljan. Robust hash functions for digital watermarking. In Proceedings of International Conference on Information Technology: Coding and Computing, pages 178–183, March 2000. [38] A. Gersho and R. M. Gray. Vector Quantization and Signal Compression. Springer, 1991. [39] V. Gligor. Security of emergent properties in ad-hoc networks. In Lecture Notes in Computer Science, 2005. [40] H. Hacig¨ um¨ us, B. R. Iyer, and S. Mehrotra. E?cient execution of aggregation queries over encrypted relational databases. Lecture Notes in Computer Science 2973, pages 125–136, 2004. [41] O. Harmanci and M. K. Mihcak. Temporal synchronization of watermarked video using image hashing. In Proceedings SPIE - Security and Watermarking of Multimedia Contents, volume 5681, 2005. [42] H. Harney and C. Muckenhirn. Group key management protocol (GKMP) architecture. RFC 2094, July 1997. [43] J. L. Hennessy and D. A. Patterson. Computer Architecture: a Quantitative Approach. Morgan Kaufmann publishers, second edition, 1996. [44] A. Howard, M. J. Matari´ c, and G. S. Sukhatme. Mobile sensor network deployment using potential ?elds: a distributed, scalable solution to the area coverage problem. In Proc. of DARS’02, June 2002. [45] I. Ingemarsson, D. T. Tang, and C. K. Wong. A conference key distribution system. IEEE Transactions on Information Theory, IT-28(5):714–720, September 1982. [46] C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann, and F. Silva. Directed di?usion for wireless sensor networking. IEEE/ACM Tran. on Networking, 11(1):2–16, November 2003. [47] A.K. Jain. Fundamentals of Digital Image Processing. Prentice Hall, 1998. [48] M. Janani, A. Hedayat, T. E. Hunter, and A. Nosratinia. Coded cooperation in wireless communications: Space-time transmission and iterative decoding. IEEE Trans. on Signal Processing, 52(2):362 – 371, Feb. 2004. [49] M. Johnson, P. Ishwar, V. Prabhakaran, D. Schonberg, and K. Ramchandran. On compressing encrypted data. IEEE Transactions on Signal Processing, 52(10):2992–3006, 2004.

203

[50] P. Juang, H. Oki, Y. Wang, M. Martonosi, L. Peh, and D. Rubenstein. Energy e?cient computing for wildlife tracking: design rradeo?s and early experiences with zebranet. In Proc. of ACM ASPLOS’02, Oct. 2002. [51] P. Judge and M. Ammar. Gothic: A group access control architecture for secure multicast and anycast. In Proceedings of the IEEE INFOCOM’02, pages 1547–1556, 2002. [52] C. Kaufman, R. Perlman, and M. Speciner. Network Security: Private Communication in a Public World. Prentice Hall, 2003. [53] R. Kershner. The number of circles covering a set. American Journal of Mathematics, 60:665–671, 1939. [54] Y. Kim, A. Perrig, and G. Tsudik. Simple and fault-tolerant key agreement for dynamic collaborative groups. In Proceedings of the 7th ACM Conference on Computer and Communications Security, pages 235–244. ACM Press, 2000. [55] D.E. Knuth. The Art of Computer Programming. Addison-Wesley, 3rd edition, 1997. [56] J. N. Laneman, D. N. C. Tse, and G. W. Wornell. Cooperative diversity in wireless networks: E?cient protocols and outage behavior. IEEE Transactions on Information Theory, 50(12):3062 – 3080, Dec. 2004. [57] J. N. Laneman and G. W. Wornell. Distributed space-time-coded protocols for exploiting cooperative diversity in wireless networks. IEEE Transactions on Information Theory, 49(10):2415 – 2425, Oct. 2003. [58] W. Li. Overview of ?ne granularity scalability in mpeg-4 video standard. IEEE Trans. on Circuits & Systems for Video Technology, 11(3):301–317, March 2001. [59] Y. Li. Pilot-symobl-aided channel estimation for OFDM in wireless systems. IEEE Trans. on Vehicular Technology, 49(4):1207 – 1215, July 2000. [60] K. Lieska, E. Laitinen, and J. Lahteenmaki. Radio coverage optimization with genetic algorithms. In Proceedings of the IEEE Inter. Symp. on Personal, Indoor and Mobile Radio Communications, pages 318 – 322, Sep. 1998. [61] G. Lin and G. Noubir. On link layer denial of service in data wireless lans. Wiley Journal on Wireless Communications and Mobile Computing, August 2004.

204

[62] S. Lin, M. T. Ozsu, V. Oria, and R. Ng. An extendible hash for multiprecision similarity querying of image databases. In Proceedings of 27th VLDB Conference, 2001. [63] D. Liu and P. Ning. Establishing pairwise keys in distributed sensor networks. In Proc. of the 2003 ACM CCS, Oct. 2003. [64] D. Liu and P. Ning. Location-based pairwise key establishment for static sensor networks. In Proc. of the ACM Workshop on Security of Ad Hoc and Sensor Networks, Oct. 2003. [65] T. Lookabaugh and D.C. Sicker. Selective encryption for consumer applications. IEEE Communications Magazine, 42(5):124–129, May 2004. [66] Y. Mao and M. K. Mihcak. Collusion-resistant desynchronization for digital video ?ngerprinting. In Proc. of the IEEE International Conference on Image Processing (ICIP), Sept. 2005. [67] Y. Mao and M. K. Mihcak. Digital video desynchronization for collusionresistant ?ngerprinting. IEEE Tran. on Image Processing (under revision), June 2006. [68] Y. Mao, Y. Sun, M. Wu, and K. J. R. Liu. Dynamic join-exit amortization and scheduling for time-e?cient group key agreement. In Proceedings of the IEEE INFOCOM 2004, volume 4, pages 2617 – 2627. [69] Y. Mao, Y. Sun, M. Wu, and K. J. R. Liu. Jet: Dynamic join-exit-tree amortization and scheduling for contributory key management. to appear in IEEE/ACM Transactions on Networking, Dec. 2006. [70] Y. Mao and M. Wu. Security evaluation for communication-friendly multimedia encryption. In Proc. of the IEEE International Conference on Image Processing (ICIP), Oct. 2004. [71] Y. Mao and M. Wu. Coordinated sensor deployment for secure communications and sensing coverage. In Proc. of ACM CCS/SASN Workshop, Nov. 2005. [72] Y. Mao and M. Wu. A joint signal processing and cryptographic approach to multimedia encryption. IEEE Tran. on Image Processing, 15(7):2061 – 2076, July 2006. [73] Y. Mao and M. Wu. Security issues in cooperative communications: Tracing adversarial relay. In Proc. of IEEE ICASSP, May 2006.

205

[74] S. Megerian, F. Koushanfar, M. Potkonjak, and M. B. Srivastava. Coverage problems in wireless ad-hoc sensor networks. In Proc. of the 2001 IEEE INFOCOM, 2001. [75] S. Megerian, F. Koushanfar, G. Qu, and M. Potkonjak. Exposure in wireless ad-hoc sensor networks. In Proc. of the 2001 ACM MobiCom, 2001. [76] A. J. Menezes, P. C. van Oorschot, and S .A Vanstone. Handbook of Applied Cryptography. CRC Press, ?rst edition, 1996. [77] P. Michiardi and R. Molva. Core: A collaborative reputation mechanism to enforce node cooperation in mobile ad hoc networks. In Proc. IFIPCommun. Multimedia Security Conf., pages 107–121, Sep. 2002. [78] M. K. Mihcak and R. Venkatesan. A tool for robust audio information hiding: A perceptual audio hashing algorithm. In Proceedings of 4th Intl. Information Hiding Workshop, April 2001. [79] M. K. Mih¸ cak and R. Venkatesan. New iterative geometric methods for robust perceptual image hashing. In Proceedings of ACM Workshop on Security and Privacy in Digital Rights Management, 2001. [80] S. Mittra. Iolus: a framework for scalable secure multicasting. In Proceedings of the ACM SIGCOMM’97, pages 277–288. ACM Press, 1997. [81] R. Molva and A. Pannetrat. Scalable multicast security in dynamic groups. In Proceedings of the 6th ACM conference on Computer and Communications Security, pages 101–112, 1999. [82] M. J. Moyer, J. R. Rao, and P. Rohatgi. A survey of security issues in multicast communications. IEEE Network, pages 12–23, Nov./Dec. 1999. [83] G. Noubir. On connectivity in ad hoc networks under jamming using directional antennas and mobility. In Proc. of International Conference on Wired /Wireless Internet Communications. [84] A.V. Oppenheim and J.S. Lim. The importance of phase in signals. Proc. of the IEEE, 69(5), May 1981. [85] A. Pal, K. Shanmugasundaram, and N. Memon. Automated reassembly of fragmented images. In Proc. of International Conference on Multimedia and Expo, 2003. [86] MPEG-21 Part-4. Intellectual Property Management and Protection (working document). 2001.

206

[87] S. Paul. Multicast on the Internet and its applications. Kluwer academic Publishers, 1998. [88] A. Perrig, D. Song, and J. D. Tygar. ELK, a new protocol for e?cient largegroup key distribution. In Proceedings of the IEEE Symposium on Security and Privacy, pages 247–262, 2001. [89] D. Pescovitz. Robugs: smart dust http://www.coe.berkeley.edu/labnotes/0903/pister.html. has legs.

[90] L.-M. Po and W.-C. Ma. A novel four step search algorithm for fast block motion estimation. IEEE Transactions on Circuits and Systems for Video Technology, 6(3):313–317, June 1996. [91] S. Poduri and G. S. Sukhatme. Constrained coverage for mobile sensor networks. In Proc. of IEEE Inter. Conf. on Robotics and Automation, Apr. 2004. [92] J. G. Proakis. Digital Communications. McGraw Hill, fourth edition, 2001. [93] A. Puri and T. Chen. Multimedia Systems, Standards, and Networks. Marcel Dekker, ?rst edition, 2000. [94] L. Qiao and K. Nahrstedt. Comparison of mpeg encryption algorithms. Inter. Journal on Computers & Graphics, 22(3), 1998. [95] H. M. Radha, M. van der Schaar, and Y. Chen. The mpeg-4 ?ne-grained scalable video coding method for multimedia streaming over ip. IEEE Transactions on Multimedia, 3(1):53–68, Mar 2001. [96] T. S. Rappaport. Wireless Communications: Principles and Practice. Prentice Hall, 1996. [97] T. S. Rappaport. Wireless Communications: Principles and Practice. Prentice Hall, 1996. [98] O. Rousseaux, G. Leus, P. Stoica, and M. Moonen. Gaussian maximumlikelihood channel estimation with short training sequences. IEEE Trans. on Wirelss Communications, 4(6):2945 – 2955, Nov. 2005. [99] A. Sendonaris, E. Erkip, and B. Aazhang. User cooperation diversity – part I: System description. IEEE Trans. on Communications, 51(11):1927 – 1938, Nov. 2003. [100] A. Sendonaris, E. Erkip, and B. Aazhang. User cooperation diversity – part II: Implementation aspects and performance analysis. IEEE Trans. on Communications, 51(11):1939 – 1948, Nov. 2003.

207

[101] C. E. Shannon. Communication theory of secrecy systems. Bell System Technical Journal, 1949. [102] J.M. Shapiro. Embedded image coding using zerotrees of wavelet coe?cients. IEEE Transactions on Signal Processing, 41(12):3445–3462, 1993. [103] G. Song and Y. Li. Cross-layer optimization for OFDM wireless networkspart i: theoretical framework. IEEE Transactions on Wireless Communications, 4(2):614–624, Mar. 2005. [104] M. Steiner, G. Tsudik, and M. Waidner. Di?e-hellman key distribution extended to group communication. In Proceedings of the 3rd ACM conference on Computer and Communications Security, pages 31–37. ACM Press, 1996. [105] M. Steiner, G. Tsudik, and M. Waidner. CLIQUES: a new approach to group key agreement. In Proceedings of the 18th International Conference on Distributed Computing Systems, pages 380–387, 1998. [106] B. Sun, W. Trappe, Y. Sun, and K. J .R. Liu. A time-e?cient contributory key agreeement scheme for secure group communications. In Proceedings of the IEEE International Conference on Communications, pages 1159–1163, 2002. [107] Y. Sun, W. Trappe, and K. J. R. Liu. A scalable multicast key management scheme for heterogeneous wireless networks. IEEE/ACM Transactions on Networking, 12(4):653–666. [108] A. Swaminathan, Y. Mao, and M. Wu. Image hashing resilient to geometric and ?ltering operations. In Proc. of Multimedia Signal Processing Workshop, Oct. 2004. [109] A. Swaminathan, Y. Mao, and M. Wu. Security of feature extraction in image hashing. In Proc. of IEEE ICASSP, Mar. 2005. [110] A. Swaminathan, Y. Mao, and M. Wu. Robust and secure image hasing. IEEE Transactions on Information Forensics and Security, 1(2):215 – 230, June 2006. [111] A. S. Tanenbaum. Computer Networks. Prentice Hall, 2003. [112] L. Tang. Methods for encrypting and decrypting mpeg video data e?ciently. In Proceedings of 4th ACM Inter. Conf. on Multimedia, pages 219–229, Nov. 1996. [113] V. Tarokh, H. Jafarkhani, and A. R. Calderbank. Spacetime block codes from orthogonal designs. IEEE Transactions on Information Theory, 45(5):1456 – 1467, July 1999.

208

[114] W. Trappe, Y. Wang, and K. J. R. Liu. Resource-aware conference key establishment for heterogeneous networks. IEEE/ACM Transactions on Networking, 13(1):134–146, Feb. 2005. [115] W. Trappe and L. C. Washington. Introduction to Cryptography with Coding Theory. Prentice Hall, ?rst edition, 2001. [116] MBone user activity data. data, March 2003. ftp://ftp.cc.gatech.edu/people/kevin/release-

[117] R. Venkatesan, S. M. Koon, M. H. Jakubowski, and P. Moulin. Robust image hashing. In Proceedings IEEE International Conference on Image Processing (ICIP), pages 664 –666, Sep. 2000. [118] M. Waldvogel, G. Caronni, D. Sun, N. Weiler, and B. Plattner. The VersayKey framework: Versatile group key management. IEEE Journal on Selected Areas in Communications, pages 1614–1631, Sept. 1999. [119] D. Wallner, E. Harder, and R. Agee. Key management for multicast: Issues and architecture. Internet-Draft draft-wallner-key-arch-00.txt, June 1997. [120] G. Wang, G. Cao, and T. L. Porta. Movement-assisted sensor deployment. In Proc. of the 2004 IEEE INFOCOM, Mar. 2004. [121] Y. Wang, S. Wenger, J. Wen, and A. Katasggelos. Error resilient video coding techniques. IEEE Signal Processing Magazine, July 2000. [122] Y. Wang, Q-F. Zhu, and L. Shaw. Maximally smooth image recovery in transform coding. IEEE Transactions on Communications, 41(10):1544– 1551, Oct. 1993. [123] Z. Wang and A.C. Bovik. A universal image quality index. IEEE Signal Processing Letters, 9(3):81–84, March 2002. [124] Z. Wang, L. Lu, and A.C. Bovik. Video quality assessment based on structural distortion measurement. Signal Processing: Image Communications, 19(1), Jan. 2004. [125] S. Wee and J. Apostolopoulos. Secure scalable streaming enabling transcoding without decryption. In Proc. of IEEE Inter. Conf. on Image Processing, Oct. 2001. [126] J. Wen, M. Muttrell, and M. Severa. Access control of standard video bitstreams. In Proc. of Inter. Conf. on Media Future, May 2001.

209

[127] J. Wen, M. Severa, W. Zeng, M.H. Luttrell, and W. Jin. A format-compliant con?gurable encryption framework for access control of video. IEEE Trans. on Circuits & Systems for Video Technology, 12(6):545–557, June 2002. [128] R. Williams. The Geometrical Foundations of Natural Structure: a Book of Design. Dover Publications, 1979. [129] C. K. Wong, M. Gouda, and S. S. Lam. Secure group communications using key graphs. IEEE/ACM Transactions on Networking, 8(1):16–30, Feb 2000. [130] C-P. Wu and C.-C. Kuo. E?cient multimedia encryption via entropy codec design. In Proc. of SPIE Inter. Symposium on Electronic Imaging, Jan. 2001. [131] M. Wu, R. Joyce, H-S. Wong, L. Guan, , and S-Y. Kung. Dynamic resource allocation via video content and short-term tra?c statistics. IEEE Trans. on Multimedia, Jan. 2001. [132] M. Wu and Y. Mao. Communication-friendly encryption of multimedia. In Proc. of IEEE Multimedia Signal Processing Workshop (MMSP’02), Dec. 2002. [133] T-L. Wu and S.F. Wu. Selective encryption and watermarking of mpeg video. In Proc. of Inter. Conf. on Image Science, Systems, and Technology, 1997. [134] W. Xu, W. Trappe, Y. Zhang, and T. Wood. The feasibility of lauching and detecting jamming attacks in wireless networks. In Proc. of ACM MobiHoc’05, May. 2005. [135] W. Xu, T. Wood, W. Trappe, and Y. Zhang. Channel sur?ng and spatial retreats: Defenses against wireless denial of service. In Proc. of ACM WiSe’04, Oct. 2004. [136] N. Yeadon, F. Garcia, D. Hutchison, and D. Shepherd. Continuous media ?lters for heterogeneous internetworking. In Proceedings of SPIE, Multimedia Computing and Networking (MMCN’96), Jan. 1996. [137] B.-L. Yeo and B. Liu. Rapid scene analysis on compressed video. IEEE Trans. on Circuits & Systems for Video Technology, 5(6):533–544, Dec. 1995. [138] P. Yin, A. Vetro, B. Liu, and H. Sun. Drift compensation for reduced spatial resolution transcoding. IEEE Trans. on Circuits & Systems for Video Technology, 12(11):1009–1020, Nov. 2002. [139] W. Yu and K. J. R. Liu. Attack-resistant cooperation stimulation in autonomous ad hoc networks. IEEE Journal on Selected Areas in Communications, 23(12):2260 – 2271, Dec. 2005.

210

[140] C. Yuan, B. Zhu, Y. Wang, S. Li, and Y. Zhong. E?cient and fully scalable encryption for mpeg-4 fgs. In Proc. of IEEE Inter. Symp. on Circuits and Systems, May 2003. [141] W. Zeng and S. Lei. E?cient frequency domain video scrambling for content access control. In Proc. of ACM Multimedia, Nov. 1999. [142] S. Zhong, J. Chen, and Y. R. Yang. Sprite: A simple, cheat-proof, creditbased system for mobile ad-hoc networks. In Proc. of IEEE INFOCOM, pages 1987–1997, 2003. [143] Y. Zhou, Y. Zhang, and Y. Fang. Llk: a link-layer key establishment scheme for wireless sensor networks. In Proc. of the IEEE Wireless Communications and Networking Conference, Mar. 2005. [144] S. Zhu, S. Setia, and S. Jajodia. Performance optimizations for group key management schemes. In Proceedings of the 23rd International Conference on Distributed Computing Systems, pages 163–171, 2003. [145] Y. Zou and K. Chakrabarty. Sensor deployment and target localization based on virtual forces. In Proc. of the 2003 IEEE INFOCOM, April 2003.

211

doc_224031453.pdf

Dissertation Study on Securing Multi-Layer Communications

Attachments