Security and Deduplication in the Cloud
Danny Harnik, IBM Research
The talk will discuss security and privacy issues that arise from
deduplication in the cloud. Deduplication, is a popular form of compression
for large storage systems in which duplicate copies of files are replaced
by links to a single copy. This technique is also used to reduce the
bandwidth of incoming data to a storage cloud and has a significant effect
on the cost of maintaining such clouds (especially when deployed across
different users).
We study the privacy implications of cross-user deduplication. We
demonstrate how deduplication can be used as a side channel which reveals
information about the contents of files of other users, as a covert channel
by which malicious software can communicate with its control center, or as
a method to retrieve files about which you have only partial information.
In our work we propose mechanisms that enable cross-user deduplication
while ensuring meaningful privacy guarantees.
Based on works with Shai Halevi, Benny Pinkas and Alexandra Shulman-Peleg.