Abstract
Private data sometimes must be made public. A corporation may keep its customer sales data secret, but reveals totals by sector for marketing reasons. A hospital keeps individual patient data secret, but might reveal outcome information about the treatment of particular illnesses over time to support epidemiological studies. In these and many other situations, aggregate data or partial data is revealed, but other data remains private. Moreover, the aggregate data may depend not only on private data but on public data as well, e.g. commodity prices, general health statistics. Our GhostDB platform allows queries that combine private and public data, produce aggregates to data warehouses for OLAP purposes, and reveal exactly what is desired, neither more nor less. We call this functionality “revelation on demand”.
Similar content being viewed by others
References
Adam, N.R., Wortmann, J.C.: Security-control methods for statistical databases: a comparative study. ACM Comput. Surv. 21(4), 515–556 (1989)
Agrawal, R., Kiernan, J., Srikant, R., Xu, Y.: Hippocratic databases. In: The International Conference on Very Large Databases, pp. 143–154 (2002)
Agrawal, R., Srikant, R.: Privacy-preserving data mining. In: ACM International Conference on Management of Data (SIGMOD), pp. 439–450 (2000)
Anciaux, N., Benzine, M., Bouganim, L., Pucheral, P., Shasha, D.: GhostDB: querying visible and hidden data without leaks. In: ACM International Conference on Management of Data (SIGMOD), pp. 677–688 (2007)
Anciaux, N., Bouganim, L., van Heerde, H., Pucheral, P., Apers, P.M.G.: Data degradation: making private data less sensitive over time. In: ACM Conference on Information and Knowledge Management (CIKM) (2008)
Anciaux, N., Bouganim, L., Pucheral, P.: Memory requirements for query execution in highly constrained devices. In: The International Conference on Very Large Data Bases (VLDB), pp. 694–705 (2003)
BBC News, Bank customer data sold on eBay, August 26, 2008. http://news.bbc.co.uk/2/hi/uk_news/7581540.stm
Bloom, B.: Space/time tradeoffs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)
Bolchini, C., Salice, F., Schreiber, F., Tanca, L.: Logical and physical design issues for smart card databases, ACM Trans. Inf. Syst. 254–285 (2003)
Bratbergsengen, B.: Hashing methods and relational algebra operators. In: The International Conference on Very Large Databases (VLDB), pp. 323–333 (1984)
Computer Security Institute: CSI/FBI computer crime and security survey (2006). http://www.gocsi.com
Computer World: NASA sites hacked. December 2003. http://www.computerworld.com/securitytopics/security/cybercrime/story/0,10801,88348,00.html
Damiani, E., De Capitani Vimercati, S., Jajodia, S., Paraboschi, S., Samarati, P.: Balancing confidentiality and efficiency in untrusted relational DBMSs. In: ACM Conference on Computer and Communications Security (CCS), pp. 93–102 (2003)
Gray, J., Barkhatov, A.: DBGen synthetic data generator for SQL tables and text files on Windows platforms (1999). http://research.microsoft.com/~Gray/dbgen/
Haas, L.M., Carey, M.J., Livny, M., Shukla, A.: SEEKing the truth about ad hoc join costs. Very Large Data Bases J. 6(3), 241–256 (1997)
Hacigumus, H., Iyer, B., Li, C., Mehrotra, S.: Executing SQL over encrypted data in the database-service-provider model. In: ACM International Conference on Management of Data (SIGMOD), pp. 216–227 (2002)
Henderson, N.J., White, N.M., Hartel, P.H.: iButton enrolment and verification requirments for the pressure sequence smart card biometric. In: The International Conference on Research in Smart Cards (2001)
Hillyard, D., Gauen, M.: Issues around the protection or revelation of personal information. Knowl. Technol. Policy 20, 2 (2007)
IBM corporation: IBM Data Encryption for IMS and DB2 Databases v. 1.1 (2003). http://www-306.ibm.com/software/data/db2imstools/html/ibmdataencryp.html
Lane, P.: Oracle9i Data Warehousing Guide, Release 1 (9.0.1). Oracle Corporation (2001)
Li, Z., Ross, K.A.: Fast joins using join indices. Very Large Data Bases J. 8(1), 1–24 (1999)
Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-Diversity: privacy beyond K-anonymity. In: International Conference on Data Engineering (ICDE) (2006)
O’Neil, P., Graefe, G.: Multi-table joins through bitmapped join indices. In: SIGMOD Record (1995)
Oracle Corporation: Oracle Database, Advanced Security Administrator’s Guide, 10g Release 2 (10.2). Oracle documentation B14268-02 (2005)
Praca, D.: Next generation smart card: new features, new architecture and system integration, deliverable of the Inspired IST project (2005)
Pucheral, P., Bouganim, L., Valduriez, P., Bobineau, C.: PicoDBMS: scaling down database techniques for the smart card. Very Large Data Bases J. 10(2–3), 120–132 (2001)
Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(5), 557–570 (2002)
The Financial Times: Chinese military hacked into Pentagon, Sept. 2007. http://www.ft.com/cms/s/0/9dba9ba2-5a3b-11dc-9bcd-0000779fd2ac.html
The Washington Post. Consultant Breached FBI’s Computers, July 2007. http://www.washingtonpost.com/wp-dyn/content/article/2006/07/05/AR2006070501489_pf.html
UK government loses personal data on 25 million citizens. http://www.edri.org/edrigram/number5.22/personal-data-lost-uk
Valduriez, P.: Join indices. ACM Trans. Database Syst. 12(2), 218–246 (1987)
Vingralek, R.: Gnatdb: a small-footprint, secure database system. In: International Conference on Very Large Databases (VLDB), pp. 884–893 (2002)
Weininger, A.: Efficient execution of joins in a star schema. In: ACM International Conference on Management of Data (SIGMOD), pp. 542–545 (2002)
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Ladjel Bellatreche.
Rights and permissions
About this article
Cite this article
Anciaux, N., Benzine, M., Bouganim, L. et al. Revelation on demand. Distrib Parallel Databases 25, 5–28 (2009). https://doi.org/10.1007/s10619-009-7035-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10619-009-7035-x