PODS - Accepted Research Papers
Get the Most out of Your Sample: Optimal Unbiased Estimators using Partial Information
Edith Cohen and Haim Kaplan
On Provenance Minimization
Yael Amsterdamer, Daniel Deutch, Tova Milo and Val Tannen
Finding a Minimal Tree Pattern Under Neighborhood Constraints
Benny Kimelfeld and Yehoshua Sagiv
Maximizing Conjunctive Views in Deletion Propagation
Benny Kimelfeld, Jan Vondrak and Ryan Williams
Rewrite Rules for Search Database Systems
Ronald Fagin, Benny Kimelfeld, Yunyao Li, Sriram Raghavan and Shivakumar Vaithyanathan
Space-efficient Substring Occurrence Estimation
Alessio Orlandi and Rossano Venturini
Provenance for Aggregate Queries
Yael Amsterdamer, Daniel Deutch and Val Tannen
Finding Skylines in External Memory: Worst-case Efficient, Finally
Cheng Sheng and Yufei Tao
Beyond Simple Aggregates: Indexing for Summary Queries
Zhewei Wei and Ke Yi
New Results on Two-dimensional Orthogonal Range Aggregation in External Memory
Cheng Sheng and Yufei Tao
The complexity of text-preserving XML transformations
Timos Antonopoulos, Wim Martens and Frank Neven
Relational transducers for declarative networking
Tom Ameloot, Frank Neven and Jan Van den Bussche
On the Complexity of Privacy-Preserving Complex Event Processing
Yeye He, Siddharth Barman, Di Wang and Jeffrey Naughton
Querying Graph Patterns
Pablo Barceló, Leonid Libkin and Juan L. Reutter
Incomplete Information and Certain Answers in General Data Models
Determining Relevance of Accesses at Runtime
Michael Benedikt, Georg Gottlob and Pierre Senellart
A rule-based language for Web data management
Serge Abiteboul, Meghyn Bienvenu, Alban Galland and Emilien Antoine
Efficient evaluation for a temporal logic on changing XML documents
Mikolaj Bojanczyk and Diego Figueira
Determining the Currency of Data
Wenfei Fan, Floris Geerts and Jef Wijsen
Provenance Views for Module Privacy
Susan Davidson, Sanjeev Khanna, Tova Milo, Debmalya Panigrahi and Sudeepa Roy
Parallel Evaluation of Conjuctive Queries
Paraschos Koutris and Dan Suciu
Cheng Sheng and Yufei Tao
Data Exchange beyond Complete Data
Marcelo Arenas, Jorge Pérez and Juan L. Reutter
Pan-Private Algorithms Via Statistics on Sketches
Darakhshan Mir, S. Muthukrishnan, Aleksandar Nikolov and Rebecca Wright
Tight Bounds for Lp Samplers, Finding Duplicates in Streams, and Related Problems
Hossein Jowhari, Mert Saglam and Gabor Tardos
SIGMOD Contributions Award
For strengthening and humanizing the database community by originating and developing the "Distinguished Profile in Database Research" series.
Marianne Winslett (University of Illinois)
Marianne Winslett has been a professor in the Department of Computer Science at the University of Illinois since 1987. She is an ACM Fellow and the recipient of a Presidential Young Investigator Award from the US National Science Foundation. She is the former vice-chair of ACM SIGMOD and has served on the editorial boards of ACM Transactions on the Web, ACM Transactions on Database Systems, IEEE Transactions on Knowledge and Data Engineering, ACM Transactions on Information and Systems Security, and the Very Large Data Bases Journal. She has received two best paper awards for research on managing regulatory compliance data (VLDB, SSS), one best paper award for research on analyzing browser extensions to detect security vulnerabilities (Usenix Security), and one for keyword search (ICDE). Her PhD is from Stanford University.
SIGMOD Test-of-Time Award
Executing SQL over Encrypted Data in the Database-Service-Provider Model
Hakan Hacigumus, Bala Iyer, Chen Li, Sharad Mehrotra
This paper from the SIGMOD 2002 Conference remarkably anticipated the world of "Database as Service" which did come about and continues to grow in importance. To get a sense of how visionary the work was, consider that this paper was published in June 2002 (and thus accepted in Jan 2002), even a couple of months before Amazon EC2 and S3 services were launched (of course, Amazon RDS and SQL Azure came much later). The core of the paper focuses on the challenges of how to leverage cloud services while keeping some of the information (at the discretion of the enterprise/user) hidden from the service provider. Beyond the specific algorithmic details, the key contribution is the framework: (i) introduction of a mapping function, and (ii) query splitting logic to ensure how the work can be distributed across cloud and client when some information is encrypted. Is this framework used by enterprises today? As best as we can tell, the answer is perhaps no. But, is the framework interesting and has real possibilities of adoption and further impact and more follow-on by research community? Absolutely. In summary, this paper is one of the early papers to foresee the world of Database as Service (before any one of us were working on that problem). The specific technical focus was dealt with reasonable depth. The impact of the technical focus has not yet been seen by the industry but this paper has the possibility of inspiring much more follow-on work/thinking (beyond 140+ citations it already has in ACM Digital Library).
Hakan Hacigumus is the head of Data Management Research at NEC Labs America. His current interests include datamanagement in the cloud, big data, data analytics, mobility, andservice oriented business models. Prior to NEC Labs, he was a researcher at IBM Almaden Research Center, where worked on a wide range of areas in data management and services research. He received his Ph.D. in Computer Science from the University of California, Irvine.
Balakrishna (Bala) Iyer works for IBM as a Distinguished Engineer for Database Technology. He earned his B.Tech from IIT -Bombay, MS and PhD degrees from Rice University. He has worked previously for Bell Labs, Murray Hill, NJ. Bala has made contributions to the field of database in the area of temporal data, database as a service compression, sorting, query processing, data mining, encoded vector representation and processing. Many of his innovation are used every day, having been incorporated in IBM's data management products like VSAM, IMS, DB2 and IBM Intelligent Miner, and products from other leading vendors. His work on the temporal data model led to the standardization of temporal function in SQL 2011.
Chen Li is an associate professor in the Department of Computer Science at the University of California, Irvine. He received his Ph.D.degree in Computer Science from Stanford University in 2001, and his M.S. and B.S. in Computer Science from Tsinghua University, China, in 1996 and 1994, respectively. He received a National Science Foundation CAREER Award in 2003 and many other NSF grants and industry gifts. He was once a part-time Visiting Research Scientist at Google.
His research interests are in the fields of data management and information search, including text search, data-intensive computing, and data integration. He is the founder of Bimaple Technology Inc., a company providing powerful search for enterprises and developers.
Sharad Mehrotra is a Professor in the School of Information and Computer Science at University of California, Irvine and founding Director of the Center for Emergency Response Technologies (CERT) at UCI. From 2002-2009 he served as the Director and PI of the RESCUE project (Responding to Crisis and Unexpected Events) which, funded by NSF through its large ITR program, spanned 7 schools and consisted of 60 members. He is the recipient of Outstanding Graduate Student Mentor Award in 2005. Prior to joining UCI, he was a member of the faculty at University of Illinois, Urbana Champaign in the Department of Computer Science where he was the recipient of the C. W. Gear Outstanding Junior Faculty Award. Mehrotra has also served as a Scientist at Matsushita Information Technology Laboratory immediately after graduating with a Ph.D. from University of Texas at Austin (1988-1993).
Mehrotra's research expertise is in data management and distributed systems areas in which he has made many pioneering contributions. Two such contributions include the concept of "database as a service" and "use of information retrieval techniques, particularly relevance feedback, in multimedia search". Mehrotra is a recipient of numerous best paper nominations and awards includingSIGMOD Best Paper award in 2001 for a paper entitled "Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases", and best paper award in DASFAA 2004 for the paper entitled "Efficient Execution of Aggregation Queries over Encrypted Databases". Another of his paper entitled "Concurrency Control in Hierarchical Multidatabase System” was selected as best of VLDB 1994 submissions invited for the VLDB Journal. Mehrotra's recent research focuses on data quality, data privacy particularly in the context of cloud computing and sensor driven situational awareness systems.
SIGMOD Best Paper Award
High-Performance Complex Event Processing over XML Streams
Barzan Mozafari, University of California, Los Angeles; Kai Zeng, University of California, Los Angeles; Carlo Zaniolo, University of California, Los Angeles
Much research attention has been given to delivering high-performance systems that are capable of complex event processing (CEP) in a wide range of applications. However, many current CEP systems focus on processing efficiently data having a simple structure, and are otherwise limited in their ability to support efficiently complex continuous queries on structured or semi-structured information. However, XML streams represent a very popular form of data exchange, comprising large portions of social network and RSS feeds, financial records, configuration files, and similar applications requiring advanced CEP queries. In this paper, we present the XSeq language and system that support CEP on XML streams, via an extension of XPath that is both powerful and amenable to an efficient implementation. Specifically, the XSeq language extends XPath with natural operators to express sequential and Kleene-* patterns over XML streams, while remaining highly amenable to efficient implementation. XSeq is designed to take full advantage of recent advances in the field of automata on Visibly Pushdown Automata (VPA), where higher expressive power can be achieved without compromising efficiency (whereas the amenability to efficient implementation was not demonstrated in XPath extensions previously proposed). We illustrate XSeq's power for CEP applications through examples from different domains, and provide formal results on its expressiveness and complexity. Finally, we present several optimization techniques for XSeq queries. Our extensive experiments indicate that XSeq brings outstanding performance to CEP applications: two orders of magnitude improvement are obtained over the same queries executed in general-purpose XML engines.
Barzan Mozafari is currently a Postdoc Associate at Massachusetts Institute of Technology. He earned his PhD in Computer Science from the University of California at Los Angeles, where he worked on scalable solutions for pattern discovery and detection from large volumes of data, meeting several system, language and algorithmic challenges. His research interests include distributed databases, machine learning, crowd- sourcing and cloud computing.
Kai Zeng received the bachelor's degree in computer science from Zhejiang University, China, in 2009. He is currently working toward the PhD degree in database systems, under the supervision of Professor Carlo Zaniolo. He is also a research assistant. His research interests include query processing, pattern matching in data streams and massive data.
Carlo Zaniolo is a professor of Computer Science at UCLA where he occupies the N.E. Friedmann chair in Knowledge Science. His research interests include Data Stream Management Systems, Data Mining, Logic Based Languages, and Web Information Systems.
Edgar F. Codd Innovations Award
For innovative and highly significant contributions of enduring value to the development, understanding, or use of database systems and databases.
Bruce Lindsay has been a leader and inventor in many of the key systems initiatives in the data management field. As a member of the original System R team, the R* project, the Starburst project, and then several content management projects, Bruce has created fundamental technologies in a broad set of database areas, including core relational databases (authorization, high performance transactions, locking and deadlock detection), extensible databases (object management, type management, production rules for query processing), distributed databases (snapshots, distributed DDL, presumed commit, presumed abort, distributed query processing), and management of unstructured data (XML, novel indexing). He thinks broadly and has uncanny intuition for the system-level issues that has led to his innovations to have lasting impact on commercial database products.
SIGMOD Jim Gray Doctoral Dissertation Award
ACM SIGMOD is pleased to present the 2012 SIGMOD Jim Gray Doctoral Dissertation Award to F. Ryan Johnson. Johnson completed his dissertation titled "Scalable Storage Managers for the Multicore Era" at Carnegie Mellon University. Johnson's dissertation is a tour de force in identifying bottlenecks when scaling OLTP systems to many cores, proposing innovative solutions to each of them. The ideas in the thesis such as speculative lock inheritance, new techniques for combining log requests, and data-oriented transaction execution are highly innovative, and the work is remarkable for its breadth, depth, thorough implementation, and evaluation.
Ryan Johnson is an Assistant Professor at the University of Toronto specializing in systems aspects of database engines, particularly in the context of modern hardware. He graduated with M.S. and PhD degrees in Computer Engineering from Carnegie Mellon University in 2010, after completing a B.S. in Computer Engineering at Brigham Young University in 2004. In addition to his work with database systems, Johnson has interests in computer architecture, operating systems, compilers, and hardware design.
SIGMOD Jim Gray Doctoral Dissertation Honorable Mention
ACM SIGMOD is also pleased to recognize Bogdan Alexe for an Honorable Mention for the 2012 SIGMOD Jim Gray Doctoral Dissertation Award. Alexe completed his dissertation titled "Interactive and Modular Design of Schema Mappings" at the University of California, Santa Cruz. Alexe's dissertation makes substantial contributions to the important problem of designing schema mappings through novel principled algorithms and the first benchmark in this area.
Bogdan Alexe is a researcher at IBM Research - Almaden. His work focuses on large scale entity resolution and integration. His past research covered topics in information integration, data exchange and schema mappings. Bogdan graduated with a Ph.D. from University of California at Santa Cruz, and an M.Sc. from Ecole Polytechnique/Telecom ParisTech, both in Computer Science.