Enabling search by experience


Today's online shopping systems enable consumers to sift through a vast amount of information by manipulating combinations of predefined filters. These filters, such as travel dates, price range, and location, are objective attributes that lead to an indisputable set of answers. However, we show that users' search criteria are often subjective and experientially expressed. Hence, to provide consumers with an enhanced search experience, online shopping systems should directly support both subjective and objective search. I will describe how this is done in an experiential search engine that we are currently developing at Megagon Labs; by harnessing information "outside the box", in the text of online reviews or social media, views, and interpreting subjective queries.

Speaker Bio

Wang-Chiew Tan leads the research efforts at Megagon Labs. Prior to joining Megagon Labs, she was a professor of Computer Science at the University of California, Santa Cruz and she also spent two years at IBM Almaden Research Center. Her research interests include data provenance, data integration, and very recently, natural language processing. She is the recipient of an NSF CAREER award, a Google Faculty Award, and an IBM Faculty Award. She is a co-recipient of the 2014 ACM PODS Alberto O. Mendelzon Test-of-Time Award and a co-recipient of the 2018 ICDT Test-of-Time Award. She was the program committee chair of ICDT 2013 and PODS 2016. She is currently on the VLDB Board of Trustees and she is a Fellow of the ACM.

Schemas for graphs and other forms of semi-structured data

Juan Reutter

Semistructured data is on the rise: Graph databases are now offered by most major database vendors, and JSON is used everywhere on the web. And while these data paradigms are commonly described as “having less structure than relational databases”, the industry is also recognising the advantages of pairing the data with some notion of schema, in the form of metadata that would describe both what is in the database and how is the data structured.

In this talk we focus on the SHACL/ShEx proposal for RDF graph data and JSON Schema for JSON data, two of the most adopted proposals for semi-structured data. Interestingly, even though these proposals have spanned in two completely different worlds, we show that they share the same foundations and the same spirit. We will also discuss about the challenges that arise from features demanded in these proposals by the community, but that give rise to a number of interesting open problems.

Speaker Bio

Juan works as an assistant professor in the department of computer science of Pontificia Universidad Católica de Chile, and as an associate researcher in the Millenium Institute for Foundational Research on Data. He received his PhD in the University of Edinburgh on 2013. His research interest has to do with the mathematical foundations of data management, web data and data science. He received the Cor Baayen Young Researcher Award from the European Research Consortium for Informatics and Mathematics and a BCS distinguished dissertation award for his thesis entitled “Graph Patterns: Structure, Query Answering and Applications in Schema Mappings and Formal Language Theory”. His research has also been highlighted with the Best Paper Award at PODS 2011.

From complete to incomplete data and back in ontology-enriched databases

Magdalena Ortiz

Enriching a database with a background theory expressing domain knowledge, usually called an ontology, has been proposed as a tool to overcome the incompleteness of data. In ontology mediated querying the theory is used to infer answers that may involve implied facts not present in the data. This and other related reasoning problems have been extensively studied over the last decade, mostly for ontologies written in description logics and in dialects of Datalog+-. But the usual first-order semantics used in this setting, which assumes that all data is incomplete, can sometimes be too weak and not give all expected answers. I will discuss some alternatives that have been explored for combining complete and incomplete data in the presence of description logic ontologies, and the challenges that they pose, including increased computational complexity of reasoning and non-monotonicity of the ontology mediated query languages they induce. I will discuss a few interesting reasoning problems that arise in these setting, and some translations from these query languages into variants of Datalog.

Speaker Bio

Magdalena Ortiz studied computer science in Mexico before moving to Europe to study computational logic in Italy and Austria. She is an assistant professor for Knowledge Representation and Reasoning at the Vienna University of Technology, where she works on the boundary between artificial intelligence and databases. Most of her research aims at using knowledge to make data-centric systems smarter and more reliable, specially using formalisms based on description logics.

Query Optimization by Quantifier Elimination

Christoph Koch

Many of us who teach database query languages have seen creative students who lack a training in formal logic come up with surprising ways of using aggregation for expressing challenging queries in SQL — ways that do not feel natural to those trained in logic but which nevertheless exact admiration. In this talk, I show how quantifier elimination can be used to optimize SQL queries in surprising ways — ways whose results coincide with and generalize these apparently creative tricks. The new query optimization technique, apart from being potentially useful for practical query engines, suggests a particular way in which the logically untrained mind synthesizes queries (not quantifier elimination, though) — an observation at best based on an amateur’s understanding of brain science, but potentially still useful for teaching databases.

Speaker Bio

Christoph Koch is a professor of Computer Science at EPFL, specializing in data management. Until 2010, he was an Associate Professor in the Department of Computer Science at Cornell University. Previously to this, from 2005 to 2007, he was an Associate Professor of Computer Science at Saarland University. Earlier, he obtained his PhD in Artificial Intelligence from TU Vienna and CERN (2001), was a postdoctoral researcher at TU Vienna and the University of Edinburgh (2001-2003), and an assistant professor at TU Vienna (2003-2005). He obtained his Habilitation degree in 2004. He has won Best Paper Awards at PODS 2002, ICALP 2005, SIGMOD 2011, VLDB 2014, and GPCE 2017, an Outrageous Ideas and Vision Paper Award at CIDR 2013, a Google Research Award (in 2009), and an ERC Grant (in 2011). He (co-)chaired the program committees of DBPL 2005, WebDB 2008, ICDE 2011, VLDB 2013, and PODS 2019, and was PC vice-chair of ICDE 2008 and ICDE 2009. He has served on the editorial board of ACM Transactions on Internet Technology and as Editor-in-Chief of PVLDB.

Organized by:

Supported by: