Efficient Insights Discovery through Conditional Generative Model based Query Approximation

International Conference on Management of Data (SIGMOD)

Publication date: June 12, 2022

Vibhor Porwal, Subrata Mitra, Fan Du, John Anderson, Nikhil Sheoran, Anup Rao, Tung Mai, Gautam Kowshik, Sapthotharan Nair, Sameeksha Arora, Saurabh Mahapatra

There are various scenarios where very quick insights from a massive amount of data need to be extracted in a time-critical manner. These might be fresh insights or re-looking at why previous insights did not work and how to fix those. A marketing campaign is one real-world scenario where a non-programmer needs to dig such huge data in a very short period of time (a few hours) in order to hit a target revenue. In this demo paper, we will describe Electra - a system that integrates an automated data-insight discovery mechanism with a novel machine-learning (ML) driven approximate query processing (AQP) engine that can answer complex queries with a large number of predicates or conditions with high accuracy. This AQP engine uses a conditional generative model to generate a very small sample (~1000 rows) corresponding to the actual query to be answered and computes the highly accurate approximate answer from those instead of running the query against the original data. The insight discovery workflow bootstraps insights using ML algorithms based on the statistical characteristics of the data and further offers a no-code based interface to drill down for deeper insights. The queries from this interface are answered by the AQP engine that runs locally at the client-side itself to offer low latency interactions.

Learn More