go back

Volume 18, No. 12

Unify: A System For Unstructured Data Analytics

Authors:
Jiayi Wang, Yuan Li, Jianming Wu, Shihui Xu, Guoliang Li

Abstract

Unstructured data comprises over 80% of today’s information, yet no specialized system effectively supports its semantic analytics. Traditional SQL-based approaches rely on predefined schemas, making them unsuitable. While large language models (LLMs) enable semantic analysis of unstructured data, manually orchestrating execution plans remains inefficient. This raises a critical question: how can we automate unstructured data analytics? In this demonstration, we present Unify , a system that automates unstructured data analytics for natural language queries. Unify defines a set of core operators for unstructured data processing, with both preprogrammed and LLM-based implementations. It guides LLMs to decompose queries into logical steps and map them to appropriate operators for accurate execution. Our demonstration showcases Unify by real-world scenarios, highlighting its ability to bridge the gap between unstructured data and actionable analytics.

PVLDB is part of the VLDB Endowment Inc.

Privacy Policy