PVLDB vol 15 - Contributions


PVLDB, established in 2008, is a scholarly journal for short and timely research papers pursuing a strict quality assurance process. PVLDB is distinguished by a monthly submission process with rapid reviews. PVLDB issues are published regularly throughout the year. A paper will appear in PVLDB soon after acceptance, and possibly in advance of the VLDB conference. All papers accepted for Volume 15 by June 15, 2022 will form the Research Track of the VLDB 2022 conference, together with any rollover papers from Volume 14. Papers accepted to Volume 15 after June 15, 2021 will be rolled over to the VLDB 2023 conference. At least one author of each accepted paper must  attend the VLDB 2022 conference. PVLDB is the only submission channel for research papers to appear in the VLDB 2022 conference. Please see the submission guidelines for paper submission instructions. The submission process for other VLDB 2022 tracks, such as demos or tutorials, is different, and is described in their respective calls for papers.

New for PVLDB vol 15

Transparency and Reproducibility: Authors are expected to submit supplemental material, such as code, data and other implementation artifacts used to produce the results reported in the paper. Reviewers will have access to the supplemental material and consider it in their evaluation of the scientific quality of the contribution. If authors are not able to submit the supplemental material, they must explain why. 

Authors of accepted papers are 1) expected to include the supplementary material with the camera ready, which will be assigned an official ACM badge (https://www.acm.org/publications/policies/artifact-review-and-badging-current), and 2) strongly encouraged to participate in the Reproducibility Evaluation (http://vldb.org/pvldb/reproducibility) and compete for the VLDB Best Reproducible Paper Award.

Scope of PVLDB

The Proceedings of the VLDB (PVLDB) welcomes original research papers on a broad range of research topics related to all aspects of data management, where systems issues play a significant role, such as data management system technology and information management infrastructures, including their very large scale of experimentation, novel architectures, and demanding applications as well as their underpinning theory. The scope of a submission for PVLDB is also described by the subject areas given below. Moreover, the scope of PVLDB is restricted to scientific areas that are covered by the combined expertise on the submission’s topic of the journal’s editorial board. Finally, the contributions in the submission should build on work already published in data management outlets, e.g., PVLDB, VLDBJ, ACM SIGMOD, IEEE ICDE, EDBT, ACM TODS, IEEE TKDE, and go beyond a syntactic citation.

Four Paper Categories

There are four equally-important categories of papers in the research track:

  • Regular Research Papers (up to 12 pages excluding references)
  • Scalable Data Science Papers (up to 8 pages excluding references)
  • Experiment, Analysis & Benchmark Papers (up to 12 pages excluding references)
  • Vision Papers (up to 6 pages excluding references)

For Experiment, Analysis & Benchmark Papers, Vision Papers, and Scalable Data Science Papers you must append the category tag as a suffix to the title of the submission such as “Data Management in the Year 3000 [Vision]”; “Comparing Spatial Database Systems [Experiment, Analysis & Benchmark Papers]”; “Data Cleaning in the Wild [Scalable Data Science]”.

Transparency and Reproducibility: Authors are expected to submit supplemental material, such as code, data and other implementation artifacts used to produce the results reported in the paper. Reviewers will have access to the supplemental material and consider it in their evaluation of the scientific quality of the contribution. If authors are not able to submit the supplemental material, they must explain why. Authors of accepted papers are 1) expected to include the supplementary material with the camera ready, which will be assigned an official ACM badge (https://www.acm.org/publications/policies/artifact-review-and-badging-current), and 2) strongly encouraged to participate in the Reproducibility Evaluation (http://vldb.org/pvldb/reproducibility) and compete for the VLDB Best Reproducible Paper Award.

Regular Research papers

PVLDB invites papers with different flavors. The primary contribution of foundations papers and algorithms papers lies in their formal underpinnings expressed through precise pseudocode or theoretical formalism. These papers are encouraged to include a prototype implementation and evaluation, including comparisons with alternate approaches, but these are typically limited to demonstrating the conceptual ideas. The primary contribution of systems papers lies in the development of novel and practical approaches. They typically have no proofs and no theoretical formalism, but include a solid prototype implementation and empirical evaluation, typically in a working system. Finally, the key contribution of information system architectures papers lies in an innovative architecture for a new type of data management system. These papers include an initial prototype implementation and evaluation, but their main contribution lies in the breadth and impact of the overall vision. The details of design goals (e.g., the class of workload to be supported), systems architecture, new abstractions, and design justifications are expected.

You may optionally append the flavor(s) of your paper as a suffix to the title, e.g., “Paper Title [Systems]”, “Paper Title [Information system architectures and Systems]”.

Scalable Data Science (SDS) papers

We solicit submissions of papers describing design, implementation, experience, or evaluation of solutions and systems for practical data science and data engineering tasks, including data management, data engineering, data analytics, data visualization, data quality, data integration, data mining, and machine learning on large-scale data.

Distinct from the Regular Research papers, papers in this category do not necessarily propose new algorithms or models, but emphasize solutions that either solve or advance the understanding of issues related to data science technologies in the real world. Note that SDS is not a "tech-lite" avenue to "bypass" Regular for publishing new research: it is a new avenue for research that has other forms of valuable novelty (not just novelty of techniques) and has more readily apparent potential for practical impact or has already had practical impact.

We seek two types of submissions: deployed solutions and evaluated solutions.

Papers about deployed solutions describe the implementation of a system that solves a significant real-world problem and is (or was) in use for an extended period of time in industry, science, medicine, education, government, nonprofit organizations, or as open source. The paper should present the problem, its significance to the application domain, the design choices for the solution, the implementation challenges, and the lessons learned from successes and failures, including post-launch performance analysis. Papers that describe enabling infrastructure for deployment of applied machine learning also fall into this category. An example may be an open-source, general-purpose entity linkage tool that takes data from any two data sources and links records that refer to the same real-world entity. Or a paper on a low-latency system to automatically monitor online model predictions on streaming data at scale to detect concept drift and recommend how to react.

Papers about evaluated but not necessarily deployed solutions shall describe fundamental insights derived from addressing a real-world problem. This might include papers that provide significant insights into an applied area/domain or papers that provide strong baselines that are thoroughly tested on real data. We also encourage papers that conclude that a problem is solved under particular conditions or is infeasible with current techniques. In addition to insights, the paper should explain what milestones were reached, what the practical impact is, and (if applicable) what the obstacles to deployment are. Straightforward improvements over trivial baseline solutions tested on small datasets are unlikely to qualify. Continuing with the previous example, a paper might present an entity linkage model that applies state-of-the-art deep learning techniques and obtains high performance on a few real-world datasets, showing success of adaptations of recent techniques in helping solve an important and practical data science problem. Similarly, a paper on a system to handle concept drift in streaming prediction applications may apply or extend recent statistical or ML approaches but demonstrates their efficacy and scalability convincingly with real-world datasets.

Submissions should be up to eight pages long, with unlimited pages for references. The papers need not cover all aspects of an application or give all details. Instead, we encourage papers with key insights supported by solid data points.

This new category helps bridge the gap between the Regular Research papers and the Industrial Track papers, especially due to the fast evolving nature of data science. In particular, it differs from the Industrial Track on both scope and level of impact expected. This category focuses more specifically on new technology for data science-oriented workloads, while the Industrial Track is more general and covers all aspects of database technology. The Industrial Track focuses on already commercial technology, while this category also welcomes work that may not yet be commercial or deployed but still at the proof-of-concept stage, as long as it is convincingly validated and has good potential for impact. In relation to concurrent submissions, authors are not allowed to submit papers on the same work to any other category or track of VLDB, except for the Demonstrations Track.

Scalability is an important aspect of data science research at the cusp of practical impact. But scalability can refer to different axes and metrics in different data science contexts, e.g., number of data examples, number of attributes/features, number of data sources, number of models, number of users, or number of concurrent requests for axes, and response latency, system throughput, machine resource footprints, and monetary costs for metrics. It is not possible to enumerate a comprehensive list. Reviewers will assess whether the submission is sound on the scalability aspect based on the merits of the work and its target application setting.

It is our hope that this new category will attract more of the cutting-edge and impactful real-world work in the scalable data science arena to VLDB for the benefit of the VLDB community, including spurring new technical connections, inspiring new follow-on research on scalable data science, and enhancing the impact of the VLDB community on data science practice.

Experimental, Analysis & Benchmark papers

EA&B papers focus on the extensive evaluation of algorithms, data structures, and systems that are of wide interest. The scientific contribution of an EA&B paper lies in providing

1) fundamentally new insights into the strengths and weaknesses of existing methods or

2) new ways to evaluate existing methods.

Such contributions are essential, because they can springboard new follow-up research by enabling the research community to see new design possibilities for new methods, new metrics that matter, new but problematic corner cases, as well as having a common infrastructure to judge methods.

Some examples of paper types suitable for this category are:

  • Experimental survey: Experimental surveys that compare multiple existing solutions (including open source solutions) to a problem and, through extensive experiments, provide a comprehensive perspective on their strengths and weaknesses.
  • Analysis: Papers that focus on relevant problems or phenomena and through analysis and/or experimentation provide insights on the nature or characteristics of these phenomena.
  • Benchmark: Papers that present new benchmarks, characteristics of the benchmark data, methods in generating the data and the gold standard, usage of the benchmarks, and optionally example experimental results on the benchmarks.
  • Reproducibility: Papers that verify or refute results published in the past and that, through a renewed performance evaluation, help to advance the state of the art

For papers that identify negative or contradictory results for published results by third parties, the Program Committee may ask the third party to comment on the submission and even allow a short rebuttal/explanation to be published along with the submission in the event of acceptance. 

Authors of accepted EA&B papers are required to:  1) make available all experimental data and related software; 2) submit their experiments for reproducibility evaluation by the the PVLDB Reproducibility Committee (http://vldb.org/pvldb/reproducibility). These two requirements must be fulfilled prior to final acceptance of the paper.

Vision papers

Vision papers outline futuristic information systems and architectures or anticipate new challenges. Submissions would describe novel projects that are in an early stage but hold out the strong promise of eventual high impact. The focus should be on the key insight behind the project (e.g., a new set of ground rules or a novel technology), as well as explaining how the key insight can be leveraged in building a system. The paper should describe what the success criteria are for the vision project. The length of a submission within the Vision Papers category is up to 6 pages, excluding references (see submission guidelines for details).

Topics of Interest

PVLDB welcomes original research papers on a broad range of topics related to all aspects of data management. The themes and topics listed below are intended to serve primarily as indicators of the kinds of data-centric subjects that are of interest to PVLDB – they do not represent an exhaustive list.

  • Data Mining and Analytics
    • Data Warehousing, OLAP, Parallel and Distributed Data Mining
    • Mining and Analytics for Scientific and Business data, Social Networks, Time Series, Streams, Text, Web, Graphs, Rules, Patterns, Logs, and Spatio-temporal Data
  • Data Privacy and Security
    • Blockchain
    • Access Control and Privacy
  • Database Engines
    • Access Methods, Concurrency Control, Recovery and Transactions
    • Hardware Accelerators
    • Query Processing and Optimization
    • Storage Management, Multi-core Databases, In-memory Data Management
    • Views, Indexing and Search
  • Database Performance
    • Tuning, Benchmarking and Performance Measurement
    • Administration and Manageability
  • Distributed Database Systems
    • Content Delivery Networks, Database-as-a-service, and Resource Management
    • Cloud Data Management
    • Distributed Analytics
    • Distributed Transactions
  • Graphs, Networks, and Semistructured Data
    • Graph Data Management, Recommendation Systems, Social Networks
    • Hierarchical, Non-relational, and other Modern Data Models
  • Information Integration and Data Quality
    • Data Cleaning, Data Discovery and Data Exploration
    • Heterogeneous and Federated DBMS, Metadata Management
    • Web Data Management and Semantic Web
    • Knowledge Graphs and Knowledge Management
  • Languages
    • Data Models and Query Languages
    • Schema Management and Design
  • Machine Learning, AI and Databases
    • Data Management Issues and Support for Machine Learning and AI
    • Machine Learning and Applied AI for Data Management
  • Novel DB Architectures
    • Embedded and Mobile Databases
    • Data management on novel hardware
    • Real-time databases, Sensors and IoT, Stream Databases
    • Crowd-sourcing
  • Provenance and Workflows
    • Profile-based and Context-Aware Data Management
    • Process Mining
    • Provenance analytics
    • Debugging
  • Specialized and Domain-Specific Data Management
    • Spatial Databases and Temporal Databases
    • Crowdsourcing
    • Ethical Data Management
    • Fuzzy, Probabilistic and Approximate Data
    • Image and Multimedia Databases
    • Scientific and Medical Data Management
  • Text, Semi-Structured Data, and IR
    • Information Retrieval
    • Text in Databases
    • Data Extraction
  • User Interfaces
    • Database Usability
    • Database support for Visual Analytics
    • Visualization