PVLDB Reproducibility

Starting with PVLDB 2018, pVLDB joins SIGMOD in encouraging the database community to develop a culture of sharing and cross-validation. PVLDB's reproducibility effort is being developed in coordination with SIGMOD's.

News

Submissions to PVLDB Reproducibility observe the submission deadlines of PVLDB and should be submitted using the PVLDB submission site in CMT for the reproducibility track.

Recent Reproducibility Highlights

  • Optimal Algorithms for Ranked Enumeration of Answers to Full Conjunctive Queries, Tziavelis, Nikolaos (Norteastern University); Ajwani, Deepak (Norteastern University); Gatterbauer, Wolfgang (Norteastern University); Riedewald, Mirek (Norteastern University); Yang, Xiaofeng (Norteastern University)

  • Prefix Filter: Practically and Theoretically Better Than Bloom, Tomer Even (Tel Aviv University)*; Guy Even (Tel Aviv University); Adam Morrison (Tel Aviv University)

  • A Critical Analysis of Recursive Model Indexes, Marcel Maltry (Saarland University), Jens Dittrich (Saarland University)

  • ByShard: Sharding in a Byzantine Environment, Jelle Hellings (McMaster University)*; Mohammad Sadoghi (University of California, Davis)

  • A four-dimensional Analysis of Partitioned Approximate Filters, Tobias Schmidt (TUM)*; Maximilian Bandle (TUM); Jana Giceva (TU Munich)

  • SetSketch: Filling the Gap between MinHash and HyperLogLog, Otmar Ertl (Dynatrace Research)

  • ORBITS: Online Recovery of Missing Values in Multiple Time Series Streams, Mourad Khayati (University of Fribourg)

  • Don't Be a Tattle-Tale: Preventing Leakages through Data Dependencies on Access Control Protected Data, Primal Pappachan (Penn State University); Shufan Zhang (University of Waterloo); Xi He (University of Waterloo); Sharad Mehrotra (University of California, Irvine)

  • Cardinality Estimation of Approximate Substring Queries using Deep Learning, Suyong Kwon (Seoul National University), Woohwan Jung (Hanyang University), Kyuseok Shim (Seoul National University)

  • Containerized Execution of UDFs: An Experimental Evaluation, Karla Saur (Microsoft); Konstantinos Karanasos (Meta); Tara Mirmira (University of California, San Diego); Jesús Camacho-Rodríguez (Microsoft)

  • What Is the Price for Joining Securely? Benchmarking Equi-Joins in Trusted Execution Environments, Kajetan Maliszewski*; Jorge-Arnulfo Quiané-Ruiz∗,∓; Jonas Traub∗ Volker Markl∗, (*Technische Universität Berlin,∓ ∓German Research Center for Artificial Intelligence (DFKI))
  • What is PVLDB Reproducibility?

    PVLDB Reproducibility has three goals:

    Increase the impact of database research papers.
    Enable easy dissemination of research results.
    Enable easy sharing of code and experimentation set-ups.

    In short, the goal is to assist in building a culture where sharing results, code, and scripts of database research is the norm rather than an exception. The challenge is to do this efficiently, which means building technical expertise on how to do better research via creating repeatable and sharable research. The pVLDB Reproducibility committee is here to help you with this.

    Submission

    Submit your accepted PVLDB papers for reproducibility through CMT (Reproducibility Track). To submit, you'll need the following information:

    The title and abstract of your original, accepted pVLDB paper.

    A link to your original, accepted pVLDB paper.

    A short description of how the reviewer may retrieve your reproducibility submission. This should include at least the following information: a link to the code and how to use the scripts for (a) code compilation, (b) data generation, (c) experimentation.

    A short description of the hardware needed to run your code and reproduce experiments included in the paper, with detailed specification of unusual or not commercially available hardware. If your hardware is sufficiently specialized, please have plans to allow the reviewers to access your hardware.

    A short description of any software or data necessary to run your code and reproduce experiments included in the paper, particularly if it is restricted-access (e.g., commercial software without a free demo or academic version). If this is the case, please have plans to allow the reviewers access to any necessary software or data.

    In keeping with pVLDB itself, the pVLDB Reproducibility effort will use a rolling, monthly deadline. Papers received by 5PM EST on the first of each month will be distributed for that month's round of reviews. We will aim to have a completed reproducibility review within 2 months.

    Why should I be part of this?

    You will be making it easy for other researchers to compare with your work, to adopt and extend your research. This instantly means more recognition for your work and higher impact.

    How much overhead is it?

    At first, making research sharable seems like an extra overhead for authors. You just had your paper accepted in a major conference; why should you spend more time on it? The answer is to have more impact!

    If you ask any experienced researcher in academia or in industry, they will tell you that they in fact already follow the reproducibility principles on a daily basis! Not as an afterthought, but as a way of doing good research.

    Maintaining easily reproducible experiments, simply makes working on hard problems much easier by being able to repeat your analysis for different data sets, different hardware, different parameters, etc. Like other leading system designers, you will save significant amounts of time because you will minimize the set up and tuning effort for your experiments. In addition, such practices will help bring new students up to speed after a project has lain dormant for a few months.

    Ideally reproducibility should be close to zero effort.

    Criteria and Process

    Availability

    Each submitted experiment should contain: (1) A prototype system provided as a white box (source, configuration files, build environment) or a black-box system fully specified. (2) Input Data: Either the process to generate the input data should be made available, or when the data is not generated, the actual data itself or a link to the data should be provided. (3) The set of experiments (system configuration and initialization, scripts, workload, measurement protocol) used to produce the raw experimental data. (4) The scripts needed to transform the raw data into the graphs included in the paper.

    Reproducibility

    The central results and claims of the paper should be supported by the submitted experiments.
    Therefore, an independent team should be able to recreate result data and graphs that demonstrate similar behavior with that shown in the paper, using the author’s own artifacts.

    Please note that typically, for some results (e.g., about response times), the exact numbers will depend on the underlying hardware. We do not expect to get identical results with the paper unless it happens that we get access to identical hardware. Instead, what we expect to see is that the overall behavior matches the conclusions drown in the paper, e.g., that a given algorithm is significantly faster than another one, or that a given parameter affects negatively or positively the behavior of a system.”

    Process

    Each paper is reviewed by one database group. The process happens in communication with the reviewers so that authors and reviewers can iron out any technical issues that arise. The end result is a short report which describes the result of the process.

    The goal of the committee is to properly assess and promote database research! While we expect that authors try as best as possible to prepare a submission that works out of the box, we know that sometimes unexpected problems appear and that in certain cases experiments are very hard to fully automate. The committee will not dismiss submissions if something does not work out of the box; instead, they will contact the authors to get their input on how to properly evaluate their work.

    Reproducibility Committee

    Co-Chairs

    Torsten Grust (University of Tuebingen)
    Gokhan Kul (University of Massachusetts, Dartmouth)

    Committee

    TU Delft: Asterios Katsifodimos
    Huazhong University of Science and Technology: Bolong Zheng
    Bongki Moon: Bongki Moon
    Illinois Institute of Technology: Boris Glavic
    ETH Zurich: Cedric Renggli
    Lyon 1 University: Chao Zhang
    UC Irvine: Chen Li
    Tsinghua University: Chengliang Chai
    PUC Chile: Cristian Riveros
    Osaka University: Daichi Amagata
    TU Dresden: Dirk Habich
    Rutgers Universituy - New Brunswick: Dong Deng
    National University of Singapore: Dumitrel Loghin
    Concordia University: Essam Mansour
    Guangzhou University: Fan Zhang
    ATHENA Research Center: George Papastefanatos
    University of Modena and Reggio EmiliaUniversity of Modena and Reggio EmiliaUniversity of Modena and Reggio Emilia:
    Giovanni SimoniniUniversity of Massachusetts Dartmouth: Gokhan Kul
    FORTH-ICS: Haridimos Kondylakis
    Tsinghua University: Huanchen Zhang
    Google: Ingo Müller
    Washington State University: Jia Yu
    Zhejiang University: Jinfei Liu
    The Hong Kong University of Science and Technology: Jing Tang
    Boston University: John Liagouris
    University of New South Wales: Kai Wang
    Mohammed VI Polytechnic University: Karima Echihabi
    Seoul National University: Kunsoo Park
    The University of Sydney: Lijun Chang
    Nanjing University of Science and Technology: Long Yuan
    Aalborg University: Matteo Lissandrini
    Humboldt-Universität zu Berlin: Matthias Weidlich
    IIT Delhi: Maya Ramanath
    University of Zurich: Michael H Böhlen
    University of Helsinki: Michael Mathioudakis
    University of Edinburgh: Milos Nikolic
    The University of Western Ontario: Mostafa Milani
    Huawei Technologies R&D (UK) Ltd: Nikos Ntarmos
    Microsoft: Raghav Kaushik
    Universität Mannheim: Rainer Gemulla
    The University of Hong Kong, China: Reynold Cheng
    MIT: Ryan C Marcus
    Zhejiang Univ: Sai Wu
    University of Auckland: Sebastian Link
    University of Cincinnati: Seokki Lee
    The Chinese University of Hong Kong: Sibo Wang
    Nanyang Technological University: Siqiang Luo
    Nanyang Technological University: Sourav S Bhowmick
    Universita' degli Studi di Bergamo: Stefano Paraboschi
    DFKI Berlin: Steffen Zeuch
    Simon Fraser University: Tianzheng Wang
    Aalborg University: Torben Bach Pedersen
    Harvard University: Utku Sirin
    University of New South Wales: Wenjie Zhang
    Microsoft Research: Wentao Wu
    Kent State University: Xiang Lian
    Northeastern University: Xiaochun Yang
    University of New South Wales: Xiaoyang Wang
    Aalborg University: Yan Zhao
    Kyoto University: Yang Cao
    BUPT: Yingxia Shao
    University of New South Wales: You Peng
    University at Buffalo - SUNY: Zhuoyue Zhao

    PVLDB is part of the VLDB Endowment Inc.

    Privacy Policy