Accepted Papers









Accepted Industrial Papers

  • MRTuner: A Toolkit to Enable Holistic Optimization for MapReduce Jobs
    Juwei Shi* (IBM Research China)*;Jia Zou (IBM Research-China);Jiaheng Lu (RUC);Zhao Cao (IBM Research China);Shi Qiang Li (IBM Research China);Chen Wang (IBM China Research Lab)

  • Reducing Database Locking Contention Through Multi-version Concurrency
    Mohammad Sadoghi* (IBM T.J. Watson Research Cente)*;Mustafa Canim (IBM T.J. Watson Research Center);Bishwaranjan Bhattacharjee (IBM T.J. Watson Research Center);Fabian Nagel (University of Edinburgh);Kenneth Ross (Columbia University)

  • Changing Engines in Midstream: A Java Stream Computational Model for Big Data Processing
    Xueyuan Su* (Oracle Corporation)*;Garret Swart (Oracle Corporation);Brian Goetz (Oracle Corporation);Brian Oliver (Oracle Corporation);Paul Sandoz (Oracle Corporation)

  • Joins on Encoded and Partitioned Data
    Jae-Gil Lee* (KAIST)*;Gopi Attaluri (IBM Software Group);Ronald Barber (IBM Almaden Research Center);Naresh Chainani (IBM Software Group);Oliver Draese (IBM Software Group);Frederick Ho (IBM Informix);Stratos Idreos (Harvard University);Min-Soo Kim (DGIST);Sam Lightstone (IBM Software Group);Guy Lohman (IBM Almaden Research Center);Konstantinos Morfonios (Oracle);Keshava Murthy (IBM Informix);Ippokratis Pandis (IBM Almaden);Lin Qiao (LinkedIn);Vijayshankar Raman (IBM Almaden Research Center);Vincent Kulandai Samy (IBM Almaden Research Center);Richard Sidle (IBM Almaden Research Center);Knut Stolze (IBM Software Group);Liping Zhang (IBM Software Group)

  • TPC-DI: The First Industry Benchmark for Data Integration
    Meikel Poess* (Oracle)*;Tilmann Rabl (University of Toronto);Hans-Arno Jacobsen (University of Toronto);Brian Caufield (IBM)

  • Real-Time Twitter Recommendation: Online Motif Detection in Large Dynamic Graphs
    Pankaj Gupta (Twitter);Venu Satuluri (Twitter);Ajeet Grewal (Twitter);Siva Gurumurthy (Twitter);Volodymyr Zhabiuk (Twitter);Quannan Li (Twitter);Jimmy Lin* (Twitter)*

  • Interval Disaggregate: A New Operator for Business Planning
    Sang Cha (SAP Labs Korea);Kunsoo Park* (SAP Labs Korea)*;Chang Song (SAP Labs Korea);Ki Kim (SAP Labs Korea);Cheol Ryu (SAP Labs Korea);Sunho Lee (SAP Labs Korea)

  • Fuxi: Fault Tolerant Resource Management and Job Scheduling System at Internet Scale
    Zhuo Zhang (Alibaba Cloud Computing);Chao Li (Alibaba Cloud Computing);Yangyu Tao (Alibaba Cloud Computing);Renyu Yang* (Beihang University)*;Jie Xu (Beihang University, University of Leeds)

  • Large-Scale Graph Analytics in Aster 6: Bringing Context to Big Data Discovery
    David Simmen* (Teradata Aster)*; Karl Schnaitter (Teradata Aster); Jeff Davis (Teradata Aster); Yingjie He (Teradata Aster); Sangeet Lohariwala (Teradata Aster); Ajay Mysore (Teradata Aster);Vinayak Shenoi (Teradata Aster); Mingfeng Tan (Teradata Aster); Yu Xiao (Teradata Aster)

  • Fast Foreign-Key Detection in Microsoft SQL Server PowerPivot for Excel
    Zhimin Chen (Microsoft Research);Vivek Narasayya* (Microsoft Research)*;Surajit Chaudhuri (Microsoft Research)

  • Big Data Small Footprint: The Design of A Low-Power Classifier for Detecting Transportation Modes
    Meng-Chieh Yu* (HTC, Studio Engineering)*;Tong Yu (National Taiwan University);ShaoChen Wang (HTC);Chih-Jen Lin (National Taiwan University);Edward Y. Chang (HTC)

  • Summingbird: A Framework for Integrating Batch and Online MapReduce Computations
    Oscar Boykin (Twitter);Sam Ritchie (Twitter);Ian O'Connell (Twitter);Jimmy Lin* (Twitter)*

  • Of Snowstorms and Bushy Trees
    Rafi Ahmed* (Oracle)*;Rajkumar Sen (Oracle USA);Meikel Poess (Oracle);Sunil Chakkappen (Oracle USA)

  • Execution Primitives for Scalable Joins and Aggregations in Map Reduce
    Srinivas Vemuri (Link);Maneesh Varshney (LinkedIn);Krishna Puttaswamy* (LinkedIn)*;Rui Liu (LinkedIn)

  • CAP limits in telecom subscriber database design
    Javier Arauz* (Ericsson)*

  • Advanced Join Strategies for Large-Scale Distributed Computation
    Nico Bruno* (Microsoft)*;YONGCHUL KWON (Microsoft);Ming-Chuan Wu (Microsoft)

  • DGFIndex for Smart Grid: Enhancing Hive with a Cost-Effective Multidimensional Range Index
    Liu Yue* (Chinese Academy of Sciences)*;Songlin Hu (Chinese Academy of Science);Tilmann Rabl (University of Toronto);Wantao Liu (Chinese Academy of Science);Hans-Arno Jacobsen (University of Toronto);Kaifeng Wu (State Grid Electricity Science Research Institute);Jian Chen (Zhejiang Electric Power Corporation); Jintao Li (Chinese Academy of Sciences)

  • Error-bounded Sampling for Analytics on Big Sparse Data
    Ying Yan* (Microsoft Research)*;Liang Chen (Microsoft Research);Zheng Zhang (MSRA)

  • Indexing HDFS Data in PDW: Splitting the data from the index
    Vinitha Gankidi (University of Wisconsin, Madison);Nikhil Teletia* (Microsoft )*;Jignesh Patel (University of Wisconsin);Alan Halverson (Microsoft Jim Gray Systems Lab);David Dewitt (Microsoft Jim Gray Research Lab)

  • Chimera: Large Scale Classification using Machine Learning, Rules, and Crowdsourcing
    AnHai Doan* (Univ. of Wisconsin Madison)*;Chong Sun (WalmartLabs);Narasimhan Rampalli (WalmartLabs)

Selected Papers from Local Industry

  • Realization of the Low Cost and High Performance MySQL Cloud Database
    Wei Cao (Alibaba Cloud Computing Inc.), Feng Yu (Alibaba Cloud Computing Inc.), Jiasen Xie (Alibaba Cloud Computing Inc.)

  • Fatman: Costsaving and reliable archival storage based on volunteer resources
    An Qin (Baidu, Inc), Dianming Hu (Baidu, Inc), Jun Liu (Baidu, Inc), Wenjun Yang (Baidu, Inc), Dai Tan (Baidu, Inc)

  • Design and Implementation of a Real-Time Interactive Analytics System for Large Spatio-Temporal Data
    Shiming Zhang (Huawei Noah's Ark Lab) ;Yin Yang (University of Illinois at Urbana-Champaign) ;Wei Fan (Huawei Noah's Ark Lab) ;Marianne Winslett (University of Illinois at Urbana-Champaign)

  • A Personalized Recommendation System for NetEase Dating Site
    Chaoyue Dai (NetEase Inc.) , Feng Qian (NetEase Inc.), Wei Jiang (NetEase Inc.), Zhoutian Wang (NetEase Inc.), Zenghong Wu (NetEase Inc.)

  • GEMINI: An Integrative Healthcare Analytics System
  • Zheng Jye Ling (National University Health System);Quoc Trung Tran (National University of Singapore);Ju Fan (National University of Singapore);Gerald C.H. Koh (National University Health System);Thi Nguyen (National University of Singapore);Chuen Seng Tan (National University Health System);James W. L. Yip (National University Health System);Meihui Zhang (National University of Singapore)

  • Mariana: Tencent Deep Learning Platform and its Applications
  • Yongqiang Zou (Tencent Inc.), Xing Jin (Tencent Inc.), Yi Li (Tencent Inc.), Zhimao Guo (Tencent Inc.), Eryu Wang (Tencent Inc.), Bin Xiao (Tencent Inc.)

  • YZStack: Provisioning Customizable Solution for Big Data
  • Sai Wu (Zhejiang University), Chun Chen (Zhejiang University), Gang Chen (Zhejiang University), Ke Chen (Zhejiang University), Lidan Shou (Zhejiang University); Hui Cao (yzBigData Co., Ltd.); He Bai (City Cloud Technology (Hangzhou) Co., Ltd.)