15. KDD 2009:
Paris,
France
John F. Elder IV, Françoise Fogelman-Soulié, Peter A. Flach, Mohammed Javeed Zaki (Eds.):
Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Paris, France, June 28 - July 1, 2009.
ACM 2009, ISBN 978-1-60558-495-9
Keynote talks
- David J. Hand:
Mismatched models, wrong results, and dreadful decisions: on choosing appropriate data mining tools.
1-2
- Ravi Kumar:
Mining web logs: applications and challenges.
3-4
- Heikki Mannila:
Randomization methods in data mining.
5-6
- Ashok N. Srivastava:
Data mining at NASA: from theory to applications.
7-8
- Stanley Wasserman:
Network science: an introduction to recent statistical approaches.
9-10
Panel
Research track papers
- Deepak Agarwal, Bee-Chung Chen:
Regression-based latent factor models.
19-28
- Charu C. Aggarwal, Yan Li, Jianyong Wang, Jing Wang:
Frequent pattern mining with uncertain data.
29-38
- Amr Ahmed, Eric P. Xing, William W. Cohen, Robert F. Murphy:
Structured correspondence topic models for mining captioned figures in biological literature.
39-48
- Anurag Ambekar, Charles B. Ward, Jahangir Mohammed, Swapna Male, Steven Skiena:
Name-ethnicity classification from open sources.
49-58
- Shin Ando, Einoshin Suzuki:
Detection of unique temporal segments by information theoretic meta-clustering.
59-68
- Mafruz Zaman Ashrafi, See-Kiong Ng:
Collusion-resistant anonymous data collection method.
69-78
- Sitaram Asur, Srinivasan Parthasarathy:
A viewpoint-based approach for interaction graph analysis.
79-88
- Lars Backstrom, Jon M. Kleinberg, Ravi Kumar:
Optimizing web traffic via the media scheduling problem.
89-98
- Ron Bekkerman, Martin Scholz, Krishnamurthy Viswanathan:
Improving clustering stability with combinatorial MRFs.
99-108
- Michele Berlingerio, Fabio Pinelli, Mirco Nanni, Fosca Giannotti:
Temporal mining for interactive workflow data analysis.
109-118
- Thomas Bernecker, Hans-Peter Kriegel, Matthias Renz, Florian Verhein, Andreas Züfle:
Probabilistic frequent itemset mining in uncertain databases.
119-128
- Alina Beygelzimer, John Langford:
The offset tree for learning with partial labels.
129-138
- Albert Bifet, Geoffrey Holmes, Bernhard Pfahringer, Richard Kirkby, Ricard Gavaldà:
New ensemble methods for evolving data streams.
139-148
- Christian Böhm, Katrin Haegler, Nikola S. Müller, Claudia Plant:
CoCo: coding cost for parameter-free outlier detection.
149-158
- Yingyi Bu, Lei Chen, Ada Wai-Chee Fu, Dawei Liu:
Efficient anomaly monitoring over moving object trajectory streams.
159-168
- Jonathan Chang, Jordan L. Boyd-Graber, David M. Blei:
Connections between the lines: augmenting social networks with text.
169-178
- Bo Chen, Wai Lam, Ivor Tsang, Tak-Lam Wong:
Extracting discriminative concepts for domain adaptation in text mining.
179-188
- Minmin Chen, Yixin Chen, Michael R. Brent, Aaron E. Tenney:
Constrained optimization for validation-guided conditional random field learning.
189-198
- Wei Chen, Yajun Wang, Siyu Yang:
Efficient influence maximization in social networks.
199-208
- Ye Chen, Dmitry Pavlov, John F. Canny:
Large-scale behavioral targeting.
209-218
- Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Michael Mitzenmacher, Alessandro Panconesi, Prabhakar Raghavan:
On compressing social networks.
219-228
- Erick Delage:
Regret-based online ranking for a growing digital library.
229-238
- Hongbo Deng, Michael R. Lyu, Irwin King:
A generalized Co-HITS algorithm and its application to bipartite graphs.
239-248
- Meghana Deodhar, Joydeep Ghosh:
Mining for the most certain predictions from dyadic data.
249-258
- Pinar Donmez, Jaime G. Carbonell, Jeff Schneider:
Efficiently learning the accuracy of labeling sources for selective sampling.
259-268
- Nan Du, Christos Faloutsos, Bai Wang, Leman Akoglu:
Large human communication networks: patterns and a utility-driven generator.
269-278
- Murat Dundar, E. Daniel Hirleman, Arun K. Bhunia, J. Paul Robinson, Bartek Rajwa:
Learning with a non-exhaustive training dataset: a case study: detection of bacteria cultures using optical-scattering technology.
279-288
- Khalid El-Arini, Gaurav Veda, Dafna Shahaf, Carlos Guestrin:
Turning down the noise in the blogosphere.
289-298
- George Forman, Martin Scholz, Shyamsundar Rajaram:
Feature shaping for linear SVM classifiers.
299-308
- Richard Frank, Martin Ester, Arno Knobbe:
A multi-relational approach to spatial classification.
309-318
- Antonino Freno, Edmondo Trentin, Marco Gori:
Scalable pseudo-likelihood estimation in hybrid random fields.
319-328
- João Gama, Raquel Sebastião, Pedro Pereira Rodrigues:
Issues in evaluation of stream learning algorithms.
329-338
- Jing Gao, Wei Fan, Yizhou Sun, Jiawei Han:
Heterogeneous source consensus learning via decision propagation and negotiation.
339-348
- Yong Ge, Hui Xiong, Wenjun Zhou, Ramendra K. Sahoo, Xiaofeng Gao, Weili Wu:
Multi-focal learning and its application to customer service support.
349-358
- Quanquan Gu, Jie Zhou:
Co-clustering on manifolds.
359-368
- Lei Guo, Enhua Tan, Songqing Chen, Xiaodong Zhang, Yihong Eric Zhao:
Analyzing patterns of user content generation in online social networks.
369-378
- Sami Hanhijärvi, Markus Ojala, Niko Vuokko, Kai Puolamäki, Nikolaj Tatti, Heikki Mannila:
Tell me something I don't know: randomization strategies for iterative data mining.
379-388
- Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, Xiaohua Zhou:
Exploiting Wikipedia as external knowledge for document clustering.
389-396
- Mohsen Jamali, Martin Ester:
TrustWalker: a random walk model for combining trust-based and item-based recommendation.
397-406
- Shuiwang Ji, Lei Yuan, Ying-Xin Li, Zhi-Hua Zhou, Sudhir Kumar, Jieping Ye:
Drosophila gene expression pattern annotation using sparse features and term-term interactions.
407-416
- Ruoming Jin, Yang Xiang, Lin Liu:
Cartesian contour: a concise representation for a collection of frequent sets.
417-426
- Aleksander Kolcz, Gordon V. Cormack:
Genre-based decomposition of email class noise.
427-436
- Arne Koopman, Arno Siebes:
Characteristic relational patterns.
437-446
- Yehuda Koren:
Collaborative filtering with temporal dynamics.
447-456
- Sayali Kulkarni, Amit Singh, Ganesh Ramakrishnan, Soumen Chakrabarti:
Collective annotation of Wikipedia entities in web text.
457-466
- Theodoros Lappas, Kun Liu, Evimaria Terzi:
Finding a team of experts in social networks.
467-476
- Theodoros Lappas, Benjamin Arai, Manolis Platakis, Dimitrios Kotsakos, Dimitrios Gunopulos:
On burstiness-aware search for document sequences.
477-486
- Mark Last:
Improving data mining utility with projective sampling.
487-496
- Jure Leskovec, Lars Backstrom, Jon M. Kleinberg:
Meme-tracking and the dynamics of the news cycle.
497-506
- Lei Li, James McCann, Nancy S. Pollard, Christos Faloutsos:
DynaMMo: mining and summarization of coevolving sequences with missing values.
507-516
- Tiancheng Li, Ninghui Li:
On the tradeoff between privacy and utility in data publishing.
517-526
- Yu-Ru Lin, Jimeng Sun, Paul Castro, Ravi B. Konuru, Hari Sundaram, Aisling Kelliher:
MetaFac: community discovery via relational hypergraph factorization.
527-536
- Chao Liu, Fan Guo, Christos Faloutsos:
BBM: bayesian browsing model from petabyte-scale data.
537-546
- Jun Liu, Jianhui Chen, Jieping Ye:
Large-scale sparse logistic regression.
547-556
- David Lo, Hong Cheng, Jiawei Han, Siau-Cheng Khoo, Chengnian Sun:
Classification of software behaviors for failure detection: a discriminative pattern mining approach.
557-566
- Steven Loscalzo, Lei Yu, Chris H. Q. Ding:
Consensus group stable feature selection.
567-576
- Aurelie C. Lozano, Naoki Abe, Yan Liu, Saharon Rosset:
Grouped graphical Granger modeling methods for temporal causal modeling.
577-586
- Aurelie C. Lozano, Hongfei Li, Alexandru Niculescu-Mizil, Yan Liu, Claudia Perlich, Jonathan R. M. Hosking, Naoki Abe:
Spatial-temporal causal modeling for climate change attribution.
587-596
- Sofus A. Macskassy:
Using graph-based metrics with empirical risk minimization to speed up active learning on networked data.
597-606
- R. Dean Malmgren, Jake M. Hofman, Luis A. N. Amaral, Duncan J. Watts:
Characterizing individual communication patterns.
607-616
- Andreas Maunz, Christoph Helma, Stefan Kramer:
Large-scale graph mining using backbone refinement classes.
617-626
- Frank McSherry, Ilya Mironov:
Differentially private recommender systems: building privacy into the net.
627-636
- Anna Monreale, Fabio Pinelli, Roberto Trasarti, Fosca Giannotti:
WhereNext: a location predictor on trajectory pattern mining.
637-646
- Siegfried Nijssen, Tias Guns, Luc De Raedt:
Correlated itemset mining in ROC space: a constraint programming approach.
647-656
- Kensuke Onuma, Hanghang Tong, Christos Faloutsos:
TANGENT: a novel, 'Surprise me', recommendation algorithm.
657-666
- Rong Pan, Martin Scholz:
Mind the gaps: weighting the unknown in large-scale one-class collaborative filtering.
667-676
- Gaurav Pandey, Gowtham Atluri, Michael Steinbach, Chad L. Myers, Vipin Kumar:
An association analysis approach to biclustering.
677-686
- Ardian Kristanto Poernomo, Vivekanand Gopalkrishnan:
CP-summary: a concise representation for browsing frequent itemsets.
687-696
- Ardian Kristanto Poernomo, Vivekanand Gopalkrishnan:
Towards efficient mining of proportional fault-tolerant frequent itemsets.
697-706
- Foster J. Provost, Brian Dalessandro, Rod Hook, Xiaohan Zhang, Alan Murray:
Audience selection for on-line brand advertising: privacy-friendly social network targeting.
707-716
- Zijie Qi, Ian Davidson:
A principled and flexible framework for finding alternative clusterings.
717-726
- Steffen Rendle, Leandro Balby Marinho, Alexandros Nanopoulos, Lars Schmidt-Thieme:
Learning optimal ranking with tensor factorization for tag recommendation.
727-736
- Venu Satuluri, Srinivasan Parthasarathy:
Scalable graph clustering using stochastic flows: applications to community discovery.
737-746
- Jerry Scripps, Pang-Ning Tan, Abdol-Hossein Esfahanian:
Measuring the effects of preprocessing decisions and network forces in dynamic network analysis.
747-756
- Bao-Hong Shen, Shuiwang Ji, Jieping Ye:
Mining discrete patterns via binary matrix factorization.
757-766
- Lei Shi, Vandana Pursnani Janeja:
Anomalous window discovery through scan statistics for linear intersecting paths (SSLIP).
767-776
- Xiaolin Shi, Jun Zhu, Rui Cai, Lei Zhang:
User grouping behavior in online forums.
777-786
- Takashi Shibuya, Tatsuya Harada, Yasuo Kuniyoshi:
Causality quantification and its applications: structuring and modeling of multivariate time series.
787-796
- Yizhou Sun, Yintao Yu, Jiawei Han:
Ranking-based clustering of heterogeneous information networks with star network schema.
797-806
- Jie Tang, Jimeng Sun, Chi Wang, Zi Yang:
Social influence analysis in large-scale networks.
807-816
- Lei Tang, Huan Liu:
Relational learning via latent social dimensions.
817-826
- Chayant Tantipathananandh, Tanya Y. Berger-Wolf:
Constant-factor approximation algorithms for identifying dynamic communities.
827-836
- Charalampos E. Tsourakakis, U. Kang, Gary L. Miller, Christos Faloutsos:
DOULION: counting triangles in massive graphs with a coin.
837-846
- Pavan Vatturi, Weng-Keen Wong:
Category detection using hierarchical mean shift.
847-856
- Ting Wang, Mudhakar Srivatsa, Dakshi Agrawal, Ling Liu:
Learning, indexing, and diagnosing network faults.
857-866
- Xuanhui Wang, Deepayan Chakrabarti, Kunal Punera:
Mining broad latent query aspects from search sessions.
867-876
- Junjie Wu, Hui Xiong, Jian Chen:
Adapting the right measures for K-means clustering.
877-886
- Mingxi Wu, Xiuyao Song, Chris Jermaine, Sanjay Ranka, John Gums:
A LRT framework for fast spatial anomaly detection.
887-896
- Jack Chongjie Xue, Gary M. Weiss:
Quantification and semi-supervised classification methods for handling changes in class distribution.
897-906
- Donghui Yan, Ling Huang, Michael I. Jordan:
Fast approximate spectral clustering.
907-916
- Bishan Yang, Jian-Tao Sun, Tengjiao Wang, Zheng Chen:
Effective multi-label active learning for text classification.
917-926
- Tianbao Yang, Rong Jin, Yun Chi, Shenghuo Zhu:
Combining link and content for community detection: a discriminative approach.
927-936
- Limin Yao, David M. Mimno, Andrew McCallum:
Efficient methods for topic model inference on streaming document collections.
937-946
- Lexiang Ye, Eamonn J. Keogh:
Time series shapelets: a new primitive for data mining.
947-956
- Zhijun Yin, Rui Li, Qiaozhu Mei, Jiawei Han:
Exploring social tagging graph for web object classification.
957-966
- Shinjae Yoo, Yiming Yang, Frank Lin, Il-Chul Moon:
Mining social networks for personalized email prioritization.
967-976
- Chang Hun You, Lawrence B. Holder, Diane J. Cook:
Learning patterns in the dynamics of biological networks.
977-986
- Xiangliang Zhang, Cyril Furtlehner, Julien Perez, Cécile Germain-Renaud, Michèle Sebag:
Toward autonomic grids: analyzing the job flow with affinity streaming.
987-996
- Yuzhou Zhang, Jianyong Wang, Yi Wang, Lizhu Zhou:
Parallel community detection on large networks with propinquity dynamics.
997-1006
- Elena Zheleva, Hossam Sharara, Lise Getoor:
Co-evolution of social and affiliation networks.
1007-1016
- Lei Zheng, Shaojun Wang, Yan Liu, Chi-Hoon Lee:
Information theoretic regularization for semi-supervised boosting.
1017-1026
- ErHeng Zhong, Wei Fan, Jing Peng, Kun Zhang, Jiangtao Ren, Deepak S. Turaga, Olivier Verscheure:
Cross domain distribution adaptation via kernel mapping.
1027-1036
- Guangyu Zhu, Gilad Mishne:
Mining rich session context to improve web search.
1037-1046
- Jun Zhu, Eric P. Xing, Bo Zhang:
Primal sparse Max-margin Markov networks.
1047-1056
- Qiang Zhu, Xiaoyue Wang, Eamonn J. Keogh, Sang-Hee Lee:
Augmenting the generalized hough transform to enable the mining of petroglyphs.
1057-1066
Industrial track papers
- Josh Attenberg, Sandeep Pandey, Torsten Suel:
Modeling and predicting user behavior in sponsored search.
1067-1076
- Indrajit Bhattacharya, Shantanu Godbole, Ajay Gupta, Ashish Verma, Jeff Achtermann, Kevin English:
Enabling analysts in managed services for CRM analytics.
1077-1086
- Ludmila Cherkasova, Kave Eshghi, Charles B. Morrey, Joseph Tucek, Alistair C. Veitch:
Applying syntactic similarity algorithms for enterprise information management.
1087-1096
- Wei Chu, Seung-Taek Park, Todd Beaupre, Nitin Motgi, Amit Phadke, Seinjuti Chakraborty, Joe Zachariah:
A case study of behavior-driven conjoint analysis on Yahoo!: front page today module.
1097-1104
- Thomas Crook, Brian Frasca, Ron Kohavi, Roger Longbotham:
Seven pitfalls to avoid when running controlled experiments on the web.
1105-1114
- Srivatsava Daruru, Nena M. Marin, Matt Walker, Joydeep Ghosh:
Pervasive parallelism in data mining: dataflow solution to co-clustering large and sparse Netflix data.
1115-1124
- Xiaowen Ding, Bing Liu, Lei Zhang:
Entity discovery and assignment for opinion mining applications.
1125-1134
- Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, John H. Thornton Jr.:
Migration motif: a spatial - temporal pattern mining approach for financial markets.
1135-1144
- Ariel Fuxman, Anitha Kannan, Andrew B. Goldberg, Rakesh Agrawal, Panayiotis Tsaparas, John C. Shafer:
Improving classification accuracy using automatically extracted training data.
1145-1154
- Honglei Guo, Huijia Zhu, Zhili Guo, Xiaoxun Zhang, Zhong Su:
Address standardization with latent semantic association.
1155-1164
- Sonal Gupta, Mikhail Bilenko, Matthew Richardson:
Catching the drift: learning broad matches from clickthrough data.
1165-1174
- Mohammad Al Hasan, W. Scott Spangler, Thomas D. Griffin, Alfredo Alba:
COA: finding novel patents through text analysis.
1175-1184
- Shunsuke Hirose, Kenji Yamanishi, Takayuki Nakata, Ryohei Fujimaki:
Network anomaly detection based on Eigen equation compression.
1185-1194
- Wei Jin, Hung Hay Ho, Rohini K. Srihari:
OpinionMiner: a novel machine learning system for web opinion mining and extraction.
1195-1204
- Jongwuk Lee, Seung-won Hwang, Zaiqing Nie, Ji-Rong Wen:
Query result clustering for object-level search.
1205-1214
- Ming Li, M. Benjamin Dias, Ian H. Jarman, Wael El-Deredy, Paulo J. G. Lisboa:
Grocery shopping recommendations based on basket-sensitive random walk.
1215-1224
- Yan Liu, Jayant R. Kalagnanam, Oivind Johnsen:
Learning dynamic temporal graphs for oil-production equipment monitoring system.
1225-1234
- Ping Luo, Fen Lin, Yuhong Xiong, Yong Zhao, Zhongzhi Shi:
Towards combining web classification and web information extraction: a case study.
1235-1244
- Justin Ma, Lawrence K. Saul, Stefan Savage, Geoffrey M. Voelker:
Beyond blacklists: learning to detect malicious web sites from suspicious URLs.
1245-1254
- Adetokunbo Makanju, A. Nur Zincir-Heywood, Evangelos E. Milios:
Clustering event logs using iterative partitioning.
1255-1264
- Mary McGlohon, Stephen Bay, Markus G. Anderle, David M. Steier, Christos Faloutsos:
SNARE: a link analytic system for graph labeling and risk detection.
1265-1274
- Prem Melville, Wojciech Gryc, Richard D. Lawrence:
Sentiment analysis of blogs by combining lexical knowledge with text classification.
1275-1284
- Noman Mohammed, Benjamin C. M. Fung, Patrick C. K. Hung, Cheuk-kwong Lee:
Anonymizing healthcare data: a case study on the blood transfusion service.
1285-1294
- Kivanc M. Ozonat, Donald Young:
Towards a universal marketplace over the web: statistical multi-label classification of service provider forms with simulated annealing.
1295-1304
- Debprakash Patnaik, Manish Marwah, Ratnesh K. Sharma, Naren Ramakrishnan:
Sustainable operation and management of data center chillers using temporal data mining.
1305-1314
- B. Aditya Prakash, Nicholas Valler, David Andersen, Michalis Faloutsos, Christos Faloutsos:
BGP-lens: patterns and anomalies in internet routing updates.
1315-1324
- D. Sculley, Robert G. Malkin, Sugato Basu, Roberto J. Bayardo:
Predicting bounce rates in sponsored search advertisements.
1325-1334
- Liang Sun, Rinkal Patel, Jun Liu, Kewei Chen, Teresa Wu, Jing Li, Eric Reiman, Jieping Ye:
Mining brain region connectivity for alzheimer's disease study via sparse inverse covariance estimation.
1335-1344
- Junfeng Wang, Chun Chen, Can Wang, Jian Pei, Jiajun Bu, Ziyu Guan, Wei Vivian Zhang:
Can we learn a template-independent wrapper for news article extraction from a single training site?
1345-1354
- Kuansan Wang, Toby Walker, Zijian Zheng:
PSkip: estimating relevance ranking quality from web search clickthrough data.
1355-1364
- Gu Xu, Shuang-Hong Yang, Hang Li:
Named entity mining from click-through data using weakly supervised latent dirichlet allocation.
1365-1374
- Jiang-Ming Yang, Rui Cai, Chunsong Wang, Hua Huang, Lei Zhang, Wei-Ying Ma:
Incorporating site-level knowledge for incremental crawling of web forums: a list-wise strategy.
1375-1384
- Yanfang Ye, Tao Li, Qingshan Jiang, Zhixue Han, Li Wan:
Intelligent file scoring system for malware detection from the gray list.
1385-1394
- Bin Zhou, Daxin Jiang, Jian Pei, Hang Li:
OLAP on search logs: an infrastructure supporting data-driven applications in search engines.
1395-1404
Copyright © Fri Mar 12 17:18:02 2010
by Michael Ley (ley@uni-trier.de)