Wen Gao, Yong Rui, Alan Hanjalic, Changsheng Xu, Eckehard G. Steinbach, Abdulmotaleb El-Saddik, Michelle X. Zhou (Eds.):
Proceedings of the 17th International Conference on Multimedia 2009, Vancouver, British Columbia, Canada, October 19-24, 2009.
 ACM 2009, ISBN 978-1-60558-608-3 
 
 
 
 
 
Keynote addresses
 
- HongJiang Zhang:
Multimedia content analysis and search: new perspectives and approaches.
1-2
  
 
 
 
 
 - Touradj Ebrahimi:
Quality of multimedia experience: past, present and future.
3-4
  
 
 
 
 
 
Best Paper Session
 
- Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan Ellis, Alexander C. Loui:
Short-term audio-visual atoms for generic video concept classification.
5-14
  
 
 
 
 
 - Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, Zengfu Wang:
Visual query suggestion.
15-24
  
 
 
 
 
 - Hao Yin, Xuening Liu, Tongyu Zhan, Vyas Sekar, Feng Qiu, Chuang Lin, Hui Zhang, Bo Li:
Design and deployment of a hybrid CDN-P2P system for live video streaming: experiences with LiveSky.
25-34
  
 
 
 
 
 - Mauro Cherubini, Rodrigo de Oliveira, Nuria Oliver:
Understanding near-duplicate videos: a user-centric approach.
35-44
  
 
 
 
 
 
Content track C1:
image retrieval
 
- Lijun Zhang, Chun Chen, Wei Chen, Jiajun Bu, Deng Cai, Xiaofei He:
Convex experimental design using manifold structure for image retrieval.
45-54
  
 
 
 
 
 - Yiming Liu, Dong Xu, Ivor W. Tsang, Jiebo Luo:
Using large-scale web data to facilitate textual query based retrieval of consumer photos.
55-64
  
 
 
 
 
 - Yin-Hsi Kuo, Kuan-Ting Chen, Chien-Hsing Chiang, Winston H. Hsu:
Query expansion for hash-based image object retrieval.
65-74
  
 
 
 
 
 - Shiliang Zhang, Qi Tian, Gang Hua, Qingming Huang, Shipeng Li:
Descriptive visual words and visual phrases for image applications.
75-84
  
 
 
 
 
 
Content track C2:
content analysis applications
 
Content track C3:
image annotation and tagging
 
- Xiaobai Liu, Bin Cheng, Shuicheng Yan, Jinhui Tang, Tat-Seng Chua, Hai Jin:
Label to region by bi-layer sparsity priors.
115-124
  
 
 
 
 
 - Liangliang Cao, Jie Yu, Jiebo Luo, Thomas S. Huang:
Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression.
125-134
  
 
 
 
 
 - Lei Wu, Steven C. H. Hoi, Rong Jin, Jianke Zhu, Nenghai Yu:
Distance metric learning from uncertain side information with application to automated photo tagging.
135-144
  
 
 
 
 
 
Content track C4:
video analysis
 
- Hung-Khoon Tan, Chong-Wah Ngo, Richang Hong, Tat-Seng Chua:
Scalable detection of partial near-duplicate videos by visual-temporal consistency.
145-154
  
 
 
 
 
 - Yu-Gang Jiang, Chong-Wah Ngo, Shih-Fu Chang:
Semantic context transfer across heterogeneous sources for domain adaptive video search.
155-164
  
 
 
 
 
 - Guangyu Zhu, Ming Yang, Kai Yu, Wei Xu, Yihong Gong:
Detecting video events based on action recognition in complex scenes using spatio-temporal descriptor.
165-174
  
 
 
 
 
 - Yi Yang, Dong Xu, Feiping Nie, Jiebo Luo, Yueting Zhuang:
Ranking with local regression and global alignment for cross media retrieval.
175-184
  
 
 
 
 
 
Content track C5:
audio and music
 
- Thilo Stadelmann, Bernd Freisleben:
Unfolding speaker clustering potential: a biomimetic approach.
185-194
  
 
 
 
 
 - Gerald Friedland, Chuohao Yeo, Hayley Hung:
Visual speaker localization aided by acoustic models.
195-202
  
 
 
 
 
 - Naoki Yasuraoka, Takehiro Abe, Katsutoshi Itoyama, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno:
Changing timbre and phrase in existing musical performances as you like: manipulations of single part using harmonic and inharmonic models.
203-212
  
 
 
 
 
 - Bingjun Zhang, Qiaoliang Xiang, Huanhuan Lu, Jialie Shen, Ye Wang:
Comprehensive query-dependent fusion using regression-on-folksonomies: a case study of multimodal music search.
213-222
  
 
 
 
 
 
Content track C6:
learning and concept detection
 
Application track A1:
interactive applications
 
- Jin-Yao Lin, Yen-Yu Chen, Ju-Chun Ko, HuiShan Kao, Wei-Han Chen, Tsun-Hung Tsai, Su-Chu Hsu, Yi-Ping Hung:
i-m-Tube: an interactive multi-resolution tubular display.
253-260
  
 
 
 
 
 - Wei-Hao Lin, Alexander G. Hauptmann:
Identifying news videos' ideological perspectives using emphatic patterns of visual concepts.
261-270
  
 
 
 
 
 - Micah T. Taylor, Anish Chandak, Lakulish Antani, Dinesh Manocha:
RESound: interactive sound rendering for dynamic virtual environments.
271-280
  
 
 
 
 
 - Luiz Fernando Gomes Soares, Romualdo M. R. Costa, Márcio Ferreira Moreno, Marcelo Ferreira Moreno:
Multiple exhibition devices in DTV systems.
281-290
  
 
 
 
 
 
Application track A2:
context awareness
 
- Meng Wang, Bo Liu, Xian-Sheng Hua:
Accessible image search.
291-300
  
 
 
 
 
 - Liang Shi, Jinqiao Wang, Lingyu Duan, Hanqing Lu:
Consumer video retargeting: context assisted spatial-temporal grid optimization.
301-310
  
 
 
 
 
 - Yi Yang, Yueting Zhuang, Dong Xu, Yunhe Pan, Dacheng Tao, Stephen J. Maybank:
Retrieval based interactive cartoon synthesis via unsupervised bi-distance metric learning.
311-320
  
 
 
 
 
 - Stephan Kopf, Johannes Kiess, Hendrik Lemelson, Wolfgang Effelsberg:
FSCAV: fast seam carving for size adaptation of videos.
321-330
  
 
 
 
 
 
Applications track A3:
information summarization
 
System track S1:
mobile devices and hardware/sensor support
 
- Yen-Lin Huang, Yun-Chung Shen, Ja-Ling Wu:
Scalable computation for spatially scalable video coding using NVIDIA CUDA and multi-core CPU.
361-370
  
 
 
 
 
 - Nan Wu, Mei Wen, Wei Wu, Ju Ren, Huayou Su, Changqing Xun, Chunyuan Zhang:
Streaming HD H.264 encoder on programmable processors.
371-380
  
 
 
 
 
 - XiaoMing Chen, Zhendong Zhao, Ahmad Rahmati, Ye Wang, Lin Zhong:
SaVE: sensor-assisted motion estimation for efficient h.264/AVC video encoding.
381-390
  
 
 
 
 
 - Shu Shi, Won J. Jeon, Klara Nahrstedt, Roy H. Campbell:
Real-time remote rendering of 3D video for mobile devices.
391-400
  
 
 
 
 
 
System track S2:
media streaming and media content distribution
 
System track S3:
3D mesh streaming + HCM track H2
 
HCM track H1:
human-centered multimedia
 
- Ioannis Arapakis, Ioannis Konstas, Joemon M. Jose:
Using facial expressions and peripheral physiological signals as implicit indicators of topical relevance.
461-470
  
 
 
 
 
 - Frank R. Bentley, Michael Groble:
TuVista: meeting the multimedia needs of mobile sports fans.
471-480
  
 
 
 
 
 - Wanmin Wu, Md. Ahsan Arefin, Raoul Rivas, Klara Nahrstedt, Renata M. Sheppard, Zhenyu Yang:
Quality of experience in distributed interactive multimedia environments: toward a theoretical framework.
481-490
  
 
 
 
 
 - Kuan-Ta Chen, Chen-Chi Wu, Yu-Chun Chang, Chin-Laung Lei:
A crowdsourceable QoE evaluation framework for multimedia content.
491-500
  
 
 
 
 
 
Short papers session 1:
content analysis
 
- Bo Han, Yan Yan, Zhenghua Chen, Chang Liu, Weiguo Wu:
A general framework for automatic on-line replay detection in sports video.
501-504
  
 
 
 
 
 - Xiao Wu, Chong-Wah Ngo, Jintao Li, Yongdong Zhang:
Localizing volumetric motion for action recognition in realistic videos.
505-508
  
 
 
 
 
 - Wei-Ta Chu, Chia-Hung Lin, Jen-Yu Yu:
Feature classification for representative photo selection.
509-512
  
 
 
 
 
 - Yuyu Liu, Yoichi Sato:
Visual localization of non-stationary sound sources.
513-516
  
 
 
 
 
 - Zhong Li, Hangzai Luo, Jianping Fan:
Incorporating camera metadata for attended region detection and consumer photo classification.
517-520
  
 
 
 
 
 - Yuanlong Shao, Yuan Zhou, Xiaofei He, Deng Cai, Hujun Bao:
Semi-supervised topic modeling for image annotation.
521-524
  
 
 
 
 
 - Boqing Gong, Chunjing Xu, Jianzhuang Liu, Xiaoou Tang:
Boosting 3D object retrieval by object flexibility.
525-528
  
 
 
 
 
 - Zhuoyuan Chen, Lifeng Sun, Shiqiang Yang:
Auto-cut for web images.
529-532
  
 
 
 
 
 - Xi Liu, Zhiping Shi, Zhixin Li, Zhongzhi Shi:
Coboost learning of visual categories with 1st and 2nd order features from Google images.
533-536
  
 
 
 
 
 - Ming Liu, Shifeng Chen, Jianzhuang Liu, Xiaoou Tang:
Video completion via motion guided spatial-temporal global optimization.
537-540
  
 
 
 
 
 - Xiaoshuai Sun, Hongxun Yao, Rongrong Ji, Shaohui Liu:
Photo assessment based on computational visual attention model.
541-544
  
 
 
 
 
 - Saloua Litayem, Alexis Joly, Nozha Boujemaa:
Interactive objects retrieval with efficient boosting.
545-548
  
 
 
 
 
 - Zhipeng Wu, Shuqiang Jiang, Qingming Huang:
Near-duplicate video matching with transformation recognition.
549-552
  
 
 
 
 
 - Yezhou Yang, Mingli Song, Na Li, Jiajun Bu, Chun Chen:
Visual attention analysis by pseudo gravitational field.
553-556
  
 
 
 
 
 - Lei Bao, Juan Cao, Tian Xia, Yong-Dong Zhang, Jintao Li:
Locally non-negative linear structure learning for interactive image retrieval.
557-560
  
 
 
 
 
 - Junyong You, Andrew Perkis, Miska M. Hannuksela, Moncef Gabbouj:
Perceptual quality assessment based on visual attention analysis.
561-564
  
 
 
 
 
 - Go Irie, Kota Hidaka, Takashi Satou, Akira Kojima, Toshihiko Yamasaki, Kiyoharu Aizawa:
Latent topic driving model for movie affective scene classification.
565-568
  
 
 
 
 
 - Boqing Gong, Yueming Wang, Jianzhuang Liu, Xiaoou Tang:
Automatic facial expression recognition on a single 3D face by exploring shape deformation.
569-572
  
 
 
 
 
 - Hao Xu, Jingdong Wang, Xian-Sheng Hua, Shipeng Li:
Tag refinement by regularized LDA.
573-576
  
 
 
 
 
 - Yang Liu, Yan Liu:
Tensor distance based multilinear multidimensional scaling for image and video analysis.
577-580
  
 
 
 
 
 - Alexis Joly, Olivier Buisson:
Logo retrieval with a contrario visual query expansion.
581-584
  
 
 
 
 
 - Sarah Favre, Alfred Dielmann, Alessandro Vinciarelli:
Automatic role recognition in multiparty recordings using social networks and probabilistic sequential models.
585-588
  
 
 
 
 
 - Yingyu Liang, Jianmin Li, Bo Zhang:
Vocabulary-based hashing for image search.
589-592
  
 
 
 
 
 - Tong Zhang, Chee Keat Fong, Linxing Xiao, Jie Zhou:
Automatic and instant ring tone generation based on music structure analysis.
593-596
  
 
 
 
 
 - Tong Zhang, Jun Xiao, Di Wen, Xiaoqing Ding:
Face based image navigation and search.
597-600
  
 
 
 
 
 - Christian Jansohn, Adrian Ulges, Thomas M. Breuel:
Detecting pornographic video content by combining image features with motion information.
601-604
  
 
 
 
 
 - Asaad Hakeem, Mun Wai Lee, Omar Javed, Niels Haering:
Semantic video search using natural language queries.
605-608
  
 
 
 
 
 - George Toderici, Jay Yagnik:
Automatic, efficient, temporally-coherent video enhancement for large scale applications.
609-612
  
 
 
 
 
 - Xianming Liu, Hongxun Yao, Rongrong Ji, Pengfei Xu, Xiaoshuai Sun:
What is a complete set of keywords for image description & annotation on the web.
613-616
  
 
 
 
 
 - Xinyi Cui, Qingshan Liu, Dimitris N. Metaxas:
Temporal spectral residual: fast motion saliency detection.
617-620
  
 
 
 
 
 - Naveed Imran, Jingen Liu, Jiebo Luo, Mubarak Shah:
Event recognition from photo collections via PageRank.
621-624
  
 
 
 
 
 - Wei-Ta Chu, Ya-Lin Lee, Jen-Yu Yu:
Visual language model for face clustering in consumer photos.
625-628
  
 
 
 
 
 - Xin Geng, Kate Smith-Miles, Zhi-Hua Zhou, Liang Wang:
Face image modeling by multilinear subspace analysis with missing values.
629-632
  
 
 
 
 
 
Short papers session 2:
content analysis and HCM
 
- Mei-Chen Yeh, Kwang-Ting Cheng:
A compact, effective descriptor for video copy detection.
633-636
  
 
 
 
 
 - Chao-Yung Hsu, Chun-Shien Lu, Soo-Chang Pei:
Secure and robust SIFT.
637-640
  
 
 
 
 
 - Minh-Son Dao, Sharma Ishan Nath, Noboru Babaguchi:
Preserving topological information in sub-trajectories-based representation for spatio-temporal trajectories indexing and retrieval.
641-644
  
 
 
 
 
 - Juan Cao, HongFang Jing, Chong-Wah Ngo, Yongdong Zhang:
Distribution-based concept selection for concept-based video retrieval.
645-648
  
 
 
 
 
 - Fariza Fauzi, Jer-Lang Hong, Mohammed Belkhatir:
Webpage segmentation for extracting images and their surrounding contextual information.
649-652
  
 
 
 
 
 - Lingfang Li, Ning Zhang, Ling-Yu Duan, Qingming Huang, Jun Du, Ling Guan:
Automatic sports genre categorization and view-type classification over large-scale dataset.
653-656
  
 
 
 
 
 - Adrian Popescu, Pierre-Alain Moëllic, Ioannis Kanellos, Rémi Landais:
Lightweight web image reranking.
657-660
  
 
 
 
 
 - Xirong Li, Cees G. M. Snoek:
Visual categorization with negative examples for free.
661-664
  
 
 
 
 
 - Panagiotis Sidiropoulos, Vasileios Mezaris, Ioannis Kompatsiaris, Hugo Meinedo, Isabel Trancoso:
Multi-modal scene segmentation using scene transition graphs.
665-668
  
 
 
 
 
 - Masashi Nishiyama, Takahiro Okabe, Yoichi Sato, Imari Sato:
Sensation-based photo cropping.
669-672
  
 
 
 
 
 - Bart Thomee, Mark J. Huiskes, Erwin M. Bakker, Michael S. Lew:
Deep exploration for experiential image retrieval.
673-676
  
 
 
 
 
 - Hrishikesh Aradhye, George Toderici, Jay Yagnik:
Adaptive, selective, automatic tonal enhancement of faces.
677-680
  
 
 
 
 
 - Giulia Garau, Sileye O. Ba, Hervé Bourlard, Jean-Marc Odobez:
Investigating the use of visual focus of attention for audio-visual speaker diarisation.
681-684
  
 
 
 
 
 - Saman Cooray, Hervé Bredin, Li-Qun Xu, Noel E. O'Connor:
An interactive and multi-level framework for summarising user generated videos.
685-688
  
 
 
 
 
 - Yanhua Chen, Ming Dong, Wanggen Wan:
Image co-clustering with multi-modality features and user feedbacks.
689-692
  
 
 
 
 
 - Danzhou Liu, Kien A. Hua:
Transfer non-metric measures into metric for similarity search.
693-696
  
 
 
 
 
 - Christian Beecks, Merih Seran Uysal, Thomas Seidl:
Signature quadratic form distances for content-based similarity.
697-700
  
 
 
 
 
 - Feng Tang, Yuli Gao:
Fast near duplicate detection for personal image collections.
701-704
  
 
 
 
 
 - Steven R. Ness, Anthony Theocharis, George Tzanetakis, Luis Gustavo Martins:
Improving automatic music tag annotation using stacked generalization of probabilistic SVM outputs.
705-708
  
 
 
 
 
 - Peng Wu, Daniel Tretter:
Close & closer: social cluster and closeness from photo collections.
709-712
  
 
 
 
 
 - Seungmin Rho, Byeong-jun Han, Eenjun Hwang:
SVR-based music mood classification and context-based music recommendation.
713-716
  
 
 
 
 
 - Yingzhen Yang, Yin Zhu, Qunsheng Peng:
Image completion using structural priority belief propagation.
717-720
  
 
 
 
 
 - Chandrasekar Ramachandran, Rahul Malik, Xin Jin, Jing Gao, Klara Nahrstedt, Jiawei Han:
VideoMule: a consensus learning approach to multi-label classification from noisy user-generated videos.
721-724
  
 
 
 
 
 - Li Wang, Linjun Yang, Xinmei Tian:
Query aware visual similarity propagation for image search reranking.
725-728
  
 
 
 
 
 - Subramanian Ramanathan, Harish Katti, Raymond Huang, Tat-Seng Chua, Mohan S. Kankanhalli:
Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis.
729-732
  
 
 
 
 
 - Bruno Lepri, Nadia Mana, Alessandro Cappelletti, Fabio Pianesi:
Automatic prediction of individual performance from "thin slices" of social behavior.
733-736
  
 
 
 
 
 - Driss Choujaa, Naranker Dulay:
Routine classification through sequence alignment.
737-740
  
 
 
 
 
 - Werner Bailer, Herwig Rehatschek:
Comparing fact finding tasks and user survey for evaluating a video browsing tool.
741-744
  
 
 
 
 
 - Jinjing Xie, Yiqiang Chen, Junfa Liu, Chunyan Miao, Xingyu Gao:
Interactive 3D caricature generation based on double sampling.
745-748
  
 
 
 
 
 - Xiaojuan Ma, Sonya S. Nikolova, Perry R. Cook:
W2ANE: when words are not enough: online multimedia language assistant for people with aphasia.
749-752
  
 
 
 
 
 - Luming Zhang, Mingli Song, Na Li, Jiajun Bu, Chun Chen:
Feature selection for fast speech emotion recognition.
753-756
  
 
 
 
 
 - Kan Ren, Janko Calic:
FreeEye: interactive intuitive interface for large-scale image browsing.
757-760
  
 
 
 
 
 - A. S. M. Mahfujur Rahman, M. Anwar Hossain, Jorge Parra, Abdulmotaleb El-Saddik:
Motion-path based gesture interaction with smart home services.
761-764
  
 
 
 
 
 - Liyue Zhao, Gita Sukthankar:
An active learning approach for segmenting human activity datasets.
765-768
  
 
 
 
 
 
Short papers session 3:
applications and systems
 
- Andrew T. Sabin, Bryan Pardo:
A method for rapid personalization of audio equalization parameters.
769-772
  
 
 
 
 
 - Guangda Li, Zhaoyan Ming, Haojie Li, Tat-Seng Chua:
Video reference: question answering on YouTube.
773-776
  
 
 
 
 
 - Hui Tian, Ke Zhou, Hong Jiang, Dan Feng:
Digital logic based encoding strategies for steganography on voice-over-IP.
777-780
  
 
 
 
 
 - Richard J. Anderson, Devy Pranowo, Craig Prince, Fred Videon:
Integrating corrections into digital ink playback.
781-784
  
 
 
 
 
 - Gamhewage C. de Silva, Kiyoharu Aizawa:
Retrieving multimedia travel stories using location data and spatial queries.
785-788
  
 
 
 
 
 - Wei-Chao Chen, Agathe Battestini, Natasha Gelfand, Vidya Setlur:
Visual summaries of popular landmarks from community photo collections.
789-792
  
 
 
 
 
 - Dechao Liu, Matthew R. Scott, Rongrong Ji, Wei Jiang, Hongxun Yao, Xing Xie:
Location sensitive indexing for image-based advertising.
793-796
  
 
 
 
 
 - Sharmeen Shahabuddin, Razib Iqbal, Ali Nazari, Shervin Shirmohammadi:
Compressed domain spatial adaptation for H.264 video.
797-800
  
 
 
 
 
 - Qiang Hao, Rui Cai, Xin-Jing Wang, Jiang-Ming Yang, Yanwei Pang, Lei Zhang:
Generating location overviews with images and tags by mining user-generated travelogues.
801-804
  
 
 
 
 
 - Dijun Luo, Heng Huang:
Link prediction of multimedia social network via unsupervised face recognition.
805-808
  
 
 
 
 
 - Dong Liu, Meng Wang, Xian-Sheng Hua, Hong-Jiang Zhang:
Smart batch tagging of photo albums.
809-812
  
 
 
 
 
 - Radu Andrei Negoescu, Brett Adams, Dinh Q. Phung, Svetha Venkatesh, Daniel Gatica-Perez:
Flickr hypergroups.
813-816
  
 
 
 
 
 - Chih-Yu Yan, Ming-Chun Tien, Ja-Ling Wu:
Interactive background blurring.
817-820
  
 
 
 
 
 - Ming-Hsiu Chang, Ming-Chun Tien, Ja-Ling Wu:
WOW: wild-open warning for broadcast basketball video based on player trajectory.
821-824
  
 
 
 
 
 - Nguyen Thi Nhat Anh, Wenxian Yang, Jianfei Cai:
Seam carving extension: a compression perspective.
825-828
  
 
 
 
 
 - Peter Bajcsy, Kenton McHenry, Hye-Jung Na, Rahul Malik, Andrew Spencer, Suk-Kyu Lee, Rob Kooper, Mike Frogley:
Immersive environments for rehabilitation activities.
829-832
  
 
 
 
 
 - Joan-Isaac Biel, Daniel Gatica-Perez:
Wearing a YouTube hat: directors, comedians, gurus, and user aggregated behavior.
833-836
  
 
 
 
 
 - Graham Healy, Alan F. Smeaton:
An outdoor spatially-aware audio playback platform exemplified by a virtual zoo.
837-840
  
 
 
 
 
 - Yuxin Peng, Zhiwu Lu, Jianguo Xiao:
Semantic concept annotation based on audio PLSA model.
841-844
  
 
 
 
 
 - Anan Liu, Yongdong Zhang, Jintao Li:
Personalized movie recommendation.
845-848
  
 
 
 
 
 - Benoit Baccot, Omar Choudary, Romulus Grigoras, Vincent Charvillat:
On the impact of sequence and time in rich media advertising.
849-852
  
 
 
 
 
 - Tongwei Ren, Yan Liu,