Wen Gao, Yong Rui, Alan Hanjalic, Changsheng Xu, Eckehard G. Steinbach, Abdulmotaleb El-Saddik, Michelle X. Zhou (Eds.):
Proceedings of the 17th International Conference on Multimedia 2009, Vancouver, British Columbia, Canada, October 19-24, 2009.
ACM 2009, ISBN 978-1-60558-608-3
Keynote addresses
- HongJiang Zhang:
Multimedia content analysis and search: new perspectives and approaches.
1-2
- Touradj Ebrahimi:
Quality of multimedia experience: past, present and future.
3-4
Best Paper Session
- Wei Jiang, Courtenay V. Cotton, Shih-Fu Chang, Dan Ellis, Alexander C. Loui:
Short-term audio-visual atoms for generic video concept classification.
5-14
- Zheng-Jun Zha, Linjun Yang, Tao Mei, Meng Wang, Zengfu Wang:
Visual query suggestion.
15-24
- Hao Yin, Xuening Liu, Tongyu Zhan, Vyas Sekar, Feng Qiu, Chuang Lin, Hui Zhang, Bo Li:
Design and deployment of a hybrid CDN-P2P system for live video streaming: experiences with LiveSky.
25-34
- Mauro Cherubini, Rodrigo de Oliveira, Nuria Oliver:
Understanding near-duplicate videos: a user-centric approach.
35-44
Content track C1:
image retrieval
- Lijun Zhang, Chun Chen, Wei Chen, Jiajun Bu, Deng Cai, Xiaofei He:
Convex experimental design using manifold structure for image retrieval.
45-54
- Yiming Liu, Dong Xu, Ivor W. Tsang, Jiebo Luo:
Using large-scale web data to facilitate textual query based retrieval of consumer photos.
55-64
- Yin-Hsi Kuo, Kuan-Ting Chen, Chien-Hsing Chiang, Winston H. Hsu:
Query expansion for hash-based image object retrieval.
65-74
- Shiliang Zhang, Qi Tian, Gang Hua, Qingming Huang, Shipeng Li:
Descriptive visual words and visual phrases for image applications.
75-84
Content track C2:
content analysis applications
Content track C3:
image annotation and tagging
- Xiaobai Liu, Bin Cheng, Shuicheng Yan, Jinhui Tang, Tat-Seng Chua, Hai Jin:
Label to region by bi-layer sparsity priors.
115-124
- Liangliang Cao, Jie Yu, Jiebo Luo, Thomas S. Huang:
Enhancing semantic and geographic annotation of web images via logistic canonical correlation regression.
125-134
- Lei Wu, Steven C. H. Hoi, Rong Jin, Jianke Zhu, Nenghai Yu:
Distance metric learning from uncertain side information with application to automated photo tagging.
135-144
Content track C4:
video analysis
- Hung-Khoon Tan, Chong-Wah Ngo, Richang Hong, Tat-Seng Chua:
Scalable detection of partial near-duplicate videos by visual-temporal consistency.
145-154
- Yu-Gang Jiang, Chong-Wah Ngo, Shih-Fu Chang:
Semantic context transfer across heterogeneous sources for domain adaptive video search.
155-164
- Guangyu Zhu, Ming Yang, Kai Yu, Wei Xu, Yihong Gong:
Detecting video events based on action recognition in complex scenes using spatio-temporal descriptor.
165-174
- Yi Yang, Dong Xu, Feiping Nie, Jiebo Luo, Yueting Zhuang:
Ranking with local regression and global alignment for cross media retrieval.
175-184
Content track C5:
audio and music
- Thilo Stadelmann, Bernd Freisleben:
Unfolding speaker clustering potential: a biomimetic approach.
185-194
- Gerald Friedland, Chuohao Yeo, Hayley Hung:
Visual speaker localization aided by acoustic models.
195-202
- Naoki Yasuraoka, Takehiro Abe, Katsutoshi Itoyama, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno:
Changing timbre and phrase in existing musical performances as you like: manipulations of single part using harmonic and inharmonic models.
203-212
- Bingjun Zhang, Qiaoliang Xiang, Huanhuan Lu, Jialie Shen, Ye Wang:
Comprehensive query-dependent fusion using regression-on-folksonomies: a case study of multimodal music search.
213-222
Content track C6:
learning and concept detection
Application track A1:
interactive applications
- Jin-Yao Lin, Yen-Yu Chen, Ju-Chun Ko, HuiShan Kao, Wei-Han Chen, Tsun-Hung Tsai, Su-Chu Hsu, Yi-Ping Hung:
i-m-Tube: an interactive multi-resolution tubular display.
253-260
- Wei-Hao Lin, Alexander G. Hauptmann:
Identifying news videos' ideological perspectives using emphatic patterns of visual concepts.
261-270
- Micah T. Taylor, Anish Chandak, Lakulish Antani, Dinesh Manocha:
RESound: interactive sound rendering for dynamic virtual environments.
271-280
- Luiz Fernando Gomes Soares, Romualdo M. R. Costa, Márcio Ferreira Moreno, Marcelo Ferreira Moreno:
Multiple exhibition devices in DTV systems.
281-290
Application track A2:
context awareness
- Meng Wang, Bo Liu, Xian-Sheng Hua:
Accessible image search.
291-300
- Liang Shi, Jinqiao Wang, Lingyu Duan, Hanqing Lu:
Consumer video retargeting: context assisted spatial-temporal grid optimization.
301-310
- Yi Yang, Yueting Zhuang, Dong Xu, Yunhe Pan, Dacheng Tao, Stephen J. Maybank:
Retrieval based interactive cartoon synthesis via unsupervised bi-distance metric learning.
311-320
- Stephan Kopf, Johannes Kiess, Hendrik Lemelson, Wolfgang Effelsberg:
FSCAV: fast seam carving for size adaptation of videos.
321-330
Applications track A3:
information summarization
System track S1:
mobile devices and hardware/sensor support
- Yen-Lin Huang, Yun-Chung Shen, Ja-Ling Wu:
Scalable computation for spatially scalable video coding using NVIDIA CUDA and multi-core CPU.
361-370
- Nan Wu, Mei Wen, Wei Wu, Ju Ren, Huayou Su, Changqing Xun, Chunyuan Zhang:
Streaming HD H.264 encoder on programmable processors.
371-380
- XiaoMing Chen, Zhendong Zhao, Ahmad Rahmati, Ye Wang, Lin Zhong:
SaVE: sensor-assisted motion estimation for efficient h.264/AVC video encoding.
381-390
- Shu Shi, Won J. Jeon, Klara Nahrstedt, Roy H. Campbell:
Real-time remote rendering of 3D video for mobile devices.
391-400
System track S2:
media streaming and media content distribution
System track S3:
3D mesh streaming + HCM track H2
HCM track H1:
human-centered multimedia
- Ioannis Arapakis, Ioannis Konstas, Joemon M. Jose:
Using facial expressions and peripheral physiological signals as implicit indicators of topical relevance.
461-470
- Frank R. Bentley, Michael Groble:
TuVista: meeting the multimedia needs of mobile sports fans.
471-480
- Wanmin Wu, Md. Ahsan Arefin, Raoul Rivas, Klara Nahrstedt, Renata M. Sheppard, Zhenyu Yang:
Quality of experience in distributed interactive multimedia environments: toward a theoretical framework.
481-490
- Kuan-Ta Chen, Chen-Chi Wu, Yu-Chun Chang, Chin-Laung Lei:
A crowdsourceable QoE evaluation framework for multimedia content.
491-500
Short papers session 1:
content analysis
- Bo Han, Yan Yan, Zhenghua Chen, Chang Liu, Weiguo Wu:
A general framework for automatic on-line replay detection in sports video.
501-504
- Xiao Wu, Chong-Wah Ngo, Jintao Li, Yongdong Zhang:
Localizing volumetric motion for action recognition in realistic videos.
505-508
- Wei-Ta Chu, Chia-Hung Lin, Jen-Yu Yu:
Feature classification for representative photo selection.
509-512
- Yuyu Liu, Yoichi Sato:
Visual localization of non-stationary sound sources.
513-516
- Zhong Li, Hangzai Luo, Jianping Fan:
Incorporating camera metadata for attended region detection and consumer photo classification.
517-520
- Yuanlong Shao, Yuan Zhou, Xiaofei He, Deng Cai, Hujun Bao:
Semi-supervised topic modeling for image annotation.
521-524
- Boqing Gong, Chunjing Xu, Jianzhuang Liu, Xiaoou Tang:
Boosting 3D object retrieval by object flexibility.
525-528
- Zhuoyuan Chen, Lifeng Sun, Shiqiang Yang:
Auto-cut for web images.
529-532
- Xi Liu, Zhiping Shi, Zhixin Li, Zhongzhi Shi:
Coboost learning of visual categories with 1st and 2nd order features from Google images.
533-536
- Ming Liu, Shifeng Chen, Jianzhuang Liu, Xiaoou Tang:
Video completion via motion guided spatial-temporal global optimization.
537-540
- Xiaoshuai Sun, Hongxun Yao, Rongrong Ji, Shaohui Liu:
Photo assessment based on computational visual attention model.
541-544
- Saloua Litayem, Alexis Joly, Nozha Boujemaa:
Interactive objects retrieval with efficient boosting.
545-548
- Zhipeng Wu, Shuqiang Jiang, Qingming Huang:
Near-duplicate video matching with transformation recognition.
549-552
- Yezhou Yang, Mingli Song, Na Li, Jiajun Bu, Chun Chen:
Visual attention analysis by pseudo gravitational field.
553-556
- Lei Bao, Juan Cao, Tian Xia, Yong-Dong Zhang, Jintao Li:
Locally non-negative linear structure learning for interactive image retrieval.
557-560
- Junyong You, Andrew Perkis, Miska M. Hannuksela, Moncef Gabbouj:
Perceptual quality assessment based on visual attention analysis.
561-564
- Go Irie, Kota Hidaka, Takashi Satou, Akira Kojima, Toshihiko Yamasaki, Kiyoharu Aizawa:
Latent topic driving model for movie affective scene classification.
565-568
- Boqing Gong, Yueming Wang, Jianzhuang Liu, Xiaoou Tang:
Automatic facial expression recognition on a single 3D face by exploring shape deformation.
569-572
- Hao Xu, Jingdong Wang, Xian-Sheng Hua, Shipeng Li:
Tag refinement by regularized LDA.
573-576
- Yang Liu, Yan Liu:
Tensor distance based multilinear multidimensional scaling for image and video analysis.
577-580
- Alexis Joly, Olivier Buisson:
Logo retrieval with a contrario visual query expansion.
581-584
- Sarah Favre, Alfred Dielmann, Alessandro Vinciarelli:
Automatic role recognition in multiparty recordings using social networks and probabilistic sequential models.
585-588
- Yingyu Liang, Jianmin Li, Bo Zhang:
Vocabulary-based hashing for image search.
589-592
- Tong Zhang, Chee Keat Fong, Linxing Xiao, Jie Zhou:
Automatic and instant ring tone generation based on music structure analysis.
593-596
- Tong Zhang, Jun Xiao, Di Wen, Xiaoqing Ding:
Face based image navigation and search.
597-600
- Christian Jansohn, Adrian Ulges, Thomas M. Breuel:
Detecting pornographic video content by combining image features with motion information.
601-604
- Asaad Hakeem, Mun Wai Lee, Omar Javed, Niels Haering:
Semantic video search using natural language queries.
605-608
- George Toderici, Jay Yagnik:
Automatic, efficient, temporally-coherent video enhancement for large scale applications.
609-612
- Xianming Liu, Hongxun Yao, Rongrong Ji, Pengfei Xu, Xiaoshuai Sun:
What is a complete set of keywords for image description & annotation on the web.
613-616
- Xinyi Cui, Qingshan Liu, Dimitris N. Metaxas:
Temporal spectral residual: fast motion saliency detection.
617-620
- Naveed Imran, Jingen Liu, Jiebo Luo, Mubarak Shah:
Event recognition from photo collections via PageRank.
621-624
- Wei-Ta Chu, Ya-Lin Lee, Jen-Yu Yu:
Visual language model for face clustering in consumer photos.
625-628
- Xin Geng, Kate Smith-Miles, Zhi-Hua Zhou, Liang Wang:
Face image modeling by multilinear subspace analysis with missing values.
629-632
Short papers session 2:
content analysis and HCM
- Mei-Chen Yeh, Kwang-Ting Cheng:
A compact, effective descriptor for video copy detection.
633-636
- Chao-Yung Hsu, Chun-Shien Lu, Soo-Chang Pei:
Secure and robust SIFT.
637-640
- Minh-Son Dao, Sharma Ishan Nath, Noboru Babaguchi:
Preserving topological information in sub-trajectories-based representation for spatio-temporal trajectories indexing and retrieval.
641-644
- Juan Cao, HongFang Jing, Chong-Wah Ngo, Yongdong Zhang:
Distribution-based concept selection for concept-based video retrieval.
645-648
- Fariza Fauzi, Jer-Lang Hong, Mohammed Belkhatir:
Webpage segmentation for extracting images and their surrounding contextual information.
649-652
- Lingfang Li, Ning Zhang, Ling-Yu Duan, Qingming Huang, Jun Du, Ling Guan:
Automatic sports genre categorization and view-type classification over large-scale dataset.
653-656
- Adrian Popescu, Pierre-Alain Moëllic, Ioannis Kanellos, Rémi Landais:
Lightweight web image reranking.
657-660
- Xirong Li, Cees G. M. Snoek:
Visual categorization with negative examples for free.
661-664
- Panagiotis Sidiropoulos, Vasileios Mezaris, Ioannis Kompatsiaris, Hugo Meinedo, Isabel Trancoso:
Multi-modal scene segmentation using scene transition graphs.
665-668
- Masashi Nishiyama, Takahiro Okabe, Yoichi Sato, Imari Sato:
Sensation-based photo cropping.
669-672
- Bart Thomee, Mark J. Huiskes, Erwin M. Bakker, Michael S. Lew:
Deep exploration for experiential image retrieval.
673-676
- Hrishikesh Aradhye, George Toderici, Jay Yagnik:
Adaptive, selective, automatic tonal enhancement of faces.
677-680
- Giulia Garau, Sileye O. Ba, Hervé Bourlard, Jean-Marc Odobez:
Investigating the use of visual focus of attention for audio-visual speaker diarisation.
681-684
- Saman Cooray, Hervé Bredin, Li-Qun Xu, Noel E. O'Connor:
An interactive and multi-level framework for summarising user generated videos.
685-688
- Yanhua Chen, Ming Dong, Wanggen Wan:
Image co-clustering with multi-modality features and user feedbacks.
689-692
- Danzhou Liu, Kien A. Hua:
Transfer non-metric measures into metric for similarity search.
693-696
- Christian Beecks, Merih Seran Uysal, Thomas Seidl:
Signature quadratic form distances for content-based similarity.
697-700
- Feng Tang, Yuli Gao:
Fast near duplicate detection for personal image collections.
701-704
- Steven R. Ness, Anthony Theocharis, George Tzanetakis, Luis Gustavo Martins:
Improving automatic music tag annotation using stacked generalization of probabilistic SVM outputs.
705-708
- Peng Wu, Daniel Tretter:
Close & closer: social cluster and closeness from photo collections.
709-712
- Seungmin Rho, Byeong-jun Han, Eenjun Hwang:
SVR-based music mood classification and context-based music recommendation.
713-716
- Yingzhen Yang, Yin Zhu, Qunsheng Peng:
Image completion using structural priority belief propagation.
717-720
- Chandrasekar Ramachandran, Rahul Malik, Xin Jin, Jing Gao, Klara Nahrstedt, Jiawei Han:
VideoMule: a consensus learning approach to multi-label classification from noisy user-generated videos.
721-724
- Li Wang, Linjun Yang, Xinmei Tian:
Query aware visual similarity propagation for image search reranking.
725-728
- Subramanian Ramanathan, Harish Katti, Raymond Huang, Tat-Seng Chua, Mohan S. Kankanhalli:
Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis.
729-732
- Bruno Lepri, Nadia Mana, Alessandro Cappelletti, Fabio Pianesi:
Automatic prediction of individual performance from "thin slices" of social behavior.
733-736
- Driss Choujaa, Naranker Dulay:
Routine classification through sequence alignment.
737-740
- Werner Bailer, Herwig Rehatschek:
Comparing fact finding tasks and user survey for evaluating a video browsing tool.
741-744
- Jinjing Xie, Yiqiang Chen, Junfa Liu, Chunyan Miao, Xingyu Gao:
Interactive 3D caricature generation based on double sampling.
745-748
- Xiaojuan Ma, Sonya S. Nikolova, Perry R. Cook:
W2ANE: when words are not enough: online multimedia language assistant for people with aphasia.
749-752
- Luming Zhang, Mingli Song, Na Li, Jiajun Bu, Chun Chen:
Feature selection for fast speech emotion recognition.
753-756
- Kan Ren, Janko Calic:
FreeEye: interactive intuitive interface for large-scale image browsing.
757-760
- A. S. M. Mahfujur Rahman, M. Anwar Hossain, Jorge Parra, Abdulmotaleb El-Saddik:
Motion-path based gesture interaction with smart home services.
761-764
- Liyue Zhao, Gita Sukthankar:
An active learning approach for segmenting human activity datasets.
765-768
Short papers session 3:
applications and systems
- Andrew T. Sabin, Bryan Pardo:
A method for rapid personalization of audio equalization parameters.
769-772
- Guangda Li, Zhaoyan Ming, Haojie Li, Tat-Seng Chua:
Video reference: question answering on YouTube.
773-776
- Hui Tian, Ke Zhou, Hong Jiang, Dan Feng:
Digital logic based encoding strategies for steganography on voice-over-IP.
777-780
- Richard J. Anderson, Devy Pranowo, Craig Prince, Fred Videon:
Integrating corrections into digital ink playback.
781-784
- Gamhewage C. de Silva, Kiyoharu Aizawa:
Retrieving multimedia travel stories using location data and spatial queries.
785-788
- Wei-Chao Chen, Agathe Battestini, Natasha Gelfand, Vidya Setlur:
Visual summaries of popular landmarks from community photo collections.
789-792
- Dechao Liu, Matthew R. Scott, Rongrong Ji, Wei Jiang, Hongxun Yao, Xing Xie:
Location sensitive indexing for image-based advertising.
793-796
- Sharmeen Shahabuddin, Razib Iqbal, Ali Nazari, Shervin Shirmohammadi:
Compressed domain spatial adaptation for H.264 video.
797-800
- Qiang Hao, Rui Cai, Xin-Jing Wang, Jiang-Ming Yang, Yanwei Pang, Lei Zhang:
Generating location overviews with images and tags by mining user-generated travelogues.
801-804
- Dijun Luo, Heng Huang:
Link prediction of multimedia social network via unsupervised face recognition.
805-808
- Dong Liu, Meng Wang, Xian-Sheng Hua, Hong-Jiang Zhang:
Smart batch tagging of photo albums.
809-812
- Radu Andrei Negoescu, Brett Adams, Dinh Q. Phung, Svetha Venkatesh, Daniel Gatica-Perez:
Flickr hypergroups.
813-816
- Chih-Yu Yan, Ming-Chun Tien, Ja-Ling Wu:
Interactive background blurring.
817-820
- Ming-Hsiu Chang, Ming-Chun Tien, Ja-Ling Wu:
WOW: wild-open warning for broadcast basketball video based on player trajectory.
821-824
- Nguyen Thi Nhat Anh, Wenxian Yang, Jianfei Cai:
Seam carving extension: a compression perspective.
825-828
- Peter Bajcsy, Kenton McHenry, Hye-Jung Na, Rahul Malik, Andrew Spencer, Suk-Kyu Lee, Rob Kooper, Mike Frogley:
Immersive environments for rehabilitation activities.
829-832
- Joan-Isaac Biel, Daniel Gatica-Perez:
Wearing a YouTube hat: directors, comedians, gurus, and user aggregated behavior.
833-836
- Graham Healy, Alan F. Smeaton:
An outdoor spatially-aware audio playback platform exemplified by a virtual zoo.
837-840
- Yuxin Peng, Zhiwu Lu, Jianguo Xiao:
Semantic concept annotation based on audio PLSA model.
841-844
- Anan Liu, Yongdong Zhang, Jintao Li:
Personalized movie recommendation.
845-848
- Benoit Baccot, Omar Choudary, Romulus Grigoras, Vincent Charvillat:
On the impact of sequence and time in rich media advertising.
849-852
- Tongwei Ren, Yan Liu,