2. AND 2008:
Singapore (SIGIR Workshop)
 Daniel P. Lopresti, Shourya Roy, Klaus U. Schulz, L. Venkata Subramaniam (Eds.):
Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data, AND 2008, Singapore, July 24, 2008.
ACM International Conference Proceeding Series 303 ACM 2008, ISBN 978-1-60558-196-5  
  
  
  
  
 
- Donna Harman:
 Some thoughts on failure analysis for noisy data.
             
- John Tait:
 Noise and information.
             
- Laurianne Sitbon, Patrice Bellot:
 How to cope with questions typed by dyslexic users.
1-8
             
- Daniel P. Lopresti:
 Optical character recognition errors and their effects on natural language processing.
9-16
             
- Ulrich Reffle, Annette Gotscharek, Christoph Ringlstetter, Klaus U. Schulz:
 Successfully detecting and correcting false friends using channel profiles.
17-22
             
- Valentin Jijkoun, Mahboob Alam Khalid, Maarten Marx, Maarten de Rijke:
 Named entity normalization in user generated content.
23-30
             
- Rema Ananthanarayanan, Vijil Chenthamarakshan, Prasad M. Deshpande, Raghuram Krishnapuram:
 Rule based synonyms for entity extraction from noisy text.
31-38
             
- Jiyin He, Wouter Weerkamp, Martha Larson, Maarten de Rijke:
 Blogger, stick to your story: modeling topical noise in blogs with coherence measures.
39-46
             
- Robert McArthur:
 Uncovering deep user context from blogs.
47-54
             
- Jinfeng Zhuang, Steven C. H. Hoi, Aixin Sun:
 On profiling blogs with representative entries.
55-62
             
- Soumya Datta, Sudeshna Sarkar:
 A comparative study of statistical features of language in blogs-vs-splogs.
63-66
             
- Sreangsu Acharyya, Sumit Negi, L. Venkata Subramaniam, Shourya Roy:
 Unsupervised learning of multilingual short message service (SMS) dialect from noisy examples.
67-74
             
- Antti Järvelin, Tuomas Talvensaari, Anni Järvelin:
 Data driven methods for improving mono- and cross-lingual IR performance in noisy environments.
75-82
             
- Lipika Dey, S. K. Mirajul Haque:
 Opinion mining from noisy text data.
83-90
             
- Rachit Arora, Balaraman Ravindran:
 Latent dirichlet allocation based multi-document summarization.
91-97
             
- Amaresh Kumar Pandey, Tanveer J. Siddiqui:
 An unsupervised Hindi stemmer with heuristic improvements.
99-105
             
- Anurag Bhardwaj, Faisal Farooq, Huaigu Cao, Venu Govindaraju:
 Topic based language models for OCR correction.
107-112
             
- Eiman Al-Shammari, Jessica Lin:
 A novel Arabic lemmatization algorithm.
113-118
             
Copyright © Mon Mar 15 03:54:22 2010
 by Michael Ley (ley@uni-trier.de)