ICS 1999:
Rhodes,
Greece
ICS '99,
Proceedings of the 1999 International Conference on Supercomputing,
June 20-25,
1999,
Rhodes,
Greece. ACM,
1999
- Francisca Quintana, Jesús Corbal, Roger Espasa, Mateo Valero:
Adding a vector unit to a superscalar processor.
1-10
- Huy Nguyen, Lizy Kurian John:
Exploiting SIMD parallelism in DSP and multimedia algorithms using the AltiVec technology.
11-20
- Kunle Olukotun, Lance Hammond, Mark Willey:
Improving the performance of speculatively parallel applications on the Hydra CMP.
21-30
- Jeffrey B. Rothman, Alan Jay Smith:
The pool of subsectors cache design.
31-42
- Peter J. Keleher:
Symmetry and performance in consistency protocols.
43-50
- F. Jesús Sánchez, Antonio González:
A locality sensitive multi-module cache with explicit management.
51-59
- Jeeraporn Srisawat, Nikitas A. Alexandridis:
A new ``quad-tree-based'' sub-system allocation technique for mesh-connected parallel machines.
60-67
- Andrei Radulescu, Arjan J. C. van Gemund:
On the complexity of list scheduling algorithms for distributed-memory systems.
68-75
- Daniel Jiménez-González, Josep-Lluis Larriba-Pey, Juan J. Navarro:
Communication conscious radix sort.
76-82
- Martin C. Rinard, Pedro C. Diniz:
Eliminating synchronization bottlenecks in object-based programs using adaptive replication.
83-92
- Kyung Dong Ryu, Jeffrey K. Hollingsworth, Peter J. Keleher:
Mechanisms and policies for supporting fine-grained cycle stealing.
93-100
- Dejan Perkovic, Peter J. Keleher:
Responsiveness without interrupts.
101-108
- Yuan C. Chou, Jason Fung, John Paul Shen:
Reducing branch misprediction penalties via dynamic control independence detection.
109-118
- Alex Ramírez, Josep-Lluis Larriba-Pey, Carlos Navarro, Josep Torrellas, Mateo Valero:
Software trace cache.
119-126
- Chi-Hung Chi, Jun-Li Yuan, Chin-Ming Cheung:
Cyclic dependence based data reference prediction.
127-134
- Xiaowei Shen, Arvind, Larry Rudolph:
CACHET: an adaptive cache coherence protocol for distributed shared-memory systems.
135-144
- Alexander V. Veidenbaum, Weiyu Tang, Rajesh K. Gupta, Alexandru Nicolau, Xiaomei Ji:
Adapting cache line size to application behavior.
145-154
- Timothy Sherwood, Brad Calder, Joel S. Emer:
Reducing cache misses using hardware and software page placement.
155-164
- Dongming Jiang, Brian O'Kelley, Xiang Yu, Sanjeev Kumar, Angelos Bilas, Jaswinder Pal Singh:
Application scaling under shared virtual memory on a cluster of SMPs.
165-174
- Liviu Iftode, Matthias A. Blumrich, Cezary Dubnicki, David L. Oppenheimer, Jaswinder Pal Singh, Kai Li:
Shared virtual memory with automatic update support.
175-183
- Evan Speight, Hazim Abdel-Shafi, John K. Bennett:
Realizing the performance potential of the virtual interface architecture.
184-192
- Valentin Puente, José A. Gregorio, Cruz Izu, Ramón Beivide, Fernando Vallejo:
Low-level router design and its impact on supercomputer system performance.
193-201
- José F. Martínez, Josep Torrellas, José Duato:
Improving the performance of bristled CC-NUMA systems using virtual channels and adaptivity.
202-209
- Daniel Franco, I. Garcés, Emilio Luque:
A new method to make communication latency uniform: distributed routing balancing.
210-219
- Francisco Corbera, Rafael Asenjo, Emilio L. Zapata:
New shape analysis techniques for automatic parallelization of C codes.
220-227
- Amy W. Lim, Gerald I. Cheong, Monica S. Lam:
An affine partitioning algorithm to maximize parallelism and minimize communication.
228-237
- Claudia Roberta Calidonna, Maurizio Giordano, Mario Mango Furnari:
A graphic parallelizing environment for user-compiler interaction.
238-245
- Masato Oguchi, Masaru Kitsuregawa:
Dynamic remote memory acquisition for parallel data mining on ATM-connected PC cluster.
246-252
- Yong E. Cho, Marianne Winslett, Szu-Wen Kuo, Jonghyun Lee, Ying Chen:
Parallel I/O for scientific applications on heterogeneous clusters: a resource-utilization approach.
253-259
- Shinji Sumimoto, Hiroshi Tezuka, Atsushi Hori, Hiroshi Harada, Toshiyuki Takahashi, Yutaka Ishikawa:
The design and evaluation of high performance communication using a Gigabit Ethernet.
260-267
- Donald Yeung:
The scalability of multigrain systems.
268-277
- Nandini Mukherjee, John R. Gurd:
A comparative analysis of four parallelisation schemes.
278-285
- Thomas L. Sterling, Larry A. Bergman:
A design analysis of a hybrid technology multithreaded architecture for petaflops scale computation3.
286-293
- Xavier Martorell, Eduard Ayguadé, Nacho Navarro, Julita Corbalán, Marc González, Jesús Labarta:
Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors.
294-301
- Suvas Vajracharya, Steve Karmesin, Peter H. Beckman, James Crotinger, Allen D. Malony, Sameer Shende, R. R. Oldehoeft, Stephen Smith:
SMARTS: exploiting temporal locality and parallelism through vertical execution.
302-310
- Bradford L. Chamberlain, E. Christopher Lewis, Lawrence Snyder:
Problem space promotion and its evaluation as a technique for efficient parallel computation.
311-318
- Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou:
A quantitative architectural evaluation of synchronization algorithms and disciplines on ccNUMA systems: the case of the SGI Origin2000.
319-328
- Hongzhang Shan, Jaswinder Pal Singh:
A comparison of MPI, SHMEM and cache-coherent shared address space programming models on the SGI Origin2000.
329-338
- Ravi R. Iyer, Nancy M. Amato, Lawrence Rauchwerger, Laxmi N. Bhuyan:
Comparing the memory system performance of the HP V-class and SGI Origin 2000 multiprocessors using microbenchmarks and scientific applications.
339-347
- Ivan Martel, Daniel Ortega, Eduard Ayguadé, Mateo Valero:
Increasing effective IPC by exploiting distant parallelism.
348-355
- Amir Roth, Andreas Moshovos, Gurindar S. Sohi:
Improving virtual function call target prediction via dependence-based pre-computation.
356-364
- Pedro Marcuello, Antonio González:
Clustered speculative multithreaded processors.
365-372
- Yuanyuan Zhou, Peter M. Chen, Kai Li:
Fast cluster failover using virtual memory-mapped communication.
373-382
- Michael D. Beynon, Alan Sussman, Joel H. Saltz:
Performance impact of proxies in data intensive client-server applications.
383-390
- A. Ferre-Vilaplana, José M. Bernabéu-Aubán:
A comparison of two approaches for independent scaling up of processing and communication capacities in multicomputer networks.
391-398
- Glenn Reinman, Brad Calder, Dean M. Tullsen, Gary S. Tyson, Todd M. Austin:
Classifying load and store instructions for memory renaming.
399-407
- Gang Chen, Michael D. Smith:
Reorganizing global schedules for register allocation.
408-416
- V. Janaki Ramanan, Ramaswamy Govindarajan:
Resource usage models for instruction scheduling: two new models and a classification.
417-424
- John M. Mellor-Crummey, David B. Whalley, Ken Kennedy:
Improving memory hierarchy performance for irregular applications.
425-433
- Vijay Menon, Keshav Pingali:
High-level semantic optimization of numerical codes.
434-443
- Siddhartha Chatterjee, Vibhor V. Jain, Alvin R. Lebeck, Shyam Mundhra, Mithuna Thottethodi:
Nonlinear array layouts for hierarchical memory systems.
444-453
- Jay B. Brockman, Peter M. Kogge, Thomas L. Sterling, Vincent W. Freeh, Shannon K. Kuntz:
Microservers: a new memory semantics for massively parallel computing.
454-463
- Ashley Saulsbury, Su-Jaen Huang, Fredrik Dahlgren:
Efficient management of memory hierarchies in embedded DRAM systems.
464-473
- Carlos Molina, Antonio González, Jordi Tubella:
Dynamic removal of redundant computations.
474-481
- Induprakas Kodukula, Keshav Pingali, Robert Cox, Dror E. Maydan:
An experimental evaluation of tiling and shackling for memory hierarchy management.
482-491
- Jacqueline Chame, Sungdo Moon:
A tile selection algorithm for data locality and cache interference.
492-499
- Mahmut T. Kandemir, Prithviraj Banerjee, Alok N. Choudhary, J. Ramanujam, Eduard Ayguadé:
An integer linear programming approach for optimizing cache locality.
500-509
Copyright © Sun Mar 14 23:09:00 2010
by Michael Ley (ley@uni-trier.de)