Nithya Vijayakumar's Oral Qualifier Reading List

Advisory Committee: Beth Plale, Dennis Gannon, Ed Robertson

Completed on April 23, 2004

I. Books

Andrew S. Tanenbaum, Maarten van Steen
Distributed Systems - Principles and Paradigms,  Chapters 4 - 8
Prentice Hall
 

Foster & Kesselman 
The Grid 2: Blueprint for a New Computing Infrastructure, Chapter: Database Access and Integration
Second Edition, Morgan Kaufmann
 

II. Data Management in Grid
 

1. Leanne Guy, Peter Kunszt, Erwin Laure, Heinz Stockinger, Kurt Stockinger
Replica Management in Data Grids
Globul Grid Forum 5, 2002

 -

Describes the architecture and design of a high level system for replica management in data grids.
 

2. Vijayshankar Raman, Inderpal Narang, Chris Crone, Laura Haas, Susan Malaika,Tina Mukai, Dan Wolfson. Chaitan Baru
Data Access and Management Services on the Grid
Global Grid Forum 5, Edinburgh, Scotland, July 2002 

 -
 

Presents a layer of grid data virtualization services that makes complexities of data management transparent and enables ease of information access on a grid.
 

3. M. Ripeanu, I. Foster
A Decentralized, Adaptive, Replica Location Service

11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11)Edinburgh, Scotland, July 24-16, 2002

 -
 

Argues that a replica location mechanism that combines probabilistic representations of replica location information with soft-state protocols and a flat overlay network of nodes brings important benefits: genuine decentralization, low query latency, and flexibility to introduce adaptive communication schedules.
 
4. Ian Foster, Jens Vockler, Michael Wilde, Yong Zhao
The Virtual Data Grid: A New Model and Architecture for Data-Intensive Collaboration
Proceedings of CIDR, Biennial Conference on Innovative Database Systems Research, 2003

 -
 

Defines the model Virtual Data Grid, a data system architecture based on an integrated treatment of data, computation procedures used to manipulate data and the computations that apply those procedures to data.
 

5. Gurmeet Singh, Shishir Bharathi, Ann Chervenak, Ewa Deelman, Carl Kesselman, Mary Manohar, Sonal Patil, Laura Pearlman
A Metadata Catalog Service for Data Intensive Applications
Supercomputing 2003, Phoenix, USA

 -
 

Presents the design of a Metadata Catalog Service (MCS) that provides a mechanism for storing and accessing descriptive metadata and allows users to query for data items based on desired attributes.
 
6. Beth Plale
Architecture for Accessing Data Streams on the Grid
2nd EUROPEAN ACROSS GRIDS CONFERENCE (AxGrids 2004), January 2004 

 -
 

Defines the architecture of data stream store, a grid enabled data resource that delivers data stream results to clients through the grid service infrastructure defined by OGSI.
 

7. Beth Plale, Craig Jacobs, Scott Jensen, Ying Liu, Charlie Moad, Rupali Parab, and Prajakta Vaidya
Understanding Grid Resource Information Management through a Synthetic Database Benchmark/Workload
4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid2004), April 2004 

 -
 

Contributes to the understanding of resource information representation and retrieval in grid computing through the development of a grid resource information repository, and the application of the benchmark/workload to MySQL, Xindice and LDAP.
 


III. Data Streaming
 

1. Douglas Terry, David Goldberg, David Nichols, Brian Oki
Continuous Queries over Append-Only Databases
ACM SIGMOD international conference on Management of data, 1992

 -
 

Converts a continuous query into an incremental query that efficiently finds new matches to the original query as new messages are added to the database.
 

2. Shivnath Babu, Jennifer Widom
Continuous Queries over Data Streams
ACM SIGMOD international conference on Management of data, 2001 

 -
 

Defines a  framework for query processing in the presence of continuous data streams. 
 

3. Sam Madden and Michael J. Franklin
Fjording the Stream: An Architecture for Queries over Streaming Sensor Data
In International Conference on Data Engineering ICDE, 2002 

 -
 

Presents an architecture for managing multiple queries over many sensors, and focuses on an efficient, adaptive, and power sensitive infrastructure upon which new query processing approaches can be built.
  

4. Brian Babcock, Shivnath Babu, Mayur Datar, Rajeev Motwani, Jennifer Widom
Models and Issues in Data Stream Systems
ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2002 

 -
 

Discusses fundamental models and issues in developing a general-purpose Data Stream Management System.
 

5. Stratis D. Viglas Jeffrey F. Naughton
Rate-Based Query Optimization for Streaming Information
ACM SIGMOD international conference on Management of data, 2002 

 -
 

Proposes a rate-based approach to estimate the cost of query plans for relational query optimizers and gives an optimization framework that aims at maximizing the output rate of query evaluation plans.
 

6. Beth Plale and Karsten Schwan
Dynamic Querying of Streaming Data with the dQUOB System
IEEE Transactions on Parallel and Distributed Systems, vol. 14, number 4, April 2003 

 -
 

Provides a conceptual relational data model and SQL query access over streaming data.
 
7. Nithya Vijayakumar and Beth Plale
Run Time Adaptations for Improved Performance on Asynchronous Data Streams
manuscript submitted to conference, February 2004  

 -
 

Reports results on the experimental evaluation of optimizations on the dQUOB system that address memory management, a performance and scalability bottleneck.
 
Supplementary Reading
Hosted by www.Geocities.ws

1