Neighbor Discovery in P2P Systems: Preparing For Failures

Neighbor Discovery in P2P Systems: Preparing For Failures

Pedram Keyani Brian Larson Muthukumar Senthil

Computer Science Department

Stanford University

Stanford, CA 94305

USA

Tel (650) 723-1963

Fax (650) 725-2588

{ pkeyani, balarson, msenthil} @cs.stanford.edu

Proposal Abstract: Peer-to-Peer protocols such as Gnutella define the way in which nodes can share information. There is a major lacking in the protocol because it does not enforce an algorithm for discovery that makes the Overlay network resilient to failures of a large portion of the network. We propose a project to collect relevant information about Gnutella from Steven Gribble and from this statistical information create a P2P simulator where we can explore different discovery algorithms and produce numbers on how they perform from failure. We believe different types of discovery algorithms can produce different overlay topologies that are more resilient to failure from failing nodes, malicious nodes and physical network failure. Our goal is bring the philosophy of AME (Availability, Maintainability and Evolutionary growth) to P2P systems in a general way that will be applicable to a wide range of P2P systems.

Proposal Steps:

Collect statistical data on the current Gnutella landscape about how many nodes are alive at any one time, how many neighbors they have, how often they ping and pong, what percentage of network traffic is used for what types of messages… etc. We will either acquire data from past crawls of Gnutella network or create our own client to crawl network and use that information
Simulate system based on statistical data by doing (a) or (b)

Find a simulator suitable for our purposes which will allow us to setup a set of nodes and plug in different algorithms for them to discover one another ß We will contact Petros Maniatis about a P2P simulator he is working on
Create our own simulator with the same functionality that allows us to plug in discovery algorithms and allow us to kill or disable some percentage of the nodes and model the affects of this.

Implement different discovery algorithms to increase the reliability of the entire system. We have some ideas about different

Selectively pick neighbors by trying to increase the redundant paths between a set of neighbors
Pick nodes so that you have a low hop count between a majority of the other nodes
Use a reputation scheme to help nodes pick neighbors