SCS Undergraduate Thesis Topics

2006-2007
Student Advisor(s) Thesis Topic
Seng Keat Teh Bruce Maggs/Dave Andersen Efficient Algorithms for Similarity-Enhanced Transfer in Peer-to-Peer Systems

Our study focuses on the development of efficient algorithms for data delivery and file distribution within the context of Similarity-Enhanced Transfer (SET), an enhancement to peer-to-peer(P2P) file distribution systems in which peers that hold similar files that overlap with the target file being downloaded are made available as additional peer sources for these overlapping data segments. Current P2P systems do not take advantage of the possible benefits of enlarging the set of peer sources from which the original target file can be downloaded with additional peers that hold similar but not exactly equivalent copies of the target file. Augmenting the initial set of exact file sources with sources of similar files and taking advantage of exploitable similarity at the level of file chunks has the potential of speeding up file download times while improving the robustness and longevity of a given P2P file distribution process.

Our work in particular contributes to the development of different file distribution algorithms than those currently employed by popular P2P file distribution systems. Current P2P file distribution systems such as BitTorrent employ game theoretic motivated algorithms and replication heuristics in the face of distributing files over the Internet to participating peers that are autonomous, self-serving and possibly dishonest. Our end goal is to instead develop file distribution algorithms for SET employed in more trustworthy peer-to-peer networks such as internal or private networks, storage area networks and networks of parallel storage devices where a degree of centralization, coordination and dependability is available. Our approach utilizes gossiping algorithms to broadcast a file being downloaded to all peer participants, with focus on optimal algorithms for the arbitrary multi-source broadcast and disjoint multi-source broadcast situations.


Close this window