General approach » History » Revision 6
Revision 5 (Greg Burri, 04/05/2011 11:19 PM) → Revision 6/18 (Greg Burri, 04/05/2011 11:25 PM)
h1. General approach h2. Generalities Each file the user want to share is cut sequentially in pieces called _chunk_. Each chunk has the same size except the last one which can be smaller. For each chunk there is a footprint called _hash_ which identify the chunk. We admit that two chunks with the same hash contains the same data. The hash is computed with a cryptographic hash function like "SHA-1":http://en.wikipedia.org/wiki/Sha-1 . SHA-1. h2. To know who are the other peers Each peer sends periodically a message to tell to the other peers that it exists. This message can contain different information like the peer name, the amount of sharing data and also the wanted chunk hashes, see _Download_ section below. etc. Multicast UDP is used use to broadcasting the message to all other peers simultaneously. h2. Browsing Remote entries can be browsed by asking the content of a given remote directory. A listed file can contain one or more of its hashes. There is a special case which the peer doesn't know about remote entries and wants to retrieve the root directories. h2. Search for entries (by terms) A multicast UDP datagram is sent to all other peer with the searched terms. The remote peers which have one or more matched entry will send them back with a unicast UDP datagram. A file result can contain on or more of its hashes. h2. Download A chunk can only be downloaded if we know its hash. The hashes associated to a specific file can be explicitly asked. If the remote peer doesn't know the hashes of one of its file when they are asked it will compute these hashes on the fly and send them one by one as soon as possible. The list of the files to download is ordered by the user, thus the top files must be downloaded first. We will try to know first the hashes of the top files. As soon as we know a hash we can try to download the associated chunk. There can be many parallel downloads but only from different remote peers. The number of parallel downloads is limited. To choose a chunk to download we will take the first file having at least one un-downloaded chunk which has at least one free peer, then among theses chunks we will choose the one which have the minimum number of peer (rarest part first), if there is some eligible chunk we choose randomly. To known which remote peer has which wanted chunk we asked periodically to all remote peers if they have the next chunks we want to download by sending them the corresponding hashes. The number of hash sent is limited. This can be achieved by putting theses hashes in the message described in section _To know who are the other peers_. with with multicast UDP. To initiate a download we ask a remote peer by sending a hash and an offset relative to the beginning of the chunk. h2. Communicate by chatting h2. Glossary Remote <thing>: An entry: A file: A peer: A free peer: A peer: A chunk: A hash: