Algorithms » History » Revision 22
Theses steps are valid independently of the language.
- Each file is split in word, there is a set of characters which will use as delimiter between the word like [' ', '#', '.', '?',..].
- Some character are replaced by a another, for example : ['é', 'ë',..] => 'e'. The goal is to remove all accent.
For a functional description see here : Functional definition
The match of a word can be partial from the beginning, for example train will match training.
This schema depicts how the results are sorted from one peer. Each peer result are then merged.
Each peer owns a peer id which is unique and generated during the first start. This ID is used to identify a peer, it's better than the previous usage of peer IP, considering this situation :
- A put in queue a file entry f from B, B doesn't know the hashes of this file entry.
- B change his IP address.
- A want to download f, it can ask B for the hashes even B's IP changed.
Core threads¶There are three kind of threads in the core in addition to the main thread :
- Downloading thread :
- Uploading thread :
- Updating file cache thread :
Updating the file cache¶
Here is the algorithm for the thread (
FileManager::FileUpdater) which will periodically update the file cache and persist it.
D : The set of shared directories T : Time during which the hashes are computed (for example 30s) F : A set containing file with unknown hashes, initially empty W : the directory watcher // First synchronize (at start) For each d in D (recursively) : - Add d to W - Synchronize physical folders and files with d content - Add in F the files which don't have hashes Loop : t : Time.now For each f in F : - Compute the unknown hash of files f - Remove f from F If (Time.now - t) > T : break - Wait for changes for a period of (if F is empty then INFINITE else 0) - When a modification occurs synchronize the file/folder - Add each new file in F - Persist the entire cache in a file (only every ~30min)
See here : Protocol_core-core