Project

General

Profile

Algorithms » History » Version 22

Greg Burri, 08/11/2009 11:44 PM

1 1 Greg Burri
h1. Algorithms
2
3 20 Greg Burri
4
h2. Word indexing
5
6
Theses steps are valid independently of the language.
7
8
# Each file is split in word, there is a set of characters which will use as delimiter between the word like [' ', '#', '.', '?',..].
9
# Some character are replaced by a another, for example : ['é', 'ë',..] => 'e'. The goal is to remove all accent.
10
11
12 1 Greg Burri
h2. Searching
13 2 Greg Burri
14 3 Greg Burri
For a functional description see here : [[Functional definition#The-search-window]]
15 15 Greg Burri
16 21 Greg Burri
The match of a word can be partial from the beginning, for example _train_ will match _training_.
17
18 1 Greg Burri
!aybabtu_search.png!
19 18 Greg Burri
_This schema depicts how the results are sorted from one peer. Each peer result are then merged._
20 4 Greg Burri
21 5 Greg Burri
h2. Peer ID
22
23 7 Greg Burri
Each peer owns a peer id which is unique and generated during the first start. This ID is used to identify a peer, it's better than the previous usage of peer IP, considering this situation :
24 5 Greg Burri
25
* _A_ put in queue a file entry _f_ from _B_, _B_ doesn't know the hashes of this file entry.
26
* _B_ change his IP address.
27
* _A_ want to download _f_, it can ask _B_ for the hashes even _B_'s IP changed.
28
29 8 Greg Burri
h2. Core threads
30
31 9 Greg Burri
There are three kind of threads in the core in addition to the main thread :
32
* [[Protocol_core-core#Downloading-thread|Downloading thread]] : @DownloadManager::ChunkDownloader@
33 8 Greg Burri
* Uploading thread : @UploadManager::Uploader@
34 11 Greg Burri
* [[Algorithms#Updating-the-file-cache|Updating file cache thread]] : @FileManager::FileUpdater@
35 5 Greg Burri
36 22 Greg Burri
h2. Updating the file cache
37 10 Greg Burri
38
Here is the algorithm for the thread (@FileManager::FileUpdater@) which will periodically update the file cache and persist it.
39
40
<pre>
41 1 Greg Burri
D : The set of shared directories
42 22 Greg Burri
T : Time during which the hashes are computed (for example 30s)
43
F : A set containing file with unknown hashes, initially empty
44
W : the directory watcher
45 13 Greg Burri
46
// First synchronize (at start)
47
For each d in D (recursively) :
48 22 Greg Burri
   - Add d to W
49 14 Greg Burri
   - Synchronize physical folders and files with d content
50 22 Greg Burri
   - Add in F the files which don't have hashes
51 13 Greg Burri
52 10 Greg Burri
Loop :   
53 1 Greg Burri
   t : Time.now
54 13 Greg Burri
   For each f in F :
55 1 Greg Burri
      - Compute the unknown hash of files f
56 10 Greg Burri
      - Remove f from F
57
      If (Time.now - t) > T : break
58 22 Greg Burri
   - Wait for changes for a period of (if F is empty then INFINITE else 0)
59
      - When a modification occurs synchronize the file/folder
60 13 Greg Burri
      - Add each new file in F
61 12 Greg Burri
   - Persist the entire cache in a file (only every ~30min)
62 10 Greg Burri
</pre>
63
64 5 Greg Burri
65 4 Greg Burri
h2. Downloading
66
67
See here : [[Protocol_core-core#Downloading-threads]]