Project

General

Profile

Algorithms » History » Version 21

Greg Burri, 08/11/2009 09:04 AM

1 1 Greg Burri
h1. Algorithms
2
3 20 Greg Burri
4
h2. Word indexing
5
6
Theses steps are valid independently of the language.
7
8
# Each file is split in word, there is a set of characters which will use as delimiter between the word like [' ', '#', '.', '?',..].
9
# Some character are replaced by a another, for example : ['é', 'ë',..] => 'e'. The goal is to remove all accent.
10
11
12 1 Greg Burri
h2. Searching
13 2 Greg Burri
14 3 Greg Burri
For a functional description see here : [[Functional definition#The-search-window]]
15 15 Greg Burri
16 21 Greg Burri
The match of a word can be partial from the beginning, for example _train_ will match _training_.
17
18 1 Greg Burri
!aybabtu_search.png!
19 18 Greg Burri
_This schema depicts how the results are sorted from one peer. Each peer result are then merged._
20 4 Greg Burri
21 5 Greg Burri
h2. Peer ID
22
23 7 Greg Burri
Each peer owns a peer id which is unique and generated during the first start. This ID is used to identify a peer, it's better than the previous usage of peer IP, considering this situation :
24 5 Greg Burri
25
* _A_ put in queue a file entry _f_ from _B_, _B_ doesn't know the hashes of this file entry.
26
* _B_ change his IP address.
27
* _A_ want to download _f_, it can ask _B_ for the hashes even _B_'s IP changed.
28
29 8 Greg Burri
h2. Core threads
30
31 9 Greg Burri
There are three kind of threads in the core in addition to the main thread :
32
* [[Protocol_core-core#Downloading-thread|Downloading thread]] : @DownloadManager::ChunkDownloader@
33 8 Greg Burri
* Uploading thread : @UploadManager::Uploader@
34 11 Greg Burri
* [[Algorithms#Updating-the-file-cache|Updating file cache thread]] : @FileManager::FileUpdater@
35 5 Greg Burri
36 19 Greg Burri
h2. Updating the file cache (Obsolete, must be rewritten after the seventh prototype ended)
37 10 Greg Burri
38
Here is the algorithm for the thread (@FileManager::FileUpdater@) which will periodically update the file cache and persist it.
39
40
<pre>
41
D : The set of shared directories
42
T : Time during the hashes are computed (for example 30s)
43
F : A set of file initialy empty
44
45 13 Greg Burri
- Add a watcher to the shared directories
46
47
// First synchronize (at start)
48
For each d in D (recursively) :
49 14 Greg Burri
   - Synchronize physical folders and files with d content
50 13 Greg Burri
   - Add in F the files which doesn't have computed hashes
51
52
Loop :   
53 10 Greg Burri
   t : Time.now
54 1 Greg Burri
   For each f in F :
55 13 Greg Burri
      - Compute the unknown hash of files f
56 1 Greg Burri
      - Remove f from F
57 10 Greg Burri
      If (Time.now - t) > T : break
58
   - Wait T - (Time.now - t)
59 13 Greg Burri
   - Wait for changes for a period of P
60
      - When a modification occurs synchronise the file/folder
61
      - Add each new file in F
62 12 Greg Burri
   - Persist the entire cache in a file (only every ~30min)
63 10 Greg Burri
</pre>
64
65 5 Greg Burri
66 4 Greg Burri
h2. Downloading
67
68
See here : [[Protocol_core-core#Downloading-threads]]