Word index prototype » History » Version 6
Greg Burri, 09/10/2009 11:00 AM
1 | 1 | Greg Burri | h1. Word index prototype |
---|---|---|---|
2 | |||
3 | 2 | Greg Burri | see here : [[Algorithms#Word-indexing]] |
4 | 3 | Greg Burri | |
5 | h2. Measure |
||
6 | |||
7 | 5 | Greg Burri | Each file and folder is split in words and indexed by these words. |
8 | |||
9 | 6 | Greg Burri | The item stored is an integer to simulate a pointer to a complex structure. Thus only the index memory is counted (with the item pointer). In a real usage, the item would contain a path to the file/folder, its size and other various information. |
10 | |||
11 | 3 | Greg Burri | h3. Case 1 |
12 | |||
13 | * 8'531 files/folders (some mp3) |
||
14 | * Time to index : ~1s (8'000 item/s) |
||
15 | * Average size per indexed item : 464 bytes. |
||
16 | * Total size in memory : ~3.7 MB |
||
17 | * Speed to do a search : < 1 ms |
||
18 | |||
19 | h3. Case 2 |
||
20 | 1 | Greg Burri | |
21 | 5 | Greg Burri | * 309'269 files/folders (various files) |
22 | 3 | Greg Burri | * Time to index : ~30s (10'000 item/s) |
23 | * Average size per indexed item : 140 bytes. (The filenames from case 1 are surely longer and thus own more words). |
||
24 | * Total size in memory : ~42 MB |
||
25 | * Speed to do a search : < 1 ms |
||
26 | |||
27 | |||
28 | h2. Conclusion |
||
29 | |||
30 | 4 | Greg Burri | This algorithm is very time effective for searching or indexing but takes a lot of memory. For the moment it will be used unchanged but some space optimization may be done for the future. |