Actions
Word index prototype¶
see here : Algorithms
Measure¶
Each file and folder is split in words and indexed by these words.
The item stored is an integer to simulate a pointer to a complex structure. Thus only the index memory is counted (with the item pointer). In a real usage, the item would contain a path to the file/folder, its size and other various information.
Case 1¶
- 8'531 files/folders (some mp3)
- Time to index : ~1s (8'000 item/s)
- Average size per indexed item : 464 bytes.
- Total size in memory : ~3.7 MB
- Speed to do a search : < 1 ms
Case 2¶
- 309'269 files/folders (various files)
- Time to index : ~30s (10'000 item/s)
- Average size per indexed item : 140 bytes. (The filenames from case 1 are surely longer and thus own more words).
- Total size in memory : ~42 MB
- Speed to do a search : < 1 ms
Conclusion¶
This algorithm is very time effective for searching or indexing but takes a lot of memory. For the moment it will be used unchanged but some space optimization may be done for the future.
Updated by Greg Burri about 15 years ago · 6 revisions