Project

General

Profile

Actions

Word index prototype » History » Revision 5

« Previous | Revision 5/6 (diff) | Next »
Greg Burri, 09/10/2009 10:56 AM


Word index prototype

see here : Algorithms

Measure

Each file and folder is split in words and indexed by these words.

Case 1

  • 8'531 files/folders (some mp3)
  • Time to index : ~1s (8'000 item/s)
  • Average size per indexed item : 464 bytes.
  • Total size in memory : ~3.7 MB
  • Speed to do a search : < 1 ms

Case 2

  • 309'269 files/folders (various files)
  • Time to index : ~30s (10'000 item/s)
  • Average size per indexed item : 140 bytes. (The filenames from case 1 are surely longer and thus own more words).
  • Total size in memory : ~42 MB
  • Speed to do a search : < 1 ms

Conclusion

This algorithm is very time effective for searching or indexing but takes a lot of memory. For the moment it will be used unchanged but some space optimization may be done for the future.

Updated by Greg Burri over 15 years ago · 5 revisions