Need efficient data access:
- yes -> table size > 10k rows
- yes -> what type of data are you querying
- full text search -> inverted index
- location data -> geospatial index (geohashing, quad-trees,r)
- In-memory exact matches -> has index
- Everything else -> B-tree
- No -> Full table scan
- yes -> what type of data are you querying
- No -> Full Table scan
How to find a row in a db without an index?
- Pull a page into memory and search for the row
- pull another page into memory and search for the row
- repeat until the row is found
How do indexes speed up queries?
- First load index into memory
- Use it to find out exactly which page has our row
- Then load just that page
Hash index not used much in production b/c they do not support range queries like b-trees. Useful more for caching.
B-trees good for prefix searches but bad for wildcards