FNV-1a Hashing Paradigm: Implementing High-Speed Textual Retrieval in Relational Databases

Nino Arsov
2 min readOct 25, 2023

--

Managing the surge of digital data remains a colossal challenge within the computational domain. FNV-1a, an offspring of the esteemed Fowler-Noll-Vo hashing family, emerges as a refined tool tailored for intricate textual data retrievals. Diving deeper, we’ll decipher its mechanics, coupled with real-world Intel processor optimizations and SQL insights for its application.

Inside the FNV-1a Mechanics

Synthesizing verbose textual data into compact hash values, FNV-1a operates iteratively using multiplication and bitwise XOR operations. This blend ensures a balance in hash value distribution while curtailing collision probabilities.

Intel-Optimized FNV-1a Hash Computation

Utilizing low-level Intel processor directives can further speed up the hashing operation:


#include <immintrin.h>

uint64_t fnv1a_hash(const char* text) {
uint64_t hash = 14695981039346656037ULL;
const uint64_t fnv_prime = 1099511628211ULL;
while (*text) {
hash ^= (uint64_t)*text++;
hash = _mm_mullo_epi64(_mm_set_epi64x(hash, 0), _mm_set_epi64x(fnv_prime, 0)).m64_u64[0];
}
return hash;
}

This C snippet leverages Intel’s SIMD (Single Instruction, Multiple Data) instructions to optimize the multiplication operation.

Integrating FNV-1a in SQL Routines

Having obtained a hash value, the subsequent phase involves leveraging it in SQL operations.

  1. Exact Match Search: Direct alignment is the essence here. Post computing the hash for a search term using FNV-1a, it’s used directly in SQL for data retrieval.

SQL Example

SELECT *
FROM articles
WHERE hash_value = fnv1a_hash('quantum');

2. Prefix Searches: A tad more intricate, as users start typing, hash values for potential prefixes are computed, guiding the SQL search.

SQL Example

Consider searching for words starting with “comp”. First, hash the term, then:


SELECT *
FROM articles
WHERE hash_value BETWEEN fnv1a_hash(‘comp’) AND fnv1a_hash(‘compz’);Here, ‘compz’ serves as an upper limit, ensuring the retrieval of all relevant prefix matches.

Here, ‘compz’ serves as an upper limit, ensuring the retrieval of all relevant prefix matches.

Applying FNV-1a: Wikipedia’s Microcosm

Using Wikipedia as a canvas, imagine millions of search queries flooding in. The backend, empowered by FNV-1a, swiftly computes, checks, and retrieves. The results? Near-instantaneous, precise matches for user queries.

Wrapping Up

The intricate fusion of FNV-1a within contemporary frameworks, further amplified by Intel optimizations, epitomizes computational efficiency. These tools serve as our compass, ensuring seamless navigation through our expansive digital realm.

--

--

Nino Arsov
Nino Arsov

Written by Nino Arsov

Nino Arsov holds an MSc degree in machine learning and statistical learning theory. He combines research experience from world-class universities with industry

No responses yet