Skip to content
Advertisement

How to identify best Max_bucket and Seed_value for Oracle ORA_Hash function?

I am new to Oracle Hash function. I know that this function is for encryption purpose. It is actually convert a very large paragraph into one single hash value.

The Ora_hash function have three different parameters:

  • Expression
  • Max_bucket
  • Seed_value

For the Max_bucket and seed value, the document says I can specify between 0 to 429496725. Max_bucket is default to 429496725 and Seed_Value is default to 0.

However, does anyone know what is the difference between 0 and 429496725 for those values?

I am actually planning to use it to compare two columns from two different tables, each rows in each columns have close to 3000 characters, and 1 table will have close to 1 million of records while the other will have close to billions of records. Of course both table can be joined with an ID columns.

As a result of this, I think using a hash value will be a better option than simply using A = B.

However, could anyone teach me how to identify best Max_bucket and Seed_value for Oracle ORA_Hash function?

Thanks in advance!

Advertisement

Answer

ORA_HASH is not intended for generating unique hash values. You probably want to use a function like STANDARD_HASH instead.

ORA_HASH is intended for situations where you want to quickly throw a bunch of values into a group of buckets, and hash collisions are useful. ORA_HASH is useful for hash partitioning; for example, you might want to split a table into 64 segments to improve manageability.

STANDARD_HASH can be used to generate practically-unique hashes, using algorithms like MD5 or SHA. These hash algorithms are useful for cryptographic purposes, whereas ORA_HASH would not be suitable. For example:

select standard_hash('asdf') the_hash from dual;

THE_HASH
--------
3DA541559918A808C2402BBA5012F6C60B27661C
User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement