Skip to content
Snippets Groups Projects
Commit 52ac4c6b authored by Richard Biener's avatar Richard Biener
Browse files

[libiberty] remove TBAA violation in iterative_hash, improve code-gen

The following removes the TBAA violation present in iterative_hash.
As we eventually LTO that it's important to fix.  This also improves
code generation for the >= 12 bytes loop by using | to compose the
4 byte words as at least GCC 7 and up can recognize that pattern
and perform a 4 byte load while the variant with a + is not
recognized (not on trunk either), I think we have an enhancement bug
for this somewhere.

Given we reliably merge and the bogus "optimized" path might be
only relevant for archs that cannot do misaligned loads efficiently
I've chosen to keep a specialization for aligned accesses.

libiberty/
	* hashtab.c (iterative_hash): Remove TBAA violating handling
	of aligned little-endian case in favor of just keeping the
	aligned case special-cased.  Use | for composing a larger word.
parent 5266f930
No related branches found
No related tags found
No related merge requests found
......@@ -940,26 +940,23 @@ iterative_hash (const void *k_in /* the key */,
c = initval; /* the previous hash value */
/*---------------------------------------- handle most of the key */
#ifndef WORDS_BIGENDIAN
/* On a little-endian machine, if the data is 4-byte aligned we can hash
by word for better speed. This gives nondeterministic results on
big-endian machines. */
if (sizeof (hashval_t) == 4 && (((size_t)k)&3) == 0)
while (len >= 12) /* aligned */
/* Provide specialization for the aligned case for targets that cannot
efficiently perform misaligned loads of a merged access. */
if ((((size_t)k)&3) == 0)
while (len >= 12)
{
a += *(hashval_t *)(k+0);
b += *(hashval_t *)(k+4);
c += *(hashval_t *)(k+8);
a += (k[0] | ((hashval_t)k[1]<<8) | ((hashval_t)k[2]<<16) | ((hashval_t)k[3]<<24));
b += (k[4] | ((hashval_t)k[5]<<8) | ((hashval_t)k[6]<<16) | ((hashval_t)k[7]<<24));
c += (k[8] | ((hashval_t)k[9]<<8) | ((hashval_t)k[10]<<16)| ((hashval_t)k[11]<<24));
mix(a,b,c);
k += 12; len -= 12;
}
else /* unaligned */
#endif
while (len >= 12)
{
a += (k[0] +((hashval_t)k[1]<<8) +((hashval_t)k[2]<<16) +((hashval_t)k[3]<<24));
b += (k[4] +((hashval_t)k[5]<<8) +((hashval_t)k[6]<<16) +((hashval_t)k[7]<<24));
c += (k[8] +((hashval_t)k[9]<<8) +((hashval_t)k[10]<<16)+((hashval_t)k[11]<<24));
a += (k[0] | ((hashval_t)k[1]<<8) | ((hashval_t)k[2]<<16) | ((hashval_t)k[3]<<24));
b += (k[4] | ((hashval_t)k[5]<<8) | ((hashval_t)k[6]<<16) | ((hashval_t)k[7]<<24));
c += (k[8] | ((hashval_t)k[9]<<8) | ((hashval_t)k[10]<<16)| ((hashval_t)k[11]<<24));
mix(a,b,c);
k += 12; len -= 12;
}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment