Lines Matching +full:phase +full:- +full:shift
1 /* SPDX-License-Identifier: GPL-2.0 */
6 * HP-PA only implements integer multiply in the FPU. However, for
7 * integer multiplies by constant, it has a number of shift-and-add
8 * (but no shift-and-subtract, sigh!) instructions that a compiler
20 * PA7100 pairing rules. This is an in-order 2-way superscalar processor.
21 * Only one instruction in a pair may be a shift (by more than 3 bits),
22 * but other than that, simple ALU ops (including shift-and-add by up
25 * PA8xxx processors also dual-issue ALU instructions, although with
28 * This 6-step sequence was found by Yevgen Voronenko's implementation
36 * Phase 1: Compute a = (x << 19) + x, in __hash_32()
43 /* Phase 2: Return (b<<11) + (c<<6) + (a<<3) - c */ in __hash_32()
45 a += c << 3; b -= c; in __hash_32()
54 * Finding a good shift-and-add chain for GOLDEN_RATIO_64 is tricky,
59 * However, Jason Thong pointed out a work-around. The Hcub software
61 * constant multiplication, and is good at finding shift-and-add chains
68 * you can see the non-zero bits are divided into several well-separated
78 * and with one more small shift than alternatives.
81 * as needing one extra cycle to shift left 31 bits before the final
91 * This prevents it from mis-optimizing certain sequences.
98 * usefully portable across all GCC platforms, and so can be test-compiled
99 * on non-PA systems.
103 * Because the PA-8xxx is out of order, I'm not sure how much this matters,
110 * optimized shift-and-add sequence.
112 * Without the final shift, the multiply proper is 19 instructions,
123 * Encourage GCC to move a dynamic shift to %sar early, in hash_64()
127 asm("" : "=q" (bits) : "0" (64 - bits)); in hash_64()
129 bits = 64 - bits; in hash_64()
135 d = a - d; _ASSIGN(a, a << 4, "X" (d)); in hash_64()
137 d -= c; c += a << 1; in hash_64()
143 #undef _ASSIGN /* We're a widely-used header file, so don't litter! */