Skip to content
Snippets Groups Projects
user avatar
Jakub Jelinek authored
As mentioned in the _BitInt support thread, _BitInt(N) is currently limited
by the wide_int/widest_int maximum precision limitation, which is depending
on target 191, 319, 575 or 703 bits (one less than WIDE_INT_MAX_PRECISION).
That is fairly low limit for _BitInt, especially on the targets with the 191
bit limitation.

The following patch bumps that limit to 16319 bits on all arches (which support
_BitInt at all), which is the limit imposed by INTEGER_CST representation
(unsigned char members holding number of HOST_WIDE_INT limbs).

In order to achieve that, wide_int is changed from a trivially copyable type
which contained just an inline array of WIDE_INT_MAX_ELTS (3, 5, 9 or
11 limbs depending on target) limbs into a non-trivially copy constructible,
copy assignable and destructible type which for the usual small cases (up
to WIDE_INT_MAX_INL_ELTS which is the former WIDE_INT_MAX_ELTS) still uses
an inline array of limbs, but for larger precisions uses heap allocated
limb array.  This makes wide_int unusable in GC structures, so for dwarf2out
which was the only place which needed it there is a new rwide_int type
(restricted wide_int) which supports only up to RWIDE_INT_MAX_ELTS limbs
inline and is trivially copyable (dwarf2out should never deal with large
_BitInt constants, those should have been lowered earlier).

Similarly, widest_int has been changed from a trivially copyable type which
contained also an inline array of WIDE_INT_MAX_ELTS limbs (but unlike
wide_int didn't contain precision and assumed that to be
WIDE_INT_MAX_PRECISION) into a non-trivially copy constructible, copy
assignable and destructible type which has always WIDEST_INT_MAX_PRECISION
precision (32640 bits currently, twice as much as INTEGER_CST limitation
allows) and unlike wide_int decides depending on get_len () value whether
it uses an inline array (again, up to WIDE_INT_MAX_INL_ELTS) or heap
allocated one.  In wide-int.h this means we need to estimate an upper
bound on how many limbs will wide-int.cc (usually, sometimes wide-int.h)
need to write, heap allocate if needed based on that estimation and upon
set_len which is done at the end if we guessed over WIDE_INT_MAX_INL_ELTS
and allocated dynamically, while we actually need less than that
copy/deallocate.  The unexact guesses are needed because the exact
computation of the length in wide-int.cc is sometimes quite complex and
especially canonicalize at the end can decrease it.  widest_int is again
because of this not usable in GC structures, so cfgloop.h has been changed
to use fixed_wide_int_storage <WIDE_INT_MAX_INL_PRECISION> and punt if
we'd have larger _BitInt based iterators, programs having more than 128-bit
iterators will be hopefully rare and I think it is fine to treat loops with
more than 2^127 iterations as effectively possibly infinite, omp-general.cc
is changed to use fixed_wide_int_storage <1024>, as it better should support
scores with the same precision on all arches.

Code which used WIDE_INT_PRINT_BUFFER_SIZE sized buffers for printing
wide_int/widest_int into buffer had to be changed to use XALLOCAVEC for
larger lengths.

On x86_64, the patch in --enable-checking=yes,rtl,extra configured
bootstrapped cc1plus enlarges the .text section by 1.01% - from
0x25725a5 to 0x25e5555 and similarly at least when compiling insn-recog.cc
with the usual bootstrap option slows compilation down by 1.01%,
user 4m22.046s and 4m22.384s on vanilla trunk vs.
4m25.947s and 4m25.581s on patched trunk.  I'm afraid some code size growth
and compile time slowdown is unavoidable in this case, we use wide_int and
widest_int everywhere, and while the rare cases are marked with UNLIKELY
macros, it still means extra checks for it.

The patch also regresses
+FAIL: gm2/pim/fail/largeconst.mod,  -O
+FAIL: gm2/pim/fail/largeconst.mod,  -O -g
+FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst.mod,  -O3 -fomit-frame-pointer -finline-functions
+FAIL: gm2/pim/fail/largeconst.mod,  -Os
+FAIL: gm2/pim/fail/largeconst.mod,  -g
+FAIL: gm2/pim/fail/largeconst2.mod,  -O
+FAIL: gm2/pim/fail/largeconst2.mod,  -O -g
+FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer
+FAIL: gm2/pim/fail/largeconst2.mod,  -O3 -fomit-frame-pointer -finline-functions
+FAIL: gm2/pim/fail/largeconst2.mod,  -Os
+FAIL: gm2/pim/fail/largeconst2.mod,  -g
tests, which previously were rejected with
error: constant literal ‘12345678912345678912345679123456789123456789123456789123456789123456791234567891234567891234567891234567891234567912345678912345678912345678912345678912345679123456789123456789’ exceeds internal ZTYPE range
kind of errors, but now are accepted.  Seems the FE tries to parse constants
into widest_int in that case and only diagnoses if widest_int overflows,
that seems wrong, it should at least punt if stuff doesn't fit into
WIDE_INT_MAX_PRECISION, but perhaps far less than that, if it wants support
for middle-end for precisions above 128-bit, it better should be using
BITINT_TYPE.  Will file a PR and defer to Modula2 maintainer.

2023-10-12  Jakub Jelinek  <jakub@redhat.com>

	PR c/102989
	* wide-int.h: Adjust file comment.
	(WIDE_INT_MAX_INL_ELTS): Define to former value of WIDE_INT_MAX_ELTS.
	(WIDE_INT_MAX_INL_PRECISION): Define.
	(WIDE_INT_MAX_ELTS): Change to 255.  Assert that WIDE_INT_MAX_INL_ELTS
	is smaller than WIDE_INT_MAX_ELTS.
	(RWIDE_INT_MAX_ELTS, RWIDE_INT_MAX_PRECISION, WIDEST_INT_MAX_ELTS,
	WIDEST_INT_MAX_PRECISION): Define.
	(WI_BINARY_RESULT_VAR, WI_UNARY_RESULT_VAR): Change write_val callers
	to pass 0 as a new argument.
	(class widest_int_storage): Likewise.
	(widest_int, widest2_int): Change typedefs to use widest_int_storage
	rather than fixed_wide_int_storage.
	(enum wi::precision_type): Add INL_CONST_PRECISION enumerator.
	(struct binary_traits): Add partial specializations for
	INL_CONST_PRECISION.
	(generic_wide_int): Add needs_write_val_arg static data member.
	(int_traits): Likewise.
	(wide_int_storage): Replace val non-static data member with a union
	u of it and HOST_WIDE_INT *valp.  Declare copy constructor, copy
	assignment operator and destructor.  Add unsigned int argument to
	write_val.
	(wide_int_storage::wide_int_storage): Initialize precision to 0
	in the default ctor.  Remove unnecessary {}s around STATIC_ASSERTs.
	Assert in non-default ctor T's precision_type is not
	INL_CONST_PRECISION and allocate u.valp for large precision.  Add
	copy constructor.
	(wide_int_storage::~wide_int_storage): New.
	(wide_int_storage::operator=): Add copy assignment operator.  In
	assignment operator remove unnecessary {}s around STATIC_ASSERTs,
	assert ctor T's precision_type is not INL_CONST_PRECISION and
	if precision changes, deallocate and/or allocate u.valp.
	(wide_int_storage::get_val): Return u.valp rather than u.val for
	large precision.
	(wide_int_storage::write_val): Likewise.  Add an unused unsigned int
	argument.
	(wide_int_storage::set_len): Use write_val instead of writing val
	directly.
	(wide_int_storage::from, wide_int_storage::from_array): Adjust
	write_val callers.
	(wide_int_storage::create): Allocate u.valp for large precisions.
	(wi::int_traits <wide_int_storage>::get_binary_precision): New.
	(fixed_wide_int_storage::fixed_wide_int_storage): Make default
	ctor defaulted.
	(fixed_wide_int_storage::write_val): Add unused unsigned int argument.
	(fixed_wide_int_storage::from, fixed_wide_int_storage::from_array):
	Adjust write_val callers.
	(wi::int_traits <fixed_wide_int_storage>::get_binary_precision): New.
	(WIDEST_INT): Define.
	(widest_int_storage): New template class.
	(wi::int_traits <widest_int_storage>): New.
	(trailing_wide_int_storage::write_val): Add unused unsigned int
	argument.
	(wi::get_binary_precision): Use
	wi::int_traits <WI_BINARY_RESULT (T1, T2)>::get_binary_precision
	rather than get_precision on get_binary_result.
	(wi::copy): Adjust write_val callers.  Don't call set_len if
	needs_write_val_arg.
	(wi::bit_not): If result.needs_write_val_arg, call write_val
	again with upper bound estimate of len.
	(wi::sext, wi::zext, wi::set_bit): Likewise.
	(wi::bit_and, wi::bit_and_not, wi::bit_or, wi::bit_or_not,
	wi::bit_xor, wi::add, wi::sub, wi::mul, wi::mul_high, wi::div_trunc,
	wi::div_floor, wi::div_ceil, wi::div_round, wi::divmod_trunc,
	wi::mod_trunc, wi::mod_floor, wi::mod_ceil, wi::mod_round,
	wi::lshift, wi::lrshift, wi::arshift): Likewise.
	(wi::bswap, wi::bitreverse): Assert result.needs_write_val_arg
	is false.
	(gt_ggc_mx, gt_pch_nx): Remove generic template for all
	generic_wide_int, instead add functions and templates for each
	storage of generic_wide_int.  Make functions for
	generic_wide_int <wide_int_storage> and templates for
	generic_wide_int <widest_int_storage <N>> deleted.
	(wi::mask, wi::shifted_mask): Adjust write_val calls.
	* wide-int.cc (zeros): Decrease array size to 1.
	(BLOCKS_NEEDED): Use CEIL.
	(canonize): Use HOST_WIDE_INT_M1.
	(wi::from_buffer): Pass 0 to write_val.
	(wi::to_mpz): Use CEIL.
	(wi::from_mpz): Likewise.  Pass 0 to write_val.  Use
	WIDE_INT_MAX_INL_ELTS instead of WIDE_INT_MAX_ELTS.
	(wi::mul_internal): Use WIDE_INT_MAX_INL_PRECISION instead of
	MAX_BITSIZE_MODE_ANY_INT in automatic array sizes, for prec
	above WIDE_INT_MAX_INL_PRECISION estimate precision from
	lengths of operands.  Use XALLOCAVEC allocated buffers for
	prec above WIDE_INT_MAX_INL_PRECISION.
	(wi::divmod_internal): Likewise.
	(wi::lshift_large): For len > WIDE_INT_MAX_INL_ELTS estimate
	it from xlen and skip.
	(rshift_large_common): Remove xprecision argument, add len
	argument with len computed in caller.  Don't return anything.
	(wi::lrshift_large, wi::arshift_large): Compute len here
	and pass it to rshift_large_common, for lengths above
	WIDE_INT_MAX_INL_ELTS using estimations from xlen if possible.
	(assert_deceq, assert_hexeq): For lengths above
	WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
	(test_printing): Use WIDE_INT_MAX_INL_PRECISION instead of
	WIDE_INT_MAX_PRECISION.
	* wide-int-print.h (WIDE_INT_PRINT_BUFFER_SIZE): Use
	WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION.
	* wide-int-print.cc (print_decs, print_decu, print_hex): For
	lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC allocated buffer.
	* tree.h (wi::int_traits<extended_tree <N>>): Change precision_type
	to INL_CONST_PRECISION for N == ADDR_MAX_PRECISION.
	(widest_extended_tree): Use WIDEST_INT_MAX_PRECISION instead of
	WIDE_INT_MAX_PRECISION.
	(wi::ints_for): Use int_traits <extended_tree <N> >::precision_type
	instead of hard coded CONST_PRECISION.
	(widest2_int_cst): Use WIDEST_INT_MAX_PRECISION instead of
	WIDE_INT_MAX_PRECISION.
	(wi::extended_tree <N>::get_len): Use WIDEST_INT_MAX_PRECISION rather
	than WIDE_INT_MAX_PRECISION.
	(wi::ints_for::zero): Use
	wi::int_traits <wi::extended_tree <N> >::precision_type instead of
	wi::CONST_PRECISION.
	* tree.cc (build_replicated_int_cst): Formatting fix.  Use
	WIDE_INT_MAX_INL_ELTS rather than WIDE_INT_MAX_ELTS.
	* print-tree.cc (print_node): Don't print TREE_UNAVAILABLE on
	INTEGER_CSTs, TREE_VECs or SSA_NAMEs.
	* double-int.h (wi::int_traits <double_int>::precision_type): Change
	to INL_CONST_PRECISION from CONST_PRECISION.
	* poly-int.h (struct poly_coeff_traits): Add partial specialization
	for wi::INL_CONST_PRECISION.
	* cfgloop.h (bound_wide_int): New typedef.
	(struct nb_iter_bound): Change bound type from widest_int to
	bound_wide_int.
	(struct loop): Change nb_iterations_upper_bound,
	nb_iterations_likely_upper_bound and nb_iterations_estimate type from
	widest_int to bound_wide_int.
	* cfgloop.cc (record_niter_bound): Return early if wi::min_precision
	of i_bound is too large for bound_wide_int.  Adjustments for the
	widest_int to bound_wide_int type change in non-static data members.
	(get_estimated_loop_iterations, get_max_loop_iterations,
	get_likely_max_loop_iterations): Adjustments for the widest_int to
	bound_wide_int type change in non-static data members.
	* tree-vect-loop.cc (vect_transform_loop): Likewise.
	* tree-ssa-loop-niter.cc (do_warn_aggressive_loop_optimizations): Use
	XALLOCAVEC allocated buffer for i_bound len above
	WIDE_INT_MAX_INL_ELTS.
	(record_estimate): Return early if wi::min_precision of i_bound is too
	large for bound_wide_int.  Adjustments for the widest_int to
	bound_wide_int type change in non-static data members.
	(wide_int_cmp): Use bound_wide_int instead of widest_int.
	(bound_index): Use bound_wide_int instead of widest_int.
	(discover_iteration_bound_by_body_walk): Likewise.  Use
	widest_int::from to convert it to widest_int when passed to
	record_niter_bound.
	(maybe_lower_iteration_bound): Use widest_int::from to convert it to
	widest_int when passed to record_niter_bound.
	(estimate_numbers_of_iteration): Don't record upper bound if
	loop->nb_iterations has too large precision for bound_wide_int.
	(n_of_executions_at_most): Use widest_int::from.
	* tree-ssa-loop-ivcanon.cc (remove_redundant_iv_tests): Adjust for
	the widest_int to bound_wide_int changes.
	* match.pd (fold_sign_changed_comparison simplification): Use
	wide_int::from on wi::to_wide instead of wi::to_widest.
	* value-range.h (irange::maybe_resize): Avoid using memcpy on
	non-trivially copyable elements.
	* value-range.cc (irange_bitmask::dump): Use XALLOCAVEC allocated
	buffer for mask or value len above WIDE_INT_PRINT_BUFFER_SIZE.
	* fold-const.cc (fold_convert_const_int_from_int, fold_unary_loc):
	Use wide_int::from on wi::to_wide instead of wi::to_widest.
	* tree-ssa-ccp.cc (bit_value_binop): Zero extend r1max from width
	before calling wi::udiv_trunc.
	* lto-streamer-out.cc (output_cfg): Adjustments for the widest_int to
	bound_wide_int type change in non-static data members.
	* lto-streamer-in.cc (input_cfg): Likewise.
	(lto_input_tree_1): Use WIDE_INT_MAX_INL_ELTS rather than
	WIDE_INT_MAX_ELTS.  For length above WIDE_INT_MAX_INL_ELTS use
	XALLOCAVEC allocated buffer.  Formatting fix.
	* data-streamer-in.cc (streamer_read_wide_int,
	streamer_read_widest_int): Likewise.
	* tree-affine.cc (aff_combination_expand): Use placement new to
	construct name_expansion.
	(free_name_expansion): Destruct name_expansion.
	* gimple-ssa-strength-reduction.cc (struct slsr_cand_d): Change
	index type from widest_int to offset_int.
	(class incr_info_d): Change incr type from widest_int to offset_int.
	(alloc_cand_and_find_basis, backtrace_base_for_ref,
	restructure_reference, slsr_process_ref, create_mul_ssa_cand,
	create_mul_imm_cand, create_add_ssa_cand, create_add_imm_cand,
	slsr_process_add, cand_abs_increment, replace_mult_candidate,
	replace_unconditional_candidate, incr_vec_index,
	create_add_on_incoming_edge, create_phi_basis_1,
	replace_conditional_candidate, record_increment,
	record_phi_increments_1, phi_incr_cost_1, phi_incr_cost,
	lowest_cost_path, total_savings, ncd_with_phi, ncd_of_cand_and_phis,
	nearest_common_dominator_for_cands, insert_initializers,
	all_phi_incrs_profitable_1, replace_one_candidate,
	replace_profitable_candidates): Use offset_int rather than widest_int
	and wi::to_offset rather than wi::to_widest.
	* real.cc (real_to_integer): Use WIDE_INT_MAX_INL_ELTS rather than
	2 * WIDE_INT_MAX_ELTS and for words above that use XALLOCAVEC
	allocated buffer.
	* tree-ssa-loop-ivopts.cc (niter_for_exit): Use placement new
	to construct tree_niter_desc and destruct it on failure.
	(free_tree_niter_desc): Destruct tree_niter_desc if value is non-NULL.
	* gengtype.cc (main): Remove widest_int handling.
	* graphite-isl-ast-to-gimple.cc (widest_int_from_isl_expr_int): Use
	WIDEST_INT_MAX_ELTS instead of WIDE_INT_MAX_ELTS.
	* gimple-ssa-warn-alloca.cc (pass_walloca::execute): Use
	WIDE_INT_MAX_INL_PRECISION instead of WIDE_INT_MAX_PRECISION and
	assert get_len () fits into it.
	* value-range-pretty-print.cc (vrange_printer::print_irange_bitmasks):
	For mask or value lengths above WIDE_INT_MAX_INL_ELTS use XALLOCAVEC
	allocated buffer.
	* gimple-ssa-sprintf.cc (adjust_range_for_overflow): Use
	wide_int::from on wi::to_wide instead of wi::to_widest.
	* omp-general.cc (score_wide_int): New typedef.
	(omp_context_compute_score): Use score_wide_int instead of widest_int
	and adjust for those changes.
	(struct omp_declare_variant_entry): Change score and
	score_in_declare_simd_clone non-static data member type from widest_int
	to score_wide_int.
	(omp_resolve_late_declare_variant, omp_resolve_declare_variant): Use
	score_wide_int instead of widest_int and adjust for those changes.
	(omp_lto_output_declare_variant_alt): Likewise.
	(omp_lto_input_declare_variant_alt): Likewise.
	* godump.cc (go_output_typedef): Assert get_len () is smaller than
	WIDE_INT_MAX_INL_ELTS.
gcc/c-family/
	* c-warn.cc (match_case_to_enum_1): Use wi::to_wide just once instead
	of 3 times, assert get_len () is smaller than WIDE_INT_MAX_INL_ELTS.
gcc/testsuite/
	* gcc.dg/bitint-38.c: New test.
0d00385e
History
Name Last commit Last update
..