Commits · e1e127de18dbee47b88fa0ce74a1c7f4d658dc68 · COBOLworx / gcc-cobol

Oct 12, 2023

x86: set spincount 1 for x86 hybrid platform · e1e127de

Zhang, Jun authored 1 year ago

By test, we find in hybrid platform spincount 1 is better.

Use '-march=native -Ofast -funroll-loops -flto',
results as follows:

spec2017 speed   RPL     ADL
657.xz_s         0.00%   0.50%
603.bwaves_s     10.90%  26.20%
607.cactuBSSN_s  5.50%   72.50%
619.lbm_s        2.40%   2.50%
621.wrf_s        -7.70%  2.40%
627.cam4_s       0.50%   0.70%
628.pop2_s       48.20%  153.00%
638.imagick_s    -0.10%  0.20%
644.nab_s        2.30%   1.40%
649.fotonik3d_s  8.00%   13.80%
654.roms_s       1.20%   1.10%
Geomean-int      0.00%   0.50%
Geomean-fp       6.30%   21.10%
Geomean-all      5.70%   19.10%

omp2012          RPL     ADL
350.md           -1.81%  -1.75%
351.bwaves       7.72%   12.50%
352.nab          14.63%  19.71%
357.bt331        -0.20%  1.77%
358.botsalgn     0.00%   0.00%
359.botsspar     0.00%   0.65%
360.ilbdc        0.00%   0.25%
362.fma3d        2.66%   -0.51%
363.swim         10.44%  0.00%
367.imagick      0.00%   0.12%
370.mgrid331     2.49%   25.56%
371.applu331     1.06%   4.22%
372.smithwa      0.74%   3.34%
376.kdtree       10.67%  16.03%
GEOMEAN          3.34%   5.53%

include/ChangeLog:

	PR target/109812
	* spincount.h: New file.

libgomp/ChangeLog:

	* env.c (initialize_env): Use do_adjust_default_spincount.
	* config/linux/x86/spincount.h: New file.

e1e127de

RISC-V: Support FP llrint auto vectorization · 6a3302a4

Pan Li authored 1 year ago


This patch would like to support the FP llrint auto vectorization.

* long long llrint (double)

This will be the CVT from DF => DI from the standard name's perpsective,
which has been covered in previous PATCH(es). Thus, this patch only add
some test cases.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/unop/test-math.h: Add type int64_t.
	* gcc.target/riscv/rvv/autovec/unop/math-llrint-0.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/math-llrint-run-0.c: New test.
	* gcc.target/riscv/rvv/autovec/vls/math-llrint-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

6a3302a4

[APX] Support Intel APX PUSH2POP2 · 180b08f6

Mo, Zewei authored 2 years ago


This feature requires stack to be aligned at 16byte, therefore in
prologue/epilogue, a standalone push/pop will be emitted before any
push2/pop2 if the stack was not aligned to 16byte.
Also for current implementation we only support push2/pop2 usage in
function prologue/epilogue for those callee-saved registers.

gcc/ChangeLog:

	* config/i386/i386.cc (gen_push2): New function to emit push2
	and adjust cfa offset.
	(ix86_pro_and_epilogue_can_use_push2_pop2): New function to
	determine whether push2/pop2 can be used.
	(ix86_compute_frame_layout): Adjust preferred stack boundary
	and stack alignment needed for push2/pop2.
	(ix86_emit_save_regs): Emit push2 when available.
	(ix86_emit_restore_reg_using_pop2): New function to emit pop2
	and adjust cfa info.
	(ix86_emit_restore_regs_using_pop2): New function to loop
	through the saved regs and call above.
	(ix86_expand_epilogue): Call ix86_emit_restore_regs_using_pop2
	when push2pop2 available.
	* config/i386/i386.md (push2_di): New pattern for push2.
	(pop2_di): Likewise for pop2.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/apx-push2pop2-1.c: New test.
	* gcc.target/i386/apx-push2pop2_force_drap-1.c: Likewise.
	* gcc.target/i386/apx-push2pop2_interrupt-1.c: Likewise.

Co-authored-by: Hu Lin1 <lin1.hu@intel.com>
Co-authored-by: Hongyu Wang <hongyu.wang@intel.com>

180b08f6

RISC-V: Support FP irintf auto vectorization · d6b7fe11

Pan Li authored 1 year ago


This patch would like to support the FP irintf auto vectorization.

* int irintf (float)

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on SF => SI.

Given we have code like:

void
test_irintf (int *out, float *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_irintf (in[i]);
}

Before this patch:
.L3:
  ...
  flw      fa5,0(a1)
  fcvt.w.s a5,fa5,dyn
  sw       a5,-4(a0)
  ...
  bne      a1,a4,.L3

After this patch:
.L3:
  ...
  vle32.v     v1,0(a1)
  vfcvt.x.f.v v1,v1
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3

The rest part like DF => SI/HF => SI will be covered by the hook
TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

	* config/riscv/autovec.md (lrint<mode><vlconvert>2): Rename from.
	(lrint<mode><v_i_l_ll_convert>2): Rename to.
	* config/riscv/vector-iterators.md: Rename and remove TARGET_64BIT.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/unop/math-irint-0.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/math-irint-run-0.c: New test.
	* gcc.target/riscv/rvv/autovec/vls/math-irint-0.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

d6b7fe11

Daily bump. · 6febf76c
GCC Administrator authored 1 year ago

6febf76c

Oct 11, 2023

RISC-V: Add TARGET_MIN_VLEN_OPTS to fix the build · 06f36c1d
Kito Cheng authored 1 year ago
```
gcc/ChangeLog:

	* config/riscv/riscv-opts.h (TARGET_MIN_VLEN_OPTS): New.
```
06f36c1d

RISC-V Adjust long unconditional branch sequence · a3e50ee9

Jeff Law authored 1 year ago

Andrew and I independently noted the long unconditional branch sequence was
using the "call" pseudo op.  Technically it works, but it's a bit odd.  This
patch flips it to use the "jump" pseudo-op.

This was tested with a hacked-up local compiler which forced all branches/jumps
to be long jumps.  Naturally it triggered some failures for scan-asm tests but
no execution regressions (which is mostly what I was testing for).

I've updated the long branch support item in the RISE wiki to indicate that we
eventually want a register scavenging approach with a fallback to $ra in the
future so that we don't muck up the return address predictors.  It's not
super-high priority and shouldn't be terrible to implement given we've got the
$ra fallback when a suitable register can not be found.

gcc/
	* config/riscv/riscv.md (jump): Adjust sequence to use a "jump"
	pseudo op instead of a "call" pseudo op.

a3e50ee9

RISC-V: Extend riscv_subset_list, preparatory for target attribute support · faae30c4

Kito Cheng authored 1 year ago

riscv_subset_list only accept a full arch string before, but we need to
parse single extension when supporting target attribute, also we may set
a riscv_subset_list directly rather than re-parsing the ISA string
again.

gcc/ChangeLog:

	* config/riscv/riscv-subset.h (riscv_subset_list::parse_single_std_ext):
	New.
	(riscv_subset_list::parse_single_multiletter_ext): Ditto.
	(riscv_subset_list::clone): Ditto.
	(riscv_subset_list::parse_single_ext): Ditto.
	(riscv_subset_list::set_loc): Ditto.
	(riscv_set_arch_by_subset_list): Ditto.
	* common/config/riscv/riscv-common.cc
	(riscv_subset_list::parse_single_std_ext): New.
	(riscv_subset_list::parse_single_multiletter_ext): Ditto.
	(riscv_subset_list::clone): Ditto.
	(riscv_subset_list::parse_single_ext): Ditto.
	(riscv_subset_list::set_loc): Ditto.
	(riscv_set_arch_by_subset_list): Ditto.

faae30c4

RISC-V: Refactor riscv_option_override and riscv_convert_vector_bits. [NFC] · 9452d13b

Kito Cheng authored 1 year ago

Allow those funciton apply from a local gcc_options rather than the
global options.

Preparatory for target attribute, sperate this change for eaiser reivew
since it's a NFC.

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_convert_vector_bits): Get setting
	from argument rather than get setting from global setting.
	(riscv_override_options_internal): New, splited from
	riscv_override_options, also take a gcc_options argument.
	(riscv_option_override): Splited most part to
	riscv_override_options_internal.

9452d13b

options: Define TARGET_<NAME>_P and TARGET_<NAME>_OPTS_P macro for Mask and InverseMask · 0363bba8

Kito Cheng authored 1 year ago

We TARGET_<NAME>_P marcro to test a Mask and InverseMask with user
specified target_variable, however we may want to test with specific
gcc_options variable rather than target_variable.

Like RISC-V has defined lots of Mask with TargetVariable, which is not
easy to use, because that means we need to known which Mask are associate with
which TargetVariable, so take a gcc_options variable is a better interface
for such use case.

gcc/ChangeLog:

	* doc/options.texi (Mask): Document TARGET_<NAME>_P and
	TARGET_<NAME>_OPTS_P.
	(InverseMask): Ditto.
	* opth-gen.awk (Mask): Generate TARGET_<NAME>_P and
	TARGET_<NAME>_OPTS_P macro.
	(InverseMask): Ditto.

0363bba8

MATCH: [PR111282] Simplify `a & (b ^ ~a)` to `a & b` · e8d418df

Andrew Pinski authored 1 year ago

While `a & (b ^ ~a)` is optimized to `a & b` on the rtl level,
it is always good to optimize this at the gimple level and allows
us to match a few extra things including where a is a comparison.

Note I had to update/change the testcase and-1.c to avoid matching
this case as we can match -2 and 1 as bitwise inversions.

	PR tree-optimization/111282

gcc/ChangeLog:

	* match.pd (`a & ~(a ^ b)`, `a & (a == b)`,
	`a & ((~a) ^ b)`): New patterns.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/and-1.c: Update testcase to avoid
	matching `~1 & (a ^ 1)` simplification.
	* gcc.dg/tree-ssa/bitops-6.c: New test.

e8d418df

modula2: Narrow subranges to int or unsigned int if ZTYPE is the base type. · acfca27e

Gaius Mulley authored 1 year ago


This patch narrows the subrange base type to INTEGER or CARDINAL
providing the range is satisfied.  It only does this when the subrange
base type is the ZTYPE.

gcc/m2/ChangeLog:

	* gm2-compiler/M2GCCDeclare.mod (DeclareSubrange): Check
	the base type of the subrange against the ZTYPE and call
	DeclareSubrangeNarrow if necessary.
	(DeclareSubrangeNarrow): New procedure function.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

acfca27e

[PATCH v4 2/2] RISC-V: Add support for XCValu extension in CV32E40P · 5ef248c1

Mary Bennett authored 1 year ago

Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett <mary.bennett@embecosm.com>
  Nandni Jamnadas <nandni.jamnadas@embecosm.com>
  Pietra Ferreira <pietra.ferreira@embecosm.com>
  Charlie Keaney
  Jessica Mills
  Craig Blackmore <craig.blackmore@embecosm.com>
  Simon Cook <simon.cook@embecosm.com>
  Jeremy Bennett <jeremy.bennett@embecosm.com>
  Helene Chelin <helene.chelin@embecosm.com>

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Add the XCValu
	extension.
	* config/riscv/constraints.md: Add builtins for the XCValu
	extension.
	* config/riscv/predicates.md (immediate_register_operand):
	Likewise.
	* config/riscv/corev.def: Likewise.
	* config/riscv/corev.md: Likewise.
	* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
	(RISCV_ATYPE_UHI): Likewise.
	* config/riscv/riscv-ftypes.def: Likewise.
	* config/riscv/riscv.opt: Likewise.
	* config/riscv/riscv.cc (riscv_print_operand): Likewise.
	* doc/extend.texi: Add XCValu documentation.
	* doc/sourcebuild.texi: Likewise.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp: Add proc for the XCValu extension.
	* gcc.target/riscv/cv-alu-compile.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-addn.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-addrn.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-addun.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-addurn.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-clip.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-clipu.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-subn.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-subrn.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-subun.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile-suburn.c: New test.
	* gcc.target/riscv/cv-alu-fail-compile.c: New test.

5ef248c1

[PATCH v4 1/2] RISC-V: Add support for XCVmac extension in CV32E40P · 400efddd

Mary Bennett authored 1 year ago

Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett <mary.bennett@embecosm.com>
  Nandni Jamnadas <nandni.jamnadas@embecosm.com>
  Pietra Ferreira <pietra.ferreira@embecosm.com>
  Charlie Keaney
  Jessica Mills
  Craig Blackmore <craig.blackmore@embecosm.com>
  Simon Cook <simon.cook@embecosm.com>
  Jeremy Bennett <jeremy.bennett@embecosm.com>
  Helene Chelin <helene.chelin@embecosm.com>

gcc/ChangeLog:

	* common/config/riscv/riscv-common.cc: Add XCVmac.
	* config/riscv/riscv-ftypes.def: Add XCVmac builtins.
	* config/riscv/riscv-builtins.cc: Likewise.
	* config/riscv/riscv.md: Likewise.
	* config/riscv/riscv.opt: Likewise.
	* doc/extend.texi: Add XCVmac builtin documentation.
	* doc/sourcebuild.texi: Likewise.
	* config/riscv/corev.def: New file.
	* config/riscv/corev.md: New file.

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp: Add new effective target check.
	* gcc.target/riscv/cv-mac-compile.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mac.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-machhsn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-machhsrn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-machhun.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-machhurn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-macsn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-macsrn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-macun.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-macurn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-msu.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mulhhsn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mulhhsrn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mulhhun.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mulhhurn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mulsn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mulsrn.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mulun.c: New test.
	* gcc.target/riscv/cv-mac-fail-compile-mulurn.c: New test.
	* gcc.target/riscv/cv-mac-test-autogeneration.c: New test.

400efddd

MAINTAINERS: Fix write after approval name order · 70b02dfd
Filip Kastl authored 1 year ago
```
ChangeLog:

	* MAINTAINERS: Fix name order.

Signed-off-by: Filip Kastl <fkastl@suse.cz>
```
70b02dfd

PR modula2/111675 Incorrect packed record field value passed to a procedure · 2b783fe2

Gaius Mulley authored 1 year ago


This patch allows a packed field to be extracted and passed to a
procedure.  It ensures that the subrange type is the same for both the
procedure and record field.  It also extends the <* bytealignment (0) *>
to cover packed subrange types.

gcc/m2/ChangeLog:

	PR modula2/111675
	* gm2-compiler/M2CaseList.mod (appendTree): Replace
	InitStringCharStar with InitString.
	* gm2-compiler/M2GCCDeclare.mod: Import AreConstantsEqual.
	(DeclareSubrange): Add zero alignment test and call
	BuildSmallestTypeRange if necessary.
	(WalkSubrangeDependants): Walk the align expression.
	(IsSubrangeDependants): Test the align expression.
	* gm2-compiler/M2Quads.mod (BuildStringAdrParam): Correct end name.
	* gm2-compiler/P2SymBuild.mod (BuildTypeAlignment): Allow subranges
	to be zero aligned (packed).
	* gm2-compiler/SymbolTable.mod (Subrange): Add Align field.
	(MakeSubrange): Set Align to NulSym.
	(PutAlignment): Assign Subrange.Align to align.
	(GetAlignment): Return Subrange.Align.
	* gm2-gcc/m2expr.cc (noBitsRequired): Rewrite.
	(calcNbits): Rename ...
	(m2expr_calcNbits): ... to this and test for negative values.
	(m2expr_BuildTBitSize): Replace calcNBits with m2expr_calcNbits.
	* gm2-gcc/m2expr.def (calcNbits): Export.
	* gm2-gcc/m2expr.h (m2expr_calcNbits): New prototype.
	* gm2-gcc/m2type.cc (noBitsRequired): Remove.
	(m2type_BuildSmallestTypeRange): Call m2expr_calcNbits.
	(m2type_BuildSubrangeType): Create range_type from
	build_range_type (type, lowval, highval).

gcc/testsuite/ChangeLog:

	PR modula2/111675
	* gm2/extensions/run/pass/packedrecord3.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

2b783fe2

RISC-V: Fix incorrect index(offset) of gather/scatter · f6c5e247

Juzhe-Zhong authored 1 year ago

I suddenly discovered I made a mistake that was lucky un-exposed.

https://godbolt.org/z/c3jzrh7or

GCC is using 32 bit index offset:

        vsll.vi v1,v1,2
        vsetvli zero,a5,e32,m1,ta,ma
        vluxei32.v      v1,(a1),v1

This is wrong since v1 may overflow 32bit after vsll.vi.

After this patch:

vsext.vf2	v8,v4
vsll.vi	v8,v8,2
vluxei64.v	v8,(a1),v8

Same as Clang.

Regression passed. Ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md: Fix index bug.
	* config/riscv/riscv-protos.h (gather_scatter_valid_offset_mode_p): New function.
	* config/riscv/riscv-v.cc (expand_gather_scatter): Fix index bug.
	(gather_scatter_valid_offset_mode_p): New function.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/gather-scatter/offset_extend-1.c: New test.

f6c5e247

RISC-V: Support FP lrint/lrintf auto vectorization · d1e55666

Pan Li authored 1 year ago


This patch would like to support the FP lrint/lrintf auto vectorization.

* long lrint (double) for rv64
* long lrintf (float) for rv32

Due to the limitation that only the same size of data type are allowed
in the vectorier, the standard name lrintmn2 only act on DF => DI for
rv64, and SF => SI for rv32.

Given we have code like:

void
test_lrint (long *out, double *in, unsigned count)
{
  for (unsigned i = 0; i < count; i++)
    out[i] = __builtin_lrint (in[i]);
}

Before this patch:
.L3:
  ...
  fld      fa5,0(a1)
  fcvt.l.d a5,fa5,dyn
  sd       a5,-8(a0)
  ...
  bne      a1,a4,.L3

After this patch:
.L3:
  ...
  vsetvli     a3,zero,e64,m1,ta,ma
  vfcvt.x.f.v v1,v1
  vsetvli     zero,a2,e64,m1,ta,ma
  vse32.v     v1,0(a0)
  ...
  bne         a2,zero,.L3

The rest part like SF => DI/HF => DI/DF => SI/HF => SI will be covered
by TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION.

gcc/ChangeLog:

	* config/riscv/autovec.md (lrint<mode><vlconvert>2): New pattern
	for lrint/lintf.
	* config/riscv/riscv-protos.h (expand_vec_lrint): New func decl
	for expanding lint.
	* config/riscv/riscv-v.cc (emit_vec_cvt_x_f): New helper func impl
	for vfcvt.x.f.v.
	(expand_vec_lrint): New function impl for expanding lint.
	* config/riscv/vector-iterators.md: New mode attr and iterator.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/unop/test-math.h: New define for
	CVT like test case.
	* gcc.target/riscv/rvv/autovec/vls/def.h: Ditto.
	* gcc.target/riscv/rvv/autovec/unop/math-lrint-0.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/math-lrint-1.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/math-lrint-run-0.c: New test.
	* gcc.target/riscv/rvv/autovec/unop/math-lrint-run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/vls/math-lrint-0.c: New test.
	* gcc.target/riscv/rvv/autovec/vls/math-lrint-1.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

d1e55666

RISC-V: Remove XFAIL of ssa-dom-cse-2.c · d4de593d

Juzhe-Zhong authored 1 year ago

Confirm RISC-V is able to CSE this case no matter whether we enable RVV or not.

Remove XFAIL,  to fix:
XPASS: gcc.dg/tree-ssa/ssa-dom-cse-2.c scan-tree-dump optimized "return 28;"

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Remove riscv.

d4de593d

tree-ssa-strlen: optimization skips clobbering store [PR111519] · e75bf198

Jakub Jelinek authored 1 year ago

The following testcase is miscompiled, because count_nonzero_bytes incorrectly
uses get_strinfo information on a pointer from which an earlier instruction
loads SSA_NAME stored at the current instruction.  get_strinfo shows a state
right before the current store though, so if there are some stores in between
the current store and the load, the string length information might have
changed.

The patch passes around gimple_vuse from the store and punts instead of using
strinfo on loads from MEM_REF which have different gimple_vuse from that.

2023-10-11  Richard Biener  <rguenther@suse.de>
	    Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/111519
	* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes): Add vuse
	argument and pass it through to recursive calls and
	count_nonzero_bytes_addr calls.  Don't shadow the stmt argument, but
	change stmt for gimple_assign_single_p statements for which we don't
	immediately punt.
	(strlen_pass::count_nonzero_bytes_addr): Add vuse argument and pass
	it through to recursive calls and count_nonzero_bytes calls.  Don't
	use get_strinfo if gimple_vuse (stmt) is different from vuse.  Don't
	shadow the stmt argument.

	* gcc.dg/torture/pr111519.c: New testcase.

e75bf198

Optimize (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) as (and:SI x 1). · c4149242

Roger Sayle authored 1 year ago

This patch is the middle-end piece of an improvement to PRs 101955 and
106245, that adds a missing simplification to the RTL optimizers.
This transformation is to simplify (char)(x << 7) != 0 as x & 1.
Technically, the cast can be any truncation, where shift is by one
less than the narrower type's precision, setting the most significant
(only) bit from the least significant bit.

This transformation applies to any target, but it's easy to see
(and add a new test case) on x86, where the following function:

int f(int a) { return (a << 31) >> 31; }

currently gets compiled with -O2 to:

foo:    movl    %edi, %eax
        sall    $7, %eax
        sarb    $7, %al
        movsbl  %al, %eax
        ret

but with this patch, we now generate the slightly simpler.

foo:    movl    %edi, %eax
        sall    $31, %eax
        sarl    $31, %eax
        ret

2023-10-11  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	PR middle-end/101955
	PR tree-optimization/106245
	* simplify-rtx.cc (simplify_relational_operation_1): Simplify
	the RTL (ne:SI (subreg:QI (ashift:SI x 7) 0) 0) to (and:SI x 1).

gcc/testsuite/ChangeLog
	* gcc.target/i386/pr106245-1.c: New test case.

c4149242

RISC-V: Enable full coverage vect tests · 23aabded

Juzhe-Zhong authored 1 year ago

I have analyzed all existing FAILs.

Except these following FAILs need to be addressed:
FAIL: gcc.dg/vect/slp-reduc-7.c -flto -ffat-lto-objects execution test
FAIL: gcc.dg/vect/slp-reduc-7.c execution test
FAIL: gcc.dg/vect/vect-cond-arith-2.c -flto -ffat-lto-objects  scan-tree-dump optimized " = \\.COND_(LEN_)?SUB"
FAIL: gcc.dg/vect/vect-cond-arith-2.c scan-tree-dump optimized " = \\.COND_(LEN_)?SUB"

All other FAILs are dumple fail can be ignored (Confirm ARM SVE also has such FAILs and didn't fix them on either tests or implementation).

Now, It's time to enable full coverage vect tests including vec_unpack, vec_pack, vec_interleave, ... etc.

To see what we are still missing:

Before this patch:

                === gcc Summary ===

# of expected passes            182839
# of unexpected failures        79
# of unexpected successes       11
# of expected failures          1275
# of unresolved testcases       4
# of unsupported tests          4223

After this patch:

                === gcc Summary ===

# of expected passes            183411
# of unexpected failures        93
# of unexpected successes       7
# of expected failures          1285
# of unresolved testcases       4
# of unsupported tests          4157

There is an important issue increased that I have noticed after this patch:

FAIL: gcc.dg/vect/vect-gather-1.c -flto -ffat-lto-objects  scan-tree-dump vect "Loop contains only SLP stmts"
FAIL: gcc.dg/vect/vect-gather-1.c scan-tree-dump vect "Loop contains only SLP stmts"
FAIL: gcc.dg/vect/vect-gather-3.c -flto -ffat-lto-objects  scan-tree-dump vect "Loop contains only SLP stmts"
FAIL: gcc.dg/vect/vect-gather-3.c scan-tree-dump vect "Loop contains only SLP stmts"

It has a related PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111721

I am gonna fix this first in the middle-end after commit this patch.

Ok for trunk ?

gcc/testsuite/ChangeLog:

	* lib/target-supports.exp: Add RVV.

23aabded

Refine predicate of operands[2] in divv4hf3 with register_operand. · 4efe9085

liuhongt authored 1 year ago

In the expander, it will emit below insn.

rtx tmp = gen_rtx_VEC_CONCAT (V4SFmode, operands[2],
			force_reg (V2SFmode, CONST1_RTX (V2SFmode)));

but *vec_concat<mode> only allow register_operand.

gcc/ChangeLog:

	PR target/111745
	* config/i386/mmx.md (divv4hf3): Refine predicate of
	operands[2] with register_operand.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr111745.c: New test.

4efe9085

RISC-V Regression: Make pattern match more accurate of vect-live-2.c · de04f73e

Juzhe-Zhong authored 1 year ago

Like previous patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632400.html
https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac11b@gmail.com/

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-live-2.c: Make pattern match more accurate.

de04f73e

RISC-V Regression: Fix FAIL of vect-multitypes-16.c for RVV · cfe89942

Juzhe-Zhong authored 1 year ago

As Richard suggested: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632288.html

Add vect_ext_char_longlong to fix FAIL for RVV.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/vect-multitypes-16.c: Adapt check for RVV.
	* lib/target-supports.exp: Add vect_ext_char_longlong property.

cfe89942

Daily bump. · 69e3072c
GCC Administrator authored 1 year ago

69e3072c

Oct 10, 2023

RISC-V: far-branch: Handle far jumps and branches for functions larger than 1MB · 71f90649

Andrew Waterman authored 1 year ago


On RISC-V, branches further than +/-1MB require a longer instruction
sequence (3 instructions): we can reuse the jump-construction in the
assmbler (which clobbers $ra) and a temporary to set up the jump
destination.

gcc/ChangeLog:

	* config/riscv/riscv.cc (struct machine_function): Track if a
	far-branch/jump is used within a function (and $ra needs to be
	saved).
	(riscv_print_operand): Implement 'N' (inverse integer branch).
	(riscv_far_jump_used_p): Implement.
	(riscv_save_return_addr_reg_p): New function.
	(riscv_save_reg_p): Use riscv_save_return_addr_reg_p.
	* config/riscv/riscv.h (FIXED_REGISTERS): Update $ra.
	(CALL_USED_REGISTERS): Update $ra.
	* config/riscv/riscv.md: Add new types "ret" and "jalr".
	(length attribute): Handle long conditional and unconditional
	branches.
	(conditional branch pattern): Handle case where jump can not
	reach the intended target.
	(indirect_jump, tablejump): Use new "jalr" type.
	(simple_return): Use new "ret" type.
	(simple_return_internal, eh_return_internal): Likewise.
	(gpr_restore_return, riscv_mret): Likewise.
	(riscv_uret, riscv_sret): Likewise.
	* config/riscv/generic.md (generic_branch): Also recognize jalr & ret
	types.
	* config/riscv/sifive-7.md (sifive_7_jump): Likewise.

Co-authored-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-authored-by: Jeff Law <jlaw@ventanamicro.com>

71f90649

c++: mangle multiple levels of template parms [PR109422] · bd5719bd

Jason Merrill authored 1 year ago

This becomes be more important with concepts, but can also be seen with
generic lambdas.

	PR c++/109422

gcc/cp/ChangeLog:

	* mangle.cc (write_template_param): Also mangle level.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp2a/lambda-generic-mangle1.C: New test.
	* g++.dg/cpp2a/lambda-generic-mangle1a.C: New test.

bd5719bd

MATCH: [PR111679] Add alternative simplification of `a | ((~a) ^ b)` · 975da6fa

Andrew Pinski authored 1 year ago

So currently we have a simplification for `a | ~(a ^ b)` but
that does not match the case where we had originally `(~a) | (a ^ b)`
so we need to add a new pattern that matches that and uses bitwise_inverted_equal_p
that also catches comparisons too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

	PR tree-optimization/111679

gcc/ChangeLog:

	* match.pd (`a | ((~a) ^ b)`): New pattern.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/bitops-5.c: New test.

975da6fa

RISC-V Regression: Make match patterns more accurate · 5bb6a876

Juzhe-Zhong authored 1 year ago

This patch fixes following 2 FAILs in RVV regression since the check is not accurate.

It's inspired by Robin's previous patch:
https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac11b@gmail.com/

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/no-scevccp-outer-7.c: Adjust regex pattern.
	* gcc.dg/vect/no-scevccp-vect-iv-3.c: Ditto.

5bb6a876

RISC-V Regression: Fix FAIL of predcom-2.c · 0b0fcb27

Juzhe-Zhong authored 1 year ago

Like GCN, add -fno-tree-vectorize.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/predcom-2.c: Add riscv.

0b0fcb27

RISC-V Regression: Fix FAIL of pr65947-8.c for RVV · 8a361405

Juzhe-Zhong authored 1 year ago

This test is testing fold_extract_last pattern so it's more reasonable use
vect_fold_extract_last instead of specifying targets.

This is the vect_fold_extract_last property:
proc check_effective_target_vect_fold_extract_last { } {
    return [expr { [check_effective_target_aarch64_sve]
		   || [istarget amdgcn*-*-*]
		   || [check_effective_target_riscv_v] }]
}

include ARM SVE/GCN/RVV.

It perfectly matches what we want and more reasonable, better maintainment.

gcc/testsuite/ChangeLog:

	* gcc.dg/vect/pr65947-8.c: Use vect_fold_extract_last.

8a361405

MAINTAINERS: Add myself to write after approval · ddf17b6d

Christoph Müllner authored 1 year ago


Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

ChangeLog:

	* MAINTAINERS: Add myself.

ddf17b6d

RISC-V: Add VLS BOOL mode vcond_mask[PR111751] · 5255273e

Juzhe-Zhong authored 1 year ago

Richard patch resolve PR111751: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=7c76c876e917a1f20a788f602cc78fff7d0a2a65

which cause ICE in RISC-V regression:

FAIL: gcc.dg/torture/pr53144.c   -O2  (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328)
FAIL: gcc.dg/torture/pr53144.c   -O2  (test for excess errors)
FAIL: gcc.dg/torture/pr53144.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328)
FAIL: gcc.dg/torture/pr53144.c   -O2 -flto -fno-use-linker-plugin -flto-partition=none  (test for excess errors)
FAIL: gcc.dg/torture/pr53144.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328)
FAIL: gcc.dg/torture/pr53144.c   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: gcc.dg/torture/pr53144.c   -O3 -g  (internal compiler error: in gimple_expand_vec_cond_expr, at gimple-isel.cc:328)
FAIL: gcc.dg/torture/pr53144.c   -O3 -g  (test for excess errors)

VLS BOOL modes vcond_mask is needed to fix this regression ICE.

More details: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

Tested and Committed.

	PR target/111751

gcc/ChangeLog:

	* config/riscv/autovec.md: Add VLS BOOL modes.

5255273e

tree-optimization/111751 - support 1024 bit vector constant reinterpretation · 70b5c698

Richard Biener authored 1 year ago

The following ups the limit in fold_view_convert_expr to handle
1024bit vectors as used by GCN and RVV.  It also robustifies
the handling in visit_reference_op_load to properly give up when
constants cannot be re-interpreted.

	PR tree-optimization/111751
	* fold-const.cc (fold_view_convert_expr): Up the buffer size
	to 128 bytes.
	* tree-ssa-sccvn.cc (visit_reference_op_load): Special case
	constants, giving up when re-interpretation to the target type
	fails.

70b5c698

ada: Fix internal error on too large representation clause for small component · 2f150833

Eric Botcazou authored 1 year ago

This is a small bug present on strict-alignment platforms for questionable
representation clauses.

gcc/ada/

	* gcc-interface/decl.cc (inline_status_for_subprog): Minor tweak.
	(gnat_to_gnu_field): Try harder to get a packable form of the type
	for a bitfield.

2f150833

ada: Tweak internal subprogram in Ada.Directories · 42c46cfe

Ronan Desplanques authored 1 year ago

The purpose of this patch is to work around false-positive warnings
emitted by GNAT SAS (also known as CodePeer). It does not change
the behavior of the modified subprogram.

gcc/ada/

	* libgnat/a-direct.adb (Start_Search_Internal): Tweak subprogram
	body.

42c46cfe

ada: Remove superfluous setter procedure · 25c253e6

Eric Botcazou authored 1 year ago

It is only called once.

gcc/ada/

	* sem_util.ads (Set_Scope_Is_Transient): Delete.
	* sem_util.adb (Set_Scope_Is_Transient): Likewise.
	* exp_ch7.adb (Create_Transient_Scope): Set Is_Transient directly.

25c253e6

ada: Fix bad finalization of limited aggregate in conditional expression · e05e5d6b

Eric Botcazou authored 1 year ago

This happens when the conditional expression is immediately returned, for
example in an expression function.

gcc/ada/

	* exp_aggr.adb (Is_Build_In_Place_Aggregate_Return): Return true
	if the aggregate is a dependent expression of a conditional
	expression being returned from a build-in-place function.

e05e5d6b

ada: Fix infinite loop with multiple limited with clauses · 6bd83c90

Eric Botcazou authored 1 year ago

This occurs when one of the types has an incomplete declaration in addition
to its full declaration in its package. In this case AI05-129 says that the
incomplete type is not part of the limited view of the package, i.e. only
the full view is. Now, in the GNAT implementation, it's the opposite in the
regular view of the package, i.e. the incomplete type is the visible one.

That's why the implementation needs to also swap the types on the visibility
chain while it is swapping the views when the clauses are either installed
or removed. This works correctly for the installation, but does not for the
removal, so this change rewrites the code doing the latter.

gcc/ada/
	PR ada/111434
	* sem_ch10.adb (Replace): New procedure to replace an entity with
	another on the homonym chain.
	(Install_Limited_With_Clause): Rename Non_Lim_View to Typ for the
	sake of consistency.  Call Replace to do the replacements and split
	the code into the regular and the special cases.  Add debuggging
	output controlled by -gnatdi.
	(Install_With_Clause): Print the Parent_With and Implicit_With flags
	in the debugging output controlled by -gnatdi.
	(Remove_Limited_With_Unit.Restore_Chain_For_Shadow (Shadow)): Rewrite
	using a direct replacement of E4 by E2.   Call Replace to do the
	replacements.  Add debuggging output controlled by -gnatdi.

6bd83c90