Skip to content
Snippets Groups Projects
  1. Sep 20, 2024
    • GCC Administrator's avatar
      Daily bump. · 442db842
      GCC Administrator authored
      442db842
    • Tobias Burnus's avatar
      OpenMP: Add get_device_from_uid/omp_get_uid_from_device routines · bf4a5efa
      Tobias Burnus authored
      Those TR13/OpenMP 6.0 routines permit a reproducible offloading to
      a specific device by mapping an OpenMP device number to a
      unique ID (UID). The GPU device UIDs should be universally unique,
      the one for the host is not.
      
      gcc/ChangeLog:
      
      	* omp-general.cc (omp_runtime_api_procname): Add
      	get_device_from_uid and omp_get_uid_from_device routines.
      
      include/ChangeLog:
      
      	* cuda/cuda.h (cuDeviceGetUuid): Declare.
      	(cuDeviceGetUuid_v2): Add prototype.
      
      libgomp/ChangeLog:
      
      	* config/gcn/target.c (omp_get_uid_from_device,
      	omp_get_device_from_uid): Add stub implementation.
      	* config/nvptx/target.c (omp_get_uid_from_device,
      	omp_get_device_from_uid): Likewise.
      	* fortran.c (omp_get_uid_from_device_,
      	omp_get_uid_from_device_8_): New functions.
      	* libgomp-plugin.h (GOMP_OFFLOAD_get_uid): Add prototype.
      	* libgomp.h (struct gomp_device_descr): Add 'uid' and 'get_uid_func'.
      	* libgomp.map (GOMP_6.0): New, includind the new UID routines.
      	* libgomp.texi (OpenMP Technical Report 13): Mark UID routines as 'Y'.
      	(Device Information Routines): Document new UID routines.
      	(Offload-Target Specifics): Document UID format.
      	* omp.h.in (omp_get_device_from_uid, omp_get_uid_from_device):
      	New prototype.
      	* omp_lib.f90.in (omp_get_device_from_uid, omp_get_uid_from_device):
      	New interface.
      	* omp_lib.h.in: Likewise.
      	* plugin/cuda-lib.def: Add cuDeviceGetUuid and cuDeviceGetUuid_v2 via
      	CUDA_ONE_CALL_MAYBE_NULL.
      	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_uid): New.
      	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_uid): New.
      	* target.c (str_omp_initial_device): New static var.
      	(STR_OMP_DEV_PREFIX): Define.
      	(gomp_get_uid_for_device, omp_get_uid_from_device,
      	omp_get_device_from_uid): New.
      	(gomp_load_plugin_for_device): DLSYM_OPT the function 'get_uid'.
      	(gomp_target_init): Set the device's 'uid' field to NULL.
      	* testsuite/libgomp.c/device_uid.c: New test.
      	* testsuite/libgomp.fortran/device_uid.f90: New test.
      bf4a5efa
  2. Sep 14, 2024
  3. Sep 13, 2024
    • Tobias Burnus's avatar
      Fortran: Fixes to OpenMP 'interop' directive parsing support · 99988464
      Tobias Burnus authored
      Handle lists as argument to 'fr' and 'attr'; fix parsing corner cases.
      Additionally, 'fr' values are now internally stored as integer, permitting
      the diagnoses (warning) for values not defined in the OpenMP additional
      definitions document.
      
      	PR fortran/116661
      
      gcc/fortran/ChangeLog:
      
      	* gfortran.h (gfc_omp_namelist): Rename 'init' members for clarity.
      	* match.cc (gfc_free_omp_namelist): Handle renaming.
      	* dump-parse-tree.cc (show_omp_namelist): Update for new format
      	and features.
      	* openmp.cc (gfc_match_omp_prefer_type): Parse list to 'fr' and 'attr';
      	store 'fr' values as integer.
      	(gfc_match_omp_init): Rename variable names.
      
      gcc/ChangeLog:
      
      	* omp-api.h (omp_get_fr_id_from_name, omp_get_name_from_fr_id): New
      	prototypes.
      	* omp-general.cc (omp_get_fr_id_from_name, omp_get_name_from_fr_id):
      	New.
      
      include/ChangeLog:
      
      	* gomp-constants.h (GOMP_INTEROP_IFR_LAST,
      	GOMP_INTEROP_IFR_SEPARATOR, GOMP_INTEROP_IFR_NONE): New.
      
      gcc/testsuite/ChangeLog:
      
      	* gfortran.dg/gomp/interop-1.f90: Extend, update dg-*.
      	* gfortran.dg/gomp/interop-2.f90: Update dg-error.
      	* gfortran.dg/gomp/interop-3.f90: Add dg-warning.
      99988464
  4. Aug 10, 2024
  5. Aug 09, 2024
  6. Jul 03, 2024
  7. Jul 02, 2024
    • David Faust's avatar
      ctf: use pointers instead of IDs internally · 36774cec
      David Faust authored
      This patch replaces all inter-type references in the ctfc internal data
      structures with pointers, rather than the references-by-ID which were
      used previously.
      
      A couple of small updates in the BPF backend are included to make it
      compatible with the change.
      
      This change is only to the in-memory representation of various CTF
      structures to make them easier to work with in various cases.  It is
      outwardly transparent; there is no change in emitted CTF.
      
      gcc/
      	* btfout.cc (BTF_VOID_TYPEID, BTF_INIT_TYPEID): Move defines to
      	include/btf.h.
      	(btf_dvd_emit_preprocess_cb, btf_emit_preprocess)
      	(btf_dmd_representable_bitfield_p, btf_asm_array, btf_asm_varent)
      	(btf_asm_sou_member, btf_asm_func_arg, btf_init_postprocess):
      	Adapt to structural changes in ctf_* structs.
      	* ctfc.h (struct ctf_dtdef): Add forward declaration.
      	(ctf_dtdef_t, ctf_dtdef_ref): Move typedefs earlier.
      	(struct ctf_arinfo, struct ctf_funcinfo, struct ctf_sliceinfo)
      	(struct ctf_itype, struct ctf_dmdef, struct ctf_func_arg)
      	(struct ctf_dvdef): Use pointers instead of type IDs for
      	references to other types and use typedefs where appropriate.
      	(struct ctf_dtdef): Add ref_type member.
      	(ctf_type_exists): Use pointer instead of type ID.
      	(ctf_add_reftype, ctf_add_enum, ctf_add_slice, ctf_add_float)
      	(ctf_add_integer, ctf_add_unknown, ctf_add_pointer)
      	(ctf_add_array, ctf_add_forward, ctf_add_typedef)
      	(ctf_add_function, ctf_add_sou, ctf_add_enumerator)
      	(ctf_add_variable): Likewise. Return pointer instead of ID.
      	(ctf_lookup_tree_type): Return pointer to type instead of ID.
      	* ctfc.cc: Analogous changes.
      	* ctfout.cc (ctf_asm_type, ctf_asm_slice, ctf_asm_varent)
      	(ctf_asm_sou_lmember, ctf_asm_sou_member, ctf_asm_func_arg)
      	(output_ctf_objt_info): Adapt to changes.
      	* dwarf2ctf.cc (gen_ctf_type, gen_ctf_void_type)
      	(gen_ctf_unknown_type, gen_ctf_base_type, gen_ctf_pointer_type)
      	(gen_ctf_subrange_type, gen_ctf_array_type, gen_ctf_typedef)
      	(gen_ctf_modifier_type, gen_ctf_sou_type, gen_ctf_function_type)
      	(gen_ctf_enumeration_type, gen_ctf_variable, gen_ctf_function)
      	(gen_ctf_type, ctf_do_die): Likewise.
      	* config/bpf/btfext-out.cc (struct btf_ext_core_reloc): Use
      	pointer instead of type ID.
      	(bpf_core_reloc_add, bpf_core_get_sou_member_index)
      	(output_btfext_core_sections): Adapt to above changes.
      	* config/bpf/core-builtins.cc (process_type): Likewise.
      
      include/
      	* btf.h (BTF_VOID_TYPEID, BTF_INIT_TYPEID): Move defines here,
      	from gcc/btfout.cc.
      36774cec
  8. May 30, 2024
  9. May 29, 2024
    • Tobias Burnus's avatar
      libgomp: Enable USM for AMD APUs and MI200 devices · 18f47798
      Tobias Burnus authored
      If HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT is true,
      all GPUs on the system support unified shared memory. That's
      the case for APUs and MI200 devices when XNACK is enabled.
      
      XNACK can be enabled by setting HSA_XNACK=1 as env var for
      supported devices; otherwise, if disable, USM code will
      use host fallback.
      
      gcc/ChangeLog:
      
      	* config/gcn/gcn-hsa.h (gcn_local_sym_hash): Fix typo.
      
      include/ChangeLog:
      
      	* hsa.h (HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT): Add
      	enum value.
      
      libgomp/ChangeLog:
      
      	* libgomp.texi (gcn): Update USM handling
      	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_num_devices): Handle
      	USM if HSA_AMD_SYSTEM_INFO_SVM_ACCESSIBLE_BY_DEFAULT is true.
      18f47798
    • Tobias Burnus's avatar
      libgomp: Enable USM for some nvptx devices · 4ccb3366
      Tobias Burnus authored
      A few high-end nvptx devices support the attribute
      CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS; for those, unified shared
      memory is supported in hardware. This patch enables support for those -
      if all installed nvptx devices have this feature (as the capabilities
      are per device type).
      
      This exposes a bug in gomp_copy_back_icvs as it did before use
      omp_get_mapped_ptr to find mapped variables, but that returns
      the unchanged pointer in cased of shared memory. But in this case,
      we have a few actually mapped pointers - like the ICV variables.
      Additionally, there was a mismatch with regards to '-1' for the
      device number as gomp_copy_back_icvs and omp_get_mapped_ptr count
      differently. Hence, do the lookup manually.
      
      include/ChangeLog:
      
      	* cuda/cuda.h (CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS): Add.
      
      libgomp/ChangeLog:
      
      	* libgomp.texi (nvptx): Update USM description.
      	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_get_num_devices):
      	Claim support when requesting USM and all devices support
      	CU_DEVICE_ATTRIBUTE_PAGEABLE_MEMORY_ACCESS.
      	* target.c (gomp_copy_back_icvs): Fix device ptr lookup.
      	(gomp_target_init): Set GOMP_OFFLOAD_CAP_SHARED_MEM is the
      	devices supports USM.
      4ccb3366
  10. Apr 09, 2024
  11. Apr 08, 2024
    • Thomas Schwinge's avatar
      GCN, nvptx: Errors during device probing are fatal · a02d7f0e
      Thomas Schwinge authored
      Currently, we silently disable libgomp GCN and nvptx plugins/devices in
      presence of certain error conditions during device probing, thus typically
      silently resorting to host-fallback execution.  Make such errors fatal, similar
      as for any other device access later on, so that we early and reliably notice
      when things go wrong.  (Keep just two cases non-fatal: (a) libgomp GCN or nvptx
      plugins are available but 'libhsa-runtime64.so.1' or 'libcuda.so.1' are not,
      and (b) those are available, but the corresponding devices are not.)
      
      This resolves the issue that we've got execution test cases unexpectedly
      PASSing, despite:
      
          libgomp: GCN fatal error: Run-time could not be initialized
          Runtime message: HSA_STATUS_ERROR_OUT_OF_RESOURCES: The runtime failed to allocate the necessary resources. This error may also occur when the core runtime library needs to spawn threads or create internal OS-specific events.
      
      ..., and therefore they were not offloaded to the GCN device, but ran in
      host-fallback execution mode.  What happend in that scenario is that in
      'init_hsa_context' during the initial 'GOMP_OFFLOAD_get_num_devices' we ran
      into 'HSA_STATUS_ERROR_OUT_OF_RESOURCES', but it wasn't fatal, but just
      silently disabled the libgomp plugin/device.
      
      Especially "entertaining" were cases where such unintended host-fallback
      execution happened during effective-target checks like
      'offload_device_available' (host-fallback execution there meaning: no offload
      device available), but actual test cases then were running with an offload
      device available, and therefore mis-configured.
      
      	include/
      	* cuda/cuda.h (CUresult): Add 'CUDA_ERROR_NO_DEVICE'.
      	libgomp/
      	* plugin/plugin-gcn.c (init_hsa_context): Add and handle
      	'bool probe' parameter.  Adjust all users; errors during device
      	probing are fatal.
      	* plugin/plugin-nvptx.c (nvptx_get_num_devices): Aside from
      	'CUDA_ERROR_NO_DEVICE', errors during device probing are fatal.
      a02d7f0e
  12. Mar 01, 2024
  13. Feb 29, 2024
    • Tom Tromey's avatar
      Fix PR libcc1/113977 · bc0e18a9
      Tom Tromey authored
      PR libcc1/113977 points out a case where a simple expression is
      rejected with a compiler error message.  The bug here is that gdb does
      not inform the plugin of the correct alignment -- in fact, there is no
      way to do that.
      
      This patch adds a new method to allow the alignment to be set, and
      bumps the C front end protocol version.
      
      It also includes some updates to various comments in 'include', done
      here to simplify the merge to binutils-gdb.
      
      include
      
      	* gcc-cp-interface.h (gcc_cp_fe_context_function): Update
      	comment.
      	* gcc-c-interface.h (enum gcc_c_api_version) <GCC_C_FE_VERSION_2>:
      	New constant.
      	(gcc_c_fe_context_function): Update comment.
      	* gcc-c-fe.def (finish_record_with_alignment): New method.
      	Update documentation.
      
      libcc1
      
      	PR libcc1/113977
      	* libcc1plugin.cc (plugin_finish_record_or_union): New function.
      	(plugin_finish_record_or_union): Rewrite.
      	(plugin_init): Use GCC_C_FE_VERSION_2.
      	* libcc1.cc (c_vtable): Use GCC_C_FE_VERSION_2.
      	(gcc_c_fe_context): Check for GCC_C_FE_VERSION_2.
      
      
      bc0e18a9
  14. Jan 14, 2024
  15. Jan 13, 2024
    • Jakub Jelinek's avatar
      c++, demangle: Implement https://github.com/itanium-cxx-abi/cxx-abi/issues/148 non-proposal · 65388b28
      Jakub Jelinek authored
      The following patch attempts to implement what apparently clang++
      implemented for explicit object member function mangling, but nobody
      actually proposed in patch form in
      https://github.com/itanium-cxx-abi/cxx-abi/issues/148
      
      2024-01-13  Jakub Jelinek  <jakub@redhat.com>
      
      gcc/cp/
      	* mangle.cc (write_nested_name): Mangle explicit object
      	member functions with H as per
      	https://github.com/itanium-cxx-abi/cxx-abi/issues/148 non-proposal.
      gcc/testsuite/
      	* g++.dg/abi/mangle79.C: New test.
      include/
      	* demangle.h (enum demangle_component_type): Add
      	DEMANGLE_COMPONENT_XOBJ_MEMBER_FUNCTION.
      libiberty/
      	* cp-demangle.c (FNQUAL_COMPONENT_CASE): Add case for
      	DEMANGLE_COMPONENT_XOBJ_MEMBER_FUNCTION.
      	(d_dump): Handle DEMANGLE_COMPONENT_XOBJ_MEMBER_FUNCTION.
      	(d_nested_name): Parse H after N in nested name.
      	(d_count_templates_scopes): Handle
      	DEMANGLE_COMPONENT_XOBJ_MEMBER_FUNCTION.
      	(d_print_mod): Likewise.
      	(d_print_function_type): Likewise.
      	* testsuite/demangle-expected: Add tests for explicit object
      	member functions.
      65388b28
  16. Jan 10, 2024
  17. Jan 09, 2024
    • Jeff Law's avatar
      [committed] Adding missing prototype for __clzhi2 to xstormy port · 9f7afa99
      Jeff Law authored
      xstormy16 has failed since the c99 transition due to a missing prototype for
      __clzhi2 in the implementation of stormy16_count_leading_zeros.
      
      This fixes the missing prototype.  Pushed to the trunk.
      
      include/
      	* longlong.h (__stormy16_count_leading_zeros): Add prototype for
      	__clzhi2.
      9f7afa99
  18. Jan 03, 2024
  19. Dec 16, 2023
  20. Dec 15, 2023
    • Julian Brown's avatar
      OpenMP/OpenACC: Unordered/non-constant component offset runtime diagnostic · f5745dc1
      Julian Brown authored
      This patch adds support for non-constant component offsets in "map"
      clauses for OpenMP (and the equivalants for OpenACC), which are not able
      to be sorted into order at compile time.  Normally struct accesses in
      such clauses are gathered together and sorted into increasing address
      order after a "GOMP_MAP_STRUCT" node: if we have variable indices,
      that is no longer possible.
      
      This version of the patch scales back the previously-posted version to
      merely add a diagnostic for incorrect usage of component accesses with
      variably-indexed arrays of structs: the only permitted variant is where
      we have multiple indices that are the same, but we could not prove so
      at compile time.  Rather than silently producing the wrong result for
      cases where the indices are in fact different, we error out (e.g.,
      "map(dtarr(i)%arrptr, dtarr(j)%arrptr(4:8))", for different i/j).
      
      For now, multiple *constant* array indices are still supported (see
      map-arrayofstruct-1.c).  That could perhaps be addressed with a follow-up
      patch, if necessary.
      
      This version of the patch renumbers the GOMP_MAP_STRUCT_UNORD kind to
      avoid clashing with the OpenACC "non-contiguous" dynamic array support
      (though that is not yet applied to mainline).
      
      2023-08-18  Julian Brown  <julian@codesourcery.com>
      
      gcc/
      	* gimplify.cc (extract_base_bit_offset): Add VARIABLE_OFFSET parameter.
      	(omp_get_attachment, omp_group_last, omp_group_base,
      	omp_directive_maps_explicitly): Add GOMP_MAP_STRUCT_UNORD support.
      	(omp_accumulate_sibling_list): Update calls to extract_base_bit_offset.
      	Support GOMP_MAP_STRUCT_UNORD.
      	(omp_build_struct_sibling_lists, gimplify_scan_omp_clauses,
      	gimplify_adjust_omp_clauses, gimplify_omp_target_update): Add
      	GOMP_MAP_STRUCT_UNORD support.
      	* omp-low.cc (lower_omp_target): Add GOMP_MAP_STRUCT_UNORD support.
      	* tree-pretty-print.cc (dump_omp_clause): Likewise.
      
      include/
      	* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_STRUCT_UNORD.
      
      libgomp/
      	* oacc-mem.c (find_group_last, goacc_enter_data_internal,
      	goacc_exit_data_internal, GOACC_enter_exit_data): Add
      	GOMP_MAP_STRUCT_UNORD support.
      	* target.c (gomp_map_vars_internal): Add GOMP_MAP_STRUCT_UNORD support.
      	Detect incorrect use of variable indexing of arrays of structs.
      	(GOMP_target_enter_exit_data, gomp_target_task_fn): Add
      	GOMP_MAP_STRUCT_UNORD support.
      	* testsuite/libgomp.c-c++-common/map-arrayofstruct-1.c: New test.
      	* testsuite/libgomp.c-c++-common/map-arrayofstruct-2.c: New test.
      	* testsuite/libgomp.c-c++-common/map-arrayofstruct-3.c: New test.
      	* testsuite/libgomp.fortran/map-subarray-5.f90: New test.
      f5745dc1
  21. Dec 11, 2023
  22. Dec 10, 2023
    • Tom Tromey's avatar
      Add some new DW_IDX_* constants · 748766b8
      Tom Tromey authored
      I've reimplemented the .debug_names code in GDB -- it was quite far
      from being correct, and the new implementation is much closer to what
      is specified by DWARF.
      
      However, the new writer in GDB needs to emit some symbol properties,
      so that the reader can be fully functional.  This patch adds a few new
      DW_IDX_* constants, and tries to document the existing extensions as
      well.  (My patch series add more documentation of these to the GDB
      manual as well.)
      
      include/ChangeLog
      2023-12-10  Tom Tromey  <tom@tromey.com>
      
      	* dwarf2.def (DW_IDX_GNU_internal, DW_IDX_GNU_external): Comment.
      	(DW_IDX_GNU_main, DW_IDX_GNU_language, DW_IDX_GNU_linkage_name):
      	New constants.
      748766b8
  23. Dec 02, 2023
  24. Dec 01, 2023
    • Jason Merrill's avatar
      c++: mangle function template constraints · c3f281a0
      Jason Merrill authored
      Per https://github.com/itanium-cxx-abi/cxx-abi/issues/24 and
      https://github.com/itanium-cxx-abi/cxx-abi/pull/166
      
      We need to mangle constraints to be able to distinguish between function
      templates that only differ in constraints.  From the latter link, we want to
      use the template parameter mangling previously specified for lambdas to also
      make explicit the form of a template parameter where the argument is not a
      "natural" fit for it, such as when the parameter is constrained or deduced.
      
      I'm concerned about how the latter link changes the mangling for some C++98
      and C++11 patterns, so I've limited template_parm_natural_p to avoid two
      cases found by running the testsuite with -Wabi forced on:
      
      template <class T, T V> T f() { return V; }
      int main() { return f<int,42>(); }
      
      template <int i> int max() { return i; }
      template <int i, int j, int... rest> int max()
      {
        int sub = max<j, rest...>();
        return i > sub ? i : sub;
      }
      int main() {  return max<1,2,3>(); }
      
      A third C++11 pattern is changed by this patch:
      
      template <template <typename...> class TT, typename... Ts> TT<Ts...> f();
      template <typename> struct A { };
      int main() { f<A,int>(); }
      
      I aim to resolve these with the ABI committee before GCC 14.1.
      
      We also need to resolve https://github.com/itanium-cxx-abi/cxx-abi/issues/38
      (mangling references to dependent template-ids where the name is fully
      resolved) as references to concepts in std:: will consistently run into this
      area.  This is why mangle-concepts1.C only refers to concepts in the global
      namespace so far.
      
      The library changes are to avoid trying to mangle builtins, which fails.
      
      Demangler support and test coverage is not complete yet.
      
      gcc/cp/ChangeLog:
      
      	* cp-tree.h (TEMPLATE_ARGS_TYPE_CONSTRAINT_P): New.
      	(get_concept_check_template): Declare.
      	* constraint.cc (combine_constraint_expressions)
      	(finish_shorthand_constraint): Use UNKNOWN_LOCATION.
      	* pt.cc (convert_generic_types_to_packs): Likewise.
      	* mangle.cc (write_constraint_expression)
      	(write_tparms_constraints, write_type_constraint)
      	(template_parm_natural_p, write_requirement)
      	(write_requires_expr): New.
      	(write_encoding): Mangle trailing requires-clause.
      	(write_name): Pass parms to write_template_args.
      	(write_template_param_decl): Factor out from...
      	(write_closure_template_head): ...here.
      	(write_template_args): Mangle non-natural parms
      	and requires-clause.
      	(write_expression): Handle REQUIRES_EXPR.
      
      include/ChangeLog:
      
      	* demangle.h (enum demangle_component_type): Add
      	DEMANGLE_COMPONENT_CONSTRAINTS.
      
      libiberty/ChangeLog:
      
      	* cp-demangle.c (d_make_comp): Handle
      	DEMANGLE_COMPONENT_CONSTRAINTS.
      	(d_count_templates_scopes): Likewise.
      	(d_print_comp_inner): Likewise.
      	(d_maybe_constraints): New.
      	(d_encoding, d_template_args_1): Call it.
      	(d_parmlist): Handle 'Q'.
      	* testsuite/demangle-expected: Add some constraint tests.
      
      libstdc++-v3/ChangeLog:
      
      	* include/std/bit: Avoid builtins in requires-clauses.
      	* include/std/variant: Likewise.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/abi/mangle10.C: Disable compat aliases.
      	* g++.dg/abi/mangle52.C: Specify ABI 18.
      	* g++.dg/cpp2a/class-deduction-alias3.C
      	* g++.dg/cpp2a/class-deduction-alias8.C:
      	Avoid builtins in requires-clauses.
      	* g++.dg/abi/mangle-concepts1.C: New test.
      	* g++.dg/abi/mangle-ttp1.C: New test.
      c3f281a0
  25. Nov 29, 2023
  26. Nov 28, 2023
    • Jakub Jelinek's avatar
      libiberty: Use x86 HW optimized sha1 · bf4f40cc
      Jakub Jelinek authored
      Nick has approved this patch (+ small ld change to use it for --build-id=),
      so I'm commiting it to GCC as master as well.
      
      If anyone from ARM would be willing to implement it similarly with
      vsha1{cq,mq,pq,h,su0q,su1q}_u32 intrinsics, it could be a useful linker
      speedup on those hosts as well, the intent in sha1.c was that
      sha1_hw_process_bytes, sha1_hw_process_block functions
      would be defined whenever
      defined (HAVE_X86_SHA1_HW_SUPPORT) || defined (HAVE_WHATEVERELSE_SHA1_HW_SUPPORT)
      but the body of sha1_hw_process_block and sha1_choose_process_bytes
      would then have #elif defined (HAVE_WHATEVERELSE_SHA1_HW_SUPPORT) for
      the other arch support, similarly for any target attributes on
      sha1_hw_process_block if needed.
      
      2023-11-28  Jakub Jelinek  <jakub@redhat.com>
      
      include/
      	* sha1.h (sha1_process_bytes_fn): New typedef.
      	(sha1_choose_process_bytes): Declare.
      libiberty/
      	* configure.ac (HAVE_X86_SHA1_HW_SUPPORT): New check.
      	* sha1.c: If HAVE_X86_SHA1_HW_SUPPORT is defined, include x86intrin.h
      	and cpuid.h.
      	(sha1_hw_process_bytes, sha1_hw_process_block,
      	sha1_choose_process_bytes): New functions.
      	* config.in: Regenerated.
      	* configure: Regenerated.
      bf4f40cc
  27. Nov 08, 2023
  28. Nov 07, 2023
    • Kwok Cheung Yeung's avatar
      openmp: Add support for the 'indirect' clause in C/C++ · a49c7d31
      Kwok Cheung Yeung authored
      This adds support for the 'indirect' clause in the 'declare target'
      directive.  Functions declared as indirect may be called via function
      pointers passed from the host in offloaded code.
      
      Virtual calls to member functions via the object pointer in C++ are
      currently not supported in target regions.
      
      2023-11-07  Kwok Cheung Yeung  <kcy@codesourcery.com>
      
      gcc/c-family/
      	* c-attribs.cc (c_common_attribute_table): Add attribute for
      	indirect functions.
      	* c-pragma.h (enum parma_omp_clause): Add entry for indirect clause.
      
      gcc/c/
      	* c-decl.cc (c_decl_attributes): Add attribute for indirect
      	functions.
      	* c-lang.h (c_omp_declare_target_attr): Add indirect field.
      	* c-parser.cc (c_parser_omp_clause_name): Handle indirect clause.
      	(c_parser_omp_clause_indirect): New.
      	(c_parser_omp_all_clauses): Handle indirect clause.
      	(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
      	(c_parser_omp_declare_target): Handle indirect clause.  Emit error
      	message if device_type or indirect clauses used alone.  Emit error
      	if indirect clause used with device_type that is not 'any'.
      	(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
      	(c_parser_omp_begin): Handle indirect clause.
      	* c-typeck.cc (c_finish_omp_clauses): Handle indirect clause.
      
      gcc/cp/
      	* cp-tree.h (cp_omp_declare_target_attr): Add indirect field.
      	* decl2.cc (cplus_decl_attributes): Add attribute for indirect
      	functions.
      	* parser.cc (cp_parser_omp_clause_name): Handle indirect clause.
      	(cp_parser_omp_clause_indirect): New.
      	(cp_parser_omp_all_clauses): Handle indirect clause.
      	(handle_omp_declare_target_clause): Add extra parameter.  Add
      	indirect attribute for indirect functions.
      	(OMP_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
      	(cp_parser_omp_declare_target): Handle indirect clause.  Emit error
      	message if device_type or indirect clauses used alone.  Emit error
      	if indirect clause used with device_type that is not 'any'.
      	(OMP_BEGIN_DECLARE_TARGET_CLAUSE_MASK): Add indirect clause to mask.
      	(cp_parser_omp_begin): Handle indirect clause.
      	* semantics.cc (finish_omp_clauses): Handle indirect clause.
      
      gcc/
      	* lto-cgraph.cc (enum LTO_symtab_tags): Add tag for indirect
      	functions.
      	(output_offload_tables): Write indirect functions.
      	(input_offload_tables): read indirect functions.
      	* lto-section-names.h (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
      	* omp-builtins.def (BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR): New.
      	* omp-offload.cc (offload_ind_funcs): New.
      	(omp_discover_implicit_declare_target): Add functions marked with
      	'omp declare target indirect' to indirect functions list.
      	(omp_finish_file): Add indirect functions to section for offload
      	indirect functions.
      	(execute_omp_device_lower): Redirect indirect calls on target by
      	passing function pointer to BUILT_IN_GOMP_TARGET_MAP_INDIRECT_PTR.
      	(pass_omp_device_lower::gate): Run pass_omp_device_lower if
      	indirect functions are present on an accelerator device.
      	* omp-offload.h (offload_ind_funcs): New.
      	* tree-core.h (omp_clause_code): Add OMP_CLAUSE_INDIRECT.
      	* tree.cc (omp_clause_num_ops): Add entry for OMP_CLAUSE_INDIRECT.
      	(omp_clause_code_name): Likewise.
      	* tree.h (OMP_CLAUSE_INDIRECT_EXPR): New.
      	* config/gcn/mkoffload.cc (process_asm): Process offload_ind_funcs
      	section.  Count number of indirect functions.
      	(process_obj): Emit number of indirect functions.
      	* config/nvptx/mkoffload.cc (ind_func_ids, ind_funcs_tail): New.
      	(process): Emit offload_ind_func_table in PTX code.  Emit indirect
      	function names and count in image.
      	* config/nvptx/nvptx.cc (nvptx_record_offload_symbol): Mark
      	indirect functions in PTX code with IND_FUNC_MAP.
      
      gcc/testsuite/
      	* c-c++-common/gomp/declare-target-7.c: Update expected error message.
      	* c-c++-common/gomp/declare-target-indirect-1.c: New.
      	* c-c++-common/gomp/declare-target-indirect-2.c: New.
      	* g++.dg/gomp/attrs-21.C (v12): Update expected error message.
      	* g++.dg/gomp/declare-target-indirect-1.C: New.
      	* gcc.dg/gomp/attrs-21.c (v12): Update expected error message.
      
      include/
      	* gomp-constants.h (GOMP_VERSION): Increment to 3.
      	(GOMP_VERSION_SUPPORTS_INDIRECT_FUNCS): New.
      
      libgcc/
      	* offloadstuff.c (OFFLOAD_IND_FUNC_TABLE_SECTION_NAME): New.
      	(__offload_ind_func_table): New.
      	(__offload_ind_funcs_end): New.
      	(__OFFLOAD_TABLE__): Add entries for indirect functions.
      
      libgomp/
      	* Makefile.am (libgomp_la_SOURCES): Add target-indirect.c.
      	* Makefile.in: Regenerate.
      	* libgomp-plugin.h (GOMP_INDIRECT_ADDR_MAP): New define.
      	(GOMP_OFFLOAD_load_image): Add extra argument.
      	* libgomp.h (struct indirect_splay_tree_key_s): New.
      	(indirect_splay_tree_node, indirect_splay_tree,
      	indirect_splay_tree_key): New.
      	(indirect_splay_compare): New.
      	* libgomp.map (GOMP_5.1.1): Add GOMP_target_map_indirect_ptr.
      	* libgomp.texi (OpenMP 5.1): Update documentation on indirect
      	calls in target region and on indirect clause.
      	(Other new OpenMP 5.2 features): Add entry for virtual function calls.
      	* libgomp_g.h (GOMP_target_map_indirect_ptr): Add prototype.
      	* oacc-host.c (host_load_image): Add extra argument.
      	* target.c (gomp_load_image_to_device): If the GOMP_VERSION is high
      	enough, read host indirect functions table and pass to
      	load_image_func.
      	* config/accel/target-indirect.c: New.
      	* config/linux/target-indirect.c: New.
      	* config/gcn/team.c (build_indirect_map): Add prototype.
      	(gomp_gcn_enter_kernel): Initialize support for indirect
      	function calls on GCN target.
      	* config/nvptx/team.c (build_indirect_map): Add prototype.
      	(gomp_nvptx_main): Initialize support for indirect function
      	calls on NVPTX target.
      	* plugin/plugin-gcn.c (struct gcn_image_desc): Add field for
      	indirect functions count.
      	(GOMP_OFFLOAD_load_image): Add extra argument.  If the GOMP_VERSION
      	is high enough, build address translation table and copy it to target
      	memory.
      	* plugin/plugin-nvptx.c (nvptx_tdata): Add field for indirect
      	functions count.
      	(GOMP_OFFLOAD_load_image): Add extra argument.  If the GOMP_VERSION
      	is high enough, Build address translation table and copy it to target
      	memory.
      	* testsuite/libgomp.c-c++-common/declare-target-indirect-1.c: New.
      	* testsuite/libgomp.c-c++-common/declare-target-indirect-2.c: New.
      	* testsuite/libgomp.c++/declare-target-indirect-1.C: New.
      a49c7d31
  29. Oct 26, 2023
  30. Oct 25, 2023
    • Chung-Lin Tang's avatar
      OpenACC 2.7: Implement self clause for compute constructs · 3a359638
      Chung-Lin Tang authored
      This patch implements the 'self' clause for compute constructs: parallel,
      kernels, and serial. This clause conditionally uses the local device
      (the host mult-core CPU) as the executing device of the compute region.
      
      The actual implementation of the "local device" device type inside libgomp
      (presumably using pthreads) is still not yet completed, so the libgomp
      side is still implemented the exact same as host-fallback mode. (so as of now,
      it essentially behaves like the 'if' clause with the condition inverted)
      
      gcc/c/ChangeLog:
      
      	* c-parser.cc (c_parser_oacc_compute_clause_self): New function.
      	(c_parser_oacc_all_clauses): Add new 'bool compute_p = false'
      	parameter, add parsing of self clause when compute_p is true.
      	(OACC_KERNELS_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_SELF.
      	(OACC_PARALLEL_CLAUSE_MASK): Likewise,
      	(OACC_SERIAL_CLAUSE_MASK): Likewise.
      	(c_parser_oacc_compute): Adjust call to c_parser_oacc_all_clauses to
      	set compute_p argument to true.
      	* c-typeck.cc (c_finish_omp_clauses): Add OMP_CLAUSE_SELF case.
      
      gcc/cp/ChangeLog:
      
      	* parser.cc (cp_parser_oacc_compute_clause_self): New function.
      	(cp_parser_oacc_all_clauses): Add new 'bool compute_p = false'
      	parameter, add parsing of self clause when compute_p is true.
      	(OACC_KERNELS_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_SELF.
      	(OACC_PARALLEL_CLAUSE_MASK): Likewise,
      	(OACC_SERIAL_CLAUSE_MASK): Likewise.
      	(cp_parser_oacc_compute): Adjust call to c_parser_oacc_all_clauses to
      	set compute_p argument to true.
      	* pt.cc (tsubst_omp_clauses): Add OMP_CLAUSE_SELF case.
      	* semantics.cc (c_finish_omp_clauses): Add OMP_CLAUSE_SELF case, merged
      	with OMP_CLAUSE_IF case.
      
      gcc/fortran/ChangeLog:
      
      	* gfortran.h (typedef struct gfc_omp_clauses): Add self_expr field.
      	* openmp.cc (enum omp_mask2): Add OMP_CLAUSE_SELF.
      	(gfc_match_omp_clauses): Add handling for OMP_CLAUSE_SELF.
      	(OACC_PARALLEL_CLAUSES): Add OMP_CLAUSE_SELF.
      	(OACC_KERNELS_CLAUSES): Likewise.
      	(OACC_SERIAL_CLAUSES): Likewise.
      	(resolve_omp_clauses): Add handling for omp_clauses->self_expr.
      	* trans-openmp.cc (gfc_trans_omp_clauses): Add handling of
      	clauses->self_expr and building of OMP_CLAUSE_SELF tree clause.
      	(gfc_split_omp_clauses): Add handling of self_expr field copy.
      
      gcc/ChangeLog:
      
      	* gimplify.cc (gimplify_scan_omp_clauses): Add OMP_CLAUSE_SELF case.
      	(gimplify_adjust_omp_clauses): Likewise.
      	* omp-expand.cc (expand_omp_target): Add OMP_CLAUSE_SELF expansion code,
      	* omp-low.cc (scan_sharing_clauses): Add OMP_CLAUSE_SELF case.
      	* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE_SELF enum.
      	* tree-nested.cc (convert_nonlocal_omp_clauses): Add OMP_CLAUSE_SELF
      	case.
      	(convert_local_omp_clauses): Likewise.
      	* tree-pretty-print.cc (dump_omp_clause): Add OMP_CLAUSE_SELF case.
      	* tree.cc (omp_clause_num_ops): Add OMP_CLAUSE_SELF entry.
      	(omp_clause_code_name): Likewise.
      	* tree.h (OMP_CLAUSE_SELF_EXPR): New macro.
      
      gcc/testsuite/ChangeLog:
      
      	* c-c++-common/goacc/self-clause-1.c: New test.
      	* c-c++-common/goacc/self-clause-2.c: New test.
      	* gfortran.dg/goacc/self.f95: New test.
      
      include/ChangeLog:
      
      	* gomp-constants.h (GOACC_FLAG_LOCAL_DEVICE): New flag bit value.
      
      libgomp/ChangeLog:
      
      	* oacc-parallel.c (GOACC_parallel_keyed): Add code to handle
      	GOACC_FLAG_LOCAL_DEVICE case.
      	* testsuite/libgomp.oacc-c-c++-common/self-1.c: New test.
      3a359638
  31. Oct 13, 2023
  32. Oct 12, 2023
    • Zhang, Jun's avatar
      x86: set spincount 1 for x86 hybrid platform · e1e127de
      Zhang, Jun authored
      By test, we find in hybrid platform spincount 1 is better.
      
      Use '-march=native -Ofast -funroll-loops -flto',
      results as follows:
      
      spec2017 speed   RPL     ADL
      657.xz_s         0.00%   0.50%
      603.bwaves_s     10.90%  26.20%
      607.cactuBSSN_s  5.50%   72.50%
      619.lbm_s        2.40%   2.50%
      621.wrf_s        -7.70%  2.40%
      627.cam4_s       0.50%   0.70%
      628.pop2_s       48.20%  153.00%
      638.imagick_s    -0.10%  0.20%
      644.nab_s        2.30%   1.40%
      649.fotonik3d_s  8.00%   13.80%
      654.roms_s       1.20%   1.10%
      Geomean-int      0.00%   0.50%
      Geomean-fp       6.30%   21.10%
      Geomean-all      5.70%   19.10%
      
      omp2012          RPL     ADL
      350.md           -1.81%  -1.75%
      351.bwaves       7.72%   12.50%
      352.nab          14.63%  19.71%
      357.bt331        -0.20%  1.77%
      358.botsalgn     0.00%   0.00%
      359.botsspar     0.00%   0.65%
      360.ilbdc        0.00%   0.25%
      362.fma3d        2.66%   -0.51%
      363.swim         10.44%  0.00%
      367.imagick      0.00%   0.12%
      370.mgrid331     2.49%   25.56%
      371.applu331     1.06%   4.22%
      372.smithwa      0.74%   3.34%
      376.kdtree       10.67%  16.03%
      GEOMEAN          3.34%   5.53%
      
      include/ChangeLog:
      
      	PR target/109812
      	* spincount.h: New file.
      
      libgomp/ChangeLog:
      
      	* env.c (initialize_env): Use do_adjust_default_spincount.
      	* config/linux/x86/spincount.h: New file.
      e1e127de
  33. Aug 23, 2023
  34. Aug 22, 2023
    • Jason Merrill's avatar
      c++: constrained hidden friends [PR109751] · 810bcc00
      Jason Merrill authored
      r13-4035 avoided a problem with overloading of constrained hidden friends by
      checking satisfaction, but checking satisfaction early is inconsistent with
      the usual late checking and can lead to hard errors, so let's not do that
      after all.
      
      We were wrongly treating the different instantiations of the same friend
      template as the same function because maybe_substitute_reqs_for was failing
      to actually substitute in the case of a non-template friend.  But we don't
      actually need to do the substitution anyway, because [temp.friend] says that
      such a friend can't be the same as any other declaration.
      
      After fixing that, instead of a redefinition error we got an ambiguous
      overload error, fixed by allowing constrained hidden friends to coexist
      until overload resolution, at which point they probably won't be in the same
      ADL overload set anyway.
      
      And we avoid mangling collisions by following the proposed mangling for
      these friends as a member function with an extra 'F' before the name.  I
      demangle this by just adding [friend] to the name of the function because
      it's not feasible to reconstruct the actual scope of the function since the
      mangling ABI doesn't distinguish between class and namespace scopes.
      
      	PR c++/109751
      
      gcc/cp/ChangeLog:
      
      	* cp-tree.h (member_like_constrained_friend_p): Declare.
      	* decl.cc (member_like_constrained_friend_p): New.
      	(function_requirements_equivalent_p): Check it.
      	(duplicate_decls): Check it.
      	(grokfndecl): Check friend template constraints.
      	* mangle.cc (decl_mangling_context): Check it.
      	(write_unqualified_name): Check it.
      	* pt.cc (uses_outer_template_parms_in_constraints): Fix for friends.
      	(tsubst_friend_function): Don't check satisfaction.
      
      include/ChangeLog:
      
      	* demangle.h (enum demangle_component_type): Add
      	DEMANGLE_COMPONENT_FRIEND.
      
      libiberty/ChangeLog:
      
      	* cp-demangle.c (d_make_comp): Handle DEMANGLE_COMPONENT_FRIEND.
      	(d_count_templates_scopes): Likewise.
      	(d_print_comp_inner): Likewise.
      	(d_unqualified_name): Handle member-like friend mangling.
      	* testsuite/demangle-expected: Add test.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/cpp2a/concepts-friend11.C: Now works.  Add template.
      	* g++.dg/cpp2a/concepts-friend15.C: New test.
      810bcc00
  35. Aug 08, 2023
  36. Aug 07, 2023
    • Vladimir Mezentsev's avatar
      gprofng: a new GNU profiler · 24552056
      Vladimir Mezentsev authored
      ChangeLog:
      
      	* Makefile.def: Add gprofng module.
      	* configure.ac: Add --enable-gprofng option.
      	* Makefile.in: Regenerate.
      	* configure: Regenerate.
      
      include/ChangeLog:
      
      	* collectorAPI.h: New file.
      	* libcollector.h: New file.
      	* libfcollector.h: New file.
      24552056
    • Alan Modra's avatar
      gcc-4.5 build fixes · 432c6f05
      Alan Modra authored
      Trying to build binutils with an older gcc currently fails.  Working
      around these gcc bugs is not onerous so let's fix them.
      
      include/ChangeLog:
      
      	* xtensa-dynconfig.h (xtensa_isa_internal): Delete unnecessary
      	forward declaration.
      432c6f05
    • Alan Modra's avatar
      PR29961, plugin-api.h: "Could not detect architecture endianess" · 24f5a73a
      Alan Modra authored
      Found when attempting to build binutils on sparc sunos-5.8 where
      sys/byteorder.h defines _BIG_ENDIAN but not any of the BYTE_ORDER
      variants.  This patch adds the extra tests to cope with the old
      machine, and tidies the header a little.
      
      include/ChangeLog:
      
      	* plugin-api.h: When handling non-gcc or gcc < 4.6.0 include
      	necessary header files before testing macros.  Make more use
      	of #elif.  Test _LITTLE_ENDIAN and _BIG_ENDIAN in final tests.
      24f5a73a
Loading