Skip to content
Snippets Groups Projects
  1. Jan 02, 2025
  2. Dec 25, 2024
  3. Dec 24, 2024
    • Lewis Hyatt's avatar
      libcpp: Fix overly large buffer allocation · 27af1a14
      Lewis Hyatt authored
      It seems that tokens_buff_new() has always been allocating the virtual
      location buffer 4 times larger than intended, and now that location_t is
      64-bit, it is 8 times larger. Fixed.
      
      libcpp/ChangeLog:
      
      	* macro.cc (tokens_buff_new): Fix length argument to XNEWVEC.
      27af1a14
  4. Dec 17, 2024
  5. Dec 16, 2024
  6. Dec 09, 2024
  7. Dec 08, 2024
    • Lewis Hyatt's avatar
      Support for 64-bit location_t: Activate 64-bit location_t · d9cdc500
      Lewis Hyatt authored
      Change location_t to be a 64-bit integer instead of a 32-bit integer in
      libcpp.
      
      Also included in this change are the two other patches in the original
      series which depended on this one; I am committing them all at once in case
      it needs to be reverted later:
      
      -Support for 64-bit location_t: gimple parts
      
      The size of struct gimple increased by 8 bytes with the change in size of
      location_t from 32- to 64-bit; adjust the WORD markings in the comments
      accordingly. It seems that most of the WORD markings were off by one already,
      probably not having been updated after a previous reduction in the size of a
      gimple, so they have become retroactively correct again, and only a couple
      needed adjustment actually.
      
      Also add a comment that there is now 32 bits of unused padding available in
      struct gimple for 64-bit hosts.
      
      -Support for 64-bit location_t: Remove -flarge-source-files
      
      The option -flarge-source-files became unnecessary with 64-bit location_t
      and harms performance compared to the new default setting, so silently
      ignore it.
      
      libcpp/ChangeLog:
      
      	* include/cpplib.h (struct cpp_token): Adjust comment about the
      	struct size.
      	* include/line-map.h (location_t): Change typedef from 32-bit to 64-bit
      	integer.
      	(LINE_MAP_MAX_COLUMN_NUMBER): Increase size to be appropriate for
      	64-bit location_t.
      	(LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES): Likewise.
      	(LINE_MAP_MAX_LOCATION_WITH_COLS): Likewise.
      	(LINE_MAP_MAX_LOCATION): Likewise.
      	(MAX_LOCATION_T): Likewise.
      	(line_map_suggested_range_bits): Likewise.
      	(struct line_map): Adjust comment about the struct size.
      	(struct line_map_macro): Likewise.
      	(struct line_map_ordinary): Likewise. Rearrange fields to optimize
      	padding.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/diagnostic/pr77949.C: Adapt the test for 64-bit location_t,
      	when the previously expected failure doesn't actually happen.
      	* g++.dg/modules/loc-prune-4.C: Adjust the expected output for the
      	64-bit location_t case.
      	* gcc.dg/plugin/expensive_selftests_plugin.cc: Don't try to test
      	the maximum supported column number in 64-bit location_t mode.
      	* gcc.dg/plugin/location_overflow_plugin.cc: Adjust the base_location
      	so it can effectively test 64-bit location_t.
      
      gcc/ChangeLog:
      
      	* gimple.h (struct gphi): Update word marking comments to reflect
      	the new size of location_t.
      	(struct gimple): Likewise. Add a comment about padding.
      	* common.opt: Mark -flarge-source-files as Ignored.
      	* common.opt.urls: Regenerate.
      	* doc/invoke.texi: Remove -flarge-source-files.
      	* toplev.cc (process_options): Remove support for
      	-flarge-source-files.
      d9cdc500
  8. Dec 07, 2024
  9. Dec 06, 2024
    • Jakub Jelinek's avatar
      libcpp, c++: Optimize initializers using #embed in C++ · 0223119f
      Jakub Jelinek authored
      This patch adds similar optimizations to the C++ FE as have been
      implemented earlier in the C FE.
      The libcpp hunk enables use of CPP_EMBED token even for C++, not just
      C; the preprocessor guarantees there is always a CPP_NUMBER CPP_COMMA
      before CPP_EMBED and CPP_COMMA CPP_NUMBER after it which simplifies
      parsing (unless #embed is more than 2GB, in that case it could be
      CPP_NUMBER CPP_COMMA CPP_EMBED CPP_COMMA CPP_EMBED CPP_COMMA CPP_EMBED
      CPP_COMMA CPP_NUMBER etc. with each CPP_EMBED covering at most INT_MAX
      bytes).
      Similarly to the C patch, this patch parses it into RAW_DATA_CST tree
      in the braced initializers (and from there peels into INTEGER_CSTs unless
      it is an initializer of an std::byte array or integral array with CHAR_BIT
      element precision), parses CPP_EMBED in cp_parser_expression into just
      the last INTEGER_CST in it because I think users don't need millions of
      -Wunused-value warnings because they did useless
        int a = (
        #embed "megabyte.dat"
        );
      and so most of the inner INTEGER_CSTs would be there just for the warning,
      and in the rest of contexts like template argument list, function argument
      list, attribute argument list, ...) parse it into a sequence of INTEGER_CSTs
      (I wrote a range/iterator classes to simplify that).
      
      My dumb
      cat embed-11.c
      constexpr unsigned char a[] = {
        #embed "cc1plus"
      };
      const unsigned char *b = a;
      testcase where cc1plus is 492329008 bytes long when configured
      --enable-checking=yes,rtl,extra against recent binutils with .base64 gas
      support results in:
      time ./xg++ -B ./ -S -O2 embed-11.c
      
      real    0m4.350s
      user    0m2.427s
      sys     0m0.830s
      time ./xg++ -B ./ -c -O2 embed-11.c
      
      real    0m6.932s
      user    0m6.034s
      sys     0m0.888s
      (compared to running out of memory or very long compilation).
      On a shorter inclusion,
      cat embed-12.c
      constexpr unsigned char a[] = {
        #embed "xg++"
      };
      const unsigned char *b = a;
      where xg++ is 15225904 bytes long, this takes using GCC with the #embed
      patchset except for this patch:
      time ~/src/gcc/obj36/gcc/xg++ -B ~/src/gcc/obj36/gcc/ -S -O2 embed-12.c
      
      real    0m33.190s
      user    0m32.327s
      sys     0m0.790s
      and with this patch:
      time ./xg++ -B ./ -S -O2 embed-12.c
      
      real    0m0.118s
      user    0m0.090s
      sys     0m0.028s
      
      The patch doesn't change anything on what the first patch in the series
      introduces even for C++, namely that #embed is expanded (actually or as if)
      into a sequence of literals like
      127,69,76,70,2,1,1,3,0,0,0,0,0,0,0,0,2,0,62,0,1,0,0,0,80,211,64,0,0,0,0,0,64,0,0,0,0,0,0,0,8,253
      and so each element has int type.
      That is how I believe it is in C23, and the different versions of the
      C++ P1967 paper specified there some casts, P1967R12 in particular
      "Otherwise, the integral constant expression is the value of std::fgetc’s return is cast
      to unsigned char."
      but please see
      https://github.com/llvm/llvm-project/pull/97274#issuecomment-2230929277
      comment and whether we really want the preprocessor to preprocess it for
      C++ as (or as-if)
      static_cast<unsigned char>(127),static_cast<unsigned char>(69),static_cast<unsigned char>(76),static_cast<unsigned char>(70),static_cast<unsigned char>(2),...
      i.e. 9 tokens per byte rather than 2, or
      (unsigned char)127,(unsigned char)69,...
      or
      ((unsigned char)127),((unsigned char)69),...
      etc.
      Without a literal suffix for unsigned char constant literals it is horrible,
      plus the incompatibility between C and C++.  Sure, we could use the magic
      form more often for C++ to save the size and do the 9 or how many tokens
      form only for the boundary constants and use #embed "." __gnu__::__base64__("...")
      for what is in between if there are at least 2 tokens inside of it.
      E.g. (unsigned char)127 vs. static_cast<unsigned char>(127) behaves
      differently if there is constexpr long long p[] = { ... };
      ...
        #embed __FILE__
      [p]
      
      2024-12-06  Jakub Jelinek  <jakub@redhat.com>
      
      libcpp/
      	* files.cc (finish_embed): Use CPP_EMBED even for C++.
      gcc/
      	* tree.h (RAW_DATA_UCHAR_ELT, RAW_DATA_SCHAR_ELT): Define.
      gcc/cp/ChangeLog:
      	* cp-tree.h (class raw_data_iterator): New type.
      	(class raw_data_range): New type.
      	* parser.cc (cp_parser_postfix_open_square_expression): Handle
      	parsing of CPP_EMBED.
      	(cp_parser_parenthesized_expression_list): Likewise.  Use
      	cp_lexer_next_token_is.
      	(cp_parser_expression): Handle parsing of CPP_EMBED.
      	(cp_parser_template_argument_list): Likewise.
      	(cp_parser_initializer_list): Likewise.
      	(cp_parser_oacc_clause_tile): Likewise.
      	(cp_parser_omp_tile_sizes): Likewise.
      	* pt.cc (tsubst_expr): Handle RAW_DATA_CST.
      	* constexpr.cc (reduced_constant_expression_p): Likewise.
      	(raw_data_cst_elt): New function.
      	(find_array_ctor_elt): Handle RAW_DATA_CST.
      	(cxx_eval_array_reference): Likewise.
      	* typeck2.cc (digest_init_r): Emit -Wnarrowing and/or -Wconversion
      	diagnostics.
      	(process_init_constructor_array): Handle RAW_DATA_CST.
      	* decl.cc (maybe_deduce_size_from_array_init): Likewise.
      	(is_direct_enum_init): Fail for RAW_DATA_CST.
      	(cp_maybe_split_raw_data): New function.
      	(consume_init): New function.
      	(reshape_init_array_1): Add VECTOR_P argument.  Handle RAW_DATA_CST.
      	(reshape_init_array): Adjust reshape_init_array_1 caller.
      	(reshape_init_vector): Likewise.
      	(reshape_init_class): Handle RAW_DATA_CST.
      	(reshape_init_r): Likewise.
      gcc/testsuite/
      	* c-c++-common/cpp/embed-22.c: New test.
      	* c-c++-common/cpp/embed-23.c: New test.
      	* g++.dg/cpp/embed-4.C: New test.
      	* g++.dg/cpp/embed-5.C: New test.
      	* g++.dg/cpp/embed-6.C: New test.
      	* g++.dg/cpp/embed-7.C: New test.
      	* g++.dg/cpp/embed-8.C: New test.
      	* g++.dg/cpp/embed-9.C: New test.
      	* g++.dg/cpp/embed-10.C: New test.
      	* g++.dg/cpp/embed-11.C: New test.
      	* g++.dg/cpp/embed-12.C: New test.
      	* g++.dg/cpp/embed-13.C: New test.
      	* g++.dg/cpp/embed-14.C: New test.
      0223119f
  10. Dec 04, 2024
  11. Dec 03, 2024
    • Joseph Myers's avatar
      preprocessor: Adjust C rules on UCNs for C23 [PR117162] · f3b5de94
      Joseph Myers authored
      As noted in bug 117162, C23 changed some rules on UCNs to match C++
      (this was a late change agreed in the resolution to CD2 comment
      US-032, implementing changes from N3124), which we need to implement.
      
      Allow UCNs below 0xa0 outside identifiers for C, with a
      pedwarn-if-pedantic before C23 (and a warning with -Wc11-c23-compat)
      except for the always-allowed cases of UCNs for $ @ `.  Also as part
      of that change, do not allow \u0024 in identifiers as equivalent to $
      for C23.
      
      Bootstrapped with no regressions for x86_64-pc-linux-gnu.
      
      	PR c/117162
      
      libcpp/
      	* include/cpplib.h (struct cpp_options): Add low_ucns.
      	* init.cc (struct lang_flags, lang_defaults): Add low_ucns.
      	(cpp_set_lang): Set low_ucns
      	* charset.cc (_cpp_valid_ucn): For C, allow UCNs below 0xa0
      	outside identifiers, with a pedwarn if pedantic before C23 or a
      	warning with -Wc11-c23-compat.  Do not allow \u0024 in identifiers
      	for C23.
      
      gcc/testsuite/
      	* gcc.dg/cpp/c17-ucn-1.c, gcc.dg/cpp/c17-ucn-2.c,
      	gcc.dg/cpp/c17-ucn-3.c, gcc.dg/cpp/c17-ucn-4.c,
      	gcc.dg/cpp/c23-ucn-2.c, gcc.dg/cpp/c23-ucnid-2.c: New tests.
      	* c-c++-common/cpp/delimited-escape-seq-3.c,
      	c-c++-common/cpp/named-universal-char-escape-3.c,
      	gcc.dg/cpp/c23-ucn-1.c, gcc.dg/cpp/c2y-delimited-escape-seq-3.c:
      	Update expected messages
      	* gcc.dg/cpp/ucs.c: Use -pedantic-errors.  Update expected
      	messages.
      f3b5de94
  12. Nov 29, 2024
  13. Nov 28, 2024
    • David Malcolm's avatar
      diagnostics: replace %<%s%> with %qs [PR104896] · 9f06b910
      David Malcolm authored
      
      No functional change intended.
      
      gcc/analyzer/ChangeLog:
      	PR c/104896
      	* sm-malloc.cc: Replace "%<%s%>" with "%qs" in message wording.
      
      gcc/c-family/ChangeLog:
      	PR c/104896
      	* c-lex.cc (c_common_lex_availability_macro): Replace "%<%s%>"
      	with "%qs" in message wording.
      	* c-opts.cc (c_common_handle_option): Likewise.
      	* c-warn.cc (warn_parm_array_mismatch): Likewise.
      
      gcc/ChangeLog:
      	PR c/104896
      	* common/config/ia64/ia64-common.cc (ia64_handle_option): Replace
      	"%<%s%>" with "%qs" in message wording.
      	* common/config/rs6000/rs6000-common.cc (rs6000_handle_option):
      	Likewise.
      	* config/aarch64/aarch64.cc (aarch64_validate_sls_mitigation):
      	Likewise.
      	(aarch64_override_options): Likewise.
      	(aarch64_process_target_attr): Likewise.
      	* config/arm/aarch-common.cc (aarch_validate_mbranch_protection):
      	Likewise.
      	* config/pru/pru.cc (pru_insert_attributes): Likewise.
      	* config/riscv/riscv-target-attr.cc
      	(riscv_target_attr_parser::parse_arch): Likewise.
      	* omp-general.cc (oacc_verify_routine_clauses): Likewise.
      	* tree-ssa-uninit.cc (maybe_warn_read_write_only): Likewise.
      	(maybe_warn_pass_by_reference): Likewise.
      
      gcc/cp/ChangeLog:
      	PR c/104896
      	* cvt.cc (maybe_warn_nodiscard): Replace "%<%s%>" with "%qs" in
      	message wording.
      
      gcc/fortran/ChangeLog:
      	PR c/104896
      	* resolve.cc (resolve_operator): Replace "%<%s%>" with "%qs" in
      	message wording.
      
      gcc/go/ChangeLog:
      	PR c/104896
      	* gofrontend/embed.cc (Gogo::initializer_for_embeds): Replace
      	"%<%s%>" with "%qs" in message wording.
      	* gofrontend/expressions.cc
      	(Selector_expression::lower_method_expression): Likewise.
      	* gofrontend/gogo.cc (Gogo::set_package_name): Likewise.
      	(Named_object::export_named_object): Likewise.
      	* gofrontend/parse.cc (Parse::struct_type): Likewise.
      	(Parse::parameter_list): Likewise.
      
      gcc/rust/ChangeLog:
      	PR c/104896
      	* backend/rust-compile-expr.cc
      	(CompileExpr::compile_integer_literal): Replace "%<%s%>" with
      	"%qs" in message wording.
      	(CompileExpr::compile_float_literal): Likewise.
      	* backend/rust-compile-intrinsic.cc (Intrinsics::compile):
      	Likewise.
      	* backend/rust-tree.cc (maybe_warn_nodiscard): Likewise.
      	* checks/lints/rust-lint-scan-deadcode.h: Likewise.
      	* lex/rust-lex.cc (Lexer::parse_partial_unicode_escape): Likewise.
      	(Lexer::parse_raw_byte_string): Likewise.
      	* lex/rust-token.cc (Token::get_str): Likewise.
      	* metadata/rust-export-metadata.cc
      	(PublicInterface::write_to_path): Likewise.
      	* parse/rust-parse.cc
      	(peculiar_fragment_match_compatible_fragment): Likewise.
      	(peculiar_fragment_match_compatible): Likewise.
      	* resolve/rust-ast-resolve-path.cc (ResolvePath::resolve_path):
      	Likewise.
      	* resolve/rust-ast-resolve-toplevel.h: Likewise.
      	* resolve/rust-ast-resolve-type.cc (ResolveRelativeTypePath::go):
      	Likewise.
      	* rust-session-manager.cc (validate_crate_name): Likewise.
      	(Session::load_extern_crate): Likewise.
      	* typecheck/rust-hir-type-check-expr.cc (TypeCheckExpr::visit):
      	Likewise.
      	(TypeCheckExpr::resolve_fn_trait_call): Likewise.
      	* typecheck/rust-hir-type-check-implitem.cc
      	(TypeCheckImplItemWithTrait::visit): Likewise.
      	* typecheck/rust-hir-type-check-item.cc
      	(TypeCheckItem::validate_trait_impl_block): Likewise.
      	* typecheck/rust-hir-type-check-struct.cc
      	(TypeCheckStructExpr::visit): Likewise.
      	* typecheck/rust-tyty-call.cc (TypeCheckCallExpr::visit):
      	Likewise.
      	* typecheck/rust-tyty.cc (BaseType::bounds_compatible): Likewise.
      	* typecheck/rust-unify.cc (UnifyRules::emit_abi_mismatch):
      	Likewise.
      	* util/rust-attributes.cc (AttributeChecker::visit): Likewise.
      
      libcpp/ChangeLog:
      	PR c/104896
      	* pch.cc (cpp_valid_state): Replace "%<%s%>" with "%qs" in message
      	wording.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      9f06b910
    • GCC Administrator's avatar
      Daily bump. · 7a656d74
      GCC Administrator authored
      7a656d74
  14. Nov 27, 2024
    • Jason Merrill's avatar
      libcpp: modules and -include again · 134dc932
      Jason Merrill authored
      I enabled include translation to header units in r15-1104-ga29f481bbcaf2b,
      but it seems that patch wasn't sufficient, as any diagnostics in the main
      source file would show up as coming from the header instead.
      
      Fixed by setting buffer->file for leaving the file transition that my
      previous patch made us enter.  And don't push a buffer of newlines, in this
      case that messes up line numbers instead of aligning them.
      
      libcpp/ChangeLog:
      
      	* files.cc (_cpp_stack_file): Handle -include of header unit more
      	specially.
      
      gcc/testsuite/ChangeLog:
      
      	* g++.dg/modules/dashinclude-1_b.C: Add an #error.
      	* g++.dg/modules/dashinclude-1_a.H: Remove dg-module-do run.
      134dc932
  15. Nov 24, 2024
  16. Nov 23, 2024
    • Lewis Hyatt's avatar
      libcpp: Fix ICE lexing invalid raw string in a deferred pragma [PR117118] · 18cace46
      Lewis Hyatt authored
      The PR shows that we ICE after lexing an invalid unterminated raw string,
      because lex_raw_string() pops the main buffer unexpectedly. Resolve by
      handling this case the same way as for other directives.
      
      libcpp/ChangeLog:
      	PR preprocessor/117118
      	* lex.cc (lex_raw_string): Treat an unterminated raw string the same
      	way for a deferred pragma as is done for other directives.
      
      gcc/testsuite/ChangeLog:
      	PR preprocessor/117118
      	* c-c++-common/raw-string-directive-3.c: New test.
      	* c-c++-common/raw-string-directive-4.c: New test.
      18cace46
    • Lewis Hyatt's avatar
      libcpp: Fix potential unaligned access in cpp_buffer · c93eb81c
      Lewis Hyatt authored
      libcpp makes use of the cpp_buffer pfile->a_buff to store things while it is
      handling macros. It uses it to store pointers (cpp_hashnode*, for macro
      arguments) and cpp_macro objects. This works fine because a cpp_hashnode*
      and a cpp_macro have the same alignment requirement on either 32-bit or
      64-bit systems (namely, the same alignment as a pointer.)
      
      When 64-bit location_t is enabled on a 32-bit sytem, the alignment
      requirement may cease to be the same, because the alignment requirement of a
      cpp_macro object changes to that of a uint64_t, which be larger than that of
      a pointer. It's not the case for x86 32-bit, but for example, on sparc, a
      pointer has 4-byte alignment while a uint64_t has 8. In that case,
      intermixing the two within the same cpp_buffer leads to a misaligned
      access. The code path that triggers this is the one in _cpp_commit_buff in
      which a hash table with its own allocator (i.e. ggc) is not being used, so
      it doesn't happen within the compiler itself, but it happens in the other
      libcpp clients, such as genmatch.
      
      Fix that up by ensuring _cpp_commit_buff commits a fully aligned chunk of the
      buffer, so it's ready for anything it may be used for next.
      
      Also modify CPP_ALIGN so that it guarantees to return an alignment at least
      the size of location_t. Currently it returns the max of a pointer and a
      double. I am not aware of any platform where a double may have smaller
      alignment than a uint64_t, but it does not hurt to add location_t here to be
      sure.
      
      libcpp/ChangeLog:
      
      	* lex.cc (_cpp_commit_buff): Make sure that the buffer is properly
      	aligned for the next allocation.
      	* internal.h (struct dummy): Make sure alignment is large enough for
      	a location_t, just in case.
      c93eb81c
    • Lewis Hyatt's avatar
      Support for 64-bit location_t: libcpp preliminaries · 927625d0
      Lewis Hyatt authored
      Prepare libcpp to support 64-bit location_t, without yet making
      any functional changes, by adding new typedefs that enable code to be
      written such that it works with any size location_t. Update the usage of
      line maps within libcpp accordingly.
      
      Subsequent patches will prepare the rest of the codebase similarly, and then
      afterwards, location_t will be changed to uint64_t.
      
      libcpp/ChangeLog:
      
      	* include/line-map.h (line_map_uint_t): New typedef, the same type
      	as location_t.
      	(location_diff_t): New typedef.
      	(line_map_suggested_range_bits): New constant.
      	(struct maps_info_ordinary): Change member types from "unsigned int"
      	to "line_map_uint_t".
      	(struct maps_info_macro): Likewise.
      	(struct location_adhoc_data_map): Likewise.
      	(LINEMAPS_ALLOCATED): Change return type from "unsigned int" to
      	"line_map_uint_t".
      	(LINEMAPS_ORDINARY_ALLOCATED): Likewise.
      	(LINEMAPS_MACRO_ALLOCATED): Likewise.
      	(LINEMAPS_USED): Likewise.
      	(LINEMAPS_ORDINARY_USED): Likewise.
      	(LINEMAPS_MACRO_USED): Likewise.
      	(linemap_lookup_macro_index): Likewise.
      	(LINEMAPS_MAP_AT): Change argument type from "unsigned int" to
      	"line_map_uint_t".
      	(LINEMAPS_ORDINARY_MAP_AT): Likewise.
      	(LINEMAPS_MACRO_MAP_AT): Likewise.
      	(line_map_new_raw): Likewise.
      	(linemap_module_restore): Likewise.
      	(linemap_dump): Likewise.
      	(line_table_dump): Likewise.
      	(LINEMAPS_LAST_MAP): Add a linemap_assert() for safety.
      	(SOURCE_COLUMN): Use a cast to ensure correctness if location_t
      	becomes a 64-bit type.
      	* line-map.cc (location_adhoc_data_hash): Don't truncate to 32-bit
      	prematurely when hashing.
      	(line_maps::get_or_create_combined_loc): Adapt types to support
      	potentially 64-bit location_t. Use MAX_LOCATION_T rather than a
      	hard-coded constant.
      	(line_maps::get_range_from_loc): Adapt types and constants to
      	support potentially 64-bit location_t.
      	(line_maps::pure_location_p): Likewise.
      	(line_maps::get_pure_location): Likewise.
      	(line_map_new_raw): Likewise.
      	(LAST_SOURCE_LINE_LOCATION): Likewise.
      	(linemap_add): Likewise.
      	(linemap_module_restore): Likewise.
      	(linemap_line_start): Likewise.
      	(linemap_position_for_column): Likewise.
      	(linemap_position_for_line_and_column): Likewise.
      	(linemap_position_for_loc_and_offset): Likewise.
      	(linemap_ordinary_map_lookup): Likewise.
      	(linemap_lookup_macro_index): Likewise.
      	(linemap_dump): Likewise.
      	(linemap_dump_location): Likewise.
      	(linemap_get_file_highest_location): Likewise.
      	(line_table_dump): Likewise.
      	(linemap_compare_locations): Avoid signed int overflow in the result.
      	* macro.cc (num_expanded_macros_counter): Change type of global
      	variable from "unsigned int" to "line_map_uint_t".
      	(num_macro_tokens_counter): Likewise.
      927625d0
  17. Aug 28, 2024
  18. Aug 26, 2024
    • Alexander Monakov's avatar
      libcpp: deduplicate definition of padding size · a8260ebe
      Alexander Monakov authored
      Tie together the two functions that ensure tail padding with
      search_line_ssse3 via CPP_BUFFER_PADDING macro.
      
      libcpp/ChangeLog:
      
      	* internal.h (CPP_BUFFER_PADDING): New macro; use it ...
      	* charset.cc (_cpp_convert_input): ...here, and ...
      	* files.cc (read_file_guts): ...here, and ...
      	* lex.cc (search_line_ssse3): here.
      a8260ebe
  19. Aug 24, 2024
  20. Aug 23, 2024
  21. Aug 22, 2024
  22. Aug 21, 2024
  23. Aug 20, 2024
    • Jakub Jelinek's avatar
      libcpp: Adjust lang_defaults · 447c32c5
      Jakub Jelinek authored
      The table over the years turned to be very wide, 147 columns
      and any addition would add a couple of new ones.
      We need a 28x23 bit matrix right now.
      
      This patch changes the formatting, so that we need just 2 columns
      per new feature and so we have some room for expansion.
      In addition, the patch changes it to bitfields, which reduces
      .rodata by 532 bytes (so 5.75x reduction of the variable) and
      on x86_64-linux grows the cpp_set_lang function by 26 bytes (8.4%
      growth).
      
      2024-08-20  Jakub Jelinek  <jakub@redhat.com>
      
      	* init.cc (struct lang_flags): Change all members from char
      	typed fields to unsigned bit-fields.
      	(lang_defaults): Change formatting of the initializer so that it
      	fits to 68 columns rather than 147.
      447c32c5
    • Alexander Monakov's avatar
      libcpp: replace SSE4.2 helper with an SSSE3 one · 20a5b482
      Alexander Monakov authored
      Since the characters we are searching for (CR, LF, '\', '?') all have
      distinct ASCII codes mod 16, PSHUFB can help match them all at once.
      
      Directly use the new helper if __SSSE3__ is defined. It makes the other
      helpers unused, so mark them inline to prevent warnings.
      
      Rewrite and simplify init_vectorized_lexer.
      
      libcpp/ChangeLog:
      
      	* config.in: Regenerate.
      	* configure: Regenerate.
      	* configure.ac: Check for SSSE3 instead of SSE4.2.
      	* files.cc (read_file_guts): Bump padding to 64 if HAVE_SSSE3.
      	* lex.cc (search_line_acc_char): Mark inline, not "unused".
      	(search_line_sse2): Mark inline.
      	(search_line_sse42): Replace with...
      	(search_line_ssse3): ... this new function.  Adjust the use...
      	(init_vectorized_lexer): ... here.  Simplify.
      20a5b482
  24. Aug 07, 2024
  25. Aug 06, 2024
    • Andi Kleen's avatar
      Remove MMX code path in lexer · eac63be1
      Andi Kleen authored
      Host systems with only MMX and no SSE2 should be really rare now.
      Let's remove the MMX code path to keep the number of custom
      implementations the same.
      
      The SSE2 code path is also somewhat dubious now (nearly everything
      should have SSE4 4.2 which is >15 years old now), but the SSE2
      code path is used as fallback for others and also apparently
      Solaris uses it due to tool chain deficiencies.
      
      libcpp/ChangeLog:
      
      	* lex.cc (search_line_mmx): Remove function.
      	(init_vectorized_lexer): Remove search_line_mmx.
      eac63be1
  26. Jul 26, 2024
  27. Jul 25, 2024
    • Jakub Jelinek's avatar
      c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343] · 29341f21
      Jakub Jelinek authored
      The following patch implements the easy parts of the paper.
      When @$` are added to the basic character set, it means that
      R"@$`()@$`" should now be valid (here I've noticed most of the
      raw string tests were tested solely with -std=c++11 or -std=gnu++11
      and I've tried to change that), and on the other side even if
      by extension $ is allowed in identifiers, \u0024 or \U00000024
      or \u{24} should not be, similarly how \u0041 is not allowed.
      
      The paper in 3.1 claims though that
       #include <stdio.h>
      
       #define STR(x) #x
      
      int main()
      {
        printf("%s", STR(\u0060)); // U+0060 is ` GRAVE ACCENT
      }
      should have been accepted before this paper (and rejected after it),
      but g++ rejects it.
      
      I've tried to understand it, but am confused on what is the right
      behavior and why.
      
      Consider
       #define STR(x) #x
      const char *a = "\u00b7";
      const char *b = STR(\u00b7);
      const char *c = "\u0041";
      const char *d = STR(\u0041);
      const char *e = STR(a\u00b7);
      const char *f = STR(a\u0041);
      const char *g = STR(a \u00b7);
      const char *h = STR(a \u0041);
      const char *i = "\u066d";
      const char *j = STR(\u066d);
      const char *k = "\u0040";
      const char *l = STR(\u0040);
      const char *m = STR(a\u066d);
      const char *n = STR(a\u0040);
      const char *o = STR(a \u066d);
      const char *p = STR(a \u0040);
      
      Neither clang nor gcc emit any diagnostics on the a, c, i and k
      initializers, those are certainly valid (c is invalid in C23 though).  g++
      emits with -pedantic-errors errors on all the others, while clang++ on the
      ones with STR involving \u0041, \u0040 and a\u0066d.  The chosen values are
      \u0040 '@' as something being changed by this paper, \u0041 'A' as basic
      character set char valid in identifiers before/after, \u00b7 as an example
      of character which is pedantically valid in identifiers if not at the start
      and \u066d s something pedantically not valid in identifiers.
      
      Now, https://eel.is/c++draft/lex.charset#6 says that UCN used outside of a
      string/character literal which corresponds to basic character set character
      (or control character) is ill-formed, that would make d, f, h cases invalid
      for C++ and l, n, p cases invalid for C++26.
      
      https://eel.is/c++draft/lex.name states which characters can appear at the
      start of the identifier and which can appear after the start.  And
      https://eel.is/c++draft/lex.pptoken states that preprocessing-token is
      either identifier, or tons of other things, or "each non-whitespace
      character that cannot be one of the above"
      
      Then https://eel.is/c++draft/lex.pptoken#1 says that this last category is
      invalid if the preprocessing token is being converted into token.
      
      And https://eel.is/c++draft/lex.pptoken#2 includes "If any character not in
      the basic character set matches the last category, the program is
      ill-formed."
      
      Now, e.g.  for the C++23 STR(\u0040) case, \u0040 is there not in the basic
      character set, so valid outside of the literals (not the case anymore in
      C++26), but it isn't nondigit and doesn't have XID_Start property, so it
      isn't IMHO an identifier and so must be the "each non-whitespace character
      that cannot be one of the above" case.  Why doesn't the above mentioned
      https://eel.is/c++draft/lex.pptoken#2 sentence make that invalid?  Ignoring
      that, I'd say it would be then stringized and that feels like it is what
      clang++ is doing.  Now, e.g.  for the STR(a\u066d) case, I wonder why that
      isn't lexed as a identifier followed by \u066d "each non-whitespace
      character that cannot be one of the above" token and stringified similarly,
      clang++ rejects that.
      
      What GCC libcpp seems to be doing is that if that forms_identifier_p calls
      _cpp_valid_utf8 or _cpp_valid_ucn with an argument which tells it is first
      or second+ in identifier, and e.g.  _cpp_valid_ucn then for UCNs valid in
      string literals calls
        else if (identifier_pos)
          {
            int validity = ucn_valid_in_identifier (pfile, result, nst);
      
            if (validity == 0)
              cpp_error (pfile, CPP_DL_ERROR,
                         "universal character %.*s is not valid in an identifier",
                         (int) (str - base), base);
            else if (validity == 2 && identifier_pos == 1)
              cpp_error (pfile, CPP_DL_ERROR,
         "universal character %.*s is not valid at the start of an identifier",
                         (int) (str - base), base);
          }
      so basically all those invalid in identifiers cases emit an error and
      pretend to be valid in identifiers, rather than what e.g.  _cpp_valid_utf8
      does for C but not for C++ and only for the chars completely invalid in
      identifiers rather than just valid in identifiers but not at the start:
                /* In C++, this is an error for invalid character in an identifier
                   because logically, the UTF-8 was converted to a UCN during
                   translation phase 1 (even though we don't physically do it that
                   way).  In C, this byte rather becomes grammatically a separate
                   token.  */
      
                if (CPP_OPTION (pfile, cplusplus))
                  cpp_error (pfile, CPP_DL_ERROR,
                             "extended character %.*s is not valid in an identifier",
                             (int) (*pstr - base), base);
                else
                  {
                    *pstr = base;
                    return false;
                  }
      The comment doesn't really match what is done in recent C++ versions because
      there UCNs are translated to characters and not the other way around.
      
      2024-07-25  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/110343
      libcpp/
      	* lex.cc: C++26 P2558R2 - Add @, $, and ` to the basic character set.
      	(lex_raw_string): For C++26 allow $@` characters in prefix.
      	* charset.cc (_cpp_valid_ucn): For C++26 reject \u0024 in identifiers.
      gcc/testsuite/
      	* c-c++-common/raw-string-1.c: Use { c || c++11 } effective target,
      	remove c++ specific dg-options.
      	* c-c++-common/raw-string-2.c: Likewise.
      	* c-c++-common/raw-string-4.c: Likewise.
      	* c-c++-common/raw-string-5.c: Likewise.  Expect some diagnostics
      	only for non-c++26, for c++26 expect different.
      	* c-c++-common/raw-string-6.c: Use { c || c++11 } effective target,
      	remove c++ specific dg-options.
      	* c-c++-common/raw-string-11.c: Likewise.
      	* c-c++-common/raw-string-13.c: Likewise.
      	* c-c++-common/raw-string-14.c: Likewise.
      	* c-c++-common/raw-string-15.c: Use { c || c++11 } effective target,
      	change c++ specific dg-options to just -Wtrigraphs.
      	* c-c++-common/raw-string-16.c: Likewise.
      	* c-c++-common/raw-string-17.c: Use { c || c++11 } effective target,
      	remove c++ specific dg-options.
      	* c-c++-common/raw-string-18.c: Use { c || c++11 } effective target,
      	remove -std=c++11 from c++ specific dg-options.
      	* c-c++-common/raw-string-19.c: Likewise.
      	* g++.dg/cpp26/raw-string1.C: New test.
      	* g++.dg/cpp26/raw-string2.C: New test.
      29341f21
    • GCC Administrator's avatar
      Daily bump. · 25256af1
      GCC Administrator authored
      25256af1
  28. Jul 24, 2024
    • David Malcolm's avatar
      diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4) · 148066bd
      David Malcolm authored
      
      This patch adds support to our SARIF output for cases where
      rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars.
      
      In such cases, the pertinent SARIF "location" object gains a property
      bag with property "gcc/escapeNonAscii": true, and the "artifactContent"
      within the location's physical location's snippet" gains a "rendered"
      property (§3.3.4) that escapes non-ASCII text in the snippet, such as:
      
      "rendered": {"text":
      
      where "text" has a string value such as (for a "trojan source" attack):
      
        "9 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n"
        "  |       ~~~~~~~~                                ~~~~~~~~                    ^\n"
        "  |       |                                       |                           |\n"
        "  |       |                                       |                           end of bidirectional context\n"
        "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)\n"
      
      where the escaping is affected by -fdiagnostics-escape-format=; with
      -fdiagnostics-escape-format=bytes, the rendered text of the above is:
      
        "9 |     /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n"
        "  |       ~~~~~~~~~~~~                                        ~~~~~~~~~~~~                    ^\n"
        "  |       |                                                   |                               |\n"
        "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)                     U+2066 (LEFT-TO-RIGHT ISOLATE)  end of bidirectional context\n"
      
      The patch also refactors/adds enough selftest machinery to be able to
      test the snippet generation from within the selftest framework, rather
      than just within DejaGnu (where the regex-based testing isn't
      sophisticated enough to verify such properties as the above).
      
      gcc/ChangeLog:
      	* Makefile.in (OBJS-libcommon): Add selftest-json.o.
      	* diagnostic-format-sarif.cc: Include "selftest.h",
      	"selftest-diagnostic.h", "selftest-diagnostic-show-locus.h",
      	"selftest-json.h", and "text-range-label.h".
      	(class content_renderer): New.
      	(sarif_builder::m_rules_arr): Convert to std::unique_ptr.
      	(sarif_builder::make_location_object): Add class
      	escape_nonascii_renderer.  If rich_loc.escape_on_output_p (),
      	pass a nonnull escape_nonascii_renderer to
      	maybe_make_physical_location_object as its snippet_renderer, and
      	add a property bag property "gcc/escapeNonAscii" to the SARIF
      	location object.  For other overloads of make_location_object,
      	pass nullptr for the snippet_renderer.
      	(sarif_builder::maybe_make_region_object_for_context): Add
      	"snippet_renderer" param and pass it to
      	maybe_make_artifact_content_object.
      	(sarif_builder::make_tool_object): Drop "const".
      	(sarif_builder::make_driver_tool_component_object): Likewise.
      	Use typesafe unique_ptr variant of object::set for setting "rules"
      	property on driver_obj.
      	(sarif_builder::maybe_make_artifact_content_object): Add param "r"
      	and use it to potentially set the "rendered" property (§3.3.4).
      	(selftest::test_make_location_object): New.
      	(selftest::diagnostic_format_sarif_cc_tests): New.
      	* diagnostic-show-locus.cc: Include "text-range-label.h" and
      	"selftest-diagnostic-show-locus.h".
      	(selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture):
      	New.
      	(selftests::test_layout_x_offset_display_utf8): Use
      	diagnostic_show_locus_fixture to simplify and consolidate setup
      	code.
      	(selftests::test_diagnostic_show_locus_one_liner): Likewise.
      	(selftests::test_one_liner_colorized_utf8): Likewise.
      	(selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise.
      	* gcc-rich-location.h (class text_range_label): Move to new file
      	text-range-label.h.
      	* selftest-diagnostic-show-locus.h: New file, based on material in
      	diagnostic-show-locus.cc.
      	* selftest-json.cc: New file.
      	* selftest-json.h: New file.
      	* selftest-run-tests.cc (selftest::run_tests): Call
      	selftest::diagnostic_format_sarif_cc_tests.
      	* selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl.
      
      gcc/testsuite/ChangeLog:
      	* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify
      	that we have a property bag with property "gcc/escapeNonAscii": true.
      	Verify that we have a "rendered" property for a snippet.
      	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include
      	"text-range-label.h".
      
      gcc/ChangeLog:
      	* text-range-label.h: New file, taking class text_range_label from
      	gcc-rich-location.h.
      
      libcpp/ChangeLog:
      	* include/rich-location.h
      	(semi_embedded_vec::semi_embedded_vec): Add copy ctor.
      	(rich_location::rich_location): Remove "= delete" from decl of
      	copy ctor.  Add deleted decl of move ctor.
      	(rich_location::operator=): Remove "= delete" from decl of
      	copy assignment.  Add deleted decl of move assignment.
      	(fixit_hint::fixit_hint): Add copy ctor decl.  Add deleted decl of
      	move.
      	(fixit_hint::operator=): Add copy assignment decl.  Add deleted
      	decl of move assignment.
      	* line-map.cc (rich_location::rich_location): New copy ctor.
      	(fixit_hint::fixit_hint): New copy ctor.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      148066bd
  29. Jul 14, 2024
  30. Jul 13, 2024
    • David Malcolm's avatar
      diagnostics: add highlight-a vs highlight-b in colorization and pp_markup · 7d73c01c
      David Malcolm authored
      
      Since r6-4582-g8a64515099e645 (which added class rich_location), ranges
      of quoted source code have been colorized using the following rules:
      - the primary range used the same color of the kind of the diagnostic
      i.e. "error" vs "warning" etc (defaulting to bold red and bold magenta
      respectively)
      - secondary ranges alternate between "range1" and "range2" (defaulting
      to green and blue respectively)
      
      This works for cases with large numbers of highlighted ranges, but is
      suboptimal for common cases.
      
      The following patch adds a pair of color names: "highlight-a" and
      "highlight-b", and uses them whenever it makes sense to highlight and
      contrast two different things in the source code (e.g. a type mismatch).
      These are used by diagnostic-show-locus.cc for highlighting quoted
      source.  In addition the patch adds colorization to fragments within the
      corresponding diagnostic messages themselves, using consistent
      colorization between the message and the quoted source code for the two
      different things being contrasted.
      
      For example, consider:
      
      demo.c: In function ‘test_bad_format_string_args’:
      ../../src/demo.c:25:18: warning: format ‘%i’ expects argument of
        type ‘int’, but argument 2 has type ‘const char *’ [-Wformat=]
         25 |   printf("hello %i", msg);
            |                 ~^   ~~~
            |                  |   |
            |                  int const char *
            |                 %s
      
      Previously, the types within the message in quotes would be in bold but
      not colorized, and the labelled ranges of quoted source code would use
      bold magenta for the "int" and non-bold green for the "const char *".
      
      With this patch:
      - the "%i" and "int" in the message and the "int" in the quoted source
        are all colored bold green
      - the "const char *" in the message and in the quoted source are both
        colored bold blue
      so that the consistent use of contrasting color draws the reader's eyes
      to the relationships between the diagnostic message and the source.
      
      I've tried this with gnome-terminal with many themes, including a
      variety of light versus dark backgrounds, solarized versus non-solarized
      themes, etc, and it was readable in all.
      
      My initial version of the patch used the existing %r and %R facilities
      within pretty-print.cc for the messages, but this turned out to be very
      uncomfortable, leading to error-prone format strings such as:
      
        error_at (richloc,
                  "invalid operands to binary %s (have %<%r%T%R%> and %<%r%T%R%>)",
                  opname,
                  "highlight-a", type0,
                  "highlight-b", type1);
      
      To avoid requiring monstrosities such as the above, the patch adds a new
      "%e" format code to pretty-print.cc, which expects a pp_element *, where
      pp_element is a new abstract base class (actually a pp_markup::element),
      along with various useful subclasses.  This lets the above be written
      as:
      
        pp_markup::element_quoted_type element_0 (type0, highlight_colors::lhs);
        pp_markup::element_quoted_type element_1 (type1, highlight_colors::rhs);
        error_at (richloc,
                  "invalid operands to binary %s (have %e and %e)",
                  opname, &element_0, &element_1);
      
      which I feel is maintainable and clear to translators; the use of %e and
      pp_element * captures the type-unsafe part of the variadic call, and the
      subclasses allow for type-safety (so e.g. an element_quoted_type expects
      a type and a highlighting color).  This approach allows for some nice
      simplifications within c-format.cc.
      
      The patch also extends -Wformat to "teach" it about the new %e and
      pp_element *.  Doing so requires c-format.cc to be able to determine
      if a T * is a pp_element * (i.e. if T is a subclass).  To do so I added
      a new comp_types callback for comparing types, where the C++ frontend
      supplies a suitable implementation (and %e will always be wrong for C).
      
      I've manually tested this on many diagnostics with both C and C++ and it
      seems a subtle but significant improvement in readability.
      
      I've added a new option -fno-diagnostics-show-highlight-colors in case
      people prefer the old behavior.
      
      gcc/c-family/ChangeLog:
      	* c-common.cc: Include "tree-pretty-print-markup.h".
      	(binary_op_error): Use pp_markup::element_quoted_type and %e.
      	(check_function_arguments): Add "comp_types" param and pass it to
      	check_function_format.
      	* c-common.h (check_function_arguments): Add "comp_types" param.
      	(check_function_format): Likewise.
      	* c-format.cc: Include "tree-pretty-print-markup.h".
      	(local_pp_element_ptr_node): New.
      	(PP_FORMAT_CHAR_TABLE): Add entry for %e.
      	(struct format_check_context): Add "m_comp_types" field.
      	(check_function_format): Add "comp_types" param and pass it to
      	check_format_info.
      	(check_format_info): Likewise, passing it to format_ctx's ctor.
      	(check_format_arg): Extract m_comp_types from format_ctx and
      	pass it to check_format_info_main.
      	(check_format_info_main): Add "comp_types" param and pass it to
      	arg_parser's ctor.
      	(class argument_parser): Add "m_comp_types" field.
      	(argument_parser::check_argument_type): Pass m_comp_types to
      	check_format_types.
      	(handle_subclass_of_pp_element_p): New.
      	(check_format_types): Add "comp_types" param, and use it to
      	call handle_subclass_of_pp_element_p.
      	(class element_format_substring): New.
      	(class element_expected_type_with_indirection): New.
      	(format_type_warning): Use element_expected_type_with_indirection
      	to unify the if (wanted_type_name) branches, reducing from four
      	emit_warning calls to two.  Simplify these further using %e.
      	Doing so also gives suitable colorization of the text within the
      	diagnostics.
      	(init_dynamic_diag_info): Initialize local_pp_element_ptr_node.
      	(selftest::test_type_mismatch_range_labels): Add nullptr for new
      	param of gcc_rich_location label overload.
      	* c-format.h (T_PP_ELEMENT_PTR): New.
      	* c-type-mismatch.cc: Include "diagnostic-highlight-colors.h".
      	(binary_op_rich_location::binary_op_rich_location): Use
      	highlight_colors::lhs and highlight_colors::rhs for the ranges.
      	* c-type-mismatch.h (class binary_op_rich_location): Add comment
      	about highlight_colors.
      
      gcc/c/ChangeLog:
      	* c-objc-common.cc: Include "tree-pretty-print-markup.h".
      	(print_type): Add optional "highlight_color" param and use it
      	to show highlight colors in "aka" text.
      	(pp_markup::element_quoted_type::print_type): New.
      	* c-typeck.cc: Include "tree-pretty-print-markup.h".
      	(comp_parm_types): New.
      	(build_function_call_vec): Pass it to check_function_arguments.
      	(inform_for_arg): Use %e and highlight colors to contrast actual
      	versus expected.
      	(convert_for_assignment): Use highlight_colors::actual for the
      	rhs_label.
      	(build_binary_op): Use highlight_colors::lhs and highlight_colors::rhs
      	for the ranges.
      
      gcc/ChangeLog:
      	* common.opt (fdiagnostics-show-highlight-colors): New option.
      	* common.opt.urls: Regenerate.
      	* coretypes.h (pp_markup::element): New forward decl.
      	(pp_element): New typedef.
      	* diagnostic-color.cc (gcc_color_defaults): Add "highlight-a"
      	and "highlight-b".
      	* diagnostic-format-json.cc (diagnostic_output_format_init_json):
      	Disable highlight colors.
      	* diagnostic-format-sarif.cc (diagnostic_output_format_init_sarif):
      	Likewise.
      	* diagnostic-highlight-colors.h: New file.
      	* diagnostic-path.cc (struct event_range): Pass nullptr for
      	highlight color of m_rich_loc.
      	* diagnostic-show-locus.cc (colorizer::set_range): Handle ranges
      	with m_highlight_color.
      	(colorizer::STATE_NAMED_COLOR): New.
      	(colorizer::m_richloc): New field.
      	(colorizer::colorizer): Add richloc param for initializing
      	m_richloc.
      	(colorizer::set_named_color): New.
      	(colorizer::begin_state): Add case STATE_NAMED_COLOR.
      	(layout::layout): Pass richloc to m_colorizer's ctor.
      	(selftest::test_one_liner_labels): Pass nullptr for new param of
      	gcc_rich_location ctor for labels.
      	(selftest::test_one_liner_labels_utf8): Likewise.
      	* diagnostic.h (diagnostic_context::set_show_highlight_colors):
      	New.
      	* doc/invoke.texi: Add option -fdiagnostics-show-highlight-colors
      	and highlight-a and highlight-b color caps.
      	* doc/ux.texi
      	(Use color consistently when highlighting mismatches): New
      	subsection.
      	* gcc-rich-location.cc (gcc_rich_location::add_expr): Add
      	"highlight_color" param.
      	(gcc_rich_location::maybe_add_expr): Likewise.
      	* gcc-rich-location.h (gcc_rich_location::gcc_rich_location):
      	Split out into a pair of ctors, where if a range_label is supplied
      	the caller must also supply a highlight color.
      	(gcc_rich_location::add_expr): Add "highlight_color" param.
      	(gcc_rich_location::maybe_add_expr): Likewise.
      	* gcc.cc (driver_handle_option): Handle
      	OPT_fdiagnostics_show_highlight_colors.
      	* lto-wrapper.cc (merge_and_complain): Likewise.
      	(append_compiler_options): Likewise.
      	(append_diag_options): Likewise.
      	(run_gcc): Likewise.
      	* opts-common.cc (decode_cmdline_options_to_array): Add comment
      	about -fno-diagnostics-show-highlight-colors.
      	* opts-global.cc (init_options_once): Preserve
      	pp_show_highlight_colors in case the global_dc's printer is
      	recreated.
      	* opts.cc (common_handle_option): Handle
      	OPT_fdiagnostics_show_highlight_colors.
      	(gen_command_line_string): Likewise.
      	* pretty-print-markup.h: New file.
      	* pretty-print.cc: Include "pretty-print-markup.h" and
      	"diagnostic-highlight-colors.h".
      	(pretty_printer::format): Handle %e.
      	(pretty_printer::pretty_printer): Handle new field
      	m_show_highlight_colors.
      	(pp_string_n): New.
      	(pp_markup::context::begin_quote): New.
      	(pp_markup::context::end_quote): New.
      	(pp_markup::context::begin_color): New.
      	(pp_markup::context::end_color): New.
      	(highlight_colors::expected): New.
      	(highlight_colors::actual): New.
      	(highlight_colors::lhs): New.
      	(highlight_colors::rhs): New.
      	(class selftest::test_element): New.
      	(selftest::test_pp_format): Add tests of %e.
      	(selftest::test_urlification): Likewise.
      	* pretty-print.h (pp_markup::context): New forward decl.
      	(class chunk_info): Add friend class pp_markup::context.
      	(class pretty_printer): Add friend pp_show_highlight_colors.
      	(pretty_printer::m_show_highlight_colors): New field.
      	(pp_show_highlight_colors): New inline function.
      	(pp_string_n): New decl.
      	* substring-locations.cc: Include "diagnostic-highlight-colors.h".
      	(format_string_diagnostic_t::highlight_color_format_string): New.
      	(format_string_diagnostic_t::highlight_color_param): New.
      	(format_string_diagnostic_t::emit_warning_n_va): Use highlight
      	colors.
      	* substring-locations.h
      	(format_string_diagnostic_t::highlight_color_format_string): New.
      	(format_string_diagnostic_t::highlight_color_param): New.
      	* toplev.cc (general_init): Initialize global_dc's
      	show_highlight_colors.
      	* tree-pretty-print-markup.h: New file.
      
      gcc/cp/ChangeLog:
      	* call.cc: Include "tree-pretty-print-markup.h".
      	(implicit_conversion_error): Use highlight_colors::percent_h for
      	the labelled range.
      	(op_error_string): Split out into...
      	(concat_op_error_string): ...this.
      	(binop_error_string): New.
      	(op_error): Use %e, binop_error_string, highlight_colors::lhs,
      	and highlight_colors::rhs.
      	(maybe_inform_about_fndecl_for_bogus_argument_init): Add
      	"highlight_color" param; use it for the richloc.
      	(convert_like_internal): Use highlight_colors::percent_h for the
      	labelled_range, and highlight_colors::percent_i for the call to
      	maybe_inform_about_fndecl_for_bogus_argument_init.
      	(build_over_call): Pass cp_comp_parm_types for new "comp_types"
      	param of check_function_arguments.
      	(complain_about_bad_argument): Use highlight_colors::percent_h for
      	the labelled_range, and highlight_colors::percent_i for the call
      	to maybe_inform_about_fndecl_for_bogus_argument_init.
      	* cp-tree.h (maybe_inform_about_fndecl_for_bogus_argument_init):
      	Add optional highlight_color param.
      	(cp_comp_parm_types): New decl.
      	(highlight_colors::const percent_h): New decl.
      	(highlight_colors::const percent_i): New decl.
      	* error.cc: Include "tree-pretty-print-markup.h".
      	(highlight_colors::const percent_h): New defn.
      	(highlight_colors::const percent_i): New defn.
      	(type_to_string): Add param "highlight_color" and use it.
      	(print_nonequal_arg): Likewise.
      	(print_template_differences): Add params "highlight_color_a" and
      	"highlight_color_b".
      	(type_to_string_with_compare): Add params "this_highlight_color"
      	and "peer_highlight_color".
      	(print_template_tree_comparison): Add params "highlight_color_a"
      	and "highlight_color_b".
      	(cxx_format_postprocessor::handle):
      	Use highlight_colors::percent_h and highlight_colors::percent_i.
      	(pp_markup::element_quoted_type::print_type): New.
      	(range_label_for_type_mismatch::get_text): Pass nullptr for new
      	params of type_to_string_with_compare.
      	* typeck.cc (cp_comp_parm_types): New.
      	(cp_build_function_call_vec): Pass it to check_function_arguments.
      	(convert_for_assignment): Use highlight_colors::percent_h for the
      	labelled_range.
      
      gcc/testsuite/ChangeLog:
      	* g++.dg/diagnostic/bad-binary-ops-highlight-colors.C: New test.
      	* g++.dg/diagnostic/bad-binary-ops-no-highlight-colors.C: New test.
      	* g++.dg/plugin/plugin.exp (plugin_test_list): Add
      	show-template-tree-color-no-highlight-colors.C to
      	show_template_tree_color_plugin.c.
      	* g++.dg/plugin/show-template-tree-color-labels.C: Update expected
      	output to reflect use of highlight-a and highlight-b to contrast
      	mismatches.
      	* g++.dg/plugin/show-template-tree-color-no-elide-type.C:
      	Likewise.
      	* g++.dg/plugin/show-template-tree-color-no-highlight-colors.C:
      	New test.
      	* g++.dg/plugin/show-template-tree-color.C: Update expected output
      	to reflect use of highlight-a and highlight-b to contrast
      	mismatches.
      	* g++.dg/warn/Wformat-gcc_diag-1.C: New test.
      	* g++.dg/warn/Wformat-gcc_diag-2.C: New test.
      	* g++.dg/warn/Wformat-gcc_diag-3.C: New test.
      	* gcc.dg/bad-binary-ops-highlight-colors.c: New test.
      	* gcc.dg/format/colors.c: New test.
      	* gcc.dg/plugin/diagnostic_plugin_show_trees.c (show_tree): Pass
      	nullptr for new param of gcc_rich_location::add_expr.
      
      libcpp/ChangeLog:
      	* include/rich-location.h (location_range::m_highlight_color): New
      	field.
      	(rich_location::rich_location): Add optional label_highlight_color
      	param.
      	(rich_location::set_highlight_color): New decl.
      	(rich_location::add_range): Add optional label_highlight_color
      	param.
      	(rich_location::set_range): Likewise.
      	* line-map.cc (rich_location::rich_location): Add
      	"label_highlight_color" param and pass it to add_range.
      	(rich_location::set_highlight_color): New.
      	(rich_location::add_range): Add "label_highlight_color" param.
      	(rich_location::set_range): Add "highlight_color" param.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      7d73c01c
  31. Jun 22, 2024
  32. Jun 21, 2024
    • David Malcolm's avatar
      diagnostics: fixes to SARIF output [PR109360] · 9f4fdc3a
      David Malcolm authored
      
      When adding validation of .sarif files against the schema
      (PR testsuite/109360) I discovered various issues where we were
      generating invalid .sarif files.
      
      Specifically, in
        c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
      the relatedLocations for the "note" diagnostics were missing column
      numbers, leading to validation failure due to non-unique elements,
      such as multiple:
      	"message": {"text": "invalid UTF-8 character <bf>"}},
      on line 25 with no column information.
      
      Root cause is that for some diagnostics in libcpp we have a location_t
      representing the line as a whole, setting a column_override on the
      rich_location (since the line hasn't been fully read yet).  We were
      handling this column override for plain text output, but not for .sarif
      output.
      
      Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
      emitted on "line 0" of the file, whereas SARIF requires line numbers to
      be positive.
      
      We also use column == 0 internally to mean "the line as a whole",
      whereas SARIF required column numbers to be positive.
      
      This patch fixes these various issues.
      
      gcc/ChangeLog:
      	PR testsuite/109360
      	* diagnostic-format-sarif.cc
      	(sarif_builder::make_location_object): Pass any column override
      	from rich_loc to maybe_make_physical_location_object.
      	(sarif_builder::maybe_make_physical_location_object): Add
      	"column_override" param and pass it to maybe_make_region_object.
      	(sarif_builder::maybe_make_region_object): Add "column_override"
      	param and use it when the location has 0 for a column.  Don't
      	add "startLine", "startColumn", "endLine", or "endColumn" if
      	the values aren't positive.
      	(sarif_builder::maybe_make_region_object_for_context): Don't
      	add "startLine" or "endLine" if the values aren't positive.
      
      libcpp/ChangeLog:
      	PR testsuite/109360
      	* include/rich-location.h (rich_location::get_column_override):
      	New accessor.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      9f4fdc3a
Loading