Commits · 97c308704f7dd03b87c32b004f989c22cffefcd7 · COBOLworx / gcc-cobol

Jan 02, 2025
- Update copyright years. · 29bc14c7
  Jakub Jelinek authored 2 months ago
  
  29bc14c7
- Update copyright years. · 6441eb6d
  Jakub Jelinek authored 2 months ago
  
  6441eb6d
- Update Copyright year in ChangeLog files · 9cf2fb5d
  Jakub Jelinek authored 2 months ago
  
  2024 -> 2025
  9cf2fb5d
Dec 25, 2024
- Daily bump. · c6b7d034
  GCC Administrator authored 2 months ago
  
  c6b7d034
Dec 24, 2024

libcpp: Fix overly large buffer allocation · 27af1a14

Lewis Hyatt authored 5 months ago

It seems that tokens_buff_new() has always been allocating the virtual
location buffer 4 times larger than intended, and now that location_t is
64-bit, it is 8 times larger. Fixed.

libcpp/ChangeLog:

	* macro.cc (tokens_buff_new): Fix length argument to XNEWVEC.

27af1a14

Dec 17, 2024
- Daily bump. · 733edbfd
  GCC Administrator authored 3 months ago
  
  733edbfd
Dec 16, 2024
- Update cpplib sr.po · 62597d19
  Joseph Myers authored 3 months ago
  
  * sr.po: Update.
  62597d19
Dec 09, 2024
- Daily bump. · a41b1a00
  GCC Administrator authored 3 months ago
  
  a41b1a00
Dec 08, 2024

Support for 64-bit location_t: Activate 64-bit location_t · d9cdc500

Lewis Hyatt authored 4 months ago

Change location_t to be a 64-bit integer instead of a 32-bit integer in
libcpp.

Also included in this change are the two other patches in the original
series which depended on this one; I am committing them all at once in case
it needs to be reverted later:

-Support for 64-bit location_t: gimple parts

The size of struct gimple increased by 8 bytes with the change in size of
location_t from 32- to 64-bit; adjust the WORD markings in the comments
accordingly. It seems that most of the WORD markings were off by one already,
probably not having been updated after a previous reduction in the size of a
gimple, so they have become retroactively correct again, and only a couple
needed adjustment actually.

Also add a comment that there is now 32 bits of unused padding available in
struct gimple for 64-bit hosts.

-Support for 64-bit location_t: Remove -flarge-source-files

The option -flarge-source-files became unnecessary with 64-bit location_t
and harms performance compared to the new default setting, so silently
ignore it.

libcpp/ChangeLog:

	* include/cpplib.h (struct cpp_token): Adjust comment about the
	struct size.
	* include/line-map.h (location_t): Change typedef from 32-bit to 64-bit
	integer.
	(LINE_MAP_MAX_COLUMN_NUMBER): Increase size to be appropriate for
	64-bit location_t.
	(LINE_MAP_MAX_LOCATION_WITH_PACKED_RANGES): Likewise.
	(LINE_MAP_MAX_LOCATION_WITH_COLS): Likewise.
	(LINE_MAP_MAX_LOCATION): Likewise.
	(MAX_LOCATION_T): Likewise.
	(line_map_suggested_range_bits): Likewise.
	(struct line_map): Adjust comment about the struct size.
	(struct line_map_macro): Likewise.
	(struct line_map_ordinary): Likewise. Rearrange fields to optimize
	padding.

gcc/testsuite/ChangeLog:

	* g++.dg/diagnostic/pr77949.C: Adapt the test for 64-bit location_t,
	when the previously expected failure doesn't actually happen.
	* g++.dg/modules/loc-prune-4.C: Adjust the expected output for the
	64-bit location_t case.
	* gcc.dg/plugin/expensive_selftests_plugin.cc: Don't try to test
	the maximum supported column number in 64-bit location_t mode.
	* gcc.dg/plugin/location_overflow_plugin.cc: Adjust the base_location
	so it can effectively test 64-bit location_t.

gcc/ChangeLog:

	* gimple.h (struct gphi): Update word marking comments to reflect
	the new size of location_t.
	(struct gimple): Likewise. Add a comment about padding.
	* common.opt: Mark -flarge-source-files as Ignored.
	* common.opt.urls: Regenerate.
	* doc/invoke.texi: Remove -flarge-source-files.
	* toplev.cc (process_options): Remove support for
	-flarge-source-files.

d9cdc500

Dec 07, 2024
- Daily bump. · 2e02cdbc
  GCC Administrator authored 3 months ago
  
  2e02cdbc
Dec 06, 2024

libcpp, c++: Optimize initializers using #embed in C++ · 0223119f

Jakub Jelinek authored 3 months ago

This patch adds similar optimizations to the C++ FE as have been
implemented earlier in the C FE.
The libcpp hunk enables use of CPP_EMBED token even for C++, not just
C; the preprocessor guarantees there is always a CPP_NUMBER CPP_COMMA
before CPP_EMBED and CPP_COMMA CPP_NUMBER after it which simplifies
parsing (unless #embed is more than 2GB, in that case it could be
CPP_NUMBER CPP_COMMA CPP_EMBED CPP_COMMA CPP_EMBED CPP_COMMA CPP_EMBED
CPP_COMMA CPP_NUMBER etc. with each CPP_EMBED covering at most INT_MAX
bytes).
Similarly to the C patch, this patch parses it into RAW_DATA_CST tree
in the braced initializers (and from there peels into INTEGER_CSTs unless
it is an initializer of an std::byte array or integral array with CHAR_BIT
element precision), parses CPP_EMBED in cp_parser_expression into just
the last INTEGER_CST in it because I think users don't need millions of
-Wunused-value warnings because they did useless
  int a = (
  #embed "megabyte.dat"
  );
and so most of the inner INTEGER_CSTs would be there just for the warning,
and in the rest of contexts like template argument list, function argument
list, attribute argument list, ...) parse it into a sequence of INTEGER_CSTs
(I wrote a range/iterator classes to simplify that).

My dumb
cat embed-11.c
constexpr unsigned char a[] = {
  #embed "cc1plus"
};
const unsigned char *b = a;
testcase where cc1plus is 492329008 bytes long when configured
--enable-checking=yes,rtl,extra against recent binutils with .base64 gas
support results in:
time ./xg++ -B ./ -S -O2 embed-11.c

real    0m4.350s
user    0m2.427s
sys     0m0.830s
time ./xg++ -B ./ -c -O2 embed-11.c

real    0m6.932s
user    0m6.034s
sys     0m0.888s
(compared to running out of memory or very long compilation).
On a shorter inclusion,
cat embed-12.c
constexpr unsigned char a[] = {
  #embed "xg++"
};
const unsigned char *b = a;
where xg++ is 15225904 bytes long, this takes using GCC with the #embed
patchset except for this patch:
time ~/src/gcc/obj36/gcc/xg++ -B ~/src/gcc/obj36/gcc/ -S -O2 embed-12.c

real    0m33.190s
user    0m32.327s
sys     0m0.790s
and with this patch:
time ./xg++ -B ./ -S -O2 embed-12.c

real    0m0.118s
user    0m0.090s
sys     0m0.028s

The patch doesn't change anything on what the first patch in the series
introduces even for C++, namely that #embed is expanded (actually or as if)
into a sequence of literals like
127,69,76,70,2,1,1,3,0,0,0,0,0,0,0,0,2,0,62,0,1,0,0,0,80,211,64,0,0,0,0,0,64,0,0,0,0,0,0,0,8,253
and so each element has int type.
That is how I believe it is in C23, and the different versions of the
C++ P1967 paper specified there some casts, P1967R12 in particular
"Otherwise, the integral constant expression is the value of std::fgetc’s return is cast
to unsigned char."
but please see
https://github.com/llvm/llvm-project/pull/97274#issuecomment-2230929277
comment and whether we really want the preprocessor to preprocess it for
C++ as (or as-if)
static_cast<unsigned char>(127),static_cast<unsigned char>(69),static_cast<unsigned char>(76),static_cast<unsigned char>(70),static_cast<unsigned char>(2),...
i.e. 9 tokens per byte rather than 2, or
(unsigned char)127,(unsigned char)69,...
or
((unsigned char)127),((unsigned char)69),...
etc.
Without a literal suffix for unsigned char constant literals it is horrible,
plus the incompatibility between C and C++.  Sure, we could use the magic
form more often for C++ to save the size and do the 9 or how many tokens
form only for the boundary constants and use #embed "." __gnu__::__base64__("...")
for what is in between if there are at least 2 tokens inside of it.
E.g. (unsigned char)127 vs. static_cast<unsigned char>(127) behaves
differently if there is constexpr long long p[] = { ... };
...
  #embed __FILE__
[p]

2024-12-06  Jakub Jelinek  <jakub@redhat.com>

libcpp/
	* files.cc (finish_embed): Use CPP_EMBED even for C++.
gcc/
	* tree.h (RAW_DATA_UCHAR_ELT, RAW_DATA_SCHAR_ELT): Define.
gcc/cp/ChangeLog:
	* cp-tree.h (class raw_data_iterator): New type.
	(class raw_data_range): New type.
	* parser.cc (cp_parser_postfix_open_square_expression): Handle
	parsing of CPP_EMBED.
	(cp_parser_parenthesized_expression_list): Likewise.  Use
	cp_lexer_next_token_is.
	(cp_parser_expression): Handle parsing of CPP_EMBED.
	(cp_parser_template_argument_list): Likewise.
	(cp_parser_initializer_list): Likewise.
	(cp_parser_oacc_clause_tile): Likewise.
	(cp_parser_omp_tile_sizes): Likewise.
	* pt.cc (tsubst_expr): Handle RAW_DATA_CST.
	* constexpr.cc (reduced_constant_expression_p): Likewise.
	(raw_data_cst_elt): New function.
	(find_array_ctor_elt): Handle RAW_DATA_CST.
	(cxx_eval_array_reference): Likewise.
	* typeck2.cc (digest_init_r): Emit -Wnarrowing and/or -Wconversion
	diagnostics.
	(process_init_constructor_array): Handle RAW_DATA_CST.
	* decl.cc (maybe_deduce_size_from_array_init): Likewise.
	(is_direct_enum_init): Fail for RAW_DATA_CST.
	(cp_maybe_split_raw_data): New function.
	(consume_init): New function.
	(reshape_init_array_1): Add VECTOR_P argument.  Handle RAW_DATA_CST.
	(reshape_init_array): Adjust reshape_init_array_1 caller.
	(reshape_init_vector): Likewise.
	(reshape_init_class): Handle RAW_DATA_CST.
	(reshape_init_r): Likewise.
gcc/testsuite/
	* c-c++-common/cpp/embed-22.c: New test.
	* c-c++-common/cpp/embed-23.c: New test.
	* g++.dg/cpp/embed-4.C: New test.
	* g++.dg/cpp/embed-5.C: New test.
	* g++.dg/cpp/embed-6.C: New test.
	* g++.dg/cpp/embed-7.C: New test.
	* g++.dg/cpp/embed-8.C: New test.
	* g++.dg/cpp/embed-9.C: New test.
	* g++.dg/cpp/embed-10.C: New test.
	* g++.dg/cpp/embed-11.C: New test.
	* g++.dg/cpp/embed-12.C: New test.
	* g++.dg/cpp/embed-13.C: New test.
	* g++.dg/cpp/embed-14.C: New test.

0223119f

Dec 04, 2024
- Daily bump. · f36cb8c7
  GCC Administrator authored 3 months ago
  
  f36cb8c7
Dec 03, 2024

preprocessor: Adjust C rules on UCNs for C23 [PR117162] · f3b5de94

Joseph Myers authored 3 months ago

As noted in bug 117162, C23 changed some rules on UCNs to match C++
(this was a late change agreed in the resolution to CD2 comment
US-032, implementing changes from N3124), which we need to implement.

Allow UCNs below 0xa0 outside identifiers for C, with a
pedwarn-if-pedantic before C23 (and a warning with -Wc11-c23-compat)
except for the always-allowed cases of UCNs for $ @ `.  Also as part
of that change, do not allow \u0024 in identifiers as equivalent to $
for C23.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

	PR c/117162

libcpp/
	* include/cpplib.h (struct cpp_options): Add low_ucns.
	* init.cc (struct lang_flags, lang_defaults): Add low_ucns.
	(cpp_set_lang): Set low_ucns
	* charset.cc (_cpp_valid_ucn): For C, allow UCNs below 0xa0
	outside identifiers, with a pedwarn if pedantic before C23 or a
	warning with -Wc11-c23-compat.  Do not allow \u0024 in identifiers
	for C23.

gcc/testsuite/
	* gcc.dg/cpp/c17-ucn-1.c, gcc.dg/cpp/c17-ucn-2.c,
	gcc.dg/cpp/c17-ucn-3.c, gcc.dg/cpp/c17-ucn-4.c,
	gcc.dg/cpp/c23-ucn-2.c, gcc.dg/cpp/c23-ucnid-2.c: New tests.
	* c-c++-common/cpp/delimited-escape-seq-3.c,
	c-c++-common/cpp/named-universal-char-escape-3.c,
	gcc.dg/cpp/c23-ucn-1.c, gcc.dg/cpp/c2y-delimited-escape-seq-3.c:
	Update expected messages
	* gcc.dg/cpp/ucs.c: Use -pedantic-errors.  Update expected
	messages.

f3b5de94

Nov 29, 2024
- Daily bump. · 52e56eef
  GCC Administrator authored 3 months ago
  
  52e56eef
Nov 28, 2024

diagnostics: replace %<%s%> with %qs [PR104896] · 9f06b910

David Malcolm authored 3 months ago


No functional change intended.

gcc/analyzer/ChangeLog:
	PR c/104896
	* sm-malloc.cc: Replace "%<%s%>" with "%qs" in message wording.

gcc/c-family/ChangeLog:
	PR c/104896
	* c-lex.cc (c_common_lex_availability_macro): Replace "%<%s%>"
	with "%qs" in message wording.
	* c-opts.cc (c_common_handle_option): Likewise.
	* c-warn.cc (warn_parm_array_mismatch): Likewise.

gcc/ChangeLog:
	PR c/104896
	* common/config/ia64/ia64-common.cc (ia64_handle_option): Replace
	"%<%s%>" with "%qs" in message wording.
	* common/config/rs6000/rs6000-common.cc (rs6000_handle_option):
	Likewise.
	* config/aarch64/aarch64.cc (aarch64_validate_sls_mitigation):
	Likewise.
	(aarch64_override_options): Likewise.
	(aarch64_process_target_attr): Likewise.
	* config/arm/aarch-common.cc (aarch_validate_mbranch_protection):
	Likewise.
	* config/pru/pru.cc (pru_insert_attributes): Likewise.
	* config/riscv/riscv-target-attr.cc
	(riscv_target_attr_parser::parse_arch): Likewise.
	* omp-general.cc (oacc_verify_routine_clauses): Likewise.
	* tree-ssa-uninit.cc (maybe_warn_read_write_only): Likewise.
	(maybe_warn_pass_by_reference): Likewise.

gcc/cp/ChangeLog:
	PR c/104896
	* cvt.cc (maybe_warn_nodiscard): Replace "%<%s%>" with "%qs" in
	message wording.

gcc/fortran/ChangeLog:
	PR c/104896
	* resolve.cc (resolve_operator): Replace "%<%s%>" with "%qs" in
	message wording.

gcc/go/ChangeLog:
	PR c/104896
	* gofrontend/embed.cc (Gogo::initializer_for_embeds): Replace
	"%<%s%>" with "%qs" in message wording.
	* gofrontend/expressions.cc
	(Selector_expression::lower_method_expression): Likewise.
	* gofrontend/gogo.cc (Gogo::set_package_name): Likewise.
	(Named_object::export_named_object): Likewise.
	* gofrontend/parse.cc (Parse::struct_type): Likewise.
	(Parse::parameter_list): Likewise.

gcc/rust/ChangeLog:
	PR c/104896
	* backend/rust-compile-expr.cc
	(CompileExpr::compile_integer_literal): Replace "%<%s%>" with
	"%qs" in message wording.
	(CompileExpr::compile_float_literal): Likewise.
	* backend/rust-compile-intrinsic.cc (Intrinsics::compile):
	Likewise.
	* backend/rust-tree.cc (maybe_warn_nodiscard): Likewise.
	* checks/lints/rust-lint-scan-deadcode.h: Likewise.
	* lex/rust-lex.cc (Lexer::parse_partial_unicode_escape): Likewise.
	(Lexer::parse_raw_byte_string): Likewise.
	* lex/rust-token.cc (Token::get_str): Likewise.
	* metadata/rust-export-metadata.cc
	(PublicInterface::write_to_path): Likewise.
	* parse/rust-parse.cc
	(peculiar_fragment_match_compatible_fragment): Likewise.
	(peculiar_fragment_match_compatible): Likewise.
	* resolve/rust-ast-resolve-path.cc (ResolvePath::resolve_path):
	Likewise.
	* resolve/rust-ast-resolve-toplevel.h: Likewise.
	* resolve/rust-ast-resolve-type.cc (ResolveRelativeTypePath::go):
	Likewise.
	* rust-session-manager.cc (validate_crate_name): Likewise.
	(Session::load_extern_crate): Likewise.
	* typecheck/rust-hir-type-check-expr.cc (TypeCheckExpr::visit):
	Likewise.
	(TypeCheckExpr::resolve_fn_trait_call): Likewise.
	* typecheck/rust-hir-type-check-implitem.cc
	(TypeCheckImplItemWithTrait::visit): Likewise.
	* typecheck/rust-hir-type-check-item.cc
	(TypeCheckItem::validate_trait_impl_block): Likewise.
	* typecheck/rust-hir-type-check-struct.cc
	(TypeCheckStructExpr::visit): Likewise.
	* typecheck/rust-tyty-call.cc (TypeCheckCallExpr::visit):
	Likewise.
	* typecheck/rust-tyty.cc (BaseType::bounds_compatible): Likewise.
	* typecheck/rust-unify.cc (UnifyRules::emit_abi_mismatch):
	Likewise.
	* util/rust-attributes.cc (AttributeChecker::visit): Likewise.

libcpp/ChangeLog:
	PR c/104896
	* pch.cc (cpp_valid_state): Replace "%<%s%>" with "%qs" in message
	wording.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

9f06b910

Daily bump. · 7a656d74
GCC Administrator authored 3 months ago

7a656d74

Nov 27, 2024

libcpp: modules and -include again · 134dc932

Jason Merrill authored 3 months ago

I enabled include translation to header units in r15-1104-ga29f481bbcaf2b,
but it seems that patch wasn't sufficient, as any diagnostics in the main
source file would show up as coming from the header instead.

Fixed by setting buffer->file for leaving the file transition that my
previous patch made us enter.  And don't push a buffer of newlines, in this
case that messes up line numbers instead of aligning them.

libcpp/ChangeLog:

	* files.cc (_cpp_stack_file): Handle -include of header unit more
	specially.

gcc/testsuite/ChangeLog:

	* g++.dg/modules/dashinclude-1_b.C: Add an #error.
	* g++.dg/modules/dashinclude-1_a.H: Remove dg-module-do run.

134dc932

Nov 24, 2024
- Daily bump. · a095d720
  GCC Administrator authored 3 months ago
  
  a095d720
Nov 23, 2024

libcpp: Fix ICE lexing invalid raw string in a deferred pragma [PR117118] · 18cace46

Lewis Hyatt authored 5 months ago

The PR shows that we ICE after lexing an invalid unterminated raw string,
because lex_raw_string() pops the main buffer unexpectedly. Resolve by
handling this case the same way as for other directives.

libcpp/ChangeLog:
	PR preprocessor/117118
	* lex.cc (lex_raw_string): Treat an unterminated raw string the same
	way for a deferred pragma as is done for other directives.

gcc/testsuite/ChangeLog:
	PR preprocessor/117118
	* c-c++-common/raw-string-directive-3.c: New test.
	* c-c++-common/raw-string-directive-4.c: New test.

18cace46

libcpp: Fix potential unaligned access in cpp_buffer · c93eb81c

Lewis Hyatt authored 4 months ago

libcpp makes use of the cpp_buffer pfile->a_buff to store things while it is
handling macros. It uses it to store pointers (cpp_hashnode*, for macro
arguments) and cpp_macro objects. This works fine because a cpp_hashnode*
and a cpp_macro have the same alignment requirement on either 32-bit or
64-bit systems (namely, the same alignment as a pointer.)

When 64-bit location_t is enabled on a 32-bit sytem, the alignment
requirement may cease to be the same, because the alignment requirement of a
cpp_macro object changes to that of a uint64_t, which be larger than that of
a pointer. It's not the case for x86 32-bit, but for example, on sparc, a
pointer has 4-byte alignment while a uint64_t has 8. In that case,
intermixing the two within the same cpp_buffer leads to a misaligned
access. The code path that triggers this is the one in _cpp_commit_buff in
which a hash table with its own allocator (i.e. ggc) is not being used, so
it doesn't happen within the compiler itself, but it happens in the other
libcpp clients, such as genmatch.

Fix that up by ensuring _cpp_commit_buff commits a fully aligned chunk of the
buffer, so it's ready for anything it may be used for next.

Also modify CPP_ALIGN so that it guarantees to return an alignment at least
the size of location_t. Currently it returns the max of a pointer and a
double. I am not aware of any platform where a double may have smaller
alignment than a uint64_t, but it does not hurt to add location_t here to be
sure.

libcpp/ChangeLog:

	* lex.cc (_cpp_commit_buff): Make sure that the buffer is properly
	aligned for the next allocation.
	* internal.h (struct dummy): Make sure alignment is large enough for
	a location_t, just in case.

c93eb81c

Support for 64-bit location_t: libcpp preliminaries · 927625d0

Lewis Hyatt authored 4 months ago

Prepare libcpp to support 64-bit location_t, without yet making
any functional changes, by adding new typedefs that enable code to be
written such that it works with any size location_t. Update the usage of
line maps within libcpp accordingly.

Subsequent patches will prepare the rest of the codebase similarly, and then
afterwards, location_t will be changed to uint64_t.

libcpp/ChangeLog:

	* include/line-map.h (line_map_uint_t): New typedef, the same type
	as location_t.
	(location_diff_t): New typedef.
	(line_map_suggested_range_bits): New constant.
	(struct maps_info_ordinary): Change member types from "unsigned int"
	to "line_map_uint_t".
	(struct maps_info_macro): Likewise.
	(struct location_adhoc_data_map): Likewise.
	(LINEMAPS_ALLOCATED): Change return type from "unsigned int" to
	"line_map_uint_t".
	(LINEMAPS_ORDINARY_ALLOCATED): Likewise.
	(LINEMAPS_MACRO_ALLOCATED): Likewise.
	(LINEMAPS_USED): Likewise.
	(LINEMAPS_ORDINARY_USED): Likewise.
	(LINEMAPS_MACRO_USED): Likewise.
	(linemap_lookup_macro_index): Likewise.
	(LINEMAPS_MAP_AT): Change argument type from "unsigned int" to
	"line_map_uint_t".
	(LINEMAPS_ORDINARY_MAP_AT): Likewise.
	(LINEMAPS_MACRO_MAP_AT): Likewise.
	(line_map_new_raw): Likewise.
	(linemap_module_restore): Likewise.
	(linemap_dump): Likewise.
	(line_table_dump): Likewise.
	(LINEMAPS_LAST_MAP): Add a linemap_assert() for safety.
	(SOURCE_COLUMN): Use a cast to ensure correctness if location_t
	becomes a 64-bit type.
	* line-map.cc (location_adhoc_data_hash): Don't truncate to 32-bit
	prematurely when hashing.
	(line_maps::get_or_create_combined_loc): Adapt types to support
	potentially 64-bit location_t. Use MAX_LOCATION_T rather than a
	hard-coded constant.
	(line_maps::get_range_from_loc): Adapt types and constants to
	support potentially 64-bit location_t.
	(line_maps::pure_location_p): Likewise.
	(line_maps::get_pure_location): Likewise.
	(line_map_new_raw): Likewise.
	(LAST_SOURCE_LINE_LOCATION): Likewise.
	(linemap_add): Likewise.
	(linemap_module_restore): Likewise.
	(linemap_line_start): Likewise.
	(linemap_position_for_column): Likewise.
	(linemap_position_for_line_and_column): Likewise.
	(linemap_position_for_loc_and_offset): Likewise.
	(linemap_ordinary_map_lookup): Likewise.
	(linemap_lookup_macro_index): Likewise.
	(linemap_dump): Likewise.
	(linemap_dump_location): Likewise.
	(linemap_get_file_highest_location): Likewise.
	(line_table_dump): Likewise.
	(linemap_compare_locations): Avoid signed int overflow in the result.
	* macro.cc (num_expanded_macros_counter): Change type of global
	variable from "unsigned int" to "line_map_uint_t".
	(num_macro_tokens_counter): Likewise.

927625d0

Aug 28, 2024
- Daily bump. · ef84d2fe
  GCC Administrator authored 6 months ago
  
  ef84d2fe
Aug 26, 2024

libcpp: deduplicate definition of padding size · a8260ebe

Alexander Monakov authored 6 months ago

Tie together the two functions that ensure tail padding with
search_line_ssse3 via CPP_BUFFER_PADDING macro.

libcpp/ChangeLog:

	* internal.h (CPP_BUFFER_PADDING): New macro; use it ...
	* charset.cc (_cpp_convert_input): ...here, and ...
	* files.cc (read_file_guts): ...here, and ...
	* lex.cc (search_line_ssse3): here.

a8260ebe

Aug 24, 2024
- Daily bump. · 3ff1b91e
  GCC Administrator authored 6 months ago
  
  3ff1b91e
Aug 23, 2024

libcpp: bump padding size in _cpp_convert_input [PR116458] · b2c1d7c4

Alexander Monakov authored 7 months ago

The recently introduced search_line_fast_ssse3 raised padding
requirement from 16 to 64, which was adjusted in read_file_guts,
but the corresponding ' + 16' in _cpp_convert_input was overlooked.

libcpp/ChangeLog:

	PR preprocessor/116458
	* charset.cc (_cpp_convert_input): Bump padding to 64 if
	HAVE_SSSE3.

b2c1d7c4

Daily bump. · 2cd783be
GCC Administrator authored 7 months ago

2cd783be

Aug 22, 2024

fix single argument static_assert · 4e905bd3

Marc Poulhiès authored 7 months ago

Single argument static_assert is C++17 only.

libcpp/ChangeLog:

	* lex.cc(search_line_ssse3): fix static_assert to use 2 arguments.

4e905bd3

Aug 21, 2024
- Daily bump. · 964c9c24
  GCC Administrator authored 7 months ago
  
  964c9c24
Aug 20, 2024

libcpp: Adjust lang_defaults · 447c32c5

Jakub Jelinek authored 7 months ago

The table over the years turned to be very wide, 147 columns
and any addition would add a couple of new ones.
We need a 28x23 bit matrix right now.

This patch changes the formatting, so that we need just 2 columns
per new feature and so we have some room for expansion.
In addition, the patch changes it to bitfields, which reduces
.rodata by 532 bytes (so 5.75x reduction of the variable) and
on x86_64-linux grows the cpp_set_lang function by 26 bytes (8.4%
growth).

2024-08-20  Jakub Jelinek  <jakub@redhat.com>

	* init.cc (struct lang_flags): Change all members from char
	typed fields to unsigned bit-fields.
	(lang_defaults): Change formatting of the initializer so that it
	fits to 68 columns rather than 147.

447c32c5

libcpp: replace SSE4.2 helper with an SSSE3 one · 20a5b482

Alexander Monakov authored 7 months ago

Since the characters we are searching for (CR, LF, '\', '?') all have
distinct ASCII codes mod 16, PSHUFB can help match them all at once.

Directly use the new helper if __SSSE3__ is defined. It makes the other
helpers unused, so mark them inline to prevent warnings.

Rewrite and simplify init_vectorized_lexer.

libcpp/ChangeLog:

	* config.in: Regenerate.
	* configure: Regenerate.
	* configure.ac: Check for SSSE3 instead of SSE4.2.
	* files.cc (read_file_guts): Bump padding to 64 if HAVE_SSSE3.
	* lex.cc (search_line_acc_char): Mark inline, not "unused".
	(search_line_sse2): Mark inline.
	(search_line_sse42): Replace with...
	(search_line_ssse3): ... this new function.  Adjust the use...
	(init_vectorized_lexer): ... here.  Simplify.

20a5b482

Aug 07, 2024
- Daily bump. · b120ca0c
  GCC Administrator authored 7 months ago
  
  b120ca0c
Aug 06, 2024

Remove MMX code path in lexer · eac63be1

Andi Kleen authored 7 months ago

Host systems with only MMX and no SSE2 should be really rare now.
Let's remove the MMX code path to keep the number of custom
implementations the same.

The SSE2 code path is also somewhat dubious now (nearly everything
should have SSE4 4.2 which is >15 years old now), but the SSE2
code path is used as fallback for others and also apparently
Solaris uses it due to tool chain deficiencies.

libcpp/ChangeLog:

	* lex.cc (search_line_mmx): Remove function.
	(init_vectorized_lexer): Remove search_line_mmx.

eac63be1

Jul 26, 2024
- Daily bump. · 18eb6ca1
  GCC Administrator authored 7 months ago
  
  18eb6ca1
Jul 25, 2024

c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343] · 29341f21

Jakub Jelinek authored 7 months ago

The following patch implements the easy parts of the paper.
When @$` are added to the basic character set, it means that
R"@$`()@$`" should now be valid (here I've noticed most of the
raw string tests were tested solely with -std=c++11 or -std=gnu++11
and I've tried to change that), and on the other side even if
by extension $ is allowed in identifiers, \u0024 or \U00000024
or \u{24} should not be, similarly how \u0041 is not allowed.

The paper in 3.1 claims though that
 #include <stdio.h>

 #define STR(x) #x

int main()
{
  printf("%s", STR(\u0060)); // U+0060 is ` GRAVE ACCENT
}
should have been accepted before this paper (and rejected after it),
but g++ rejects it.

I've tried to understand it, but am confused on what is the right
behavior and why.

Consider
 #define STR(x) #x
const char *a = "\u00b7";
const char *b = STR(\u00b7);
const char *c = "\u0041";
const char *d = STR(\u0041);
const char *e = STR(a\u00b7);
const char *f = STR(a\u0041);
const char *g = STR(a \u00b7);
const char *h = STR(a \u0041);
const char *i = "\u066d";
const char *j = STR(\u066d);
const char *k = "\u0040";
const char *l = STR(\u0040);
const char *m = STR(a\u066d);
const char *n = STR(a\u0040);
const char *o = STR(a \u066d);
const char *p = STR(a \u0040);

Neither clang nor gcc emit any diagnostics on the a, c, i and k
initializers, those are certainly valid (c is invalid in C23 though).  g++
emits with -pedantic-errors errors on all the others, while clang++ on the
ones with STR involving \u0041, \u0040 and a\u0066d.  The chosen values are
\u0040 '@' as something being changed by this paper, \u0041 'A' as basic
character set char valid in identifiers before/after, \u00b7 as an example
of character which is pedantically valid in identifiers if not at the start
and \u066d s something pedantically not valid in identifiers.

Now, https://eel.is/c++draft/lex.charset#6 says that UCN used outside of a
string/character literal which corresponds to basic character set character
(or control character) is ill-formed, that would make d, f, h cases invalid
for C++ and l, n, p cases invalid for C++26.

https://eel.is/c++draft/lex.name states which characters can appear at the
start of the identifier and which can appear after the start.  And
https://eel.is/c++draft/lex.pptoken states that preprocessing-token is
either identifier, or tons of other things, or "each non-whitespace
character that cannot be one of the above"

Then https://eel.is/c++draft/lex.pptoken#1 says that this last category is
invalid if the preprocessing token is being converted into token.

And https://eel.is/c++draft/lex.pptoken#2 includes "If any character not in
the basic character set matches the last category, the program is
ill-formed."

Now, e.g.  for the C++23 STR(\u0040) case, \u0040 is there not in the basic
character set, so valid outside of the literals (not the case anymore in
C++26), but it isn't nondigit and doesn't have XID_Start property, so it
isn't IMHO an identifier and so must be the "each non-whitespace character
that cannot be one of the above" case.  Why doesn't the above mentioned
https://eel.is/c++draft/lex.pptoken#2 sentence make that invalid?  Ignoring
that, I'd say it would be then stringized and that feels like it is what
clang++ is doing.  Now, e.g.  for the STR(a\u066d) case, I wonder why that
isn't lexed as a identifier followed by \u066d "each non-whitespace
character that cannot be one of the above" token and stringified similarly,
clang++ rejects that.

What GCC libcpp seems to be doing is that if that forms_identifier_p calls
_cpp_valid_utf8 or _cpp_valid_ucn with an argument which tells it is first
or second+ in identifier, and e.g.  _cpp_valid_ucn then for UCNs valid in
string literals calls
  else if (identifier_pos)
    {
      int validity = ucn_valid_in_identifier (pfile, result, nst);

      if (validity == 0)
        cpp_error (pfile, CPP_DL_ERROR,
                   "universal character %.*s is not valid in an identifier",
                   (int) (str - base), base);
      else if (validity == 2 && identifier_pos == 1)
        cpp_error (pfile, CPP_DL_ERROR,
   "universal character %.*s is not valid at the start of an identifier",
                   (int) (str - base), base);
    }
so basically all those invalid in identifiers cases emit an error and
pretend to be valid in identifiers, rather than what e.g.  _cpp_valid_utf8
does for C but not for C++ and only for the chars completely invalid in
identifiers rather than just valid in identifiers but not at the start:
          /* In C++, this is an error for invalid character in an identifier
             because logically, the UTF-8 was converted to a UCN during
             translation phase 1 (even though we don't physically do it that
             way).  In C, this byte rather becomes grammatically a separate
             token.  */

          if (CPP_OPTION (pfile, cplusplus))
            cpp_error (pfile, CPP_DL_ERROR,
                       "extended character %.*s is not valid in an identifier",
                       (int) (*pstr - base), base);
          else
            {
              *pstr = base;
              return false;
            }
The comment doesn't really match what is done in recent C++ versions because
there UCNs are translated to characters and not the other way around.

2024-07-25  Jakub Jelinek  <jakub@redhat.com>

	PR c++/110343
libcpp/
	* lex.cc: C++26 P2558R2 - Add @, $, and ` to the basic character set.
	(lex_raw_string): For C++26 allow $@` characters in prefix.
	* charset.cc (_cpp_valid_ucn): For C++26 reject \u0024 in identifiers.
gcc/testsuite/
	* c-c++-common/raw-string-1.c: Use { c || c++11 } effective target,
	remove c++ specific dg-options.
	* c-c++-common/raw-string-2.c: Likewise.
	* c-c++-common/raw-string-4.c: Likewise.
	* c-c++-common/raw-string-5.c: Likewise.  Expect some diagnostics
	only for non-c++26, for c++26 expect different.
	* c-c++-common/raw-string-6.c: Use { c || c++11 } effective target,
	remove c++ specific dg-options.
	* c-c++-common/raw-string-11.c: Likewise.
	* c-c++-common/raw-string-13.c: Likewise.
	* c-c++-common/raw-string-14.c: Likewise.
	* c-c++-common/raw-string-15.c: Use { c || c++11 } effective target,
	change c++ specific dg-options to just -Wtrigraphs.
	* c-c++-common/raw-string-16.c: Likewise.
	* c-c++-common/raw-string-17.c: Use { c || c++11 } effective target,
	remove c++ specific dg-options.
	* c-c++-common/raw-string-18.c: Use { c || c++11 } effective target,
	remove -std=c++11 from c++ specific dg-options.
	* c-c++-common/raw-string-19.c: Likewise.
	* g++.dg/cpp26/raw-string1.C: New test.
	* g++.dg/cpp26/raw-string2.C: New test.

29341f21

Daily bump. · 25256af1
GCC Administrator authored 7 months ago

25256af1

Jul 24, 2024

diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4) · 148066bd

David Malcolm authored 7 months ago


This patch adds support to our SARIF output for cases where
rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars.

In such cases, the pertinent SARIF "location" object gains a property
bag with property "gcc/escapeNonAscii": true, and the "artifactContent"
within the location's physical location's snippet" gains a "rendered"
property (§3.3.4) that escapes non-ASCII text in the snippet, such as:

"rendered": {"text":

where "text" has a string value such as (for a "trojan source" attack):

  "9 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n"
  "  |       ~~~~~~~~                                ~~~~~~~~                    ^\n"
  "  |       |                                       |                           |\n"
  "  |       |                                       |                           end of bidirectional context\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)\n"

where the escaping is affected by -fdiagnostics-escape-format=; with
-fdiagnostics-escape-format=bytes, the rendered text of the above is:

  "9 |     /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n"
  "  |       ~~~~~~~~~~~~                                        ~~~~~~~~~~~~                    ^\n"
  "  |       |                                                   |                               |\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)                     U+2066 (LEFT-TO-RIGHT ISOLATE)  end of bidirectional context\n"

The patch also refactors/adds enough selftest machinery to be able to
test the snippet generation from within the selftest framework, rather
than just within DejaGnu (where the regex-based testing isn't
sophisticated enough to verify such properties as the above).

gcc/ChangeLog:
	* Makefile.in (OBJS-libcommon): Add selftest-json.o.
	* diagnostic-format-sarif.cc: Include "selftest.h",
	"selftest-diagnostic.h", "selftest-diagnostic-show-locus.h",
	"selftest-json.h", and "text-range-label.h".
	(class content_renderer): New.
	(sarif_builder::m_rules_arr): Convert to std::unique_ptr.
	(sarif_builder::make_location_object): Add class
	escape_nonascii_renderer.  If rich_loc.escape_on_output_p (),
	pass a nonnull escape_nonascii_renderer to
	maybe_make_physical_location_object as its snippet_renderer, and
	add a property bag property "gcc/escapeNonAscii" to the SARIF
	location object.  For other overloads of make_location_object,
	pass nullptr for the snippet_renderer.
	(sarif_builder::maybe_make_region_object_for_context): Add
	"snippet_renderer" param and pass it to
	maybe_make_artifact_content_object.
	(sarif_builder::make_tool_object): Drop "const".
	(sarif_builder::make_driver_tool_component_object): Likewise.
	Use typesafe unique_ptr variant of object::set for setting "rules"
	property on driver_obj.
	(sarif_builder::maybe_make_artifact_content_object): Add param "r"
	and use it to potentially set the "rendered" property (§3.3.4).
	(selftest::test_make_location_object): New.
	(selftest::diagnostic_format_sarif_cc_tests): New.
	* diagnostic-show-locus.cc: Include "text-range-label.h" and
	"selftest-diagnostic-show-locus.h".
	(selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture):
	New.
	(selftests::test_layout_x_offset_display_utf8): Use
	diagnostic_show_locus_fixture to simplify and consolidate setup
	code.
	(selftests::test_diagnostic_show_locus_one_liner): Likewise.
	(selftests::test_one_liner_colorized_utf8): Likewise.
	(selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise.
	* gcc-rich-location.h (class text_range_label): Move to new file
	text-range-label.h.
	* selftest-diagnostic-show-locus.h: New file, based on material in
	diagnostic-show-locus.cc.
	* selftest-json.cc: New file.
	* selftest-json.h: New file.
	* selftest-run-tests.cc (selftest::run_tests): Call
	selftest::diagnostic_format_sarif_cc_tests.
	* selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl.

gcc/testsuite/ChangeLog:
	* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify
	that we have a property bag with property "gcc/escapeNonAscii": true.
	Verify that we have a "rendered" property for a snippet.
	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include
	"text-range-label.h".

gcc/ChangeLog:
	* text-range-label.h: New file, taking class text_range_label from
	gcc-rich-location.h.

libcpp/ChangeLog:
	* include/rich-location.h
	(semi_embedded_vec::semi_embedded_vec): Add copy ctor.
	(rich_location::rich_location): Remove "= delete" from decl of
	copy ctor.  Add deleted decl of move ctor.
	(rich_location::operator=): Remove "= delete" from decl of
	copy assignment.  Add deleted decl of move assignment.
	(fixit_hint::fixit_hint): Add copy ctor decl.  Add deleted decl of
	move.
	(fixit_hint::operator=): Add copy assignment decl.  Add deleted
	decl of move assignment.
	* line-map.cc (rich_location::rich_location): New copy ctor.
	(fixit_hint::fixit_hint): New copy ctor.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

148066bd

Jul 14, 2024
- Daily bump. · 944e4251
  GCC Administrator authored 8 months ago
  
  944e4251
Jul 13, 2024

diagnostics: add highlight-a vs highlight-b in colorization and pp_markup · 7d73c01c

David Malcolm authored 8 months ago


Since r6-4582-g8a64515099e645 (which added class rich_location), ranges
of quoted source code have been colorized using the following rules:
- the primary range used the same color of the kind of the diagnostic
i.e. "error" vs "warning" etc (defaulting to bold red and bold magenta
respectively)
- secondary ranges alternate between "range1" and "range2" (defaulting
to green and blue respectively)

This works for cases with large numbers of highlighted ranges, but is
suboptimal for common cases.

The following patch adds a pair of color names: "highlight-a" and
"highlight-b", and uses them whenever it makes sense to highlight and
contrast two different things in the source code (e.g. a type mismatch).
These are used by diagnostic-show-locus.cc for highlighting quoted
source.  In addition the patch adds colorization to fragments within the
corresponding diagnostic messages themselves, using consistent
colorization between the message and the quoted source code for the two
different things being contrasted.

For example, consider:

demo.c: In function ‘test_bad_format_string_args’:
../../src/demo.c:25:18: warning: format ‘%i’ expects argument of
  type ‘int’, but argument 2 has type ‘const char *’ [-Wformat=]
   25 |   printf("hello %i", msg);
      |                 ~^   ~~~
      |                  |   |
      |                  int const char *
      |                 %s

Previously, the types within the message in quotes would be in bold but
not colorized, and the labelled ranges of quoted source code would use
bold magenta for the "int" and non-bold green for the "const char *".

With this patch:
- the "%i" and "int" in the message and the "int" in the quoted source
  are all colored bold green
- the "const char *" in the message and in the quoted source are both
  colored bold blue
so that the consistent use of contrasting color draws the reader's eyes
to the relationships between the diagnostic message and the source.

I've tried this with gnome-terminal with many themes, including a
variety of light versus dark backgrounds, solarized versus non-solarized
themes, etc, and it was readable in all.

My initial version of the patch used the existing %r and %R facilities
within pretty-print.cc for the messages, but this turned out to be very
uncomfortable, leading to error-prone format strings such as:

  error_at (richloc,
            "invalid operands to binary %s (have %<%r%T%R%> and %<%r%T%R%>)",
            opname,
            "highlight-a", type0,
            "highlight-b", type1);

To avoid requiring monstrosities such as the above, the patch adds a new
"%e" format code to pretty-print.cc, which expects a pp_element *, where
pp_element is a new abstract base class (actually a pp_markup::element),
along with various useful subclasses.  This lets the above be written
as:

  pp_markup::element_quoted_type element_0 (type0, highlight_colors::lhs);
  pp_markup::element_quoted_type element_1 (type1, highlight_colors::rhs);
  error_at (richloc,
            "invalid operands to binary %s (have %e and %e)",
            opname, &element_0, &element_1);

which I feel is maintainable and clear to translators; the use of %e and
pp_element * captures the type-unsafe part of the variadic call, and the
subclasses allow for type-safety (so e.g. an element_quoted_type expects
a type and a highlighting color).  This approach allows for some nice
simplifications within c-format.cc.

The patch also extends -Wformat to "teach" it about the new %e and
pp_element *.  Doing so requires c-format.cc to be able to determine
if a T * is a pp_element * (i.e. if T is a subclass).  To do so I added
a new comp_types callback for comparing types, where the C++ frontend
supplies a suitable implementation (and %e will always be wrong for C).

I've manually tested this on many diagnostics with both C and C++ and it
seems a subtle but significant improvement in readability.

I've added a new option -fno-diagnostics-show-highlight-colors in case
people prefer the old behavior.

gcc/c-family/ChangeLog:
	* c-common.cc: Include "tree-pretty-print-markup.h".
	(binary_op_error): Use pp_markup::element_quoted_type and %e.
	(check_function_arguments): Add "comp_types" param and pass it to
	check_function_format.
	* c-common.h (check_function_arguments): Add "comp_types" param.
	(check_function_format): Likewise.
	* c-format.cc: Include "tree-pretty-print-markup.h".
	(local_pp_element_ptr_node): New.
	(PP_FORMAT_CHAR_TABLE): Add entry for %e.
	(struct format_check_context): Add "m_comp_types" field.
	(check_function_format): Add "comp_types" param and pass it to
	check_format_info.
	(check_format_info): Likewise, passing it to format_ctx's ctor.
	(check_format_arg): Extract m_comp_types from format_ctx and
	pass it to check_format_info_main.
	(check_format_info_main): Add "comp_types" param and pass it to
	arg_parser's ctor.
	(class argument_parser): Add "m_comp_types" field.
	(argument_parser::check_argument_type): Pass m_comp_types to
	check_format_types.
	(handle_subclass_of_pp_element_p): New.
	(check_format_types): Add "comp_types" param, and use it to
	call handle_subclass_of_pp_element_p.
	(class element_format_substring): New.
	(class element_expected_type_with_indirection): New.
	(format_type_warning): Use element_expected_type_with_indirection
	to unify the if (wanted_type_name) branches, reducing from four
	emit_warning calls to two.  Simplify these further using %e.
	Doing so also gives suitable colorization of the text within the
	diagnostics.
	(init_dynamic_diag_info): Initialize local_pp_element_ptr_node.
	(selftest::test_type_mismatch_range_labels): Add nullptr for new
	param of gcc_rich_location label overload.
	* c-format.h (T_PP_ELEMENT_PTR): New.
	* c-type-mismatch.cc: Include "diagnostic-highlight-colors.h".
	(binary_op_rich_location::binary_op_rich_location): Use
	highlight_colors::lhs and highlight_colors::rhs for the ranges.
	* c-type-mismatch.h (class binary_op_rich_location): Add comment
	about highlight_colors.

gcc/c/ChangeLog:
	* c-objc-common.cc: Include "tree-pretty-print-markup.h".
	(print_type): Add optional "highlight_color" param and use it
	to show highlight colors in "aka" text.
	(pp_markup::element_quoted_type::print_type): New.
	* c-typeck.cc: Include "tree-pretty-print-markup.h".
	(comp_parm_types): New.
	(build_function_call_vec): Pass it to check_function_arguments.
	(inform_for_arg): Use %e and highlight colors to contrast actual
	versus expected.
	(convert_for_assignment): Use highlight_colors::actual for the
	rhs_label.
	(build_binary_op): Use highlight_colors::lhs and highlight_colors::rhs
	for the ranges.

gcc/ChangeLog:
	* common.opt (fdiagnostics-show-highlight-colors): New option.
	* common.opt.urls: Regenerate.
	* coretypes.h (pp_markup::element): New forward decl.
	(pp_element): New typedef.
	* diagnostic-color.cc (gcc_color_defaults): Add "highlight-a"
	and "highlight-b".
	* diagnostic-format-json.cc (diagnostic_output_format_init_json):
	Disable highlight colors.
	* diagnostic-format-sarif.cc (diagnostic_output_format_init_sarif):
	Likewise.
	* diagnostic-highlight-colors.h: New file.
	* diagnostic-path.cc (struct event_range): Pass nullptr for
	highlight color of m_rich_loc.
	* diagnostic-show-locus.cc (colorizer::set_range): Handle ranges
	with m_highlight_color.
	(colorizer::STATE_NAMED_COLOR): New.
	(colorizer::m_richloc): New field.
	(colorizer::colorizer): Add richloc param for initializing
	m_richloc.
	(colorizer::set_named_color): New.
	(colorizer::begin_state): Add case STATE_NAMED_COLOR.
	(layout::layout): Pass richloc to m_colorizer's ctor.
	(selftest::test_one_liner_labels): Pass nullptr for new param of
	gcc_rich_location ctor for labels.
	(selftest::test_one_liner_labels_utf8): Likewise.
	* diagnostic.h (diagnostic_context::set_show_highlight_colors):
	New.
	* doc/invoke.texi: Add option -fdiagnostics-show-highlight-colors
	and highlight-a and highlight-b color caps.
	* doc/ux.texi
	(Use color consistently when highlighting mismatches): New
	subsection.
	* gcc-rich-location.cc (gcc_rich_location::add_expr): Add
	"highlight_color" param.
	(gcc_rich_location::maybe_add_expr): Likewise.
	* gcc-rich-location.h (gcc_rich_location::gcc_rich_location):
	Split out into a pair of ctors, where if a range_label is supplied
	the caller must also supply a highlight color.
	(gcc_rich_location::add_expr): Add "highlight_color" param.
	(gcc_rich_location::maybe_add_expr): Likewise.
	* gcc.cc (driver_handle_option): Handle
	OPT_fdiagnostics_show_highlight_colors.
	* lto-wrapper.cc (merge_and_complain): Likewise.
	(append_compiler_options): Likewise.
	(append_diag_options): Likewise.
	(run_gcc): Likewise.
	* opts-common.cc (decode_cmdline_options_to_array): Add comment
	about -fno-diagnostics-show-highlight-colors.
	* opts-global.cc (init_options_once): Preserve
	pp_show_highlight_colors in case the global_dc's printer is
	recreated.
	* opts.cc (common_handle_option): Handle
	OPT_fdiagnostics_show_highlight_colors.
	(gen_command_line_string): Likewise.
	* pretty-print-markup.h: New file.
	* pretty-print.cc: Include "pretty-print-markup.h" and
	"diagnostic-highlight-colors.h".
	(pretty_printer::format): Handle %e.
	(pretty_printer::pretty_printer): Handle new field
	m_show_highlight_colors.
	(pp_string_n): New.
	(pp_markup::context::begin_quote): New.
	(pp_markup::context::end_quote): New.
	(pp_markup::context::begin_color): New.
	(pp_markup::context::end_color): New.
	(highlight_colors::expected): New.
	(highlight_colors::actual): New.
	(highlight_colors::lhs): New.
	(highlight_colors::rhs): New.
	(class selftest::test_element): New.
	(selftest::test_pp_format): Add tests of %e.
	(selftest::test_urlification): Likewise.
	* pretty-print.h (pp_markup::context): New forward decl.
	(class chunk_info): Add friend class pp_markup::context.
	(class pretty_printer): Add friend pp_show_highlight_colors.
	(pretty_printer::m_show_highlight_colors): New field.
	(pp_show_highlight_colors): New inline function.
	(pp_string_n): New decl.
	* substring-locations.cc: Include "diagnostic-highlight-colors.h".
	(format_string_diagnostic_t::highlight_color_format_string): New.
	(format_string_diagnostic_t::highlight_color_param): New.
	(format_string_diagnostic_t::emit_warning_n_va): Use highlight
	colors.
	* substring-locations.h
	(format_string_diagnostic_t::highlight_color_format_string): New.
	(format_string_diagnostic_t::highlight_color_param): New.
	* toplev.cc (general_init): Initialize global_dc's
	show_highlight_colors.
	* tree-pretty-print-markup.h: New file.

gcc/cp/ChangeLog:
	* call.cc: Include "tree-pretty-print-markup.h".
	(implicit_conversion_error): Use highlight_colors::percent_h for
	the labelled range.
	(op_error_string): Split out into...
	(concat_op_error_string): ...this.
	(binop_error_string): New.
	(op_error): Use %e, binop_error_string, highlight_colors::lhs,
	and highlight_colors::rhs.
	(maybe_inform_about_fndecl_for_bogus_argument_init): Add
	"highlight_color" param; use it for the richloc.
	(convert_like_internal): Use highlight_colors::percent_h for the
	labelled_range, and highlight_colors::percent_i for the call to
	maybe_inform_about_fndecl_for_bogus_argument_init.
	(build_over_call): Pass cp_comp_parm_types for new "comp_types"
	param of check_function_arguments.
	(complain_about_bad_argument): Use highlight_colors::percent_h for
	the labelled_range, and highlight_colors::percent_i for the call
	to maybe_inform_about_fndecl_for_bogus_argument_init.
	* cp-tree.h (maybe_inform_about_fndecl_for_bogus_argument_init):
	Add optional highlight_color param.
	(cp_comp_parm_types): New decl.
	(highlight_colors::const percent_h): New decl.
	(highlight_colors::const percent_i): New decl.
	* error.cc: Include "tree-pretty-print-markup.h".
	(highlight_colors::const percent_h): New defn.
	(highlight_colors::const percent_i): New defn.
	(type_to_string): Add param "highlight_color" and use it.
	(print_nonequal_arg): Likewise.
	(print_template_differences): Add params "highlight_color_a" and
	"highlight_color_b".
	(type_to_string_with_compare): Add params "this_highlight_color"
	and "peer_highlight_color".
	(print_template_tree_comparison): Add params "highlight_color_a"
	and "highlight_color_b".
	(cxx_format_postprocessor::handle):
	Use highlight_colors::percent_h and highlight_colors::percent_i.
	(pp_markup::element_quoted_type::print_type): New.
	(range_label_for_type_mismatch::get_text): Pass nullptr for new
	params of type_to_string_with_compare.
	* typeck.cc (cp_comp_parm_types): New.
	(cp_build_function_call_vec): Pass it to check_function_arguments.
	(convert_for_assignment): Use highlight_colors::percent_h for the
	labelled_range.

gcc/testsuite/ChangeLog:
	* g++.dg/diagnostic/bad-binary-ops-highlight-colors.C: New test.
	* g++.dg/diagnostic/bad-binary-ops-no-highlight-colors.C: New test.
	* g++.dg/plugin/plugin.exp (plugin_test_list): Add
	show-template-tree-color-no-highlight-colors.C to
	show_template_tree_color_plugin.c.
	* g++.dg/plugin/show-template-tree-color-labels.C: Update expected
	output to reflect use of highlight-a and highlight-b to contrast
	mismatches.
	* g++.dg/plugin/show-template-tree-color-no-elide-type.C:
	Likewise.
	* g++.dg/plugin/show-template-tree-color-no-highlight-colors.C:
	New test.
	* g++.dg/plugin/show-template-tree-color.C: Update expected output
	to reflect use of highlight-a and highlight-b to contrast
	mismatches.
	* g++.dg/warn/Wformat-gcc_diag-1.C: New test.
	* g++.dg/warn/Wformat-gcc_diag-2.C: New test.
	* g++.dg/warn/Wformat-gcc_diag-3.C: New test.
	* gcc.dg/bad-binary-ops-highlight-colors.c: New test.
	* gcc.dg/format/colors.c: New test.
	* gcc.dg/plugin/diagnostic_plugin_show_trees.c (show_tree): Pass
	nullptr for new param of gcc_rich_location::add_expr.

libcpp/ChangeLog:
	* include/rich-location.h (location_range::m_highlight_color): New
	field.
	(rich_location::rich_location): Add optional label_highlight_color
	param.
	(rich_location::set_highlight_color): New decl.
	(rich_location::add_range): Add optional label_highlight_color
	param.
	(rich_location::set_range): Likewise.
	* line-map.cc (rich_location::rich_location): Add
	"label_highlight_color" param and pass it to add_range.
	(rich_location::set_highlight_color): New.
	(rich_location::add_range): Add "label_highlight_color" param.
	(rich_location::set_range): Add "highlight_color" param.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

7d73c01c

Jun 22, 2024
- Daily bump. · 69fdcd0c
  GCC Administrator authored 9 months ago
  
  69fdcd0c
Jun 21, 2024

diagnostics: fixes to SARIF output [PR109360] · 9f4fdc3a

David Malcolm authored 9 months ago


When adding validation of .sarif files against the schema
(PR testsuite/109360) I discovered various issues where we were
generating invalid .sarif files.

Specifically, in
  c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
the relatedLocations for the "note" diagnostics were missing column
numbers, leading to validation failure due to non-unique elements,
such as multiple:
	"message": {"text": "invalid UTF-8 character <bf>"}},
on line 25 with no column information.

Root cause is that for some diagnostics in libcpp we have a location_t
representing the line as a whole, setting a column_override on the
rich_location (since the line hasn't been fully read yet).  We were
handling this column override for plain text output, but not for .sarif
output.

Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
emitted on "line 0" of the file, whereas SARIF requires line numbers to
be positive.

We also use column == 0 internally to mean "the line as a whole",
whereas SARIF required column numbers to be positive.

This patch fixes these various issues.

gcc/ChangeLog:
	PR testsuite/109360
	* diagnostic-format-sarif.cc
	(sarif_builder::make_location_object): Pass any column override
	from rich_loc to maybe_make_physical_location_object.
	(sarif_builder::maybe_make_physical_location_object): Add
	"column_override" param and pass it to maybe_make_region_object.
	(sarif_builder::maybe_make_region_object): Add "column_override"
	param and use it when the location has 0 for a column.  Don't
	add "startLine", "startColumn", "endLine", or "endColumn" if
	the values aren't positive.
	(sarif_builder::maybe_make_region_object_for_context): Don't
	add "startLine" or "endLine" if the values aren't positive.

libcpp/ChangeLog:
	PR testsuite/109360
	* include/rich-location.h (rich_location::get_column_override):
	New accessor.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

9f4fdc3a