Skip to content
Snippets Groups Projects
  1. Oct 10, 2024
  2. Oct 09, 2024
  3. Oct 08, 2024
    • Jakub Jelinek's avatar
      contrib, libcpp, libstdc++: Update to Unicode 16.0 · d0e8f58b
      Jakub Jelinek authored
      It is autumn again and there is a new Unicode version 16.0.
      
      The following patch updates our Unicode stuff in contrib, libcpp and
      libstdc++ from that Unicode version.
      
      2024-10-08  Jakub Jelinek  <jakub@redhat.com>
      
      contrib/
      	* unicode/README: Update glibc git commit hash, replace
      	Unicode 15 or 15.1 versions with 16.
      	* unicode/gen_libstdcxx_unicode_data.py: Use 160000 instead of
      	150100 in _GLIBCXX_GET_UNICODE_DATA test.
      	* unicode/from_glibc/utf8_gen.py: Updated from glibc
      	064c708c78cc2a6b5802dce73108fc0c1c6bfc80 commit.
      	* unicode/DerivedCoreProperties.txt: Updated from Unicode 16.0.
      	* unicode/emoji-data.txt: Likewise.
      	* unicode/PropList.txt: Likewise.
      	* unicode/GraphemeBreakProperty.txt: Likewise.
      	* unicode/DerivedNormalizationProps.txt: Likewise.
      	* unicode/NameAliases.txt: Likewise.
      	* unicode/UnicodeData.txt: Likewise.
      	* unicode/EastAsianWidth.txt: Likewise.
      gcc/testsuite/
      	* c-c++-common/cpp/named-universal-char-escape-1.c: Add tests
      	for some Unicode 16.0 characters, both normal and generated.
      libcpp/
      	* makeucnid.cc (write_copyright): Update Unicode Copyright years.
      	* makeuname2c.cc (generated_ranges): Adjust Unicode version from 15.1
      	to 16.0.  Add EGYPTIAN HIEROGLYPH- generated range, adjust indexes in
      	following entries.
      	(write_copyright): Update Unicode Copyright years.
      	* generated_cpp_wcwidth.h: Regenerated.
      	* ucnid.h: Regenerated.
      	* uname2c.h: Regenerated.
      libstdc++-v3/
      	* include/bits/unicode.h (std::__unicode::__v15_1_0): Rename inline
      	namespace to ...
      	(std::__unicode::__v16_0_0): ... this.
      	(_GLIBCXX_GET_UNICODE_DATA): Change from 150100 to 160000.
      	* include/bits/unicode-data.h: Regenerated.
      	* testsuite/ext/unicode/properties.cc: Check for _Gcb_SpacingMark
      	on U+11F03 rather than U+1D16D as the latter lost SpacingMark property
      	in Unicode 16.0.
      d0e8f58b
    • GCC Administrator's avatar
      Daily bump. · 14870c1f
      GCC Administrator authored
      14870c1f
  4. Oct 07, 2024
    • Jakub Jelinek's avatar
      libcpp: Use constexpr for _cpp_trigraph_map initialization for C++14 · e4c0595e
      Jakub Jelinek authored
      The _cpp_trigraph_map initialization used to be done for C99+ using
      designated initializers, but can't be done that way for C++ because
      the designated initializer support in C++ as array designators are just
      an extension there and don't allow skipping anything nor going backwards.
      
      But, we can get the same effect using C++14 constexpr constructor.
      With the following patch we get rid of the runtime initialization
      and the array can be in .rodata.
      
      2024-10-07  Jakub Jelinek  <jakub@redhat.com>
      
      	* internal.h (_cpp_trigraph_map_s): New type for C++14 or later.
      	(_cpp_trigraph_map_d): New variable for C++14 or later.
      	(_cpp_trigraph_map): Define to _cpp_trigraph_map_d.map for C++14 or
      	later.
      	* init.cc (init_trigraph_map): Define to nothing for C++14 or later.
      	(TRIGRAPH_MAP, END, s): Define differently for C++14 or later.
      e4c0595e
  5. Oct 03, 2024
  6. Oct 02, 2024
    • Jakub Jelinek's avatar
      libcpp: Implement clang -Wheader-guard warning [PR96842] · 5943a2fa
      Jakub Jelinek authored
      The following patch implements the clang -Wheader-guard warning, which warns
      if a valid multiple inclusion header guard's #ifndef/#if !defined directive
      is immediately (no other non-line directives nor other (non-comment)
      tokens in between) followed by #define directive for some different macro,
      which in get_suggestion rules is close enough to the actual header guard
      macro (i.e. likely misspelling), the #define is object-like with empty
      definition (I've followed what clang implements) and the macro isn't defined
      later on (at least not on the final #endif at the end of a header).
      
      In this case it emits a warning, so that
        #ifndef STDIO_H
        #define STDOI_H
        ...
        #endif
      or similar misspellings can be caught.
      
      clang enables this warning by default, but I've put it into -Wall instead
      as it still seems to be a style warning, nothing more severe; if a header
      doesn't survive multiple inclusion because of the misspelling, users will
      get different diagnostics.
      
      2024-10-02  Jakub Jelinek  <jakub@redhat.com>
      
      	PR preprocessor/96842
      libcpp/
      	* include/cpplib.h (struct cpp_options): Add warn_header_guard member.
      	(enum cpp_warning_reason): Add CPP_W_HEADER_GUARD enumerator.
      	* internal.h (struct cpp_reader): Add mi_def_cmacro, mi_loc and
      	mi_def_loc members.
      	(_cpp_defined_macro_p): Constify type pointed by argument type.
      	Formatting fix.
      	* init.cc (cpp_create_reader): Clear
      	CPP_OPTION (pfile, warn_header_guard).
      	* directives.cc (struct if_stack): Add def_loc and mi_def_cmacro
      	members.
      	(DIRECTIVE_TABLE): Add IF_COND flag to define.
      	(do_define): Set ifs->mi_def_cmacro on a define immediately following
      	#ifndef directive for the guard.  Clear pfile->mi_valid.  Formatting
      	fix.
      	(do_endif): Copy over pfile->mi_def_cmacro and pfile->mi_def_loc
      	if ifs->mi_def_cmacro is set and pfile->mi_cmacro isn't a defined
      	macro.
      	(push_conditional): Clear mi_def_cmacro and mi_def_loc members.
      	* files.cc (_cpp_pop_file_buffer): Emit -Wheader-guard diagnostics.
      gcc/
      	* doc/invoke.texi (Wheader-guard): Document.
      gcc/c-family/
      	* c.opt (Wheader-guard): New option.
      	* c.opt.urls: Regenerated.
      	* c-ppoutput.cc (init_pp_output): Initialize also cb->get_suggestion.
      gcc/testsuite/
      	* c-c++-common/cpp/Wheader-guard-1.c: New test.
      	* c-c++-common/cpp/Wheader-guard-1-1.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-2.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-3.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-4.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-5.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-6.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-7.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-8.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-9.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-10.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-11.h: New test.
      	* c-c++-common/cpp/Wheader-guard-1-12.h: New test.
      	* c-c++-common/cpp/Wheader-guard-2.c: New test.
      	* c-c++-common/cpp/Wheader-guard-2.h: New test.
      	* c-c++-common/cpp/Wheader-guard-3.c: New test.
      	* c-c++-common/cpp/Wheader-guard-3.h: New test.
      5943a2fa
  7. Sep 20, 2024
  8. Sep 19, 2024
  9. Sep 14, 2024
  10. Sep 13, 2024
    • Jakub Jelinek's avatar
      libcpp: Fix up UB in finish_embed · 4963eb76
      Jakub Jelinek authored
      Jonathan reported on IRC that certain unnamed proprietary static analyzer
      is unhappy about the new finish_embed function and it is actually right.
      On a testcase like:
       #embed __FILE__ limit (0) if_empty (0)
      params->if_empty.count is 1, limit is 0, so count is 0 (we need just
      a single token and one fits into pfile->directive_result).  Because
      count is 0, we don't allocate toks, so it stays NULL, and then in
      1301      if (prefix->count)
      1302        {
      1303          *tok = *prefix->base_run.base;
      1304          tok = toks;
      1305          tokenrun *cur_run = &prefix->base_run;
      1306          while (cur_run)
      1307            {
      1308              size_t cnt = (cur_run->next ? cur_run->limit
      1309                            : prefix->cur_token) - cur_run->base;
      1310              cpp_token *t = cur_run->base;
      1311              if (cur_run == &prefix->base_run)
      1312                {
      1313                  t++;
      1314                  cnt--;
      1315                }
      1316              memcpy (tok, t, cnt * sizeof (cpp_token));
      1317              tok += cnt;
      1318              cur_run = cur_run->next;
      1319            }
      1320        }
      the *tok = *prefix->base_run.base; assignment will copy the only
      token.  cur_run is still non-NULL, cnt will be initially 1 and
      then decremented to 0, but we invoke UB because we do
      memcpy (NULL, cur_run->base + 1, 0 * sizeof (cpp_token));
      and then the loop stops because cur_run->next must be NULL.
      
      As we don't really copy anything, toks can be anything non-NULL,
      so the following patch fixes that by initializing toks also to
      &pfile->directive_result (just something known to be non-NULL).
      This should be harmless even for the
       #embed __FILE__ limit (1)
      case (no non-empty prefix/suffix) where toks isn't allocated
      either, but in that case prefix->count will be 0 and in the
      1321      for (size_t i = 0; i < limit; ++i)
      1322        {
      1323          tok->src_loc = params->loc;
      1324          tok->type = CPP_NUMBER;
      1325          tok->flags = NO_EXPAND;
      1326          if (i == 0)
      1327            tok->flags |= PREV_WHITE;
      1328          tok->val.str.text = s;
      1329          tok->val.str.len = sprintf ((char *) s, "%d", buffer[i]);
      1330          s += tok->val.str.len + 1;
      1331          if (tok == &pfile->directive_result)
      1332            tok = toks;
      1333          else
      1334            tok++;
      1335          if (i < limit - 1)
      1336            {
      1337              tok->src_loc = params->loc;
      1338              tok->type = CPP_COMMA;
      1339              tok->flags = NO_EXPAND;
      1340              tok++;
      1341            }
      1342        }
      loop limit will be 1, so tok is initially &pfile->directive_result,
      that is stilled in, then tok = toks; (previously setting tok to NULL,
      now to &pfile->directive_result again) and because 0 < 1 - 1 is
      false, nothing further will happen and the loop will finish (and as
      params->suffix.count will be 0, nothing further will use tok).
      
      2024-09-13  Jakub Jelinek  <jakub@redhat.com>
      
      	* files.cc (finish_embed): Initialize toks to tok rather
      	than NULL.
      4963eb76
    • GCC Administrator's avatar
      Daily bump. · 3d021a02
      GCC Administrator authored
      3d021a02
  11. Sep 12, 2024
    • Jakub Jelinek's avatar
      libcpp, v2: Add support for gnu::base64 #embed parameter · ce0aecc7
      Jakub Jelinek authored
      This patch which adds another #embed extension, gnu::base64.
      
      As mentioned in the documentation, this extension is primarily
      intended for use by the preprocessor, so that for the larger (say 32+ or
      64+ bytes long embeds it doesn't have to emit tens of thousands or
      millions of comma separated string literals which would be very expensive
      to parse again, but can emit
       #embed "." __gnu__::__base64__( \
       "Tm9uIGVyYW0gbsOpc2NpdXMsIEJydXRlLCBjdW0sIHF1w6Ygc3VtbWlzIGluZ8OpbmlpcyBleHF1" \
       "aXNpdMOhcXVlIGRvY3Ryw61uYSBwaGlsw7Nzb3BoaSBHcsOmY28gc2VybcOzbmUgdHJhY3RhdsOt" \
       "c3NlbnQsIGVhIExhdMOtbmlzIGzDrXR0ZXJpcyBtYW5kYXLDqW11cywgZm9yZSB1dCBoaWMgbm9z" \
       "dGVyIGxhYm9yIGluIHbDoXJpYXMgcmVwcmVoZW5zacOzbmVzIGluY8O6cnJlcmV0LiBuYW0gcXVp" \
       "YsO6c2RhbSwgZXQgaWlzIHF1aWRlbSBub24gw6FkbW9kdW0gaW5kw7NjdGlzLCB0b3R1bSBob2Mg" \
       "ZMOtc3BsaWNldCBwaGlsb3NvcGjDoXJpLiBxdWlkYW0gYXV0ZW0gbm9uIHRhbSBpZCByZXByZWjD" \
       "qW5kdW50LCBzaSByZW3DrXNzaXVzIGFnw6F0dXIsIHNlZCB0YW50dW0gc3TDumRpdW0gdGFtcXVl" \
       "IG11bHRhbSDDs3BlcmFtIHBvbsOpbmRhbSBpbiBlbyBub24gYXJiaXRyw6FudHVyLiBlcnVudCDD" \
       "qXRpYW0sIGV0IGlpIHF1aWRlbSBlcnVkw610aSBHcsOmY2lzIGzDrXR0ZXJpcywgY29udGVtbsOp" \
       "bnRlcyBMYXTDrW5hcywgcXVpIHNlIGRpY2FudCBpbiBHcsOmY2lzIGxlZ8OpbmRpcyDDs3BlcmFt" \
       "IG1hbGxlIGNvbnPDum1lcmUuIHBvc3Ryw6ltbyDDoWxpcXVvcyBmdXTDunJvcyBzw7pzcGljb3Is" \
       "IHF1aSBtZSBhZCDDoWxpYXMgbMOtdHRlcmFzIHZvY2VudCwgZ2VudXMgaG9jIHNjcmliw6luZGks" \
       "IGV0c2kgc2l0IGVsw6lnYW5zLCBwZXJzw7Nuw6YgdGFtZW4gZXQgZGlnbml0w6F0aXMgZXNzZSBu" \
       "ZWdlbnQu")
      with the meaning don't actually load some file, instead base64 decode
      (RFC4648 with A-Za-z0-9+/ chars and = padding, no newlines in between)
      the string and use that as data.  This is chosen because it should be
      -pedantic-errors clean, fairly cheap to decode and then in optimizing
      compiler could be handled as similar binary blob to normal #embed,
      while the data isn't left somewhere on the disk, so distcc/ccache etc.
      can move the preprocessed source without issues.
      It makes no sense to support limit and gnu::offset parameters together
      with it IMHO, why would somebody waste providing full data and then
      threw some away?  prefix/suffix/if_empty are normally supported though,
      but not intended to be used by the preprocessor.
      
      This patch adds just the extension side, not the actual emitting of this
      during -E or -E -fdirectives-only for now, that will be included in the
      upcoming patch.
      
      Compared to the earlier posted version of this extension, this patch
      allows the string concatenation in the parameter argument (but still
      doesn't allow escapes in the string, why would anyone use them when
      only A-Za-z0-9+/= are valid).  The patch also adds support for parsing
      this even in -fpreprocessed compilation.
      
      2024-09-12  Jakub Jelinek  <jakub@redhat.com>
      
      libcpp/
      	* internal.h (struct cpp_embed_params): Add base64 member.
      	(_cpp_free_embed_params_tokens): Declare.
      	* directives.cc (DIRECTIVE_TABLE): Add IN_I flag to T_EMBED.
      	(save_token_for_embed, _cpp_free_embed_params_tokens): New functions.
      	(EMBED_PARAMS): Add gnu::base64 entry.
      	(_cpp_parse_embed_params): Parse gnu::base64 parameter.  If
      	-fpreprocessed without -fdirectives-only, require #embed to have
      	gnu::base64 parameter.  Diagnose conflict between gnu::base64 and
      	limit or gnu::offset parameters.
      	(do_embed): Use _cpp_free_embed_params_tokens.
      	* files.cc (finish_embed, base64_dec_fn): New functions.
      	(base64_dec): New array.
      	(B64D0, B64D1, B64D2, B64D3): Define.
      	(finish_base64_embed): New function.
      	(_cpp_stack_embed): Use finish_embed.  Handle params->base64
      	using finish_base64_embed.
      	* macro.cc (builtin_has_embed): Call _cpp_free_embed_params_tokens.
      gcc/
      	* doc/cpp.texi (Binary Resource Inclusion): Document gnu::base64
      	parameter.
      gcc/testsuite/
      	* c-c++-common/cpp/embed-17.c: New test.
      	* c-c++-common/cpp/embed-18.c: New test.
      	* c-c++-common/cpp/embed-19.c: New test.
      	* c-c++-common/cpp/embed-27.c: New test.
      	* gcc.dg/cpp/embed-6.c: New test.
      	* gcc.dg/cpp/embed-7.c: New test.
      ce0aecc7
    • Jason Merrill's avatar
      libcpp: adjust pedwarn handling · c5009eb8
      Jason Merrill authored
      Using cpp_pedwarning (CPP_W_PEDANTIC instead of if (CPP_PEDANTIC cpp_error
      lets users suppress these diagnostics with
       #pragma GCC diagnostic ignored "-Wpedantic".
      
      This patch changes all instances of the cpp_error (CPP_DL_PEDWARN to
      cpp_pedwarning.  In cases where the extension appears in a later C++
      revision, we now condition the warning on the relevant -Wc++??-extensions
      flag instead of -Wpedantic; in such cases often the if (CPP_PEDANTIC) check
      is retained to preserve the default non-warning behavior.
      
      I didn't attempt to adjust the warning flags for the C compiler, since it
      seems to follow a different system than C++.
      
      The CPP_PEDANTIC check is also kept in _cpp_lex_direct to avoid an ICE in
      the self-tests from cb.diagnostics not being initialized.
      
      While working on testcases for these changes I noticed that the c-c++-common
      tests are not run with -pedantic-errors by default like the gcc.dg and
      g++.dg directories are.  And if I specify -pedantic-errors with dg-options,
      the default -std= changes from c++?? to gnu++??, which interferes with some
      other pedwarns.  So two of the tests are C++-only.
      
      libcpp/ChangeLog:
      
      	* include/cpplib.h (enum cpp_warning_reason): Add
      	CPP_W_CXX{14,17,20,23}_EXTENSIONS.
      	* charset.cc (_cpp_valid_ucn, convert_hex, convert_oct)
      	(convert_escape, narrow_str_to_charconst): Use cpp_pedwarning
      	instead of cpp_error for pedwarns.
      	* directives.cc (directive_diagnostics, _cpp_handle_directive)
      	(do_line, do_elif): Likewise.
      	* expr.cc (cpp_classify_number, eval_token): Likewise.
      	* lex.cc (skip_whitespace, maybe_va_opt_error)
      	(_cpp_lex_direct): Likewise.
      	* macro.cc (_cpp_arguments_ok): Likewise.
      	(replace_args): Use -Wvariadic-macros for pedwarn about
      	empty macro arguments.
      
      gcc/c-family/ChangeLog:
      
      	* c.opt: Add CppReason for Wc++{14,17,20,23}-extensions.
      	* c-pragma.cc (handle_pragma_diagnostic_impl): Don't check
      	OPT_Wc__23_extensions.
      
      gcc/testsuite/ChangeLog:
      
      	* c-c++-common/pragma-diag-17.c: New test.
      	* g++.dg/cpp0x/va-opt1.C: New test.
      	* g++.dg/cpp23/named-universal-char-escape3.C: New test.
      c5009eb8
    • Jakub Jelinek's avatar
      libcpp: Add support for gnu::offset #embed/__has_embed parameter · 44058b84
      Jakub Jelinek authored
      The following patch adds on top of the just posted #embed patch
      a first extension, gnu::offset which allows to seek in the data
      file (for seekable files, otherwise read and throw away).
      I think this is useful e.g. when some binary data start with
      some well known header which shouldn't be included in the data etc.
      
      2024-09-12  Jakub Jelinek  <jakub@redhat.com>
      
      libcpp/
      	* internal.h (struct cpp_embed_params): Add offset member.
      	* directives.cc (EMBED_PARAMS): Add gnu::offset entry.
      	(enum embed_param_kind): Add NUM_EMBED_STD_PARAMS.
      	(_cpp_parse_embed_params): Use NUM_EMBED_STD_PARAMS rather than
      	NUM_EMBED_PARAMS when parsing standard parameters.  Parse gnu::offset
      	parameter.
      	* files.cc (struct _cpp_file): Add offset member.
      	(_cpp_stack_embed): Handle params->offset.
      gcc/
      	* doc/cpp.texi (Binary Resource Inclusion): Document gnu::offset
      	#embed parameter.
      gcc/testsuite/
      	* c-c++-common/cpp/embed-15.c: New test.
      	* c-c++-common/cpp/embed-16.c: New test.
      	* gcc.dg/cpp/embed-5.c: New test.
      44058b84
    • Jakub Jelinek's avatar
      libcpp, c-family: Add (dumb) C23 N3017 #embed support [PR105863] · eba6d2aa
      Jakub Jelinek authored
      The following patch implements the C23 N3017 "#embed - a scannable,
      tooling-friendly binary resource inclusion mechanism" paper.
      
      The implementation is intentionally dumb, in that it doesn't significantly
      speed up compilation of larger initializers and doesn't make it possible
      to use huge #embeds (like several gigabytes large, that is compile time
      and memory still infeasible).
      There are 2 reasons for this.  One is that I think like it is implemented
      now in the patch is how we should use it for the smaller #embed sizes,
      dunno with which boundary, whether 32 bytes or 64 or something like that,
      certainly handling the single byte cases which is something that can appear
      anywhere in the source where constant integer literal can appear is
      desirable and I think for a few bytes it isn't worth it to come up with
      something smarter and users would like to e.g. see it in -E readably as
      well (perhaps the slow vs. fast boundary should be determined by command
      line option).  And the other one is to be able to more easily find
      regressions in behavior caused by the optimizations, so we have something
      to get back in git to compare against.
      I'm definitely willing to work on the optimizations (likely introduce a new
      CPP_* token type to refer to a range of libcpp owned memory (start + size)
      and similarly some tree which can do the same, and can be at any time e.g.
      split into 2 subparts + say INTEGER_CST in between if needed say for
      const unsigned char d[] = {
       #embed "2GB.dat" prefix (0, 0, ) suffix (, [0x40000000] = 42)
      }; still without having to copy around huge amounts of data; STRING_CST
      owns the memory it points to and can be only 2GB in size), but would
      like to do that incrementally.
      And would like to first include some extensions also not included in
      this patch, like gnu::offset (off) parameter to allow to skip certain
      constant amount of bytes at the start of the files, plus
      gnu::base64 ("base64_encoded_data") parameter to add something which can
      store more efficiently large amounts of the #embed data in preprocessed
      source.
      
      I've been cross-checking all the tests also against the LLVM implementation
      https://github.com/llvm/llvm-project/pull/68620
      which has been for a few hours even committed to LLVM trunk but reverted
      afterwards.  LLVM now has the support committed and I admit I haven't
      rechecked whether the behavior on the below mentioned spots have been fixed
      in it already or not yet.
      
      The patch uses --embed-dir= option that clang plans to add above and doesn't
      use other variants on the search directories yet, plus there are no
      default directories at least for the time being where to search for embed
      files.  So, #embed "..." works if it is found in the same directory (or
      relative to the current file's directory) and #embed "/..." or #embed </...>
      work always, but relative #embed <...> doesn't unless at least one
      --embed-dir= is specified.  There is no reason to differentiate between
      system and non-system directories, so we don't need -isystem like
      counterpart, perhaps -iquote like counterpart could be useful in the future,
      dunno what else.  It has --embed-directory=dir and --embed-directory dir
      as aliases.
      
      There are some differences beyond clang ICEs, so I'd like to point them out
      to make sure there is agreement on the choices in the patch.  They are also
      mentioned in the comments of the llvm pull request.
      
      The most important is that the GCC patch (as well as the original thephd.dev
      LLVM branch on godbolt) expands #embed (or acts as if it is expanded) into
      a mere sequence of numbers like 123,2,35,26 rather then what clang
      effectively treats as (unsigned char)123,(unsigned char)2,(unsigned
      char)35,(unsigned char)26 but only does that when using integrated
      preprocessor, not when using -save-temps where it acts as GCC.
      JeanHeyd as the original author agrees that is how it is currently worded in
      C23.
      
      Another difference (not tested in the testsuite, not sure how to check for
      effective target /dev/urandom nor am sure it is desirable to check that
      during testsuite) is how to treat character devices, named pipes etc.
      (block devices are errored on).  The original paper uses /dev/urandom
      in various examples and seems to assume that unlike regular files the
      devices aren't really cached, so
       #embed </dev/urandom> limit(1) prefix(int a = ) suffix(;)
       #embed </dev/urandom> limit(1) prefix(int b = ) suffix(;)
      usually results in a != b.  That is what the godbolt thephd.dev branch
      implements too and what this patch does as well, but clang actually seems
      to just go from st.st_size == 0, ergo it must be zero-sized resource and
      so just copies over if_empty if present.  It is really questionable
      what to do about the character devices/named pipes with __has_embed, for
      regular files the patch doesn't read anything from them, relies on
      st.st_size + limit for whether it is empty or non-empty.  But I don't know
      of a way to check if read on say a character device would read anything
      or not (the </dev/null> limit (1) vs. </dev/zero> limit (1) cases), and
      if we read something, that would be better cached for later because
       #embed later if it reads again could read no further data even when it
      first read something.  So, the patch currently for __has_embed just
      always returns 2 on the non-regular files, like the thephd.dev
      branch does as well and like the clang pull request as well.
      A question is also what to do for gnu::offset on the non-regular files
      even for #embed, those aren't seekable and do we want to just read and throw
      away the offset bytes each time we see it used?
      
      clang also chokes on the
       #if __has_embed (__FILE__ __limit__ (1) __prefix__ () suffix (1 / 0) \
       __if_empty__ ((({{[0[0{0{0(0(0)1)1}1}]]}})))) != __STDC_EMBED_FOUND__
       #error "__has_embed fail"
       #endif
      in embed-1.c, but thephd.dev branch accepts it and I don't see why
      it shouldn't, (({{[0[0{0{0(0(0)1)1}1}]]}}))) is a balanced token
      sequence and the file isn't empty, so it should just be parsed and
      discarded.
      
      clang also IMHO mishandles
       const unsigned char w[] = {
       #embed __FILE__ prefix([0] = 42, [15] =) limit(32)
       };
      but again only without -save-temps, seems like it
      treats it as
      [0] = 42, [15] = (99,111,110,115,116,32,117,110,115,105,103,110,101,100,
      32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98)
      rather than
      [0] = 42, [15] = 99,111,110,115,116,32,117,110,115,105,103,110,101,100,
      32,99,104,97,114,32,119,91,93,32,61,32,123,10,35,101,109,98
      and warns on it for -Wunused-value and just compiles it as
      [0] = 42, [15] = 98
      
      And also
       void foo (int, int, int, int);
       void bar (void) { foo (
       #embed __FILE__ limit (4) prefix (172 + ) suffix (+ 2)
       ); }
      is treated as
      172 + (118, 111, 105, 100) + 2
      rather than
      172 + 118, 111, 105, 100 + 2
      which clang -save-temps or GCC treats it like, so results
      in just one argument passed rather than 4.
      
      if (!strstr ((const char *) magna_carta, "imprisonétur")) abort ();
      in the testcase fails as well, but in that case calling it in gdb succeeds:
      p ((char *(*)(char *, char *))__strstr_sse2) (magna_carta, "imprisonétur")
      $2 = 0x555555558d3c <magna_carta+11564> "imprisonétur aut disseisiátur"...
      so I guess they are just trying to constant evaluate strstr and do it
      incorrectly.
      
      They started with making the optimizations together in the initial patch
      set, so they don't have the luxury to compare if it is just because of
      the optimization they are trying to do or because that is how the
      feature works for them.  At least unless they use -save-temps for now.
      
      There is also different behavior between clang and gcc on -M or other
      dependency generating options.  Seems clang includes the __has_embed
      searched files in dependencies, while my patch doesn't.  But so does
      clang for __has_include and GCC doesn't.  Emitting a hard dependency
      on some header just because there was __has_include/__has_embed for it
      seems wrong to me, because (at least when properly written) the source
      likely doesn't mind if the file is missing, it will do something else,
      so a hard error from make because of it doesn't seem right.  Does
      make have some weaker dependencies, such that if some file can be remade
      it is but if it doesn't exist, it isn't fatal?
      
      I wonder whether #embed <non-existent-file> really needs to be fatal
      or whether we could simply after diagnosing it pretend the file exists
      and is empty.  For #include I think fatal errors make tons of sense,
      but perhaps for #embed which is more localized we'd get better error
      reporting if we didn't bail out immediately.  Note, both GCC and clang
      currently treat those as fatal errors.
      
      clang also added -dE option which with -E instead of preprocessing
      the #embed directives keeps them as is, but the preprocessed source
      then isn't self-contained.  That option looks more harmful than useful to
      me.
      
      Also, it isn't clear to me from C23 whether it is possible to have
      __has_include/__has_c_attribute/__has_embed expressions inside of
      the limit #embed/__has_embed argument.
      6.10.3.2/2 says that defined should not appear there (and the patch
      diagnoses it and testsuite tests), but for __has_include/__has_embed
      etc. 6.10.1/11 says:
      "The identifiers __has_include, __has_embed, and __has_c_attribute
      shall not appear in any context not mentioned in this subclause."
      If that subclause in that case means 6.10.1, then it presumably shouldn't
      appear in #embed in 6.10.3, but __has_embed is in 6.10.1...
      But 6.10.3.2/3 says that it should be parsed according to the 6.10.1
      rules.  Haven't included tests like
       #if __has_embed (__FILE__ limit (__has_embed (__FILE__ limit (1))))
      or
       #embed __FILE__ limit (__has_include (__FILE__))
      into the testsuite because of the doubts but I think the patch should
      handle those right now.
      
      The reason I've used Magna Carta text in some of the testcases is that
      I hope it shouldn't be copyrighted after the centuries and I'd strongly
      prefer not to have binary blobs in git after the xz backdoor lesson
      and wanted something larger which doesn't change all the time.
      
      Oh, BTW, I see in C23 draft 6.10.3.2 in Example 4
      if (f_source == NULL);
        return 1;
      (note the spurious semicolon after closing paren), has that been fixed
      already?
      
      Like the thephd.dev and clang implementations, the patch always macro
      expands the whole #embed and __has_embed directives except for the
      embed keyword.  That is most likely not what C23 says, my limited
      understanding right now is that in #embed one needs to parse the whole
      directive line with macro expansion disabled and check if it satisfies the
      grammar, if not, the whole directive is macro expanded, if yes, only
      the limit parameter argument is macro expanded and the prefix/suffix/if_empty
      arguments are maybe macro expanded when actually used (and not at all if
      unused).  And I think __has_embed macro expansion has conflicting rules.
      
      2024-09-12  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c/105863
      libcpp/
      	* include/cpplib.h: Implement C23 N3017 #embed - a scannable,
      	tooling-friendly binary resource inclusion mechanism paper.
      	(struct cpp_options): Add embed member.
      	(enum cpp_builtin_type): Add BT_HAS_EMBED.
      	(cpp_set_include_chains): Add another cpp_dir * argument to
      	the declaration.
      	* internal.h (enum include_type): Add IT_EMBED.
      	(struct cpp_reader): Add embed_include member.
      	(struct cpp_embed_params_tokens): New type.
      	(struct cpp_embed_params): New type.
      	(_cpp_get_token_no_padding): Declare.
      	(enum _cpp_find_file_kind): Add _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
      	(_cpp_stack_embed): Declare.
      	(_cpp_parse_expr): Change return type to cpp_num_part instead of
      	bool, change second argument from bool to const char * and add third
      	argument.
      	(_cpp_parse_embed_params): Declare.
      	* directives.cc (DIRECTIVE_TABLE): Add embed entry.
      	(end_directive): Don't call skip_rest_of_line for T_EMBED directive.
      	(_cpp_handle_directive): Return 2 rather than 1 for T_EMBED in
      	directives-only mode.
      	(parse_include): Don't Call check_eol for T_EMBED directive.
      	(skip_balanced_token_seq): New function.
      	(EMBED_PARAMS): Define.
      	(enum embed_param_kind): New type.
      	(embed_params): New variable.
      	(_cpp_parse_embed_params): New function.
      	(do_embed): New function.
      	(do_if): Adjust _cpp_parse_expr caller.
      	(do_elif): Likewise.
      	* expr.cc (parse_defined): Diagnose defined in #embed or __has_embed
      	parameters.
      	(_cpp_parse_expr): Change return type to cpp_num_part instead of
      	bool, change second argument from bool to const char * and add third
      	argument.  Adjust function comment.  For #embed/__has_embed parameters
      	add an artificial CPP_OPEN_PAREN.  Use the second argument DIR
      	directly instead of string literals conditional on IS_IF.
      	For #embed/__has_embed parameter, stop on reaching CPP_CLOSE_PAREN
      	matching the artificial one.  Diagnose negative or too large embed
      	parameter operands.
      	(num_binary_op): Use #embed instead of #if for diagnostics if inside
      	#embed/__has_embed parameter.
      	(num_div_op): Likewise.
      	* files.cc (struct _cpp_file): Add limit member and embed bitfield.
      	(search_cache): Add IS_EMBED argument, formatting fix.  Skip over
      	files with different file->embed from the argument.
      	(find_file_in_dir): Don't call pch_open_file if file->embed.
      	(_cpp_find_file): Handle _cpp_FFK_EMBED and _cpp_FFK_HAS_EMBED.
      	(read_file_guts): Formatting fix.
      	(has_unique_contents): Ignore file->embed files.
      	(search_path_head): Handle IT_EMBED type.
      	(_cpp_stack_embed): New function.
      	(_cpp_get_file_stat): Formatting fix.
      	(cpp_set_include_chains): Add embed argument, save it to
      	pfile->embed_include and compute lens for the chain.
      	* init.cc (struct lang_flags): Add embed member.
      	(lang_defaults): Add embed initializers.
      	(cpp_set_lang): Initialize CPP_OPTION (pfile, embed).
      	(builtin_array): Add __has_embed entry.
      	(cpp_init_builtins): Predefine __STDC_EMBED_NOT_FOUND__,
      	__STDC_EMBED_FOUND__ and __STDC_EMBED_EMPTY__.
      	* lex.cc (cpp_directive_only_process): Handle #embed.
      	* macro.cc (cpp_get_token_no_padding): Rename to ...
      	(_cpp_get_token_no_padding): ... this.  No longer static.
      	(builtin_has_include_1): New function.
      	(builtin_has_include): Use it.  Use _cpp_get_token_no_padding
      	instead of cpp_get_token_no_padding.
      	(builtin_has_embed): New function.
      	(_cpp_builtin_macro_text): Handle BT_HAS_EMBED.
      gcc/
      	* doc/cppdiropts.texi (--embed-dir=): Document.
      	* doc/cpp.texi (Binary Resource Inclusion): New chapter.
      	(__has_embed): Document.
      	* doc/invoke.texi (Directory Options): Mention --embed-dir=.
      	* gcc.cc (cpp_unique_options): Add %{-embed*}.
      	* genmatch.cc (main): Adjust cpp_set_include_chains caller.
      	* incpath.h (enum incpath_kind): Add INC_EMBED.
      	* incpath.cc (merge_include_chains): Handle INC_EMBED.
      	(register_include_chains): Adjust cpp_set_include_chains caller.
      gcc/c-family/
      	* c.opt (-embed-dir=): New option.
      	(-embed-directory): New alias.
      	(-embed-directory=): New alias.
      	* c-opts.cc (c_common_handle_option): Handle OPT__embed_dir_.
      gcc/testsuite/
      	* c-c++-common/cpp/embed-1.c: New test.
      	* c-c++-common/cpp/embed-2.c: New test.
      	* c-c++-common/cpp/embed-3.c: New test.
      	* c-c++-common/cpp/embed-4.c: New test.
      	* c-c++-common/cpp/embed-5.c: New test.
      	* c-c++-common/cpp/embed-6.c: New test.
      	* c-c++-common/cpp/embed-7.c: New test.
      	* c-c++-common/cpp/embed-8.c: New test.
      	* c-c++-common/cpp/embed-9.c: New test.
      	* c-c++-common/cpp/embed-10.c: New test.
      	* c-c++-common/cpp/embed-11.c: New test.
      	* c-c++-common/cpp/embed-12.c: New test.
      	* c-c++-common/cpp/embed-13.c: New test.
      	* c-c++-common/cpp/embed-14.c: New test.
      	* c-c++-common/cpp/embed-25.c: New test.
      	* c-c++-common/cpp/embed-26.c: New test.
      	* c-c++-common/cpp/embed-dir/embed-1.inc: New test.
      	* c-c++-common/cpp/embed-dir/embed-3.c: New test.
      	* c-c++-common/cpp/embed-dir/embed-4.c: New test.
      	* c-c++-common/cpp/embed-dir/magna-carta.txt: New test.
      	* gcc.dg/cpp/embed-1.c: New test.
      	* gcc.dg/cpp/embed-2.c: New test.
      	* gcc.dg/cpp/embed-3.c: New test.
      	* gcc.dg/cpp/embed-4.c: New test.
      	* g++.dg/cpp/embed-1.C: New test.
      	* g++.dg/cpp/embed-2.C: New test.
      	* g++.dg/cpp/embed-3.C: New test.
      eba6d2aa
  12. Aug 28, 2024
  13. Aug 26, 2024
    • Alexander Monakov's avatar
      libcpp: deduplicate definition of padding size · a8260ebe
      Alexander Monakov authored
      Tie together the two functions that ensure tail padding with
      search_line_ssse3 via CPP_BUFFER_PADDING macro.
      
      libcpp/ChangeLog:
      
      	* internal.h (CPP_BUFFER_PADDING): New macro; use it ...
      	* charset.cc (_cpp_convert_input): ...here, and ...
      	* files.cc (read_file_guts): ...here, and ...
      	* lex.cc (search_line_ssse3): here.
      a8260ebe
  14. Aug 24, 2024
  15. Aug 23, 2024
  16. Aug 22, 2024
  17. Aug 21, 2024
  18. Aug 20, 2024
    • Jakub Jelinek's avatar
      libcpp: Adjust lang_defaults · 447c32c5
      Jakub Jelinek authored
      The table over the years turned to be very wide, 147 columns
      and any addition would add a couple of new ones.
      We need a 28x23 bit matrix right now.
      
      This patch changes the formatting, so that we need just 2 columns
      per new feature and so we have some room for expansion.
      In addition, the patch changes it to bitfields, which reduces
      .rodata by 532 bytes (so 5.75x reduction of the variable) and
      on x86_64-linux grows the cpp_set_lang function by 26 bytes (8.4%
      growth).
      
      2024-08-20  Jakub Jelinek  <jakub@redhat.com>
      
      	* init.cc (struct lang_flags): Change all members from char
      	typed fields to unsigned bit-fields.
      	(lang_defaults): Change formatting of the initializer so that it
      	fits to 68 columns rather than 147.
      447c32c5
    • Alexander Monakov's avatar
      libcpp: replace SSE4.2 helper with an SSSE3 one · 20a5b482
      Alexander Monakov authored
      Since the characters we are searching for (CR, LF, '\', '?') all have
      distinct ASCII codes mod 16, PSHUFB can help match them all at once.
      
      Directly use the new helper if __SSSE3__ is defined. It makes the other
      helpers unused, so mark them inline to prevent warnings.
      
      Rewrite and simplify init_vectorized_lexer.
      
      libcpp/ChangeLog:
      
      	* config.in: Regenerate.
      	* configure: Regenerate.
      	* configure.ac: Check for SSSE3 instead of SSE4.2.
      	* files.cc (read_file_guts): Bump padding to 64 if HAVE_SSSE3.
      	* lex.cc (search_line_acc_char): Mark inline, not "unused".
      	(search_line_sse2): Mark inline.
      	(search_line_sse42): Replace with...
      	(search_line_ssse3): ... this new function.  Adjust the use...
      	(init_vectorized_lexer): ... here.  Simplify.
      20a5b482
  19. Aug 07, 2024
  20. Aug 06, 2024
    • Andi Kleen's avatar
      Remove MMX code path in lexer · eac63be1
      Andi Kleen authored
      Host systems with only MMX and no SSE2 should be really rare now.
      Let's remove the MMX code path to keep the number of custom
      implementations the same.
      
      The SSE2 code path is also somewhat dubious now (nearly everything
      should have SSE4 4.2 which is >15 years old now), but the SSE2
      code path is used as fallback for others and also apparently
      Solaris uses it due to tool chain deficiencies.
      
      libcpp/ChangeLog:
      
      	* lex.cc (search_line_mmx): Remove function.
      	(init_vectorized_lexer): Remove search_line_mmx.
      eac63be1
  21. Jul 26, 2024
  22. Jul 25, 2024
    • Jakub Jelinek's avatar
      c++: Implement C++26 P2558R2 - Add @, $, and ` to the basic character set [PR110343] · 29341f21
      Jakub Jelinek authored
      The following patch implements the easy parts of the paper.
      When @$` are added to the basic character set, it means that
      R"@$`()@$`" should now be valid (here I've noticed most of the
      raw string tests were tested solely with -std=c++11 or -std=gnu++11
      and I've tried to change that), and on the other side even if
      by extension $ is allowed in identifiers, \u0024 or \U00000024
      or \u{24} should not be, similarly how \u0041 is not allowed.
      
      The paper in 3.1 claims though that
       #include <stdio.h>
      
       #define STR(x) #x
      
      int main()
      {
        printf("%s", STR(\u0060)); // U+0060 is ` GRAVE ACCENT
      }
      should have been accepted before this paper (and rejected after it),
      but g++ rejects it.
      
      I've tried to understand it, but am confused on what is the right
      behavior and why.
      
      Consider
       #define STR(x) #x
      const char *a = "\u00b7";
      const char *b = STR(\u00b7);
      const char *c = "\u0041";
      const char *d = STR(\u0041);
      const char *e = STR(a\u00b7);
      const char *f = STR(a\u0041);
      const char *g = STR(a \u00b7);
      const char *h = STR(a \u0041);
      const char *i = "\u066d";
      const char *j = STR(\u066d);
      const char *k = "\u0040";
      const char *l = STR(\u0040);
      const char *m = STR(a\u066d);
      const char *n = STR(a\u0040);
      const char *o = STR(a \u066d);
      const char *p = STR(a \u0040);
      
      Neither clang nor gcc emit any diagnostics on the a, c, i and k
      initializers, those are certainly valid (c is invalid in C23 though).  g++
      emits with -pedantic-errors errors on all the others, while clang++ on the
      ones with STR involving \u0041, \u0040 and a\u0066d.  The chosen values are
      \u0040 '@' as something being changed by this paper, \u0041 'A' as basic
      character set char valid in identifiers before/after, \u00b7 as an example
      of character which is pedantically valid in identifiers if not at the start
      and \u066d s something pedantically not valid in identifiers.
      
      Now, https://eel.is/c++draft/lex.charset#6 says that UCN used outside of a
      string/character literal which corresponds to basic character set character
      (or control character) is ill-formed, that would make d, f, h cases invalid
      for C++ and l, n, p cases invalid for C++26.
      
      https://eel.is/c++draft/lex.name states which characters can appear at the
      start of the identifier and which can appear after the start.  And
      https://eel.is/c++draft/lex.pptoken states that preprocessing-token is
      either identifier, or tons of other things, or "each non-whitespace
      character that cannot be one of the above"
      
      Then https://eel.is/c++draft/lex.pptoken#1 says that this last category is
      invalid if the preprocessing token is being converted into token.
      
      And https://eel.is/c++draft/lex.pptoken#2 includes "If any character not in
      the basic character set matches the last category, the program is
      ill-formed."
      
      Now, e.g.  for the C++23 STR(\u0040) case, \u0040 is there not in the basic
      character set, so valid outside of the literals (not the case anymore in
      C++26), but it isn't nondigit and doesn't have XID_Start property, so it
      isn't IMHO an identifier and so must be the "each non-whitespace character
      that cannot be one of the above" case.  Why doesn't the above mentioned
      https://eel.is/c++draft/lex.pptoken#2 sentence make that invalid?  Ignoring
      that, I'd say it would be then stringized and that feels like it is what
      clang++ is doing.  Now, e.g.  for the STR(a\u066d) case, I wonder why that
      isn't lexed as a identifier followed by \u066d "each non-whitespace
      character that cannot be one of the above" token and stringified similarly,
      clang++ rejects that.
      
      What GCC libcpp seems to be doing is that if that forms_identifier_p calls
      _cpp_valid_utf8 or _cpp_valid_ucn with an argument which tells it is first
      or second+ in identifier, and e.g.  _cpp_valid_ucn then for UCNs valid in
      string literals calls
        else if (identifier_pos)
          {
            int validity = ucn_valid_in_identifier (pfile, result, nst);
      
            if (validity == 0)
              cpp_error (pfile, CPP_DL_ERROR,
                         "universal character %.*s is not valid in an identifier",
                         (int) (str - base), base);
            else if (validity == 2 && identifier_pos == 1)
              cpp_error (pfile, CPP_DL_ERROR,
         "universal character %.*s is not valid at the start of an identifier",
                         (int) (str - base), base);
          }
      so basically all those invalid in identifiers cases emit an error and
      pretend to be valid in identifiers, rather than what e.g.  _cpp_valid_utf8
      does for C but not for C++ and only for the chars completely invalid in
      identifiers rather than just valid in identifiers but not at the start:
                /* In C++, this is an error for invalid character in an identifier
                   because logically, the UTF-8 was converted to a UCN during
                   translation phase 1 (even though we don't physically do it that
                   way).  In C, this byte rather becomes grammatically a separate
                   token.  */
      
                if (CPP_OPTION (pfile, cplusplus))
                  cpp_error (pfile, CPP_DL_ERROR,
                             "extended character %.*s is not valid in an identifier",
                             (int) (*pstr - base), base);
                else
                  {
                    *pstr = base;
                    return false;
                  }
      The comment doesn't really match what is done in recent C++ versions because
      there UCNs are translated to characters and not the other way around.
      
      2024-07-25  Jakub Jelinek  <jakub@redhat.com>
      
      	PR c++/110343
      libcpp/
      	* lex.cc: C++26 P2558R2 - Add @, $, and ` to the basic character set.
      	(lex_raw_string): For C++26 allow $@` characters in prefix.
      	* charset.cc (_cpp_valid_ucn): For C++26 reject \u0024 in identifiers.
      gcc/testsuite/
      	* c-c++-common/raw-string-1.c: Use { c || c++11 } effective target,
      	remove c++ specific dg-options.
      	* c-c++-common/raw-string-2.c: Likewise.
      	* c-c++-common/raw-string-4.c: Likewise.
      	* c-c++-common/raw-string-5.c: Likewise.  Expect some diagnostics
      	only for non-c++26, for c++26 expect different.
      	* c-c++-common/raw-string-6.c: Use { c || c++11 } effective target,
      	remove c++ specific dg-options.
      	* c-c++-common/raw-string-11.c: Likewise.
      	* c-c++-common/raw-string-13.c: Likewise.
      	* c-c++-common/raw-string-14.c: Likewise.
      	* c-c++-common/raw-string-15.c: Use { c || c++11 } effective target,
      	change c++ specific dg-options to just -Wtrigraphs.
      	* c-c++-common/raw-string-16.c: Likewise.
      	* c-c++-common/raw-string-17.c: Use { c || c++11 } effective target,
      	remove c++ specific dg-options.
      	* c-c++-common/raw-string-18.c: Use { c || c++11 } effective target,
      	remove -std=c++11 from c++ specific dg-options.
      	* c-c++-common/raw-string-19.c: Likewise.
      	* g++.dg/cpp26/raw-string1.C: New test.
      	* g++.dg/cpp26/raw-string2.C: New test.
      29341f21
    • GCC Administrator's avatar
      Daily bump. · 25256af1
      GCC Administrator authored
      25256af1
  23. Jul 24, 2024
    • David Malcolm's avatar
      diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4) · 148066bd
      David Malcolm authored
      
      This patch adds support to our SARIF output for cases where
      rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars.
      
      In such cases, the pertinent SARIF "location" object gains a property
      bag with property "gcc/escapeNonAscii": true, and the "artifactContent"
      within the location's physical location's snippet" gains a "rendered"
      property (§3.3.4) that escapes non-ASCII text in the snippet, such as:
      
      "rendered": {"text":
      
      where "text" has a string value such as (for a "trojan source" attack):
      
        "9 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n"
        "  |       ~~~~~~~~                                ~~~~~~~~                    ^\n"
        "  |       |                                       |                           |\n"
        "  |       |                                       |                           end of bidirectional context\n"
        "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)\n"
      
      where the escaping is affected by -fdiagnostics-escape-format=; with
      -fdiagnostics-escape-format=bytes, the rendered text of the above is:
      
        "9 |     /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n"
        "  |       ~~~~~~~~~~~~                                        ~~~~~~~~~~~~                    ^\n"
        "  |       |                                                   |                               |\n"
        "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)                     U+2066 (LEFT-TO-RIGHT ISOLATE)  end of bidirectional context\n"
      
      The patch also refactors/adds enough selftest machinery to be able to
      test the snippet generation from within the selftest framework, rather
      than just within DejaGnu (where the regex-based testing isn't
      sophisticated enough to verify such properties as the above).
      
      gcc/ChangeLog:
      	* Makefile.in (OBJS-libcommon): Add selftest-json.o.
      	* diagnostic-format-sarif.cc: Include "selftest.h",
      	"selftest-diagnostic.h", "selftest-diagnostic-show-locus.h",
      	"selftest-json.h", and "text-range-label.h".
      	(class content_renderer): New.
      	(sarif_builder::m_rules_arr): Convert to std::unique_ptr.
      	(sarif_builder::make_location_object): Add class
      	escape_nonascii_renderer.  If rich_loc.escape_on_output_p (),
      	pass a nonnull escape_nonascii_renderer to
      	maybe_make_physical_location_object as its snippet_renderer, and
      	add a property bag property "gcc/escapeNonAscii" to the SARIF
      	location object.  For other overloads of make_location_object,
      	pass nullptr for the snippet_renderer.
      	(sarif_builder::maybe_make_region_object_for_context): Add
      	"snippet_renderer" param and pass it to
      	maybe_make_artifact_content_object.
      	(sarif_builder::make_tool_object): Drop "const".
      	(sarif_builder::make_driver_tool_component_object): Likewise.
      	Use typesafe unique_ptr variant of object::set for setting "rules"
      	property on driver_obj.
      	(sarif_builder::maybe_make_artifact_content_object): Add param "r"
      	and use it to potentially set the "rendered" property (§3.3.4).
      	(selftest::test_make_location_object): New.
      	(selftest::diagnostic_format_sarif_cc_tests): New.
      	* diagnostic-show-locus.cc: Include "text-range-label.h" and
      	"selftest-diagnostic-show-locus.h".
      	(selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture):
      	New.
      	(selftests::test_layout_x_offset_display_utf8): Use
      	diagnostic_show_locus_fixture to simplify and consolidate setup
      	code.
      	(selftests::test_diagnostic_show_locus_one_liner): Likewise.
      	(selftests::test_one_liner_colorized_utf8): Likewise.
      	(selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise.
      	* gcc-rich-location.h (class text_range_label): Move to new file
      	text-range-label.h.
      	* selftest-diagnostic-show-locus.h: New file, based on material in
      	diagnostic-show-locus.cc.
      	* selftest-json.cc: New file.
      	* selftest-json.h: New file.
      	* selftest-run-tests.cc (selftest::run_tests): Call
      	selftest::diagnostic_format_sarif_cc_tests.
      	* selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl.
      
      gcc/testsuite/ChangeLog:
      	* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify
      	that we have a property bag with property "gcc/escapeNonAscii": true.
      	Verify that we have a "rendered" property for a snippet.
      	* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include
      	"text-range-label.h".
      
      gcc/ChangeLog:
      	* text-range-label.h: New file, taking class text_range_label from
      	gcc-rich-location.h.
      
      libcpp/ChangeLog:
      	* include/rich-location.h
      	(semi_embedded_vec::semi_embedded_vec): Add copy ctor.
      	(rich_location::rich_location): Remove "= delete" from decl of
      	copy ctor.  Add deleted decl of move ctor.
      	(rich_location::operator=): Remove "= delete" from decl of
      	copy assignment.  Add deleted decl of move assignment.
      	(fixit_hint::fixit_hint): Add copy ctor decl.  Add deleted decl of
      	move.
      	(fixit_hint::operator=): Add copy assignment decl.  Add deleted
      	decl of move assignment.
      	* line-map.cc (rich_location::rich_location): New copy ctor.
      	(fixit_hint::fixit_hint): New copy ctor.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      148066bd
  24. Jul 14, 2024
  25. Jul 13, 2024
    • David Malcolm's avatar
      diagnostics: add highlight-a vs highlight-b in colorization and pp_markup · 7d73c01c
      David Malcolm authored
      
      Since r6-4582-g8a64515099e645 (which added class rich_location), ranges
      of quoted source code have been colorized using the following rules:
      - the primary range used the same color of the kind of the diagnostic
      i.e. "error" vs "warning" etc (defaulting to bold red and bold magenta
      respectively)
      - secondary ranges alternate between "range1" and "range2" (defaulting
      to green and blue respectively)
      
      This works for cases with large numbers of highlighted ranges, but is
      suboptimal for common cases.
      
      The following patch adds a pair of color names: "highlight-a" and
      "highlight-b", and uses them whenever it makes sense to highlight and
      contrast two different things in the source code (e.g. a type mismatch).
      These are used by diagnostic-show-locus.cc for highlighting quoted
      source.  In addition the patch adds colorization to fragments within the
      corresponding diagnostic messages themselves, using consistent
      colorization between the message and the quoted source code for the two
      different things being contrasted.
      
      For example, consider:
      
      demo.c: In function ‘test_bad_format_string_args’:
      ../../src/demo.c:25:18: warning: format ‘%i’ expects argument of
        type ‘int’, but argument 2 has type ‘const char *’ [-Wformat=]
         25 |   printf("hello %i", msg);
            |                 ~^   ~~~
            |                  |   |
            |                  int const char *
            |                 %s
      
      Previously, the types within the message in quotes would be in bold but
      not colorized, and the labelled ranges of quoted source code would use
      bold magenta for the "int" and non-bold green for the "const char *".
      
      With this patch:
      - the "%i" and "int" in the message and the "int" in the quoted source
        are all colored bold green
      - the "const char *" in the message and in the quoted source are both
        colored bold blue
      so that the consistent use of contrasting color draws the reader's eyes
      to the relationships between the diagnostic message and the source.
      
      I've tried this with gnome-terminal with many themes, including a
      variety of light versus dark backgrounds, solarized versus non-solarized
      themes, etc, and it was readable in all.
      
      My initial version of the patch used the existing %r and %R facilities
      within pretty-print.cc for the messages, but this turned out to be very
      uncomfortable, leading to error-prone format strings such as:
      
        error_at (richloc,
                  "invalid operands to binary %s (have %<%r%T%R%> and %<%r%T%R%>)",
                  opname,
                  "highlight-a", type0,
                  "highlight-b", type1);
      
      To avoid requiring monstrosities such as the above, the patch adds a new
      "%e" format code to pretty-print.cc, which expects a pp_element *, where
      pp_element is a new abstract base class (actually a pp_markup::element),
      along with various useful subclasses.  This lets the above be written
      as:
      
        pp_markup::element_quoted_type element_0 (type0, highlight_colors::lhs);
        pp_markup::element_quoted_type element_1 (type1, highlight_colors::rhs);
        error_at (richloc,
                  "invalid operands to binary %s (have %e and %e)",
                  opname, &element_0, &element_1);
      
      which I feel is maintainable and clear to translators; the use of %e and
      pp_element * captures the type-unsafe part of the variadic call, and the
      subclasses allow for type-safety (so e.g. an element_quoted_type expects
      a type and a highlighting color).  This approach allows for some nice
      simplifications within c-format.cc.
      
      The patch also extends -Wformat to "teach" it about the new %e and
      pp_element *.  Doing so requires c-format.cc to be able to determine
      if a T * is a pp_element * (i.e. if T is a subclass).  To do so I added
      a new comp_types callback for comparing types, where the C++ frontend
      supplies a suitable implementation (and %e will always be wrong for C).
      
      I've manually tested this on many diagnostics with both C and C++ and it
      seems a subtle but significant improvement in readability.
      
      I've added a new option -fno-diagnostics-show-highlight-colors in case
      people prefer the old behavior.
      
      gcc/c-family/ChangeLog:
      	* c-common.cc: Include "tree-pretty-print-markup.h".
      	(binary_op_error): Use pp_markup::element_quoted_type and %e.
      	(check_function_arguments): Add "comp_types" param and pass it to
      	check_function_format.
      	* c-common.h (check_function_arguments): Add "comp_types" param.
      	(check_function_format): Likewise.
      	* c-format.cc: Include "tree-pretty-print-markup.h".
      	(local_pp_element_ptr_node): New.
      	(PP_FORMAT_CHAR_TABLE): Add entry for %e.
      	(struct format_check_context): Add "m_comp_types" field.
      	(check_function_format): Add "comp_types" param and pass it to
      	check_format_info.
      	(check_format_info): Likewise, passing it to format_ctx's ctor.
      	(check_format_arg): Extract m_comp_types from format_ctx and
      	pass it to check_format_info_main.
      	(check_format_info_main): Add "comp_types" param and pass it to
      	arg_parser's ctor.
      	(class argument_parser): Add "m_comp_types" field.
      	(argument_parser::check_argument_type): Pass m_comp_types to
      	check_format_types.
      	(handle_subclass_of_pp_element_p): New.
      	(check_format_types): Add "comp_types" param, and use it to
      	call handle_subclass_of_pp_element_p.
      	(class element_format_substring): New.
      	(class element_expected_type_with_indirection): New.
      	(format_type_warning): Use element_expected_type_with_indirection
      	to unify the if (wanted_type_name) branches, reducing from four
      	emit_warning calls to two.  Simplify these further using %e.
      	Doing so also gives suitable colorization of the text within the
      	diagnostics.
      	(init_dynamic_diag_info): Initialize local_pp_element_ptr_node.
      	(selftest::test_type_mismatch_range_labels): Add nullptr for new
      	param of gcc_rich_location label overload.
      	* c-format.h (T_PP_ELEMENT_PTR): New.
      	* c-type-mismatch.cc: Include "diagnostic-highlight-colors.h".
      	(binary_op_rich_location::binary_op_rich_location): Use
      	highlight_colors::lhs and highlight_colors::rhs for the ranges.
      	* c-type-mismatch.h (class binary_op_rich_location): Add comment
      	about highlight_colors.
      
      gcc/c/ChangeLog:
      	* c-objc-common.cc: Include "tree-pretty-print-markup.h".
      	(print_type): Add optional "highlight_color" param and use it
      	to show highlight colors in "aka" text.
      	(pp_markup::element_quoted_type::print_type): New.
      	* c-typeck.cc: Include "tree-pretty-print-markup.h".
      	(comp_parm_types): New.
      	(build_function_call_vec): Pass it to check_function_arguments.
      	(inform_for_arg): Use %e and highlight colors to contrast actual
      	versus expected.
      	(convert_for_assignment): Use highlight_colors::actual for the
      	rhs_label.
      	(build_binary_op): Use highlight_colors::lhs and highlight_colors::rhs
      	for the ranges.
      
      gcc/ChangeLog:
      	* common.opt (fdiagnostics-show-highlight-colors): New option.
      	* common.opt.urls: Regenerate.
      	* coretypes.h (pp_markup::element): New forward decl.
      	(pp_element): New typedef.
      	* diagnostic-color.cc (gcc_color_defaults): Add "highlight-a"
      	and "highlight-b".
      	* diagnostic-format-json.cc (diagnostic_output_format_init_json):
      	Disable highlight colors.
      	* diagnostic-format-sarif.cc (diagnostic_output_format_init_sarif):
      	Likewise.
      	* diagnostic-highlight-colors.h: New file.
      	* diagnostic-path.cc (struct event_range): Pass nullptr for
      	highlight color of m_rich_loc.
      	* diagnostic-show-locus.cc (colorizer::set_range): Handle ranges
      	with m_highlight_color.
      	(colorizer::STATE_NAMED_COLOR): New.
      	(colorizer::m_richloc): New field.
      	(colorizer::colorizer): Add richloc param for initializing
      	m_richloc.
      	(colorizer::set_named_color): New.
      	(colorizer::begin_state): Add case STATE_NAMED_COLOR.
      	(layout::layout): Pass richloc to m_colorizer's ctor.
      	(selftest::test_one_liner_labels): Pass nullptr for new param of
      	gcc_rich_location ctor for labels.
      	(selftest::test_one_liner_labels_utf8): Likewise.
      	* diagnostic.h (diagnostic_context::set_show_highlight_colors):
      	New.
      	* doc/invoke.texi: Add option -fdiagnostics-show-highlight-colors
      	and highlight-a and highlight-b color caps.
      	* doc/ux.texi
      	(Use color consistently when highlighting mismatches): New
      	subsection.
      	* gcc-rich-location.cc (gcc_rich_location::add_expr): Add
      	"highlight_color" param.
      	(gcc_rich_location::maybe_add_expr): Likewise.
      	* gcc-rich-location.h (gcc_rich_location::gcc_rich_location):
      	Split out into a pair of ctors, where if a range_label is supplied
      	the caller must also supply a highlight color.
      	(gcc_rich_location::add_expr): Add "highlight_color" param.
      	(gcc_rich_location::maybe_add_expr): Likewise.
      	* gcc.cc (driver_handle_option): Handle
      	OPT_fdiagnostics_show_highlight_colors.
      	* lto-wrapper.cc (merge_and_complain): Likewise.
      	(append_compiler_options): Likewise.
      	(append_diag_options): Likewise.
      	(run_gcc): Likewise.
      	* opts-common.cc (decode_cmdline_options_to_array): Add comment
      	about -fno-diagnostics-show-highlight-colors.
      	* opts-global.cc (init_options_once): Preserve
      	pp_show_highlight_colors in case the global_dc's printer is
      	recreated.
      	* opts.cc (common_handle_option): Handle
      	OPT_fdiagnostics_show_highlight_colors.
      	(gen_command_line_string): Likewise.
      	* pretty-print-markup.h: New file.
      	* pretty-print.cc: Include "pretty-print-markup.h" and
      	"diagnostic-highlight-colors.h".
      	(pretty_printer::format): Handle %e.
      	(pretty_printer::pretty_printer): Handle new field
      	m_show_highlight_colors.
      	(pp_string_n): New.
      	(pp_markup::context::begin_quote): New.
      	(pp_markup::context::end_quote): New.
      	(pp_markup::context::begin_color): New.
      	(pp_markup::context::end_color): New.
      	(highlight_colors::expected): New.
      	(highlight_colors::actual): New.
      	(highlight_colors::lhs): New.
      	(highlight_colors::rhs): New.
      	(class selftest::test_element): New.
      	(selftest::test_pp_format): Add tests of %e.
      	(selftest::test_urlification): Likewise.
      	* pretty-print.h (pp_markup::context): New forward decl.
      	(class chunk_info): Add friend class pp_markup::context.
      	(class pretty_printer): Add friend pp_show_highlight_colors.
      	(pretty_printer::m_show_highlight_colors): New field.
      	(pp_show_highlight_colors): New inline function.
      	(pp_string_n): New decl.
      	* substring-locations.cc: Include "diagnostic-highlight-colors.h".
      	(format_string_diagnostic_t::highlight_color_format_string): New.
      	(format_string_diagnostic_t::highlight_color_param): New.
      	(format_string_diagnostic_t::emit_warning_n_va): Use highlight
      	colors.
      	* substring-locations.h
      	(format_string_diagnostic_t::highlight_color_format_string): New.
      	(format_string_diagnostic_t::highlight_color_param): New.
      	* toplev.cc (general_init): Initialize global_dc's
      	show_highlight_colors.
      	* tree-pretty-print-markup.h: New file.
      
      gcc/cp/ChangeLog:
      	* call.cc: Include "tree-pretty-print-markup.h".
      	(implicit_conversion_error): Use highlight_colors::percent_h for
      	the labelled range.
      	(op_error_string): Split out into...
      	(concat_op_error_string): ...this.
      	(binop_error_string): New.
      	(op_error): Use %e, binop_error_string, highlight_colors::lhs,
      	and highlight_colors::rhs.
      	(maybe_inform_about_fndecl_for_bogus_argument_init): Add
      	"highlight_color" param; use it for the richloc.
      	(convert_like_internal): Use highlight_colors::percent_h for the
      	labelled_range, and highlight_colors::percent_i for the call to
      	maybe_inform_about_fndecl_for_bogus_argument_init.
      	(build_over_call): Pass cp_comp_parm_types for new "comp_types"
      	param of check_function_arguments.
      	(complain_about_bad_argument): Use highlight_colors::percent_h for
      	the labelled_range, and highlight_colors::percent_i for the call
      	to maybe_inform_about_fndecl_for_bogus_argument_init.
      	* cp-tree.h (maybe_inform_about_fndecl_for_bogus_argument_init):
      	Add optional highlight_color param.
      	(cp_comp_parm_types): New decl.
      	(highlight_colors::const percent_h): New decl.
      	(highlight_colors::const percent_i): New decl.
      	* error.cc: Include "tree-pretty-print-markup.h".
      	(highlight_colors::const percent_h): New defn.
      	(highlight_colors::const percent_i): New defn.
      	(type_to_string): Add param "highlight_color" and use it.
      	(print_nonequal_arg): Likewise.
      	(print_template_differences): Add params "highlight_color_a" and
      	"highlight_color_b".
      	(type_to_string_with_compare): Add params "this_highlight_color"
      	and "peer_highlight_color".
      	(print_template_tree_comparison): Add params "highlight_color_a"
      	and "highlight_color_b".
      	(cxx_format_postprocessor::handle):
      	Use highlight_colors::percent_h and highlight_colors::percent_i.
      	(pp_markup::element_quoted_type::print_type): New.
      	(range_label_for_type_mismatch::get_text): Pass nullptr for new
      	params of type_to_string_with_compare.
      	* typeck.cc (cp_comp_parm_types): New.
      	(cp_build_function_call_vec): Pass it to check_function_arguments.
      	(convert_for_assignment): Use highlight_colors::percent_h for the
      	labelled_range.
      
      gcc/testsuite/ChangeLog:
      	* g++.dg/diagnostic/bad-binary-ops-highlight-colors.C: New test.
      	* g++.dg/diagnostic/bad-binary-ops-no-highlight-colors.C: New test.
      	* g++.dg/plugin/plugin.exp (plugin_test_list): Add
      	show-template-tree-color-no-highlight-colors.C to
      	show_template_tree_color_plugin.c.
      	* g++.dg/plugin/show-template-tree-color-labels.C: Update expected
      	output to reflect use of highlight-a and highlight-b to contrast
      	mismatches.
      	* g++.dg/plugin/show-template-tree-color-no-elide-type.C:
      	Likewise.
      	* g++.dg/plugin/show-template-tree-color-no-highlight-colors.C:
      	New test.
      	* g++.dg/plugin/show-template-tree-color.C: Update expected output
      	to reflect use of highlight-a and highlight-b to contrast
      	mismatches.
      	* g++.dg/warn/Wformat-gcc_diag-1.C: New test.
      	* g++.dg/warn/Wformat-gcc_diag-2.C: New test.
      	* g++.dg/warn/Wformat-gcc_diag-3.C: New test.
      	* gcc.dg/bad-binary-ops-highlight-colors.c: New test.
      	* gcc.dg/format/colors.c: New test.
      	* gcc.dg/plugin/diagnostic_plugin_show_trees.c (show_tree): Pass
      	nullptr for new param of gcc_rich_location::add_expr.
      
      libcpp/ChangeLog:
      	* include/rich-location.h (location_range::m_highlight_color): New
      	field.
      	(rich_location::rich_location): Add optional label_highlight_color
      	param.
      	(rich_location::set_highlight_color): New decl.
      	(rich_location::add_range): Add optional label_highlight_color
      	param.
      	(rich_location::set_range): Likewise.
      	* line-map.cc (rich_location::rich_location): Add
      	"label_highlight_color" param and pass it to add_range.
      	(rich_location::set_highlight_color): New.
      	(rich_location::add_range): Add "label_highlight_color" param.
      	(rich_location::set_range): Add "highlight_color" param.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      7d73c01c
  26. Jun 22, 2024
  27. Jun 21, 2024
    • David Malcolm's avatar
      diagnostics: fixes to SARIF output [PR109360] · 9f4fdc3a
      David Malcolm authored
      
      When adding validation of .sarif files against the schema
      (PR testsuite/109360) I discovered various issues where we were
      generating invalid .sarif files.
      
      Specifically, in
        c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c
      the relatedLocations for the "note" diagnostics were missing column
      numbers, leading to validation failure due to non-unique elements,
      such as multiple:
      	"message": {"text": "invalid UTF-8 character <bf>"}},
      on line 25 with no column information.
      
      Root cause is that for some diagnostics in libcpp we have a location_t
      representing the line as a whole, setting a column_override on the
      rich_location (since the line hasn't been fully read yet).  We were
      handling this column override for plain text output, but not for .sarif
      output.
      
      Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning
      emitted on "line 0" of the file, whereas SARIF requires line numbers to
      be positive.
      
      We also use column == 0 internally to mean "the line as a whole",
      whereas SARIF required column numbers to be positive.
      
      This patch fixes these various issues.
      
      gcc/ChangeLog:
      	PR testsuite/109360
      	* diagnostic-format-sarif.cc
      	(sarif_builder::make_location_object): Pass any column override
      	from rich_loc to maybe_make_physical_location_object.
      	(sarif_builder::maybe_make_physical_location_object): Add
      	"column_override" param and pass it to maybe_make_region_object.
      	(sarif_builder::maybe_make_region_object): Add "column_override"
      	param and use it when the location has 0 for a column.  Don't
      	add "startLine", "startColumn", "endLine", or "endColumn" if
      	the values aren't positive.
      	(sarif_builder::maybe_make_region_object_for_context): Don't
      	add "startLine" or "endLine" if the values aren't positive.
      
      libcpp/ChangeLog:
      	PR testsuite/109360
      	* include/rich-location.h (rich_location::get_column_override):
      	New accessor.
      
      Signed-off-by: default avatarDavid Malcolm <dmalcolm@redhat.com>
      9f4fdc3a
  28. Jun 12, 2024
  29. Jun 11, 2024
    • Joseph Myers's avatar
      c: Add -std=c2y, -std=gnu2y, -Wc23-c2y-compat, C2Y _Generic with type operand · 0cf68222
      Joseph Myers authored
      The first new C2Y feature, _Generic where the controlling operand is a
      type name rather than an expression (as defined in N3260), was voted
      into C2Y today.  (In particular, this form of _Generic allows
      distinguishing qualified and unqualified versions of a type.)  This
      feature also includes allowing the generic associations to specify
      incomplete and function types.
      
      Add this feature to GCC, along with the -std=c2y, -std=gnu2y and
      -Wc23-c2y-compat options to control when and how it is diagnosed.  As
      usual, the feature is allowed by default in older standards modes,
      subject to diagnosis with -pedantic, -pedantic-errors or
      -Wc23-c2y-compat.
      
      Bootstrapped with no regressions on x86_64-pc-linux-gnu.
      
      gcc/
      	* doc/cpp.texi (__STDC_VERSION__): Document C2Y handling.
      	* doc/invoke.texi (-Wc23-c2y-compat, -std=c2y, -std=gnu2y):
      	Document options.
      	(-std=gnu23): Update documentation.
      	* doc/standards.texi (C Language): Document C2Y.  Update C23
      	description.
      	* config/rl78/rl78.cc (rl78_option_override): Handle "GNU C2Y"
      	language name.
      	* dwarf2out.cc (highest_c_language, gen_compile_unit_die):
      	Likewise.
      
      gcc/c-family/
      	* c-common.cc (flag_isoc2y): New.
      	(flag_isoc99, flag_isoc11, flag_isoc23): Update comments.
      	* c-common.h (flag_isoc2y): New.
      	(clk_c, flag_isoc23): Update comments.
      	* c-opts.cc (set_std_c2y): New.
      	(c_common_handle_option): Handle OPT_std_c2y and OPT_std_gnu2y.
      	(set_std_c89, set_std_c99, set_std_c11, set_std_c17, set_std_c23):
      	Set flag_isoc2y.
      	(set_std_c23): Update comment.
      	* c.opt (Wc23-c2y-compat, std=c2y, std=gnu2y): New.
      	* c.opt.urls: Regenerate.
      
      gcc/c/
      	* c-errors.cc (pedwarn_c23): New.
      	* c-parser.cc (disable_extension_diagnostics)
      	(restore_extension_diagnostics): Save and restore
      	warn_c23_c2y_compat.
      	(c_parser_generic_selection): Handle type name as controlling
      	operand.  Allow incomplete and function types subject to
      	pedwarn_c23 calls.
      	* c-tree.h (pedwarn_c23): New.
      
      gcc/testsuite/
      	* gcc.dg/c23-generic-1.c, gcc.dg/c23-generic-2.c,
      	gcc.dg/c23-generic-3.c, gcc.dg/c23-generic-4.c,
      	gcc.dg/c2y-generic-1.c, gcc.dg/c2y-generic-2.c,
      	gcc.dg/c2y-generic-3.c, gcc.dg/gnu2y-generic-1.c: New tests.
      	* gcc.dg/c23-tag-6.c: Use -pedantic-errors.
      
      libcpp/
      	* include/cpplib.h (CLK_GNUC2Y, CLK_STDC2Y): New.
      	* init.cc (lang_defaults): Add GNUC2Y and STDC2Y entries.
      	(cpp_init_builtins): Define __STDC_VERSION__ to 202500L for GNUC2Y
      	and STDC2Y.
      0cf68222
Loading