Skip to content
Snippets Groups Projects
  • Jakub Jelinek's avatar
    ce0aecc7
    libcpp, v2: Add support for gnu::base64 #embed parameter · ce0aecc7
    Jakub Jelinek authored
    This patch which adds another #embed extension, gnu::base64.
    
    As mentioned in the documentation, this extension is primarily
    intended for use by the preprocessor, so that for the larger (say 32+ or
    64+ bytes long embeds it doesn't have to emit tens of thousands or
    millions of comma separated string literals which would be very expensive
    to parse again, but can emit
     #embed "." __gnu__::__base64__( \
     "Tm9uIGVyYW0gbsOpc2NpdXMsIEJydXRlLCBjdW0sIHF1w6Ygc3VtbWlzIGluZ8OpbmlpcyBleHF1" \
     "aXNpdMOhcXVlIGRvY3Ryw61uYSBwaGlsw7Nzb3BoaSBHcsOmY28gc2VybcOzbmUgdHJhY3RhdsOt" \
     "c3NlbnQsIGVhIExhdMOtbmlzIGzDrXR0ZXJpcyBtYW5kYXLDqW11cywgZm9yZSB1dCBoaWMgbm9z" \
     "dGVyIGxhYm9yIGluIHbDoXJpYXMgcmVwcmVoZW5zacOzbmVzIGluY8O6cnJlcmV0LiBuYW0gcXVp" \
     "YsO6c2RhbSwgZXQgaWlzIHF1aWRlbSBub24gw6FkbW9kdW0gaW5kw7NjdGlzLCB0b3R1bSBob2Mg" \
     "ZMOtc3BsaWNldCBwaGlsb3NvcGjDoXJpLiBxdWlkYW0gYXV0ZW0gbm9uIHRhbSBpZCByZXByZWjD" \
     "qW5kdW50LCBzaSByZW3DrXNzaXVzIGFnw6F0dXIsIHNlZCB0YW50dW0gc3TDumRpdW0gdGFtcXVl" \
     "IG11bHRhbSDDs3BlcmFtIHBvbsOpbmRhbSBpbiBlbyBub24gYXJiaXRyw6FudHVyLiBlcnVudCDD" \
     "qXRpYW0sIGV0IGlpIHF1aWRlbSBlcnVkw610aSBHcsOmY2lzIGzDrXR0ZXJpcywgY29udGVtbsOp" \
     "bnRlcyBMYXTDrW5hcywgcXVpIHNlIGRpY2FudCBpbiBHcsOmY2lzIGxlZ8OpbmRpcyDDs3BlcmFt" \
     "IG1hbGxlIGNvbnPDum1lcmUuIHBvc3Ryw6ltbyDDoWxpcXVvcyBmdXTDunJvcyBzw7pzcGljb3Is" \
     "IHF1aSBtZSBhZCDDoWxpYXMgbMOtdHRlcmFzIHZvY2VudCwgZ2VudXMgaG9jIHNjcmliw6luZGks" \
     "IGV0c2kgc2l0IGVsw6lnYW5zLCBwZXJzw7Nuw6YgdGFtZW4gZXQgZGlnbml0w6F0aXMgZXNzZSBu" \
     "ZWdlbnQu")
    with the meaning don't actually load some file, instead base64 decode
    (RFC4648 with A-Za-z0-9+/ chars and = padding, no newlines in between)
    the string and use that as data.  This is chosen because it should be
    -pedantic-errors clean, fairly cheap to decode and then in optimizing
    compiler could be handled as similar binary blob to normal #embed,
    while the data isn't left somewhere on the disk, so distcc/ccache etc.
    can move the preprocessed source without issues.
    It makes no sense to support limit and gnu::offset parameters together
    with it IMHO, why would somebody waste providing full data and then
    threw some away?  prefix/suffix/if_empty are normally supported though,
    but not intended to be used by the preprocessor.
    
    This patch adds just the extension side, not the actual emitting of this
    during -E or -E -fdirectives-only for now, that will be included in the
    upcoming patch.
    
    Compared to the earlier posted version of this extension, this patch
    allows the string concatenation in the parameter argument (but still
    doesn't allow escapes in the string, why would anyone use them when
    only A-Za-z0-9+/= are valid).  The patch also adds support for parsing
    this even in -fpreprocessed compilation.
    
    2024-09-12  Jakub Jelinek  <jakub@redhat.com>
    
    libcpp/
    	* internal.h (struct cpp_embed_params): Add base64 member.
    	(_cpp_free_embed_params_tokens): Declare.
    	* directives.cc (DIRECTIVE_TABLE): Add IN_I flag to T_EMBED.
    	(save_token_for_embed, _cpp_free_embed_params_tokens): New functions.
    	(EMBED_PARAMS): Add gnu::base64 entry.
    	(_cpp_parse_embed_params): Parse gnu::base64 parameter.  If
    	-fpreprocessed without -fdirectives-only, require #embed to have
    	gnu::base64 parameter.  Diagnose conflict between gnu::base64 and
    	limit or gnu::offset parameters.
    	(do_embed): Use _cpp_free_embed_params_tokens.
    	* files.cc (finish_embed, base64_dec_fn): New functions.
    	(base64_dec): New array.
    	(B64D0, B64D1, B64D2, B64D3): Define.
    	(finish_base64_embed): New function.
    	(_cpp_stack_embed): Use finish_embed.  Handle params->base64
    	using finish_base64_embed.
    	* macro.cc (builtin_has_embed): Call _cpp_free_embed_params_tokens.
    gcc/
    	* doc/cpp.texi (Binary Resource Inclusion): Document gnu::base64
    	parameter.
    gcc/testsuite/
    	* c-c++-common/cpp/embed-17.c: New test.
    	* c-c++-common/cpp/embed-18.c: New test.
    	* c-c++-common/cpp/embed-19.c: New test.
    	* c-c++-common/cpp/embed-27.c: New test.
    	* gcc.dg/cpp/embed-6.c: New test.
    	* gcc.dg/cpp/embed-7.c: New test.
    ce0aecc7
    History
    libcpp, v2: Add support for gnu::base64 #embed parameter
    Jakub Jelinek authored
    This patch which adds another #embed extension, gnu::base64.
    
    As mentioned in the documentation, this extension is primarily
    intended for use by the preprocessor, so that for the larger (say 32+ or
    64+ bytes long embeds it doesn't have to emit tens of thousands or
    millions of comma separated string literals which would be very expensive
    to parse again, but can emit
     #embed "." __gnu__::__base64__( \
     "Tm9uIGVyYW0gbsOpc2NpdXMsIEJydXRlLCBjdW0sIHF1w6Ygc3VtbWlzIGluZ8OpbmlpcyBleHF1" \
     "aXNpdMOhcXVlIGRvY3Ryw61uYSBwaGlsw7Nzb3BoaSBHcsOmY28gc2VybcOzbmUgdHJhY3RhdsOt" \
     "c3NlbnQsIGVhIExhdMOtbmlzIGzDrXR0ZXJpcyBtYW5kYXLDqW11cywgZm9yZSB1dCBoaWMgbm9z" \
     "dGVyIGxhYm9yIGluIHbDoXJpYXMgcmVwcmVoZW5zacOzbmVzIGluY8O6cnJlcmV0LiBuYW0gcXVp" \
     "YsO6c2RhbSwgZXQgaWlzIHF1aWRlbSBub24gw6FkbW9kdW0gaW5kw7NjdGlzLCB0b3R1bSBob2Mg" \
     "ZMOtc3BsaWNldCBwaGlsb3NvcGjDoXJpLiBxdWlkYW0gYXV0ZW0gbm9uIHRhbSBpZCByZXByZWjD" \
     "qW5kdW50LCBzaSByZW3DrXNzaXVzIGFnw6F0dXIsIHNlZCB0YW50dW0gc3TDumRpdW0gdGFtcXVl" \
     "IG11bHRhbSDDs3BlcmFtIHBvbsOpbmRhbSBpbiBlbyBub24gYXJiaXRyw6FudHVyLiBlcnVudCDD" \
     "qXRpYW0sIGV0IGlpIHF1aWRlbSBlcnVkw610aSBHcsOmY2lzIGzDrXR0ZXJpcywgY29udGVtbsOp" \
     "bnRlcyBMYXTDrW5hcywgcXVpIHNlIGRpY2FudCBpbiBHcsOmY2lzIGxlZ8OpbmRpcyDDs3BlcmFt" \
     "IG1hbGxlIGNvbnPDum1lcmUuIHBvc3Ryw6ltbyDDoWxpcXVvcyBmdXTDunJvcyBzw7pzcGljb3Is" \
     "IHF1aSBtZSBhZCDDoWxpYXMgbMOtdHRlcmFzIHZvY2VudCwgZ2VudXMgaG9jIHNjcmliw6luZGks" \
     "IGV0c2kgc2l0IGVsw6lnYW5zLCBwZXJzw7Nuw6YgdGFtZW4gZXQgZGlnbml0w6F0aXMgZXNzZSBu" \
     "ZWdlbnQu")
    with the meaning don't actually load some file, instead base64 decode
    (RFC4648 with A-Za-z0-9+/ chars and = padding, no newlines in between)
    the string and use that as data.  This is chosen because it should be
    -pedantic-errors clean, fairly cheap to decode and then in optimizing
    compiler could be handled as similar binary blob to normal #embed,
    while the data isn't left somewhere on the disk, so distcc/ccache etc.
    can move the preprocessed source without issues.
    It makes no sense to support limit and gnu::offset parameters together
    with it IMHO, why would somebody waste providing full data and then
    threw some away?  prefix/suffix/if_empty are normally supported though,
    but not intended to be used by the preprocessor.
    
    This patch adds just the extension side, not the actual emitting of this
    during -E or -E -fdirectives-only for now, that will be included in the
    upcoming patch.
    
    Compared to the earlier posted version of this extension, this patch
    allows the string concatenation in the parameter argument (but still
    doesn't allow escapes in the string, why would anyone use them when
    only A-Za-z0-9+/= are valid).  The patch also adds support for parsing
    this even in -fpreprocessed compilation.
    
    2024-09-12  Jakub Jelinek  <jakub@redhat.com>
    
    libcpp/
    	* internal.h (struct cpp_embed_params): Add base64 member.
    	(_cpp_free_embed_params_tokens): Declare.
    	* directives.cc (DIRECTIVE_TABLE): Add IN_I flag to T_EMBED.
    	(save_token_for_embed, _cpp_free_embed_params_tokens): New functions.
    	(EMBED_PARAMS): Add gnu::base64 entry.
    	(_cpp_parse_embed_params): Parse gnu::base64 parameter.  If
    	-fpreprocessed without -fdirectives-only, require #embed to have
    	gnu::base64 parameter.  Diagnose conflict between gnu::base64 and
    	limit or gnu::offset parameters.
    	(do_embed): Use _cpp_free_embed_params_tokens.
    	* files.cc (finish_embed, base64_dec_fn): New functions.
    	(base64_dec): New array.
    	(B64D0, B64D1, B64D2, B64D3): Define.
    	(finish_base64_embed): New function.
    	(_cpp_stack_embed): Use finish_embed.  Handle params->base64
    	using finish_base64_embed.
    	* macro.cc (builtin_has_embed): Call _cpp_free_embed_params_tokens.
    gcc/
    	* doc/cpp.texi (Binary Resource Inclusion): Document gnu::base64
    	parameter.
    gcc/testsuite/
    	* c-c++-common/cpp/embed-17.c: New test.
    	* c-c++-common/cpp/embed-18.c: New test.
    	* c-c++-common/cpp/embed-19.c: New test.
    	* c-c++-common/cpp/embed-27.c: New test.
    	* gcc.dg/cpp/embed-6.c: New test.
    	* gcc.dg/cpp/embed-7.c: New test.
internal.h 33.40 KiB