Skip to content
Snippets Groups Projects
  • Jonathan Wakely's avatar
    9fc61d45
    libstdc++: Implement C++20 time zone support in <chrono> · 9fc61d45
    Jonathan Wakely authored
    This is the largest missing piece of C++20 support. Only the cxx11 ABI
    is supported, due to the use of std::string in the API for time zones.
    For the old gcc4 ABI, utc_clock and leap seconds are supported, but only
    using a hardcoded list of leap seconds, no up-to-date tzdb::leap_seconds
    information is available, and no time zones or zoned_time conversions.
    
    The implementation currently depends on a tzdata.zi file being provided
    by the OS or the user. The expected location is /usr/share/zoneinfo but
    that can be changed using --with-libstdcxx-zoneinfo-dir=PATH. On targets
    that support it there is also a weak symbol that users can override in
    their own program (which also helps with testing):
    
    extern "C++" const char* __gnu_cxx::zoneinfo_dir_override();
    
    If no file is found, a fallback tzdb object will be created which only
    contains the "Etc/UTC" and "Etc/GMT" time zones.
    
    A leapseconds file is also expected in the same directory, but if that
    isn't present then a hardcoded list of leapseconds is used, which is
    correct at least as far as 2023-06-28 (and it currently looks like no
    leap second will be inserted for a few years).
    
    The tzdata.zi and leapseconds files from https://www.iana.org/time-zones
    are in the public domain, so shipping copies of them with GCC would be
    an option. However, the tzdata.zi file will rapidly become outdated, so
    users should really provide it themselves (or convince their OS vendor
    to do so). It would also be possible to implement an alternative parser
    for the compiled tzdata files (one per time zone) under
    /usr/share/zoneinfo. Those files are present on more operating systems,
    but do not contain all the information present in tzdata.zi.
    Specifically, the "links" are not present, so that e.g. "UTC" and
    "Universal" are distinct time zones, rather than both being links to the
    canonical "Etc/UTC" zone. For some platforms those files are hard links
    to the same file, but there's no indication which zone is the canonical
    name and which is a link. Other platforms just store them in different
    inodes anyway. I do not plan to add such an alternative parser for the
    compiled files. That would need to be contributed by maintainers or
    users of targets that require it, if making tzdata.zi available is not
    an option. The library ABI would not need to change for a new tzdb
    implementation, because everything in tzdb_list, tzdb and time_zone is
    implemented as a pimpl (except for the shared_ptr links between nodes,
    described below). That means the new exported symbols added by this
    commit should be stable even if the implementation is completely
    rewritten.
    
    The information from tzdata.zi is parsed and stored in data structures
    that closely model the info in the file. This is a space-efficient
    representation that uses less memory that storing every transition for
    every time zone.  It also avoids spending time expanding that
    information into time zone transitions that might never be needed by the
    program.  When a conversion to/from a local time to UTC is requested the
    information will be processed to determine the time zone transitions
    close to the time being converted.
    
    There is a bug in some time zone transitions. When generating a sys_info
    object immediately after one that was previously generated, we need to
    find the previous rule that was in effect and note its offset and
    letters. This is so that the start time and abbreviation of the new
    sys_info will be correct. This only affects time zones that use a format
    like "C%sT" where the LETTERS replacing %s are non-empty for standard
    time, e.g. "Asia/Shanghai" which uses "CST" for standard time and "CDT"
    for daylight time.
    
    The tzdb_list structure maintains a linked list of tzdb nodes using
    shared_ptr links. This allows the iterators into the list to share
    ownership with the list itself. This offers a non-portable solution to a
    lifetime issue in the API. Because tzdb objects can be erased from the
    list using tzdb_list::erase_after, separate modules/libraries in a large
    program cannot guarantee that any const tzdb& or const time_zone*
    remains valid indefinitely. Holding onto a tzdb_list::const_iterator
    will extend the tzdb object's lifetime, even if it's erased from the
    list. An alternative design would be for the list iterator to hold a
    weak_ptr. This would allow users to test whether the tzdb still exists
    when the iterator is dereferenced, which is better than just having a
    dangling raw pointer. That doesn't actually extend the tzdb's lifetime
    though, and every use of it would need to be preceded by checking the
    weak_ptr. Using shared_ptr adds a little bit of overhead but allows
    users to solve the lifetime issue if they rely on the libstdc++-specific
    iterator property.
    
    libstdc++-v3/ChangeLog:
    
    	* acinclude.m4 (GLIBCXX_ZONEINFO_DIR): New macro.
    	* config.h.in: Regenerate.
    	* config/abi/pre/gnu.ver: Export new symbols.
    	* configure: Regenerate.
    	* configure.ac (GLIBCXX_ZONEINFO_DIR): Use new macro.
    	* include/std/chrono (utc_clock::from_sys): Correct handling
    	of leap seconds.
    	(nonexistent_local_time::_M_make_what_str): Define.
    	(ambiguous_local_time::_M_make_what_str): Define.
    	(__throw_bad_local_time): Define new function.
    	(time_zone, tzdb_list, tzdb): Implement all members.
    	(remote_version, zoned_time, get_leap_second_info): Define.
    	* include/std/version: Add comment for __cpp_lib_chrono.
    	* src/c++20/Makefile.am: Add new file.
    	* src/c++20/Makefile.in: Regenerate.
    	* src/c++20/tzdb.cc: New file.
    	* testsuite/lib/libstdc++.exp: Define effective target tzdb.
    	* testsuite/std/time/clock/file/members.cc: Check file_time
    	alias and file_clock::now() member.
    	* testsuite/std/time/clock/gps/1.cc: Likewise for gps_clock.
    	* testsuite/std/time/clock/tai/1.cc: Likewise for tai_clock.
    	* testsuite/std/time/syn_c++20.cc: Uncomment everything except
    	parse.
    	* testsuite/std/time/clock/utc/leap_second_info.cc: New test.
    	* testsuite/std/time/exceptions.cc: New test.
    	* testsuite/std/time/time_zone/get_info_local.cc: New test.
    	* testsuite/std/time/time_zone/get_info_sys.cc: New test.
    	* testsuite/std/time/time_zone/requirements.cc: New test.
    	* testsuite/std/time/tzdb/1.cc: New test.
    	* testsuite/std/time/tzdb/leap_seconds.cc: New test.
    	* testsuite/std/time/tzdb_list/1.cc: New test.
    	* testsuite/std/time/tzdb_list/requirements.cc: New test.
    	* testsuite/std/time/zoned_time/1.cc: New test.
    	* testsuite/std/time/zoned_time/custom.cc: New test.
    	* testsuite/std/time/zoned_time/deduction.cc: New test.
    	* testsuite/std/time/zoned_time/req_neg.cc: New test.
    	* testsuite/std/time/zoned_time/requirements.cc: New test.
    	* testsuite/std/time/zoned_traits.cc: New test.
    9fc61d45
    History
    libstdc++: Implement C++20 time zone support in <chrono>
    Jonathan Wakely authored
    This is the largest missing piece of C++20 support. Only the cxx11 ABI
    is supported, due to the use of std::string in the API for time zones.
    For the old gcc4 ABI, utc_clock and leap seconds are supported, but only
    using a hardcoded list of leap seconds, no up-to-date tzdb::leap_seconds
    information is available, and no time zones or zoned_time conversions.
    
    The implementation currently depends on a tzdata.zi file being provided
    by the OS or the user. The expected location is /usr/share/zoneinfo but
    that can be changed using --with-libstdcxx-zoneinfo-dir=PATH. On targets
    that support it there is also a weak symbol that users can override in
    their own program (which also helps with testing):
    
    extern "C++" const char* __gnu_cxx::zoneinfo_dir_override();
    
    If no file is found, a fallback tzdb object will be created which only
    contains the "Etc/UTC" and "Etc/GMT" time zones.
    
    A leapseconds file is also expected in the same directory, but if that
    isn't present then a hardcoded list of leapseconds is used, which is
    correct at least as far as 2023-06-28 (and it currently looks like no
    leap second will be inserted for a few years).
    
    The tzdata.zi and leapseconds files from https://www.iana.org/time-zones
    are in the public domain, so shipping copies of them with GCC would be
    an option. However, the tzdata.zi file will rapidly become outdated, so
    users should really provide it themselves (or convince their OS vendor
    to do so). It would also be possible to implement an alternative parser
    for the compiled tzdata files (one per time zone) under
    /usr/share/zoneinfo. Those files are present on more operating systems,
    but do not contain all the information present in tzdata.zi.
    Specifically, the "links" are not present, so that e.g. "UTC" and
    "Universal" are distinct time zones, rather than both being links to the
    canonical "Etc/UTC" zone. For some platforms those files are hard links
    to the same file, but there's no indication which zone is the canonical
    name and which is a link. Other platforms just store them in different
    inodes anyway. I do not plan to add such an alternative parser for the
    compiled files. That would need to be contributed by maintainers or
    users of targets that require it, if making tzdata.zi available is not
    an option. The library ABI would not need to change for a new tzdb
    implementation, because everything in tzdb_list, tzdb and time_zone is
    implemented as a pimpl (except for the shared_ptr links between nodes,
    described below). That means the new exported symbols added by this
    commit should be stable even if the implementation is completely
    rewritten.
    
    The information from tzdata.zi is parsed and stored in data structures
    that closely model the info in the file. This is a space-efficient
    representation that uses less memory that storing every transition for
    every time zone.  It also avoids spending time expanding that
    information into time zone transitions that might never be needed by the
    program.  When a conversion to/from a local time to UTC is requested the
    information will be processed to determine the time zone transitions
    close to the time being converted.
    
    There is a bug in some time zone transitions. When generating a sys_info
    object immediately after one that was previously generated, we need to
    find the previous rule that was in effect and note its offset and
    letters. This is so that the start time and abbreviation of the new
    sys_info will be correct. This only affects time zones that use a format
    like "C%sT" where the LETTERS replacing %s are non-empty for standard
    time, e.g. "Asia/Shanghai" which uses "CST" for standard time and "CDT"
    for daylight time.
    
    The tzdb_list structure maintains a linked list of tzdb nodes using
    shared_ptr links. This allows the iterators into the list to share
    ownership with the list itself. This offers a non-portable solution to a
    lifetime issue in the API. Because tzdb objects can be erased from the
    list using tzdb_list::erase_after, separate modules/libraries in a large
    program cannot guarantee that any const tzdb& or const time_zone*
    remains valid indefinitely. Holding onto a tzdb_list::const_iterator
    will extend the tzdb object's lifetime, even if it's erased from the
    list. An alternative design would be for the list iterator to hold a
    weak_ptr. This would allow users to test whether the tzdb still exists
    when the iterator is dereferenced, which is better than just having a
    dangling raw pointer. That doesn't actually extend the tzdb's lifetime
    though, and every use of it would need to be preceded by checking the
    weak_ptr. Using shared_ptr adds a little bit of overhead but allows
    users to solve the lifetime issue if they rely on the libstdc++-specific
    iterator property.
    
    libstdc++-v3/ChangeLog:
    
    	* acinclude.m4 (GLIBCXX_ZONEINFO_DIR): New macro.
    	* config.h.in: Regenerate.
    	* config/abi/pre/gnu.ver: Export new symbols.
    	* configure: Regenerate.
    	* configure.ac (GLIBCXX_ZONEINFO_DIR): Use new macro.
    	* include/std/chrono (utc_clock::from_sys): Correct handling
    	of leap seconds.
    	(nonexistent_local_time::_M_make_what_str): Define.
    	(ambiguous_local_time::_M_make_what_str): Define.
    	(__throw_bad_local_time): Define new function.
    	(time_zone, tzdb_list, tzdb): Implement all members.
    	(remote_version, zoned_time, get_leap_second_info): Define.
    	* include/std/version: Add comment for __cpp_lib_chrono.
    	* src/c++20/Makefile.am: Add new file.
    	* src/c++20/Makefile.in: Regenerate.
    	* src/c++20/tzdb.cc: New file.
    	* testsuite/lib/libstdc++.exp: Define effective target tzdb.
    	* testsuite/std/time/clock/file/members.cc: Check file_time
    	alias and file_clock::now() member.
    	* testsuite/std/time/clock/gps/1.cc: Likewise for gps_clock.
    	* testsuite/std/time/clock/tai/1.cc: Likewise for tai_clock.
    	* testsuite/std/time/syn_c++20.cc: Uncomment everything except
    	parse.
    	* testsuite/std/time/clock/utc/leap_second_info.cc: New test.
    	* testsuite/std/time/exceptions.cc: New test.
    	* testsuite/std/time/time_zone/get_info_local.cc: New test.
    	* testsuite/std/time/time_zone/get_info_sys.cc: New test.
    	* testsuite/std/time/time_zone/requirements.cc: New test.
    	* testsuite/std/time/tzdb/1.cc: New test.
    	* testsuite/std/time/tzdb/leap_seconds.cc: New test.
    	* testsuite/std/time/tzdb_list/1.cc: New test.
    	* testsuite/std/time/tzdb_list/requirements.cc: New test.
    	* testsuite/std/time/zoned_time/1.cc: New test.
    	* testsuite/std/time/zoned_time/custom.cc: New test.
    	* testsuite/std/time/zoned_time/deduction.cc: New test.
    	* testsuite/std/time/zoned_time/req_neg.cc: New test.
    	* testsuite/std/time/zoned_time/requirements.cc: New test.
    	* testsuite/std/time/zoned_traits.cc: New test.