Commit 4a992eca authored 3 months ago by Andi Kleen

Rebalance file_cache input line cache dynamically

The input context file_cache maintains an array of anchors
to speed up accessing lines before the previous line.
The array has a fixed upper size and the algorithm relies
on the linemap reporting the maximum number of lines in the file
in advance to compute the position of each anchor in the cache.

This doesn't work for C which doesn't know the maximum number
of lines before the files has finished parsing. The code
has a fallback for this, but it is quite inefficient and
effectively defeats the cache, so many accesses have to
go through most of the input buffer to compute line
boundaries. For large files this can be very costly
as demonstrated in PR118168.

Use a different algorithm to maintain the cache without
needing the maximum number of lines in advance. When the cache
runs out of entries and the gap to the last line anchor gets
too large, prune every second entry in the cache. This maintains
even spacing of the line anchors without requiring the maximum
index.

For the original PR this moves the overhead of enabling
-Wmisleading-indentation to 32% with the default cache size.
With a 10k entry cache it becomes noise.

cc1 -O0 -fsyntax-only mypy.c -quiet ran
1.03 ± 0.05 times faster than cc1 -O0 -fsyntax-only mypy.c -quiet -Wmisleading-indentation --param=file-cache-lines=10000
1.09 ± 0.08 times faster than cc1 -O0 -fsyntax-only mypy.c -quiet -Wmisleading-indentation --param=file-cache-lines=1000
1.32 ± 0.07 times faster than cc1 -O0 -fsyntax-only mypy.c -quiet -Wmisleading-indentation

The code could be further optimized, e.g. use the vectorized
line search functions the preprocessor uses.

Also it seems the input cache always reads the whole file into
memory, so perhaps it should just be using file mmap if possible.

gcc/ChangeLog:

PR preprocessor/118168
* input.cc (file_cache_slot::get_next_line): Use new algorithm
to maintain
(file_cache_slot::read_line_num): Use binary search for lookup.

parent ae814afa

No related branches found

No related tags found

Hide whitespace changes

Inline Side-by-side

Showing with 43 additions and 89 deletions

Please register or to comment