Julian Brown
authored
This is an optimisation for middle-end worker-partitioning support (used to support multiple workers on AMD GCN). At present, barriers may be emitted in cases where they aren't needed and cannot be optimised away. This patch stops the extraneous barriers from being emitted in the first place. One exception to the above (where the barrier is still needed) is for predicated blocks of code that perform a write to gang-private shared memory from one worker. We must execute a barrier before other workers read that shared memory location. gcc/ * config/gcn/gcn.c (gimple.h): Include. (gcn_fork_join): Emit barrier for worker-level joins. * omp-oacc-neuter-broadcast.cc (find_local_vars_to_propagate): Add writes_gang_private bitmap parameter. Set bit for blocks containing gang-private variable writes. (worker_single_simple): Don't emit barrier after predicated block. (worker_single_copy): Don't emit barrier if we're not broadcasting anything and the block contains no gang-private writes. (neuter_worker_single): Don't predicate blocks that only contain NOPs or internal marker functions. Pass has_gang_private_write argument to worker_single_copy. (oacc_do_neutering): Add writes_gang_private bitmap handling.
Name | Last commit | Last update |
---|