Skip to content
Snippets Groups Projects
Commit fdc7469c authored by Chung-Lin Tang's avatar Chung-Lin Tang
Browse files

nvptx: reimplement libgomp barriers [PR99555]

Instead of trying to have the GPU do CPU-with-OS-like things, this new barriers
implementation for NVPTX uses simplistic bar.* synchronization instructions.
Tasks are processed after threads have joined, and only if team->task_count != 0

It is noted that: there might be a little bit of performance forfeited for
cases where earlier arriving threads could've been used to process tasks ahead
of other threads, but that has the requirement of implementing complex
futex-wait/wake like behavior, which is what we're try to avoid with this patch.
It is deemed that task processing is not what GPU target offloading is usually
used for.

Implementation highlight notes:
1. gomp_team_barrier_wake() is now an empty function (threads never "wake" in
   the usual manner)
2. gomp_team_barrier_cancel() now uses the "exit" PTX instruction.
3. gomp_barrier_wait_last() now is implemented using "bar.arrive"

4. gomp_team_barrier_wait_end()/gomp_team_barrier_wait_cancel_end():
   The main synchronization is done using a 'bar.red' instruction. This reduces
   across all threads the condition (team->task_count != 0), to enable the task
   processing down below if any thread created a task.
   (this bar.red usage means that this patch is dependent on the prior NVPTX
   bar.red GCC patch)

	PR target/99555

libgomp/ChangeLog:

	* config/nvptx/bar.c (generation_to_barrier): Remove.
	(futex_wait,futex_wake,do_spin,do_wait): Remove.
	(GOMP_WAIT_H): Remove.
	(#include "../linux/bar.c"): Remove.
	(gomp_barrier_wait_end): New function.
	(gomp_barrier_wait): Likewise.
	(gomp_barrier_wait_last): Likewise.
	(gomp_team_barrier_wait_end): Likewise.
	(gomp_team_barrier_wait): Likewise.
	(gomp_team_barrier_wait_final): Likewise.
	(gomp_team_barrier_wait_cancel_end): Likewise.
	(gomp_team_barrier_wait_cancel): Likewise.
	(gomp_team_barrier_cancel): Likewise.
	* config/nvptx/bar.h (gomp_barrier_t): Remove waiters, lock fields.
	(gomp_barrier_init): Remove init of waiters, lock fields.
	(gomp_team_barrier_wake): Remove prototype, add new static inline
	function.
parent 623daaf8
No related branches found
No related tags found
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment