Skip to content

Commit 07df04d

Browse files
nhaehnlealexdeucher
authored andcommitted
drm/amdgpu: fix race condition in amd_sched_entity_push_job
As soon as we leave the spinlock after the job has been added to the job queue, we can no longer rely on the job's data to be available. I have seen a null-pointer dereference due to sched == NULL in amd_sched_wakeup via amd_sched_entity_push_job and amd_sched_ib_submit_kernel_helper. Since the latter initializes sched_job->sched with the address of the ring scheduler, which is guaranteed to be non-NULL, this race appears to be a likely culprit. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/attachment.cgi?bugid=93079 Reviewed-by: Christian König <christian.koenig@amd.com>
1 parent e2f784f commit 07df04d

1 file changed

Lines changed: 3 additions & 2 deletions

File tree

drivers/gpu/drm/amd/scheduler/gpu_scheduler.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,7 @@ amd_sched_entity_pop_job(struct amd_sched_entity *entity)
288288
*/
289289
static bool amd_sched_entity_in(struct amd_sched_job *sched_job)
290290
{
291+
struct amd_gpu_scheduler *sched = sched_job->sched;
291292
struct amd_sched_entity *entity = sched_job->s_entity;
292293
bool added, first = false;
293294

@@ -302,7 +303,7 @@ static bool amd_sched_entity_in(struct amd_sched_job *sched_job)
302303

303304
/* first job wakes up scheduler */
304305
if (first)
305-
amd_sched_wakeup(sched_job->sched);
306+
amd_sched_wakeup(sched);
306307

307308
return added;
308309
}
@@ -318,9 +319,9 @@ void amd_sched_entity_push_job(struct amd_sched_job *sched_job)
318319
{
319320
struct amd_sched_entity *entity = sched_job->s_entity;
320321

322+
trace_amd_sched_job(sched_job);
321323
wait_event(entity->sched->job_scheduled,
322324
amd_sched_entity_in(sched_job));
323-
trace_amd_sched_job(sched_job);
324325
}
325326

326327
/**

0 commit comments

Comments
 (0)