Skip to content

sycl: add GGML_SYCL_FATTN_VEC_NTHREADS build option#25205

Open
Titaniumtown wants to merge 1 commit into
ggml-org:masterfrom
Titaniumtown:pr/sycl_fattn_vec_nthreads
Open

sycl: add GGML_SYCL_FATTN_VEC_NTHREADS build option#25205
Titaniumtown wants to merge 1 commit into
ggml-org:masterfrom
Titaniumtown:pr/sycl_fattn_vec_nthreads

Conversation

@Titaniumtown

Copy link
Copy Markdown

Overview

Adds a GGML_SYCL_FATTN_VEC_NTHREADS option which sets the VEC_NTHREADS option. This is useful when the hardware supports values different than 128.

Additional information

For 1st gen Intel ARC Graphics, 128 is the native option. For 2nd gen Arc Graphics, 256 is what the hardware supports.

Requirements

Default is 128, but for newer hardware such as the intel arc b70, 256 is optimal.
@Titaniumtown Titaniumtown requested a review from a team as a code owner July 1, 2026 15:51
@github-actions github-actions Bot added ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Jul 1, 2026

@arthw arthw left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea to set different value for hardware is good.
But the method of hard code is not good.
In base code, there is framework to check the hardware type and set the parameter value in initial stage.
Is it possible to refactor the code by referring them?

code to check the hardware type:

In ggml-sycl.cpp

if (!(ggml_sycl_info().devices[ctx.device].hw_info.arch ==
                gpu_arch::intel_gpu_acm_g10 &&
            src0->type == GGML_TYPE_Q4_0)) {
        use_dequantize_mul_mat_vec =
            use_dequantize_mul_mat_vec && !use_mul_mat_vec_q;
      }

Check the hardware and set parameter in initial stage:

In ggml-sycl.cpp

info.devices[i].max_wg_per_cu = info.max_work_group_sizes[i] / prop.get_max_compute_units();
        info.devices[i].hw_info = get_device_hw_info(&device);

In this case, you could add info.devices[i].xxx = 128 for xe, 256 for xe2.
In the code, call info.devices[i].xxx to get the right value.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants