Skip to content

perf(cuBLAS): store device pointers in ggml_tensor #1194

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
jon-chuang opened this issue Apr 26, 2023 · 1 comment
Closed

perf(cuBLAS): store device pointers in ggml_tensor #1194

jon-chuang opened this issue Apr 26, 2023 · 1 comment
Labels

Comments

@jon-chuang
Copy link
Contributor

jon-chuang commented Apr 26, 2023

We do not need to do (DTH/HTD) copy of tensor data. This is like how pytorch does it.

In self-attention, the kv cache could still be on host, but the host launches kernels on the device data based on the cache.

The ggml_tensor provides methods to sync to the operator device type to hide complexity.

Unfortunately, lazy sync is not the smartest way - knowing the full compute graph is much better to identify sync points; then one can overlap copy and compute wherever sync is required.

example:

if graph.sync_required(&tensor) {
  cudaAsyncCopy(...); // e.g. DTH
}
@jon-chuang jon-chuang changed the title perf(cuBLAS): store device pointers in ggml_tensor perf(cuBLAS): store device pointers in ggml_tensor; lazily copy Apr 26, 2023
@jon-chuang jon-chuang changed the title perf(cuBLAS): store device pointers in ggml_tensor; lazily copy perf(cuBLAS): store device pointers in ggml_tensor; lazily copy based on operator CUDA support Apr 26, 2023
@jon-chuang jon-chuang changed the title perf(cuBLAS): store device pointers in ggml_tensor; lazily copy based on operator CUDA support perf(cuBLAS): store device pointers in ggml_tensor Apr 26, 2023
@github-actions github-actions bot added the stale label Mar 25, 2024
Copy link
Contributor

github-actions bot commented Apr 9, 2024

This issue was closed because it has been inactive for 14 days since being marked as stale.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant