Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Refactor Harmonize for Host-Side Python Bindings and Modularized, Header-Only Library Structure #4

Merged
merged 22 commits into from
Apr 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
a0b9ab2
Update .gitignore
braxtoncuneo Mar 7, 2024
afe3d8e
Checkout for comparison
braxtoncuneo Mar 26, 2024
8579ea1
Checkpoint
braxtoncuneo Mar 26, 2024
f0f3c4b
Initial no-objmode bindings working
braxtoncuneo Mar 29, 2024
f36b11d
Single-gpu-program compilation working
braxtoncuneo Mar 30, 2024
200a9cc
Fixed up lookup test problem
braxtoncuneo Apr 2, 2024
16d3986
Have renaming working (probably)
braxtoncuneo Apr 12, 2024
9024d08
Moved common specification to a per-specialization specification, als…
braxtoncuneo Apr 12, 2024
5d352a3
Refactored util and bindings for proper header/source files
braxtoncuneo Apr 12, 2024
46a8d88
Preliminary implementation of delayed compilation/linking
braxtoncuneo Apr 14, 2024
3cc2a22
IT'S ALIVEgit status! (multi-program compilation/linking working)
braxtoncuneo Apr 14, 2024
405696f
Added cached compilation
braxtoncuneo Apr 14, 2024
1ecb020
Fixed pathing in compilation
braxtoncuneo Apr 14, 2024
ffa316d
Restructured repo back into a header-only library, but with more conv…
braxtoncuneo Apr 14, 2024
d2be3e0
Moved tests to examples
braxtoncuneo Apr 14, 2024
e7b3c11
Decomposed headers
braxtoncuneo Apr 14, 2024
d473f44
Merged main into no-objmode
braxtoncuneo Apr 14, 2024
db6340d
Removed dependence on nvidia-smi for compute level detection
braxtoncuneo Apr 14, 2024
e71a19b
Revised after testing on lassen
braxtoncuneo Apr 14, 2024
b0aa847
Potentially fixed checkout bug in async program
braxtoncuneo Apr 15, 2024
8043c92
Confirmed checkout fix on lassen
braxtoncuneo Apr 15, 2024
de2c231
Fixed nvcc path for non-lassen usage
braxtoncuneo Apr 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,12 @@ test_problems/simple_neutron/neut_evt
*.exe
*.so
*.ptx
*.pyc

__pycache__/
__ptxcache__/

build/

CMakeLists.txt

62 changes: 62 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# Harmonize

Harmonize is an execution framework for GPU with the aims of increasing performance and decreasing development burden for applications requiring complex processing patterns.
Harmonize is currently available both as a headers-only CUDA C++ library and as a Python package. In both forms, functionality is exposed as an asynchronous processing framework.

## Fundamentals

The system of asynchronous functions that are executed within a given runtime are defined in advance through a **program specification**.
A program specification only represents these systems abstractly, but they may be transformed into an implementation by declaring a **program** ***specialization***.

Program specifications are defined by application developers, and represent the business logic of what needs to be accomplished in the GPU, whereas the templates (**program types**) used to transform them into specializations are defined by framework developers.
This structure is used because it separates much of the underlying implementation details away from application developers, delegating those concerns to the framework developers.
Ideally, with the addition of a program type, transitioning a codebase to this other program could be as little as a one-line refactor:

```
< using MyProgram = EventProgram<MyProgramSpec>;
> using MyProgram = AsyncProgram<MyProgramSpec>;
```

Likewise, as long as the interface between the specification and the program type does not change, program types may be updated without requiring refactors from the application developers.

## Why Async?

Asynchronous programming represents a looser contract between program and execution environment.
Once a function is called, there is no guarantee of where or when that call is actually evaluated.
If the call is lazy, then the evaluation may not happen at all.

This looser contract is useful, because it allows an interface to represent a wide variety of execution strategies without breaking any promises.
By representing all program types as different asynchronous runtimes, they can be used interchangeably as long as they fulfill the few promises made by the runtime's interface.


## Harmonize on AMD?

An AMD-compatible version of Harmonize is in development, but is currently not available as a full or experimental release.


## Dependencies

### CUDA C++ Dependencies

The CUDA C++ framework currently requires:

- a CUDA compiler
- the CUDA runtime (host and device)



### Python Dependencies

Python bindings require:

- Non-package Dependencies
- `nvcc`
- the CUDA runtime (host and device)

- Python Package Dependencies
- `numpy`
- `numba`
- `llvmlite`



Binary file removed __pycache__/harmonize.cpython-36.pyc
Binary file not shown.
Loading