Initialize max number of global memory definition for simulator #104

sherry-yuan · 2022-04-14T18:57:18Z

Simulator does not have any global memory interface information until the actuall aocx is loaded.
(Note this is only a problem for simulator not hardware run, in hardware run, we can communicate with BSP to query memory interface information)
Prior to loading aocx it uses predefined autodiscovery [1] to initialize its global memory interface, which has only 1 global memory
In the sycl runtime flow today, the USM device allocation call happens before aocx is loaded.
The aocx is loaded when clCreateProgram is called, which typically happen on first kernel launch in sycl runtime.
The USM device allocation on mutli global memory system will fail because there are in total 1 global memory as defined in [1] but the user is requesting more than 1 device global memory.
User could go around this issue by launching a sacrificial kernel that uses shared allocation as kernel argument. This will setup the correct global memory interface in runtime.
This change eliminate the need to run a sacrificial kernel.
However there are a few downside:

The address range/size may not be exactly the same as the one that is in aocx, but this is not too large of a problem because runtime first fit allocation algorithm will fill the lowest address range first. Unless user requested more than what is availble.
it potentially occupied more space than required
will not error out when user requested a non-existing device global memory because we are using ACL_MAX_GLOBAL_MEM for num_global_mem_systems

[1]

fpga-runtime-for-opencl/include/acl_shipped_board_cfgs.h

Line 7 in 950f21d

"23 16 sample40byterandomhash000000000000000000 a10gx 0 1 9 DDR 2 1 2 0 "

pcolberg · 2022-04-14T20:23:35Z

Looks like git checkout is broken. Maybe GitHub changed their container setup?
https://github.com/intel/fpga-runtime-for-opencl/runs/6029451177?check_suite_focus=true

2022-04-14T19:08:35.1638886Z ##[group]Run sudo chown -R build:build .
2022-04-14T19:08:35.1639231Z �[36;1msudo chown -R build:build .�[0m
2022-04-14T19:08:35.1643269Z shell: sh -e {0}
2022-04-14T19:08:35.1643503Z ##[endgroup]
2022-04-14T19:08:35.3237562Z ##[group]Run actions/checkout@v2
2022-04-14T19:08:35.3237813Z with:
2022-04-14T19:08:35.3238075Z   repository: intel/fpga-runtime-for-opencl
2022-04-14T19:08:35.3238576Z   token: ***
2022-04-14T19:08:35.3238765Z   ssh-strict: true
2022-04-14T19:08:35.3238996Z   persist-credentials: true
2022-04-14T19:08:35.3239223Z   clean: true
2022-04-14T19:08:35.3239407Z   fetch-depth: 1
2022-04-14T19:08:35.3239606Z   lfs: false
2022-04-14T19:08:35.3239800Z   submodules: false
2022-04-14T19:08:35.3239993Z ##[endgroup]
2022-04-14T19:08:35.3334732Z ##[command]/usr/bin/docker exec  5b270e3f98b5ba860472430574ecd30e726cd60a8663a6981828793fee29b786 sh -c "cat /etc/*release | grep ^ID"
2022-04-14T19:08:35.8300296Z Syncing repository: intel/fpga-runtime-for-opencl
2022-04-14T19:08:35.8301911Z ##[group]Getting Git version info
2022-04-14T19:08:35.8302459Z Working directory is '/__w/fpga-runtime-for-opencl/fpga-runtime-for-opencl'
2022-04-14T19:08:35.8303000Z [command]/usr/bin/git version
2022-04-14T19:08:35.8303819Z git version 2.27.0
2022-04-14T19:08:35.8304529Z ##[endgroup]
2022-04-14T19:08:35.8357485Z ##[error]EACCES: permission denied, mkdir '/__w/_temp/dfdaa6c0-6817-4c98-9c22-84b0ea9c776a'

… runners This resolves a recent regression with GitHub-hosted runners where actions/checkout@v2 fails to create a directory under /__w/_temp/ intel#104 (comment) actions/checkout#47

… runners This resolves a recent regression with GitHub-hosted runners where actions/checkout@v2 fails to create a directory under /__w/_temp/ #104 (comment) actions/checkout#47

sherry-yuan · 2022-04-18T16:48:50Z

No regression was found in external&internal tests. This is ready to merge.

pcolberg

Thanks @sherry-yuan, looks good!

The commit message contains very useful background. What do you think about moving the commit message to a comment in the source code?
@bsyrowik committed a similar change in d9df7a9, although it sets 128 memory systems instead of ACL_MAX_GLOBAL_MEM == 32. With your change that commit is no longer needed, correct? Could you add a git revert d9df7a9e to your pull request? Is there any downside to setting the default 32 instead of the non-default 128 memory systems?

Simulator does not have any global memory interface information until the actuall aocx is loaded. (Note this is only a problem for simulator not hardware run, in hardware run, we can communicate with BSP to query memory interface information) Prior to loading aocx it uses predefined autodiscovery [1] to initialize its global memory interface, which has only 1 global memory In the sycl runtime flow today, the USM device allocation call happens before aocx is loaded. The aocx is loaded when clCreateProgram is called, which typically happen on first kernel launch in sycl runtime. The USM device allocation on mutli global memory system will fail because there are in total 1 global memory as defined in [1] but the user is requesting more than 1 device global memory. User could go around this issue by launching a sacrificial kernel that uses shared allocation as kernel argument. This will setup the correct global memory interface in runtime. This change eliminate the need to run a sacrificial kernel. However there are a few downside: 1. The address range/size may not be exactly the same as the one that is in aocx, but this is not too large of a problem because runtime first fit allocation algorithm will fill the lowest address range first. Unless user requested more than what is availble. 2. it potentially occupied more space than required 3. will not error out when user requested a non-existing device global memory because we are using ACL_MAX_GLOBAL_MEM for num_global_mem_systems [1] https://github.com/intel/fpga-runtime-for-opencl/blob/950f21dd079dfd55a473ba4122a4a9dca450e36f/include/acl_shipped_board_cfgs.h#L7

This reverts commit d9df7a9.

sherry-yuan · 2022-04-18T19:57:18Z

What do you think about moving the commit message to a comment in the source code?

Great idea, I have moved the commit message to line comment with slight modification.

With your change that commit is no longer needed, correct?

Right its no longer needed

Is there any downside to setting the default 32 instead of the non-default 128 memory systems?

I do not think so, as far as I know, all the device so far support up to 32 global memory, therefore there is no point for simulation to support more than 32 global memory.

sherry-yuan requested a review from pcolberg April 14, 2022 18:57

sherry-yuan added the bug Something isn't working label Apr 14, 2022

sherry-yuan force-pushed the sim_mem_config branch from 1c9d0e1 to 08c6e94 Compare April 14, 2022 19:06

pcolberg added this to the 2022.3 milestone Apr 14, 2022

pcolberg assigned sherry-yuan Apr 14, 2022

pcolberg mentioned this pull request Apr 15, 2022

Override uid of default user in container to align with GitHub-hosted runners #105

Merged

sherry-yuan force-pushed the sim_mem_config branch from 08c6e94 to 4ad9ada Compare April 18, 2022 13:59

pcolberg requested changes Apr 18, 2022

View reviewed changes

sherry-yuan force-pushed the sim_mem_config branch from 4ad9ada to 692eb87 Compare April 18, 2022 19:42

sherry-yuan added 2 commits April 18, 2022 15:52

Revert "Fix simulation of multi-memory systems"

93696eb

This reverts commit d9df7a9.

sherry-yuan force-pushed the sim_mem_config branch from 692eb87 to 93696eb Compare April 18, 2022 19:53

pcolberg approved these changes Apr 18, 2022

View reviewed changes

pcolberg merged commit 079d575 into intel:main Apr 18, 2022

sophimao mentioned this pull request Nov 24, 2022

Perform a memory copy for simulation buffer with buffer location #214

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initialize max number of global memory definition for simulator #104

Initialize max number of global memory definition for simulator #104

sherry-yuan commented Apr 14, 2022

pcolberg commented Apr 14, 2022

sherry-yuan commented Apr 18, 2022

pcolberg left a comment

sherry-yuan commented Apr 18, 2022

Initialize max number of global memory definition for simulator #104

Initialize max number of global memory definition for simulator #104

Conversation

sherry-yuan commented Apr 14, 2022

pcolberg commented Apr 14, 2022

sherry-yuan commented Apr 18, 2022

pcolberg left a comment

Choose a reason for hiding this comment

sherry-yuan commented Apr 18, 2022