-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Loading file list of large directory is too slow #4706
Comments
I think this issue is related #4575. Did you see how these issues fair in Lab? |
@kevin-bates Thanks for response. Actually I'm using Lab. For more detail, I opened the directory which has 10,000 images. In that case everything hang even terminal input not working. I can see request timeout in the end. I'll add my env detail soon. Sorry for lack of information. |
I think you might get better traction opening this issue in https://github.com/jupyterlab/jupyterlab since that's where the front-end focus is these days. I suspect "files" are treated differently than directories (which is where I saw the difference in Lab) relative to rendering. |
@kevin-bates Well.. I decided to post here because I could check this happens in classic notebook (/tree) as well as jupyter lab (/lab). The following shows it takes 7.40 seconds to get response from the notebook server for the request of 25,089 dentries under
EDIT) notebook/notebook/services/contents/filemanager.py Lines 310 to 343 in 6d15e9c
The sample data used for description is from https://github.com/junyanz/CycleGAN |
Thanks for the information. So it sounds like you see roughly the same behavior between classic and lab with files (unlike I saw with directories). I figured the delay was in the client-side rendering, but you're showing essentially server-side code, which implies thousands of directories should have resulted in the same behavior - i.e., delay seen in both Notebook and Lab (contrary to what I found). ((That repo you link is interesting. The sample for the failure case is particularly entertaining.)) |
Hmm, I still see the same behaviors with files. I touched 10,000 files in my notebook directory ( With Notebook "classic", I see the contents api completes in just over 1 second, but the rendering (not sure if that's the approrpriate use of the term here, not a front-end dev) on the order of 48 seconds as I attempt to scroll. This scrolling is also accompanied by "Page Unresponsive" dialogs (using Chrome).
Switching the url to Lab, I see the same contents api taking just over 1 second, but the scrolling appears to be fine, with gaps between contents calls taking on the order of 8 seconds. However, I see no delay in the UI, so I suspect this "retrieval & rendering work" is happening in the background.
Not sure why the contents api is occuring during scrolling given the contents service doesn't appear to have paging. This might just be how the front-end is written in order to deal with updates. I suspect there's a general assumption that notebook directories are sparsely populated - which is reasonable IMO. |
I suspect this is due to periodic refreshing behavior of directory change in Jupyter Lab because contents API does not return the path under the directory incrementally (referring to the code I attached above). As for the test result, my environment magnifies the problem since it resides in remote server and allocated resource to the server is quite small (CPU 2, Memory 4Gi). So I tested in local, result is similar with @kevin-bates reported.
So I increased the number of files to 10x more than above by
I can check contents api almost takes 10x more time to get response. Here I could find the problem I think.
|
@esevan Great analysis. I am experiencing the same issue (in particular, JupyterLab hangs intermittently). Here are a few more insights from my POV.
In my case the key problem is that JupyterLab seems to issue a new call to the
The server side logic seems ok (except that it is blocking, not sure the ContentsManager api supports async). However it is not quite clear to me why JupyterLab requests the file contents when all it really does is to refresh the directory listing. In particular, JupyterLab - like the previous Jupyter server file listing, i.e. /tree - issues a specific get request to get the actual contents once the file/notebook is opened. I see several possible approaches to improve the situation:
Not sure if option 1 interferes with the JupyterLab UI logic (it may use the actual contents to display icons or other information). From my perspective option 2 would the best. We should avoid option 3 as it is bound to introduce consistency issues. Currently I don't have the capacity to dig further or open an issue in the JupyterLab tracker, any support would be appreciated. |
It's duplicate of #3114 . But no answer in that issue thread. Is there any progress on it such as pagination?
The text was updated successfully, but these errors were encountered: