Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Performance of creating new cells #521

Closed
trykyn opened this issue Feb 19, 2021 · 4 comments
Closed

Performance of creating new cells #521

trykyn opened this issue Feb 19, 2021 · 4 comments

Comments

@trykyn
Copy link

trykyn commented Feb 19, 2021

When creating an excel sheet with millions of cells and thousands of cells per row I noticed in the profiler that the majority of time is spent in a small function inside NPOI: XSSFRow.LastCellNum

image

Given that a SortedDictionary is used, it should be possible to find the last key in O(log(n)) time instead of O(n)
Unfortunately the available implementation of SortedDictionary doesn't have Min, Max even though the underlying SortedSet does.. but it would be possible to either roll your own which could exposes Min/Max
or more hacky but easy, access it with reflection..
In constructor of XSSFRow:
this._set = typeof(SortedDictionary<int, ICell>).GetField("_set", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance).GetValue(this._cells) as SortedSet<KeyValuePair<int,ICell>>;

Then In GetFirstKey/GetLastKey use _set.Max.Key or _set.Min.Key

@trykyn
Copy link
Author

trykyn commented Feb 19, 2021

For performance, I did some benchmarks and with 1000 cells the reflection solution is 10 times faster, with 10000 cells it is more than 100 times faster

@tonyqus
Copy link
Member

tonyqus commented Feb 22, 2021

Do you have a workable version on your box? We do accept PR.

@delreluca
Copy link

If SortedDictionary<int,ICell> in XSSFRow is only used internally (no serialization and no exposing outside) rolling your own is quite easy, we can copy and rename the official one (it is MIT-licensed) and expose the missing properties of the set:

https://github.com/dotnet/runtime/blob/5ca4992150e66a7ac6d48c567f905c24a73e7919/src/libraries/System.Collections/src/System/Collections/Generic/SortedDictionary.cs#L21

Is this approach OK, also from a license perspective? If so, I might find some time to create a PR.

@tonyqus tonyqus closed this as completed Jun 26, 2021
@tonyqus tonyqus added this to the NPOI 2.5.5 milestone Jun 26, 2021
@tonyqus tonyqus reopened this Jun 26, 2021
@tonyqus
Copy link
Member

tonyqus commented Jun 26, 2021

Thank you for the tip. I will test if it has any side effect on existing logic.

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

3 participants