We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
setkey make joins extremely faster in data.tables, the codes over join benchmark are not setting the keys properly and can affect the main results.
It is also important in other kinds of data manipulation such as deduce. for instance: setkey(DT, key) unique(DT, by = 'key')
is very much faster than unique(DT, by 'key')
This can go from 15 minutes to seconds for 100GB+ datasets
Joins work the same way:
setkey(DTA, key) setkey(DTB, key)
DTA[DTB, on = .(key)]
I hope it can make the benchmar better!!
The text was updated successfully, but these errors were encountered:
No branches or pull requests
setkey make joins extremely faster in data.tables, the codes over join benchmark are not setting the keys properly and can affect the main results.
It is also important in other kinds of data manipulation such as deduce. for instance:
setkey(DT, key)
unique(DT, by = 'key')
is very much faster than
unique(DT, by 'key')
This can go from 15 minutes to seconds for 100GB+ datasets
Joins work the same way:
setkey(DTA, key)
setkey(DTB, key)
DTA[DTB, on = .(key)]
I hope it can make the benchmar better!!
The text was updated successfully, but these errors were encountered: