Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Write support #13

Closed
jkrumbiegel opened this issue Dec 5, 2022 · 5 comments
Closed

Write support #13

jkrumbiegel opened this issue Dec 5, 2022 · 5 comments

Comments

@jkrumbiegel
Copy link
Contributor

jkrumbiegel commented Dec 5, 2022

I had opened a PR against ReadStat.jl a while ago to implement writing: queryverse/ReadStat.jl#87

However, it does not seem it's going to be merged as that package is not supposed to pull in Tables.jl. I could port the implementation to ReadStatTables.jl, which we are using anyway to read these files, so it makes sense from our perspective.

Would you be interested in such a PR? If yes, do you have any suggestions before I get started transferring the code, any particular interface you already had in mind?

@junyuan-chen
Copy link
Owner

Thanks for making the offer!

The write support is definitely something that will be added to this package at some point, tentatively in the v0.3.0 release. I haven't done so immediately as I plan to finish some remaining enhancement on the reading part before getting started on the writing part. But, to get things move faster on the writing part, it seems to make sense to have an experimental write support in the v0.2.x phase. I would probably get back to you in several days on this.

@jkrumbiegel
Copy link
Contributor Author

That sounds good, thank you. I'd be willing to work on this to get a first version working as soon as possible.

@junyuan-chen
Copy link
Owner

junyuan-chen commented Dec 17, 2022

@jkrumbiegel If you are still interested in taking the lead on the writing part, I will be happy to incorporate these changes into the package.

Regarding the design, I imagine that there could be two options: 1) directly iterating over any Tables.jl-compatible table specified by the user in a fashion similar to the PR to ReadStat.jl; 2) converting the table to ReadStatTable before writing the data. The first option avoids copying data but the row iteration can be slow depending on the type of that table. The second option may involve some data copying but row iteration with ReadStatTable is fast and it also allows collecting all metadata including value labels in one place.

Regarding the file organization, please put functions that merely wrap the ccall in src/wrappers.jl in an order that matches how they show up in readstat.h. Please add a new file named src/writer.jl for code that handles the writer. There could be an additional file named src/writestat for the interface exposed to end users, including docstrings. The function for end users is tentatively named writestat.

@jkrumbiegel
Copy link
Contributor Author

I will take a look next week, sounds good!

@junyuan-chen
Copy link
Owner

I am closing this issue as the first attempt on write support is arriving with the v0.2.3 release. Future development will target toward the v0.3 series due to planned breaking changes.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants