Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[r] Use libtiledbsoma for array handles #3061

Open
johnkerl opened this issue Sep 24, 2024 · 3 comments
Open

[r] Use libtiledbsoma for array handles #3061

johnkerl opened this issue Sep 24, 2024 · 3 comments
Labels

Comments

@johnkerl
Copy link
Member

johnkerl commented Sep 24, 2024

Currently we're retaining a tiledb-r handle at open:

private$.tiledb_array <- tiledb::tiledb_array_open(self$object, type = mode)

And we use a temporary open-use-close at every single call to libtiledbsoma -- here is just one of many examples:

void set_metadata(std::string& uri, std::string& key, SEXP valuesxp, std::string& type,
bool is_array, Rcpp::XPtr<somactx_wrap_t> ctxxp,
Rcpp::Nullable<Rcpp::DatetimeVector> tsvec = R_NilValue) {
// shared pointer to SOMAContext from external pointer wrapper
std::shared_ptr<tdbs::SOMAContext> sctx = ctxxp->ctxptr;
// SOMA Object unique pointer (aka soup)
auto soup = getObjectUniquePointer(is_array, OpenMode::write, uri, sctx, tsvec);
if (type == "character") {
const tiledb_datatype_t value_type = TILEDB_STRING_UTF8;
std::string value = Rcpp::as<std::string>(valuesxp);
spdl::debug("[set_metadata] key {} value {} is_array {} type {}", key, value, is_array, type);
soup->set_metadata(key, value_type, value.length(), (void*) value.c_str(), true);
} else if (type == "integer64") {
const tiledb_datatype_t value_type = TILEDB_INT64;
double dv = Rcpp::as<double>(valuesxp);
int64_t value = Rcpp::fromInteger64(dv);
spdl::debug("[set_metadata] key {} value {} is_array {} type {}", key, value, is_array, type);
soup->set_metadata(key, value_type, 1, (void*) &value, true);
} else {
Rcpp::stop("Unsupported type '%s'", type);
}
soup->close();
}

On the one hand this might seem lower-pri: the redundant opens are a perf hit but they work.

But as discussed on #3060 we must do this in order to remove the tiledb-r dependency.

See also #3053 which @nguyenv is working on -- this is a case where we do currently require a second open for array reads.

From #3059:

Already done for groups

#2406; [sc-55685].

Steps:

  • Status quo is
    • There is a private$.tiledb_array with connection to self$object
      • This is set on construction/open, and closed on close
    • The libtiledbsoma handles are currently ephemeral
      • Every read / write / get_metadata etc which is libtiledbsoma-capable currently takes a uri and a ctx and does open/op/close
  • Needing to be done:

To check:

  • Currently the array set_metadata does not error out when the array is opened for read -- this is a bug.
    • This should be auto-fixed by this issue, but, it needs to be tested.
@johnkerl johnkerl changed the title [r] Use libtiledbsoma for array handles [r] Use libtiledbsoma for array handles Oct 10, 2024
@johnkerl
Copy link
Member Author

Blocked by #3051 since here
https://github.com/single-cell-data/TileDB-SOMA/blob/1.15.0rc2/apis/r/R/SOMADataFrame.R#L182
the self$object is used. Removing that private$.tiledb_array / self$object from the base class TileDBArray.R causes this query-condition logic to return empty results, in a way that makes unit-test cases fail in non-obvious ways.

We need to move the query-condition logic over from TileDB-R before we can continue here.

@viviannguyen
Copy link

I don't believe I'm the correct Vivian to be tagged here. You're probably looking for @nguyenv.

@johnkerl
Copy link
Member Author

Oh no @viviannguyen sorry about that!! Have a nice day though!!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants