Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

tmpFiles() does not recognized orphaned temporary files if terra option tempdir is not the default tempdir() #1630

Closed
smckenzie1986 opened this issue Oct 25, 2024 · 1 comment

Comments

@smckenzie1986
Copy link

Hi Robert,
I ran into a little snag trying to do parallel processing with large raster tiles and many geoprocessing steps. To avoid overloading my memory, I write all intermediate steps as temporary files to a folder on an external hard drive. I have been trying to delete the temporary files between each iteration of a foreach() loop, but I got hung up because I couldn't get tmpFiles() to recognize just orphan files after I had removed their corresponding spatRasters. I dug into the source code, and found out that the reason I ran into this problem was that the internal function terra:::.orphanTmpFiles() only searches the default temporary directory generated by tempdir(). Below is a reproducible example showing the behavior I described, and the proposed solution I came up with. Could this potentially be integrated into a future version of terra? Thanks as always for such a powerful and user-friendly package!

##Example of behavior##
library(terra)

othertmpdir<-"C:/Example"

if(!dir.exists(othertmpdir)){
  dir.create(othertmpdir)
}

terraOptions(tempdir="C:/Example", todisk=TRUE)

r<-rast(xmin= -122, xmax=-121, ymin=45, ymax=45.5, nrows=1000, ncols=1000, crs="epsg:4326")
values(r)<-rnorm(100000, 500, 50)
r_6557<-project(r, "epsg:6557") *100

tmpFiles()

rm(r)

tmpFiles(current=FALSE, orphan=TRUE)


####Proposed fix####
orphan_files<-function () 
{
  objects <- ls(envir = globalenv())
  ftmp <- list()
  for (i in seq_along(objects)) {
    x <- get(objects[i], envir = globalenv())
    if (inherits(x, "SpatRaster")) {
      ftmp[[i]] <- sources(x)
    }
  }
  ftmp <- unique(unlist(ftmp))
  ftmp <- ftmp[ftmp != ""]
  pattrn <- "^spat_.*tif$"
  i <- grep(pattrn, basename(ftmp))
  ftmp <- ftmp[i]
  ff <- list.files(terraOptions()$tempdir, pattern = pattrn, full.names = TRUE) #This is the only line that I changed from your source code
  i <- !(basename(ff) %in% basename(ftmp))
  ff[i]
}

orphan_files()


new_tmpFiles<-function (current = TRUE, orphan = FALSE, old = FALSE, remove = FALSE) 
{
  if (!(old | current | orphan)) {
    error("tmpFiles", "at least one of 'orphan', 'current' and 'old' must be set to TRUE")
  }
  opt <- terra:::spatOptions()
  d <- opt$tempdir
  f <- NULL
  if (old) {
    if (normalizePath(tempdir()) != normalizePath(d)) {
      warn("tmpFiles", "old files can only be found if terra uses the R tempdir")
    }
    else {
      f <- list.files(dirname(d), recursive = TRUE, pattern = "^spat_", 
                      full.names = TRUE)
      f <- grep("Rtmp", f, value = TRUE)
      if ((length(f) > 0) && (!current)) {
        i <- grep(d, f)
        if (length(i) > 0) {
          f <- f[-i]
        }
      }
    }
  }
  if (current) {
    ff <- list.files(d, pattern = "^spat", full.names = TRUE)
    f <- c(f, ff)
  }
  else if (orphan) {
    fo <- orphan_files()
    f <- c(f, fo)
  }
  if (remove) {
    file.remove(f)
    return(invisible(f))
  }
  else {
    return(f)
  }
}

new_tmpFiles(current=FALSE, orphan=TRUE)
@rhijmans
Copy link
Member

Thanks, that was indeed an oversight. I think this commit fixes it.

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Jan 5, 2025
# version 1.8-5

## bug fixes

- `Spatsample(method='stratified', ext=e)` returned the wrong sampling
  coordinates [#1628](rspatial/terra#1628)
  by Barnabas Harris

- `spatSample(method='stratified')` could fail with small sample sizes
  [#1503](rspatial/terra#1503) by karluf

- transparency (alpha) did not work with RGB
  plotting. [#1642](rspatial/terra#1642) by
  Timothée Giraud

- rasterization failed on very large rasters
  [#1636](rspatial/terra#1636) by Mary
  Fisher, [#1463](rspatial/terra#1463) by
  Nic Spono and [#1281](rspatial/terra#1281)
  by Sebastian Dunnett

- `tmpFiles` only looked in the default temp files folder
  [#1630](rspatial/terra#1630) by
  smckenzie1986

- `where.min` did not work well if there were negative values
  [#1634](rspatial/terra#1634) by Michael
  Sumner

- `plet<SpatRaster>` now works for RGB rasters and rasters with a
  color table [#1596](rspatial/terra#1596)
  by Agustin Lobo

- `vect<MULTIPOINT WKT>` did not work properly
  [#1376](rspatial/terra#1376) by
  silasprincipe

- `compareGeom<SpatVector>` did not work
  [#1654](rspatial/terra#1654) by Jason
  Flower

- `buffer<SpatVector>` is now more accurate buffers for lonlat
  polygons [#1616](rspatial/terra#1616) by
  Roberto Amaral-Santos

- `terra:interpNear` used square windows, not circles, beyond 100
  points [#1509](rspatial/terra#1509) by
  Jean-Luc Dupouey

- `vect` read INT64 fields as integers, sometimes leading to
  overflows. [#1666](rspatial/terra#1666) by
  bengannon-fc

- `plot` showed a legend title even if none was requestd if title
  parameters were specified
  . [#1664](rspatial/terra#1664) by Márcia
  Barbosa



## enhancements

n- improved documentation of `writeVector` overwrite when using
  layers. [#1573](rspatial/terra#1573) by
  Todd West

- improved treatment of (supposedly) flipped rasters by Timothée
  Giraud [#1627](rspatial/terra#1627) and
  fchianucci [#1646](rspatial/terra#1646)

- added `map.pal("random")`
  [#1631](rspatial/terra#1631) by Agustin
  Lobo

- expressions can now be used in legend titles
  [#1626](rspatial/terra#1626) by Noah
  Goodkind

- `app` and `tapp` now emit a warning when factors are coerced to
  numeric [#1566](rspatial/terra#1566) by
  shuysman

- `plet<SpatRaster>` now has argument "stretch" for RGB rasters
  [#1596](rspatial/terra#1596) by Agustin

- `%%` and `%/%` now behave the same for SpatRaster as for (base R)
  numbers [#1661](rspatial/terra#1661) by
  Klaus Huebert

## new

- `patches` with option `valus=TRUE` can now distinguish regions based
  on their cell values (instead of only NA vs not-NA)
  [#495](rspatial/terra#495) by Jakub
  Nowosad and [#1632](rspatial/terra#1632)
  by Agustin Lobo

- `rowSums`, `rowMeans`, `colSums` and `colMeans` for SpatRaster

- `metags` for SpatRasterDataset
  [#1624](rspatial/terra#1624) by Andrea
  Manica

- `metags` for layers (bands) of SpatRaster are now saved to and read
  from GTiff files
  [#1071](rspatial/terra#1071) by Mike
  Koontz

- `global` has new effcient functions "anyNA" and "anynotNA"
  [#1540](rspatial/terra#1540) by Kevin J
  Wolz

- `wrap`, `saveRDS` and `serialize` for
  SpatExtent. [#1430](rspatial/terra#1430)
  by BastienFR

- `vect<SpatGraticule>` method suggested in relation to [tidyterra
  #155](dieghernan/tidyterra#155) by Diego
  Hernangómez

- `toMemory<SpatRaster>` and `<SpatRasterDataset>` methods
  [#1660](rspatial/terra#1660) by Derek Friend
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants