Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Whishlist (or bugs?) for plot.cox.zph(*, plot = FALSE) #297

Open
ThomasSoeiro opened this issue Jan 27, 2025 · 3 comments
Open

Whishlist (or bugs?) for plot.cox.zph(*, plot = FALSE) #297

ThomasSoeiro opened this issue Jan 27, 2025 · 3 comments

Comments

@ThomasSoeiro
Copy link

ThomasSoeiro commented Jan 27, 2025

Regarding plot.cox.zph(*, plot = FALSE), I don't know if this qualifies as bugs or whishlist:

  1. Only data for the first term in the model/subscript is returned. Maybe it should return a list with an element for each term in the model/subscript?
library(survival)
fit <- coxph(Surv(time, status) ~ trt + factor(celltype) + karno + age, veteran)
zph <- cox.zph(fit)

x1 <- plot(zph, plot = FALSE)
x2 <- plot(zph[1], plot = FALSE)
identical(x1, x2)
# [1] TRUE

x1 <- plot(zph[3:4], plot = FALSE)
x2 <- plot(zph[3], plot = FALSE)
identical(x1, x2)
# [1] TRUE
  1. hr = TRUE is ignored. Maybe it should return exp(y) in this case?
x1 <- plot(zph, plot = FALSE)
x2 <- plot(zph, hr = TRUE, plot = FALSE)
identical(x1, x2)
# [1] TRUE
  1. For cox.zph(*, transform = "km") or "rank", x does not match what is plotted (it does for transform = "identity"). Maybe return x transformed so the plot is easely reproducible?
zph <- cox.zph(fit, transform = "km") # the default
plot(zph[1], resid = FALSE) # x-axis tick marks from 8.2 to 390

x <- plot(zph, plot = FALSE)
range(x$x)
# [1] 0.0000000 0.9909955 # <- not the same range
matplot(
  x$x,
  x$y,
  type = "l",
  lty = c(1, 2, 2),
  col = 1,
  xlab = "Time",
  ylab = "Beta(t) for trt"
)

Image

  1. Finally, maybe set names to the y matrix, e.g. "y", "upper", "lower"?

I can propose patch(es) if you wish.
Thanks!

@therneau
Copy link
Owner

These are all good suggestions. Please feel free to suggest patches: I have several other things ahead of this in my queue.

@ThomasSoeiro
Copy link
Author

Here is a minimal patch for 1, 2, and 4. To keep it minimal:

  • I didn't reindent the new else block.
  • I didn't change the logic for drawing the curve but we could exp() the data only once.

Do you prefer a proper PR with indented code?

diff --git a/R/plot.cox.zph.R b/R/plot.cox.zph.R
index 18b4cf9..9676dd3 100644
--- a/R/plot.cox.zph.R
+++ b/R/plot.cox.zph.R
@@ -53,6 +53,8 @@ plot.cox.zph <- function(x, resid=TRUE, se=TRUE, df=4, nsmo=40,
     lwd <- rep(lwd, length=2)
     lty <- rep(lty, length=2)
 
+    if (!plot) ans <- vector(mode = "list", length = nvar)
+
     # Now, finally do the work
     for (i in var) {
         #   Since release 3.1-6, yy can have missing values.  If a covariate is
@@ -86,9 +88,12 @@ plot.cox.zph <- function(x, resid=TRUE, se=TRUE, df=4, nsmo=40,
 	    }
 
 
-        if (!plot)  # return the curve
-            return(list(x=pred.x, y=cbind(yhat, yup, ylow)))
+        if (!plot) {  # return the curve
+            ans[[i]] <- list(x=pred.x, y=cbind(yhat, yup, ylow))
+            colnames(ans[[i]]$y) <- c("y", "upper", "lower")
+            if (hr) ans[[i]]$y <- exp(ans[[i]]$y)
 
+        } else {
         if (!hr) {
             if (x$transform=='identity')
                 plot(range(xx), yr, type='n', xlab=xlab, ylab=ylab[i], ...)
@@ -136,4 +141,6 @@ plot.cox.zph <- function(x, resid=TRUE, se=TRUE, df=4, nsmo=40,
 	    }
         }
     }
+    }
+    if (!plot) ans
 }

@ThomasSoeiro
Copy link
Author

ThomasSoeiro commented Feb 2, 2025

Here is a new proposal allowing to reproduce the plots with the data returned by plot.cox.zph(*, plot = FALSE). The object returned is list of two components:

  • data: a named list of data frames.
  • xaxis: a data frame with xaxisval and xaxislab (fixing issue 3 from the first post), or pred.x (not sure if this is useful; maybe renaming xaxis to xaxis.trans and return NA if x-axis is not transformed?).

This is a breaking change. I hope it is acceptable because I think it is more natural for post processing.

As in the previous version, to keep the patch small:

  • I didn't reindent the new else block.
  • I didn't change the logic for drawing the curve but we could exp() the data only once.

Also, I'm not sure I followed your coding style correctly (single quote vs double quotes, spacing, etc.).

--- plot.cox.zph.R.orig 2025-02-02 17:08:46.490301100 +0100
+++ plot.cox.zph.R      2025-02-02 17:29:33.707893600 +0100
@@ -53,6 +53,14 @@
     lwd <- rep(lwd, length=2)
     lty <- rep(lty, length=2)

+    if (!plot) {
+        ans <- list(data=vector(mode="list", length=nvar))
+        names(ans$data) <- dimnames(yy)[[2]]
+        if (x$transform != 'identity')
+            ans$xaxis <- data.frame(val=xaxisval, lab=xaxislab)
+        else ans$xaxis <- pred.x
+    }
+
     # Now, finally do the work
     for (i in var) {
         #   Since release 3.1-6, yy can have missing values.  If a covariate is
@@ -86,9 +94,15 @@
            }


-        if (!plot)  # return the curve
-            return(list(x=pred.x, y=cbind(yhat, yup, ylow)))
+        if (!plot) {  # return the curve
+            if (!hr)
+                ans$data[[i]] <- data.frame(x=pred.x, y=yhat,
+                                            upper=yup, lower=ylow)
+            else
+                ans$data[[i]] <- data.frame(x=pred.x, y=exp(yhat),
+                                            upper=exp(yup), lower=exp(ylow))

+        } else {
         if (!hr) {
             if (x$transform=='identity')
                 plot(range(xx), yr, type='n', xlab=xlab, ylab=ylab[i], ...)
@@ -137,3 +151,5 @@
         }
     }
 }
+    if (!plot) ans
+}

Reproducible example:

library(survival)

fit <- coxph(Surv(time, status) ~ trt + factor(celltype) + karno + age, veteran)
zph <- cox.zph(fit)

# before patch
plot(zph[1], resid = FALSE, hr = TRUE, main = "Before patch")

# after patch
p <- plot(zph, hr = TRUE, plot = FALSE)
matplot(
  p$data$trt$x,
  p$data$trt[-1],
  type = "l",
  lty = c(1, 2, 2),
  col = 1,
  xlab = "Time",
  ylab = "HR(t) for trt",
  log = "y",
  xaxt = "n",
  main = "After patch"
)
axis(1, p$xaxis$val, p$xaxis$lab)

Image

Interestingly, it also allows to directly use ggplot2 to produce the plots, without survminer. I think that it's important because survminer is popular but has important issues and some functions are basically copy-paste of survival (which is probably not a very good idea for maintenance) combined with ggplot2 plots. (I don't mean to be rude!)

library(ggplot2)

p$data <- Map(cbind, p$data, term = names(p$data))
p$data <- do.call(rbind, p$data)

ggplot(p$data, aes(x, y, ymin = lower, ymax = upper)) +
  geom_ribbon(alpha = 0.2) +
  geom_line() +
  scale_x_continuous(breaks = p$xaxis$val, labels = p$xaxis$lab) +
  scale_y_log10() +
  labs(x = "Time", y = "HR(t)") +
  facet_wrap(~ term, scales = "free_y")

Image

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants