Title: | Simplify Survival Data Analysis and Model Fitting |
---|---|
Description: | Inspect survival data, plot Kaplan-Meier curves, assess the proportional hazards assumption, fit parametric survival models, predict and plot survival and hazards, and export the outputs to 'Excel'. A simple interface for fitting survival models using flexsurv::flexsurvreg(), flexsurv::flexsurvspline(), flexsurvcure::flexsurvcure(), and survival::survreg(). |
Authors: | Niall Davison [aut, cre] |
Maintainer: | Niall Davison <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2.0.1.9000 |
Built: | 2025-02-10 05:17:10 UTC |
Source: | https://github.com/maple-health-group/easysurv |
Background The example simulated data set is based on large phase III clinical trials in breast cancer such as the ALTTO trial doi:10.1200/JCO.2015.62.1797. The example trial aims to determine if a combination of two therapies tablemab (T) plus vismab (V) improves outcomes for metastatic human epidermal growth factor 2-positive breast cancer and increases the pathologic complete response in the neoadjuvant setting (i.e. treatment given as a first step to shrink a tumor before the main treatment or surgery).
The trial has four treatment arms, patients with centrally confirmed human epidermal growth factor 2-positive early breast cancer were randomly assigned to 1 year of adjuvant therapy with V, T, their sequence (T to V), or their combination (T+V) for 52 weeks.
The primary end point was progression-free survival (PFS) as defined by Cancer.gov: '"the length of time during and after the treatment of a disease, such as cancer, that a patient lives with the disease but it does not get worse. In a clinical trial, measuring the progression-free survival is one way to see how well a new treatment works"'.
A number of baseline measurements (taken at randomization) are also included such as age, hormone receptor status and prior radiotherapy treatment.
Additional details on reasons for study discontinuation and censoring event description are also included.
The data set adopts an abridged version of the CDISC ADaM ADTTE time to event data model. See here for more info on CDISC ADaM data standards https://www.cdisc.org/standards/foundational/adam and specifically the ADTTE time to event data model here https://www.cdisc.org/standards/foundational/adam/adam-basic-data-structure-bds-time-event-tte-analyses-v1-0.
easy_adtte
easy_adtte
The data set contains the following variables:
The study identifier. A code unique to the clinical trial
subject identifier. Numeric ID unique to each patient
unique subject identifier. Text ID combining study and patient IDs
age at randomisation (years)
Hormone receptor status at randomisation
Hormone receptor positive (Numeric)
Hormone receptor positive (Long format)
Prior Radiotherapy at randomisation
Prior Radiotherapy at randomisation (Numeric)
Prior Radiotherapy at randomisation (Long format)
Planned treatment assigned at randomisation
Planned treatment assigned at randomisation (Numeric)
Analysis parameter: Progression free survival
Analysis parameter code
Analysis value (time to event (years)
Censoring (0 = Event, 1 = Censored)
Event description
Censoring description
Discontinuation from study reason
This is a copy of the bc data set exported by the flexsurv package. This data set, however, has column labels assigned.
easy_bc
easy_bc
The data set contains the following variables:
0 = Censored, 1 = Dead
Time of censoring or death in days
Prognostic group: Good, Medium, or Poor
Time of censoring or death in years
This is a copy of the lung data set exported by the survival package. This data set, however, has column labels assigned and time in months.
easy_lung
easy_lung
The data set contains the following variables:
Institution code
Survival time, months
Censoring status, 1 = censored, 2 = dead
Age
Sex, 1 = Male, 2 = Female
ECOG Performance Status (Physician)
Karnofsky performance score (Physician)
Karnofsky performance score (Patient)
Calories consumed
Weight loss, lbs
Fits survival models to the provided data using the specified engine and returns various outputs including model parameters, goodness of fit, and estimates of median survival.
fit_models( data, time, event, predict_by = NULL, covariates = NULL, dists = c("exp", "gamma", "gengamma", "gompertz", "llogis", "lnorm", "weibull"), engine = "flexsurv", k = c(1, 2, 3), scale = "hazard", add_time_0 = TRUE, ... )
fit_models( data, time, event, predict_by = NULL, covariates = NULL, dists = c("exp", "gamma", "gengamma", "gompertz", "llogis", "lnorm", "weibull"), engine = "flexsurv", k = c(1, 2, 3), scale = "hazard", add_time_0 = TRUE, ... )
data |
A data frame containing the survival data. |
time |
The name of the column in |
event |
The name of the column in |
predict_by |
(Optional) The name of the column in |
covariates |
(Optional) A character vector specifying the names of covariates to be included in the model. |
dists |
(Optional) A character vector specifying the distribution(s) to be fitted. When the engine parameter is set to "flexsurv", options are "exp", "exponential", "gamma", "genf", "genf.orig", "gengamma", "gengamma.orig", "gompertz", "llogis", "lnorm", "lognormal", "weibull", "weibullPH". When the engine parameter is set to "flexsurvcure", options are "exp", "gamma", "gengamma", "gompertz", "llogis", "lnorm", "weibull". When the engine parameter is set to "flexsurvspline", dists are ignored in favor of k and scale parameters. When the engine parameter is set to "survival", options are "exponential", "extreme", "gaussian", "loggaussian" (same as lognormal), "logistic", "lognormal", "rayleigh", "weibull". Default is |
engine |
(Optional) The survival analysis engine to be used. Options are "flexsurv", "flexsurvcure", "flexsurvspline", and "survival". Default is "flexsurv".
|
k |
(Optional) A numeric vector specifying the number of knots for
spline-based models. Default is |
scale |
(Optional) A character vector specifying the scale parameter(s)
for spline-based models. Options are "hazard", "odds", and "normal".
Default is |
add_time_0 |
Optional. Uses |
... |
Additional arguments just to catch them and avoid errors. |
A list containing information about the fit_models() call, the distributions attempted, goodness of fit, fit averages, and cure fractions (if applicable).
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group", covariates = "group" ) models
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group", covariates = "group" ) models
Calculates Kaplan-Meier estimates for survival data and returns summary statistics, plots, and additional outputs.
get_km( data, time, event, group = NULL, group_labels = NULL, just_km = FALSE, ... )
get_km( data, time, event, group = NULL, group_labels = NULL, just_km = FALSE, ... )
data |
A data frame containing the survival data. |
time |
The name of the column in |
event |
The name of the column in |
group |
(Optional) The name of the column in |
group_labels |
Optional character vector containing the names of
the strata (default is NULL). Provide in a consistent order with
|
just_km |
Logical. If |
... |
(Optional) Parameters to pass to ggsurvfit. |
A list containing Kaplan-Meier estimates, summary statistics, and plots.
km_results <- get_km( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group", risktable_symbols = FALSE ) km_results
km_results <- get_km( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group", risktable_symbols = FALSE ) km_results
This function extracts Schoenfeld residuals from a fitted cox.zph
object
and formats them into a tidy data frame.
get_schoenfeld(fit_zph)
get_schoenfeld(fit_zph)
fit_zph |
An object of class |
A tibble with the Schoenfeld residuals in long format, containing the columns:
time |
The time variable from the Cox model. |
transform |
The transformation applied to the time variable. |
variable |
The variable names from the Cox model for which residuals are calculated. |
residual |
The Schoenfeld residuals for each variable at each time point. |
library(survival) test_fit <- survival::coxph(survival::Surv(time, status) ~ sex, data = lung) test_fit_zph <- survival::cox.zph(test_fit) get_schoenfeld(test_fit_zph)
library(survival) test_fit <- survival::coxph(survival::Surv(time, status) ~ sex, data = lung) test_fit_zph <- survival::cox.zph(test_fit) get_schoenfeld(test_fit_zph)
Quickly inspect the survival data to ensure it is in the correct format.
inspect_surv_data(data, time, event, group = NULL)
inspect_surv_data(data, time, event, group = NULL)
data |
A data frame containing the survival data. |
time |
The column name in |
event |
The column name in |
group |
Optional. The column name in |
A list containing tibbles that summarise the first few rows of the survival data, the sample sizes, the events, and median survival.
inspect_surv_data( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group" )
inspect_surv_data( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group" )
Generates a Cumulative Log Log survival curve plot using
ggsurvfit::ggsurvfit()
with customizable options.
plot_cloglog( fit, median_line = FALSE, legend_position = "top", plot_theme = theme_easysurv() )
plot_cloglog( fit, median_line = FALSE, legend_position = "top", plot_theme = theme_easysurv() )
fit |
A survival::survfit object representing the survival data. |
median_line |
Logical value indicating whether to include a line
representing the median survival time. Default is |
legend_position |
Position of the legend in the plot. Default is "top". |
plot_theme |
ggplot2 theme for the plot. Default is
|
A ggplot object representing the cumulative log log plot.
library(ggsurvfit) fit <- survfit2(Surv(time, status) ~ surg, data = df_colon) plot_cloglog(fit)
library(ggsurvfit) fit <- survfit2(Surv(time, status) ~ surg, data = df_colon) plot_cloglog(fit)
Generates a Kaplan-Meier survival curve plot using
ggsurvfit::ggsurvfit()
with customizable options.
This function provides sensible defaults while allowing for customization.
plot_km( fit, risktable = TRUE, risktable_symbols = TRUE, median_line = TRUE, legend_position = "top", plot_theme = theme_easysurv(), risktable_theme = theme_risktable_easysurv() )
plot_km( fit, risktable = TRUE, risktable_symbols = TRUE, median_line = TRUE, legend_position = "top", plot_theme = theme_easysurv(), risktable_theme = theme_risktable_easysurv() )
fit |
A survival::survfit object representing the survival data. |
risktable |
Logical value indicating whether to include a risk table
below the plot. Default is |
risktable_symbols |
Logical value indicating whether to include symbols
instead of text to label risk table strata. Default is |
median_line |
Logical value indicating whether to include a line
representing the median survival time. Default is |
legend_position |
Position of the legend in the plot. Default is "top". |
plot_theme |
ggplot2 theme for the plot. Default is
|
risktable_theme |
ggplot2 theme for the risk table. Default is
|
A ggplot object representing the Kaplan-Meier survival curve plot.
library(ggsurvfit) fit <- survfit2(Surv(time, status) ~ surg, data = df_colon) plot_km(fit, risktable_symbols = FALSE)
library(ggsurvfit) fit <- survfit2(Surv(time, status) ~ surg, data = df_colon) plot_km(fit, risktable_symbols = FALSE)
Plot the residuals generated by the get_schoenfeld
function.
This function creates a visual representation of Schoenfeld residuals from a
Cox proportional hazards model.
It allows for customization of the plot, including the addition of horizontal
and smoothed lines, and styling of points and plot elements.
plot_schoenfeld( residuals, hline = TRUE, sline = TRUE, sline_se = TRUE, hline_col = "#F8766D", hline_size = 1, hline_alpha = 1, hline_yintercept = 0, hline_lty = "dashed", sline_col = "#00BFC4", sline_size = 1, sline_alpha = 0.2, sline_lty = "dashed", point_col = "black", point_size = 1, point_shape = 19, point_alpha = 1, plot_theme = ggplot2::theme_bw() )
plot_schoenfeld( residuals, hline = TRUE, sline = TRUE, sline_se = TRUE, hline_col = "#F8766D", hline_size = 1, hline_alpha = 1, hline_yintercept = 0, hline_lty = "dashed", sline_col = "#00BFC4", sline_size = 1, sline_alpha = 0.2, sline_lty = "dashed", point_col = "black", point_size = 1, point_shape = 19, point_alpha = 1, plot_theme = ggplot2::theme_bw() )
residuals |
A data frame containing the Schoenfeld residuals, typically
with columns |
hline |
Logical. If |
sline |
Logical. If |
sline_se |
Logical. If |
hline_col |
Color of the horizontal line. Default is |
hline_size |
Line width of the horizontal line. Default is |
hline_alpha |
Transparency of the horizontal line. Default is |
hline_yintercept |
Y-intercept for the horizontal line. Default is |
hline_lty |
Line type for the horizontal line. Default is |
sline_col |
Color of the smooth line. Default is |
sline_size |
Line width of the smooth line. Default is |
sline_alpha |
Transparency of the smooth line. Default is |
sline_lty |
Line type for the smooth line. Default is |
point_col |
Color of the points representing residuals. Default is
|
point_size |
Size of the points representing residuals. Default is |
point_shape |
Shape of the points representing residuals. Default is
|
point_alpha |
Transparency of the points representing residuals. Default
is |
plot_theme |
A ggplot2 theme for the plot. Default is
|
A ggplot object representing the plot of Schoenfeld residuals.
library(survival) test_fit <- survival::coxph(survival::Surv(time, status) ~ sex, data = lung) test_fit_zph <- survival::cox.zph(test_fit) plot_schoenfeld(get_schoenfeld(test_fit_zph))
library(survival) test_fit <- survival::coxph(survival::Surv(time, status) ~ sex, data = lung) test_fit_zph <- survival::cox.zph(test_fit) plot_schoenfeld(get_schoenfeld(test_fit_zph))
fit_models
Plot method for fit_models
## S3 method for class 'fit_models' plot( x, eval_time = NULL, km_include = TRUE, subtitle_include = TRUE, add_plotly = FALSE, ... )
## S3 method for class 'fit_models' plot( x, eval_time = NULL, km_include = TRUE, subtitle_include = TRUE, add_plotly = FALSE, ... )
x |
An object of class |
eval_time |
Time points at which to evaluate the survival function.
Default is |
km_include |
Logical value indicating whether to include Kaplan-Meier
survival data. Default is |
subtitle_include |
Logical value indicating whether to include a
subtitle in the plot. Default is |
add_plotly |
Logical value indicating whether to add plotly
interactivity. Default is |
... |
Additional arguments |
A list containing predictions and plots for the survival and hazards
of models in a fit_models
object.
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group", covariates = "group" ) plot(models)
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group", covariates = "group" ) plot(models)
This function generates survival and hazard predictions and plots for each
model in a fit_models
object. Optionally, interactive plotly
outputs can be added for each plot.
predict_and_plot( fit_models, eval_time = NULL, km_include = TRUE, subtitle_include = TRUE, add_plotly = FALSE )
predict_and_plot( fit_models, eval_time = NULL, km_include = TRUE, subtitle_include = TRUE, add_plotly = FALSE )
fit_models |
An object returned from fit_models. |
eval_time |
(Optional) A vector of evaluation time points for generating
predictions. Default is |
km_include |
A logical indicating whether to include Kaplan-Meier
estimates in the plot outputs. Default is |
subtitle_include |
A logical indicating whether to include the subtitle.
Default is |
add_plotly |
A logical indicating whether to add interactive plotly
outputs for each plot. Default is |
A list of predictions and plots for each model in the
fit_models
object.
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group" ) predict_and_plot(models)
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group" ) predict_and_plot(models)
fit_models
Predict method for fit_models
## S3 method for class 'fit_models' predict(object, eval_time = NULL, type = c("survival", "hazard"), ...)
## S3 method for class 'fit_models' predict(object, eval_time = NULL, type = c("survival", "hazard"), ...)
object |
An object of class |
eval_time |
(Optional) A vector of evaluation time points for generating
predictions. Default is |
type |
A character vector indicating the type of predictions to
generate. Default is |
... |
Additional arguments |
A list of predictions for each model in the
fit_models
object.
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group", covariates = "group" ) predict(models)
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group", covariates = "group" ) predict(models)
fit_models()
Print methods for fit_models()
## S3 method for class 'fit_models' print(x, ...)
## S3 method for class 'fit_models' print(x, ...)
x |
An object of class |
... |
Additional arguments |
A print summary of the fit_models
object.
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group", covariates = "group" ) models
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group", covariates = "group" ) models
get_km()
Print methods for get_km()
## S3 method for class 'get_km' print(x, ...)
## S3 method for class 'get_km' print(x, ...)
x |
An object of class |
... |
Additional arguments |
The summary of the Kaplan-Meier estimates, printed via the console.
km_results <- get_km( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group", risktable_symbols = FALSE ) print(km_results)
km_results <- get_km( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group", risktable_symbols = FALSE ) print(km_results)
inspect_surv_data()
Print methods for inspect_surv_data()
## S3 method for class 'inspect_surv_data' print(x, ...)
## S3 method for class 'inspect_surv_data' print(x, ...)
x |
An object of class |
... |
Additional arguments |
A print summary of the inspect_surv_data
object.
predict_and_plot()
Print methods for predict_and_plot()
## S3 method for class 'predict_and_plot' print(x, ...)
## S3 method for class 'predict_and_plot' print(x, ...)
x |
An object of class |
... |
Additional arguments |
A print summary of the predict_and_plot
object.
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group" ) predict_and_plot(models)
models <- fit_models( data = easysurv::easy_bc, time = "recyrs", event = "censrec", predict_by = "group" ) predict_and_plot(models)
test_ph()
Print methods for test_ph()
## S3 method for class 'test_ph' print(x, ...)
## S3 method for class 'test_ph' print(x, ...)
x |
An object of class |
... |
Additional arguments |
A print summary of the test_ph
object.
ph_results <- test_ph( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group" ) ph_results
ph_results <- test_ph( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group" ) ph_results
This function launches an example script for starting survival analysis
using the easysurv package. The script uses a modified version of the
lung data set exported from the survival package. The code is inspired by
usethis::use_template()
but modified to work outside the context of
an .RProj or package.
quick_start(output_file_name = NULL)
quick_start(output_file_name = NULL)
output_file_name |
Optional. A file name to use for the script. Defaults to "easysurv_start.R" within a helper function. |
A new R script file with example code.
quick_start()
quick_start()
This function launches an example script for starting survival analysis
using the easysurv package. The script uses a modified version of the bc data
set exported from the flexsurv package. The code is inspired by
usethis::use_template()
but modified to work outside the context of an
.RProj or package.
quick_start2(output_file_name = NULL)
quick_start2(output_file_name = NULL)
output_file_name |
Optional. A file name to use for the script. Defaults to "easysurv_start.R" within a helper function. |
A new R script file with example code.
quick_start2()
quick_start2()
This function launches an example script for starting survival analysis
using the easysurv package. The script uses simulated phase III breast
cancer trial data available from the ggsurvfit package.
The code is inspired by usethis::use_template()
but modified to work outside the context of an .RProj or package.
quick_start3(output_file_name = NULL)
quick_start3(output_file_name = NULL)
output_file_name |
Optional. A file name to use for the script. Defaults to "easysurv_start.R" within a helper function. |
A new R script file with example code.
quick_start3()
quick_start3()
Assesses the proportional hazards assumption for survival data using a Cox proportional hazards model and related tests.
test_ph(data, time, event, group, plot_theme = theme_easysurv())
test_ph(data, time, event, group, plot_theme = theme_easysurv())
data |
A data frame containing the survival data. |
time |
The name of the column in |
event |
The name of the column in |
group |
The name of the column in |
plot_theme |
The theme to be used for the plots. |
A list containing plots and test results related to the assessment of the proportional hazards assumption.
cloglog_plot |
A plot of the log cumulative hazard function. If the lines are roughly parallel, this suggests that the proportional hazards assumption holds." |
coxph_model |
The coefficients from the Cox proportional hazards model. The exp(coef) column shows the hazard ratio. |
survdiff |
The results of the log-rank test for differences in survival curves between groups. A p-value less than 0.05 suggests that survival differences between groups are statistically significant. |
coxph_test |
The results of the proportional hazards assumption test. A p-value less than 0.05 suggests that the proportional hazards assumption may be violated. |
schoenfeld_plot |
A plot of the Schoenfeld residuals. A flat smoothed line close to zero supports the proportional hazards assumption. A non-flat smoothed line with a trend suggests the proportional hazards assumption is violated. |
ph_results <- test_ph( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group" ) ph_results
ph_results <- test_ph( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group" ) ph_results
Plot Theme for easysurv Survival and Hazard Plots
theme_easysurv()
theme_easysurv()
A ggplot2 theme object.
library(ggsurvfit) fit <- survfit2(Surv(time, status) ~ surg, data = df_colon) fit |> ggsurvfit() + theme_easysurv()
library(ggsurvfit) fit <- survfit2(Surv(time, status) ~ surg, data = df_colon) fit |> ggsurvfit() + theme_easysurv()
To be used with ggsurvfit::add_risktable()
.
theme_risktable_easysurv()
theme_risktable_easysurv()
A list containing a ggplot2 theme object.
library(ggsurvfit) fit <- survfit2(Surv(time, status) ~ surg, data = df_colon) fit <- fit |> ggsurvfit() + theme_easysurv() + add_risktable(theme = theme_risktable_easysurv()) fit
library(ggsurvfit) fit <- survfit2(Surv(time, status) ~ surg, data = df_colon) fit <- fit |> ggsurvfit() + theme_easysurv() + add_risktable(theme = theme_risktable_easysurv()) fit
openxlsx
Export easysurv output to Excel via openxlsx
write_to_xl(wb, object)
write_to_xl(wb, object)
wb |
A Workbook object containing a worksheet |
object |
The output of an easysurv command |
An Excel workbook with the easysurv output.
km_results <- get_km( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group", risktable_symbols = FALSE ) wb <- openxlsx::createWorkbook() ## Not run: write_to_xl(wb, km_results) openxlsx::saveWorkbook(wb, "km_results.xlsx", overwrite = TRUE) openxlsx::openXL("km_results.xlsx") ## End(Not run)
km_results <- get_km( data = easysurv::easy_bc, time = "recyrs", event = "censrec", group = "group", risktable_symbols = FALSE ) wb <- openxlsx::createWorkbook() ## Not run: write_to_xl(wb, km_results) openxlsx::saveWorkbook(wb, "km_results.xlsx", overwrite = TRUE) openxlsx::openXL("km_results.xlsx") ## End(Not run)