--- title: "Drop Labels from a Table" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Drop Labels from a Table} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(dplyr) ``` ```{r setup} library(tidyREDCap) ``` # The Problem The `tidyREDCap` package creates data sets with labelled columns. ```{r eval=FALSE} tidyREDCap::import_instruments( url = "https://bbmc.ouhsc.edu/redcap/api/", token = keyring::key_get("REDCapR_test") ) ``` ```{r hidden-data-load, echo=FALSE} demographics <- structure( list( record_id = c(1, 2, 3, 4, 5), name_first = structure( c("Nutmeg", "Tumtum", "Marcus", "Trudy", "John Lee"), label = "First Name", class = c("labelled", "character") ), name_last = structure( c("Nutmouse", "Nutmouse", "Wood", "DAG", "Walker"), label = "Last Name", class = c("labelled", "character") ), address = structure( c( "14 Rose Cottage St.\nKenning UK, 323232", "14 Rose Cottage Blvd.\nKenning UK 34243", "243 Hill St.\nGuthrie OK 73402", "342 Elm\nDuncanville TX, 75116", "Hotel Suite\nNew Orleans LA, 70115" ), label = "Street, City, State, ZIP", class = c("labelled", "character") ), telephone = structure( c( "(405) 321-1111", "(405) 321-2222", "(405) 321-3333", "(405) 321-4444", "(405) 321-5555" ), label = "Phone number", class = c("labelled", "character") ), email = structure( c( "nutty@mouse.com", "tummy@mouse.comm", "mw@mwood.net", "peroxide@blonde.com", "left@hippocket.com" ), label = "E-mail", class = c("labelled", "character") ), dob = structure( c(12294, 12121, -13051, -6269, -5375), class = c("labelled", "Date"), label = "Date of birth" ), age = structure( c(11, 11, 80, 61, 59), label = "Age (years)", class = c("labelled", "numeric") ), days = structure( c(1, 2, 3, 4, 5), label = "Days", class = c("labelled", "numeric") ), sex = structure( c("Female", "Male", "Male", "Female", "Male"), label = "Gender", class = c("labelled", "character") ), demographics_complete = structure( c("Complete", "Complete", "Complete", "Complete", "Complete"), label = "Complete?...10", class = c("labelled", "character") ) ), row.names = c(NA, -5L), class = "data.frame" ) ``` If you would like to see the labels on the data set `demographics`, you can use the RStudio function `View()`, as shown below. ```{r eval=FALSE} View(demographics) ``` ![](./view_demog_w_labels_20230217.png){width=90% alt="Demographics preview with labels"} However, some functions do not work well with labeled variables. For example: ![](./show_numbers.png){width=40% alt="Show two numeric variables with labels"} ```{r show-error, error=TRUE} demographics |> rowwise() |> mutate(x = sum(c_across(c(age, days)))) ``` So you need a way to drop the label off of a variable or to drop all the labels from all the variables in a dataset. # The Solution ## Drop a single label You can drop the label from a single variable with the `drop_label()` function. For example: ```r new_demographics_table <- drop_label(demographics, "name_first") # Or new_demographics_table <- drop_label(demographics, name_first) # Or new_demographics_table <- demographics |> drop_label("name_first") ``` ## Drop multiple labels If you need to drop labels from multiple variables, you can drop them individually or using helper methods (i.e., `across()`). ```r demographics |> mutate(age = drop_label(age)) |> mutate(days = drop_label(days)) |> rowwise() |> mutate(x = sum(c_across(c(age, days)))) # Or demographics |> mutate(across(c(age, days), drop_label)) |> rowwise() |> mutate(x = sum(c_across(c(age, days)))) ``` ![](./show_numbers_2.png){width=40% alt="Show three numeric variables without labels"} You can use `tidyselect` helper methods (i.e., `contains()` or `starts_with()`) to include more than one variable or list them. The following code produces the same result: ```r demographics_changed_2 <- drop_label(demographics, contains("name")) # Same as: demographics_changed_3 <- drop_label(demographics, name_first, name_last) # Verifying: identical(demographics_changed_2, demographics_changed_3) ``` > NOTE: You do not normally need to enclose the variable names in quotations outside of `tidyselect` helpers (i.e., `contains()`) though the function still operates if you choose to. ## Use inside a `mutate` You can now use `drop_label()` inside a `mutate` pipe like this: ```r demographics_from_mutate <- demographics |> mutate(name_first = drop_label(name_first)) # Or demographics_from_mutate <- demographics |> mutate(across(starts_with('name'), drop_label)) ``` ## Drop all dataset variable labels You can drop all the labels using the `drop_label()` function. For example: ```{r drop-label-dataset} demographics_without_labels <- drop_label(demographics) ``` > NOTE: tidyREDCap versions prior to 1.2.0 handled dropping all variable labels from a dataset by using `drop_labels()`. This function can still be used, but we added a polite message to use `drop_label()` instead. ```{r, warning=TRUE} demographics_without_labels <- drop_labels(demographics) ```