Title: | A Systematic Data Wrangling Idiom |
---|---|
Description: | Supports systematic scrutiny, modification, and integration of data. The function status() counts rows that have missing values in grouping columns (returned by na() ), have non-unique combinations of grouping columns (returned by dup() ), and that are not locally sorted (returned by unsorted() ). Functions enumerate() and itemize() give sorted unique combinations of columns, with or without occurrence counts, respectively. Function ignore() drops columns in x that are present in y, and informative() drops columns in x that are entirely NA; constant() returns values that are constant, given a key. Data that have defined unique combinations of grouping values behave more predictably during merge operations. |
Authors: | Tim Bergsma |
Maintainer: | Tim Bergsma <[email protected]> |
License: | GPL-3 |
Version: | 0.6.4 |
Built: | 2025-02-22 05:09:59 UTC |
Source: | https://github.com/bergsmat/wrangle |
Returns columns of a data.frame whose values do not vary within subsets defined by columns named in .... Defaults to groups(x) if none supplied, or all columns otherwise.
## S3 method for class 'data.frame' constant(x, ...)
## S3 method for class 'data.frame' constant(x, ...)
x |
object |
... |
optional grouping columns (named arguments are ignored) |
data.frame (should be same class as x)
Other constant:
constant()
library(dplyr) constant(Theoph) # data frame with 0 columns and 1 row constant(Theoph, Subject) # Subject Wt Dose Study Theoph$Study <- 1 constant(Theoph) # Study constant(Theoph, Study) # Study constant(Theoph, Study, Subject) # Subject Wt Dose Study Theoph <- group_by(Theoph, Subject) constant(Theoph) # Subject Wt Dose Study constant(Theoph, Study) # Study foo <- data.frame(x = 1) foo <- group_by(foo, x) class(foo) <- c('foo', class(foo)) stopifnot(identical(class(foo), class(constant(foo))))
library(dplyr) constant(Theoph) # data frame with 0 columns and 1 row constant(Theoph, Subject) # Subject Wt Dose Study Theoph$Study <- 1 constant(Theoph) # Study constant(Theoph, Study) # Study constant(Theoph, Study, Subject) # Subject Wt Dose Study Theoph <- group_by(Theoph, Subject) constant(Theoph) # Subject Wt Dose Study constant(Theoph, Study) # Study foo <- data.frame(x = 1) foo <- group_by(foo, x) class(foo) <- c('foo', class(foo)) stopifnot(identical(class(foo), class(constant(foo))))
Shows records with duplicate or duplicated values of grouping variables.
## S3 method for class 'data.frame' dup(x, ...)
## S3 method for class 'data.frame' dup(x, ...)
x |
data.frame |
... |
optional grouping columns (named arguments are ignored) |
data.frame
Other dup:
dup()
library(dplyr) dupGroups(mtcars) dupGroups(group_by(mtcars, mpg)) dup(group_by(mtcars, mpg))
library(dplyr) dupGroups(mtcars) dupGroups(group_by(mtcars, mpg)) dup(group_by(mtcars, mpg))
Indexes records with with duplicate or duplicated values of grouping variables. If b follows a and and is the same, then b is a duplicate, a is duplicated, and both are shown.
## S3 method for class 'data.frame' dupGroups(x, ...)
## S3 method for class 'data.frame' dupGroups(x, ...)
x |
data.frame |
... |
optional grouping columns (named arguments are ignored) |
grouped_df
logical
Other dupGroups:
dupGroups()
Counts unique combinations of items in specified columns (unquoted).
enumerate(x, ...)
enumerate(x, ...)
x |
data.frame |
... |
columns to show |
grouped_df
Other util:
detect()
,
itemize()
,
static()
enumerate(mtcars, cyl, gear, carb)
enumerate(mtcars, cyl, gear, carb)
Drops columns in x that are present in y.
ignore(x, y, ...)
ignore(x, y, ...)
x |
data.frame |
y |
data.frame |
... |
ingored |
data.frame
Drops columns in x that are entirely NA.
informative(x, ...)
informative(x, ...)
x |
object of dispatch |
... |
passed |
Other informative:
informative.data.frame()
head(Theoph) Theoph$Dose <- NA head(informative(Theoph))
head(Theoph) Theoph$Dose <- NA head(informative(Theoph))
Drops columns in x that are entirely NA.
## S3 method for class 'data.frame' informative(x, ...)
## S3 method for class 'data.frame' informative(x, ...)
x |
data.frame |
... |
ingored |
data.frame
Other informative:
informative()
Shows unique combinations of items in specified columns (unquoted).
itemize(x, ...)
itemize(x, ...)
x |
data.frame |
... |
columns to show |
grouped_df
Other util:
detect()
,
enumerate()
,
static()
itemize(mtcars, cyl, gear, carb)
itemize(mtcars, cyl, gear, carb)
Indexes records whose relative positions would change if sorted, i.e. records that would not have the same nearest neighbors (before and after). unsorted() returns the records corresponding to this index.
## S3 method for class 'data.frame' misplaced(x, ...)
## S3 method for class 'data.frame' misplaced(x, ...)
x |
data.frame |
... |
optional grouping columns (named arguments are ignored) |
logical with length nrow(x)
Other unsorted:
misplaced()
,
unsorted.data.frame()
,
unsorted()
Shows records with NA values of grouping variables.
## S3 method for class 'data.frame' na(x, ...)
## S3 method for class 'data.frame' na(x, ...)
x |
data.frame |
... |
optional grouping columns (named arguments are ignored) |
data.frame
Other na:
na()
Indexes records with NA values of grouping variables.
## S3 method for class 'data.frame' naGroups(x, ...)
## S3 method for class 'data.frame' naGroups(x, ...)
x |
data.frame |
... |
optional grouping columns (named arguments are ignored) |
logical
Other naGroups:
naGroups()
Joins data frames safely. I.e., a left join that cannot alter row order or number. Supports the case where you only intend to augment existing rows with additional columns and are expecting singular matches. Gives an error if row order or number would have been altered by a left join.
## S3 method for class 'data.frame' safe_join(x, y, ...)
## S3 method for class 'data.frame' safe_join(x, y, ...)
x |
data.frame |
y |
data.frame |
... |
passed to dplyr::left_join |
Other safe_join:
safe_join()
library(magrittr) x <- data.frame(code = c('a','b','c'), value = c(1:3)) y <- data.frame(code = c('a','b','c'), roman = c('I','II','III')) x %>% safe_join(y) try( x %>% safe_join(rbind(y,y)) )
library(magrittr) x <- data.frame(code = c('a','b','c'), value = c(1:3)) y <- data.frame(code = c('a','b','c'), roman = c('I','II','III')) x %>% safe_join(y) try( x %>% safe_join(rbind(y,y)) )
As of 0.5, dplyr::arrange ignores groups. This function gives the old behavior as a method for generic base::sort. Borrowed from Ax3man at https://github.com/hadley/dplyr/issues/1206.
## S3 method for class 'grouped_df' sort(x, decreasing = FALSE, ...)
## S3 method for class 'grouped_df' sort(x, decreasing = FALSE, ...)
x |
grouped_df |
decreasing |
logical (ignored) |
... |
further sort criteria |
grouped_df
library(dplyr) head(sort(group_by(Theoph, Subject, Time)))
library(dplyr) head(sort(group_by(Theoph, Subject, Time)))
Finds unique records for subset of columns with one unique value.
static(x, ...)
static(x, ...)
x |
data.frame |
... |
ignored |
data.frame
Other util:
detect()
,
enumerate()
,
itemize()
Reports status with respect to grouping variables.
## S3 method for class 'data.frame' status(x, ...)
## S3 method for class 'data.frame' status(x, ...)
x |
data.frame |
... |
optional grouping columns (named arguments are ignored) |
returns x invisibly (as originally grouped)
na
dup
unsorted
informative
ignore
itemize
enumerate
sort.grouped_df
Other status:
status()
library(dplyr) status(Theoph) status(Theoph, Subject) status(group_by(Theoph, Subject, Time))
library(dplyr) status(Theoph) status(Theoph, Subject) status(group_by(Theoph, Subject, Time))
Extracts records whose relative positions would change if sorted, i.e. records that would not have the same nearest neighbors (before and after). misplaced() returns the index that extracts these records.
## S3 method for class 'data.frame' unsorted(x, ...)
## S3 method for class 'data.frame' unsorted(x, ...)
x |
data.frame |
... |
optional grouping columns (named arguments are ignored) |
data.frame, possibly grouped_df
Other unsorted:
misplaced.data.frame()
,
misplaced()
,
unsorted()
Shows records with NA, duplicate or duplicated values of grouping variables.
## S3 method for class 'data.frame' weak(x, ...)
## S3 method for class 'data.frame' weak(x, ...)
x |
data.frame |
... |
optional grouping columns (named arguments are ignored) |
data.frame
Other weak:
weak()