Skip to contents

Identifies datasets in the Item Response Warehouse (IRW) based on user-defined criteria. This function filters datasets using precomputed metadata, which contains summary statistics for each dataset (e.g., number of responses, number of participants, density scores, etc.).

Usage

irw_filter(
  n_responses = NULL,
  n_categories = NULL,
  n_participants = NULL,
  n_items = NULL,
  responses_per_participant = NULL,
  responses_per_item = NULL,
  density = c(0.5, 1),
  var = NULL
)

Arguments

n_responses

A numeric vector of length 2 specifying the range for total responses.

n_categories

A numeric vector of length 2 specifying the range for unique response categories.

n_participants

A numeric vector of length 2 specifying the range for the number of participants.

n_items

A numeric vector of length 2 specifying the range for the number of items.

responses_per_participant

A numeric vector of length 2 specifying the range for average responses per participant.

responses_per_item

A numeric vector of length 2 specifying the range for average responses per item.

density

A numeric vector of length 2 specifying the range for data density. Defaults to c(0.5, 1). To disable this filter, set density = NULL.

var

A character vector specifying one or more variables. - If exact variable names are provided, only datasets containing all specified variables will be returned. - If a variable name contains an underscore (e.g., "cov_", "Qmatrix_"), the function will match all datasets that contain at least one variable that starts with that prefix.

Value

A sorted character vector of dataset names matching all specified criteria or an empty result if no matches are found.

Details

Exploring Metadata

To understand available dataset properties before filtering, run summary(irw_metadata()). This provides an overview of key characteristics such as response counts, participant numbers, and density. Users can then apply irw_filter() to select datasets matching their criteria.

Examples

if (FALSE) { # \dontrun{
  # Example 1: Filter datasets with at least 1,000 responses and contain "rt"
  irw_filter(n_responses = c(1000, Inf), var = "rt")

  # Example 2: Disable density filtering and return datasets with "wave"
  irw_filter(var = "wave", density = NULL)

  # Example 3: Find datasets with at least 500 participants and response density 0.3-0.8
  irw_filter(n_participants = c(500, Inf), density = c(0.3, 0.8))

  # Example 4: Retrieve datasets that contain **all** of "treat", "rt", and any "cov_*" variables
  irw_filter(var = c("treat", "rt", "cov_"))

  # Example 5: Retrieve datasets that contain any variable starting with "Qmatrix_"
  irw_filter(var = c("Qmatrix_"))
} # }