Skip to contents

Make a toy data set for testing and demo. This is for internal use purpose and not intended to be called by users.

Usage

make_test_dat(
  vals_kept = c("304", "305", 3040:3049, 3050:3059),
  noise_val = "999",
  IDs = 1:50,
  date_range = seq(as.Date("2015-01-01"), as.Date("2020-12-31"), by = 1),
  nrows = 100,
  n_any = 50,
  n_all = 10,
  seed = NULL,
  answer_id = NULL,
  type = c("data.frame", "database")
)

Arguments

vals_kept

A vector of values that suppose to be identified.

noise_val

A vector of values that are not meant to be identified.

IDs

A vector of client IDs.

date_range

A vector of all possible dates in the data.

nrows

Number of rows of the output.

n_any

Number of rows to be identified if the criteria is that if any target column contains certain values.

n_all

Number of rows to be identified if the criteria is that if all target columns contain certain values.

seed

Seed for random number generation.

answer_id

Column name for the indicator of how the row should be identified: any, all, and noise.

type

Output type, "data.frame" or "database".

Value

A data.frame or remote table from 'dbplyr'.

Examples

make_test_dat() %>% head()
#>   uid clnt_id      dates diagx diagx_1 diagx_2
#> 1  56       1 2016-06-21   999    <NA>     999
#> 2  77       1 2016-08-18   999    <NA>     999
#> 3  93       1 2018-11-17   999    <NA>    <NA>
#> 4  91       1 2019-01-23   999     999     999
#> 5   5       3 2015-09-16   304    3057     999
#> 6  79       3 2019-02-14   999    <NA>    <NA>