Main functions
This package has 3 main functions, with them we can generate simulated data for a pool of donors, a set of kidney transplant candidates and the respective HLA-antibodies for those patients HLA sensitized.
Donors
A data frame with information for a pool of simulated donors can be generated with the function donors_df()
:
donors_df(n = 10,
replace = TRUE,
origin = 'PT',
probs = c(0.4658, 0.0343, 0.077, 0.4229),
lower=18, upper=75,
mean = 60, sd = 12,
uk = FALSE,
seed.number = 3)
#> # A tibble: 10 × 9
#> ID bg A1 A2 B1 B2 DR1 DR2 age
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
#> 1 D1 A 1 29 8 44 11 7 53
#> 2 D2 O 2 3 7 57 4 8 57
#> 3 D3 A 11 33 14 35 1 13 61
#> 4 D4 O 24 30 49 58 11 11 62
#> 5 D5 B 2 3 7 51 1 13 66
#> 6 D6 A 30 68 15 18 3 4 45
#> 7 D7 O 3 26 18 40 11 13 52
#> 8 D8 B 1 1 7 8 3 13 55
#> 9 D9 O 3 3 7 44 15 8 75
#> 10 D10 A 11 29 44 57 7 7 64
For a given number of rows n
, a data frame is generated with columns:
- ID unique identifier with the prefix ‘D’;
-
bg with the blood group generated from the parameter
probs
a vector with the probabilities for groups A, AB, B and O, respectively; -
A1, A2, B1, B2, DR1, DR2 HLA typing obtained according to
origin
option (withreplace = TRUE
we can generate a data frame without limitations on the number of rows); -
age generated from a Normal distribution with
mean
andsd
given by the user, values truncated bylower
andupper
boundaries; -
DRI when option
uk = TRUE
, Donor Risk Index is copmputed as described by transplantr
HLA population origin
has currently as valid options ‘PT’ for Portuguese, and populations available from US National Marrow Donor Program:
- ‘API’ - Asian / Pacific Islander
- ‘AFA’ - African American / Black
- ‘CAU’ - White / Caucasian
- ‘HIS’ - Hispanic
Defining seed.number
allows for reproducibility.
:information_source: to compute DRI as decribed on transplantr, we generated variables: height (\(N(165,20)\)); hypertension (with probability \(0.43\)); sex (with probability \(0.55\) for man); CMV+ (with probability \(0.9\)); hospital stay (\(P(\lambda = 4)\)); and GFR by age (<30 \(N(116,10)\); 30-39 \(N(107,10)\); 40-49 \(N(99,10)\); 50-59 \(N(93,10)\); 60-69 \(N(85, 10)\); >=70 \(N(75, 10)\))
Candidates
A simulated waiting list for kidney transplant candidates, can be generated with candidates_df()
:
candidates_df(n = 10,
replace = TRUE,
origin = 'PT',
probs.abo = c(0.43, 0.03, 0.08, 0.46),
probs.cpra = c(0.7, 0.1, 0.1, 0.1),
lower=18, upper=75,
mean = 45, sd = 15,
prob.dm = 0.12,
prob.urgent = 0.05,
uk = FALSE,
seed.number = 3)
#> # A tibble: 10 × 13
#> ID bg A1 A2 B1 B2 DR1 DR2 age cPRA hiper dialysis
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <lgl> <dbl>
#> 1 K1 O 1 29 8 44 11 7 37 0 FALSE 80
#> 2 K2 A 2 3 7 57 4 8 42 0 FALSE 49
#> 3 K3 O 11 33 14 35 1 13 68 0 FALSE 49
#> 4 K4 O 24 30 49 58 11 11 46 0 FALSE 36
#> 5 K5 A 2 3 7 51 1 13 47 76 FALSE 38
#> 6 K6 A 30 68 15 18 3 4 71 25 FALSE 44
#> 7 K7 O 3 26 18 40 11 13 52 91 TRUE 99
#> 8 K8 O 1 1 7 8 3 13 26 0 FALSE 69
#> 9 K9 A 3 3 7 44 15 8 35 0 FALSE 27
#> 10 K10 A 11 29 44 57 7 7 38 0 FALSE 10
#> # ℹ 1 more variable: urgent <dbl>
For a given number of n
rows, a data frame is generated with columns:
- ID unique identifier with the prefix ‘K’;
-
bg with the blood group generated from the parameter
probs.abo
a vector with the probabilities for groups A, AB, B and O, respectively (here by default, we assumed group O patients are more frequent); -
A1, A2, B1, B2, DR1, DR2 HLA typing obtained according to
origin
option (withreplace = TRUE
we can generate a data frame without limitations on the number of rows); -
age generated from a Normal distribution with
mean
andsd
given by the user, values truncated bylower
andupper
boundaries; - dialysis time on dialysis in months, values computed according to patients’ blood group and hypersensitation status (cPRA > 85%): for patients with blood group O and Hypersinsitized time on dialysis obtained from N(85, 20); for those patients blood O or Hypersinsitized \(N(70,20)\); remaining patients have time on dialysis obtained from \(N(35,20)\);
-
cPRA patients are classified in groups with probabilities given by
probs.cpra
for 0%, 1%-50%, 51%-85% and 86%-100%, respectively. Within the groups > 0%, cPRA are computed as random values from distributions \(P(\lambda = 30)\), \(P(\lambda = 70)\) and \(P(\lambda = 90)\); - Tier patients are classified in two Tiers as described on POL186/11 – Kidney Transplantation: Deceased Donor Organ Allocation from UK transplant. In Tier A are patients with MS = 10 or cPRA = 100% or time on dialysis > 7 years, all remaing patients are classified as Tier B;
-
MS matchabilily score are the deciles obtained from the number of donors on dataset
D10K
that are a match to each transplant candidate. This score takes into account a patient’s blood type, HLA type and cPRA value. A patient with a MS = 1 is defined as easy to match and a MS = 10 as difficult to match. -
RRI when option
uk = TRUE
, Recipient Risk Index is copmputed as described by transplantr. To compute RRI, variables age, time on dialysis (in days) and the probability of being diabetic (obtained fromprob.dm
) are used. Also, we assumed all patients were on dialysis at time of listing. -
urgent a diccotomic variavel that assumes 1 for clinical urgent patients. It’s generated from
prob.urgent
.
HLA population origin
can be defined from options: ‘PT’,‘API’,‘AFA’,‘CAU’ and ‘HIS’, as reported for donors_df()
data frame.
Defining seed.number
allows for reproducibility.
HLA antibodies
the function Abs_df()
allows to generate a data frame with HLA antibodies from a candidates waiting list:
Abs_df(candidates = candidates_df(n=10),
origin = 'PT',
seed.number = 3)
#> # A tibble: 35 × 2
#> ID abs
#> <chr> <chr>
#> 1 K5 A25
#> 2 K5 DR4
#> 3 K5 A34
#> 4 K5 B53
#> 5 K5 B49
#> 6 K5 DR4
#> 7 K5 B54
#> 8 K5 B57
#> 9 K5 B44
#> 10 K5 A30
#> # ℹ 25 more rows
as inputs, this function requires a data set with an ID and patients HLA information (HLA typing and cPRA value) with the same format as provided by candidates_df()
. Defining seed.number
allows for reproducibility.
HLA population origin
must be defined in accordance with functions candidates_df()
.
For PT origin, all these functions rely on HLA typing at intermediate resolution as described at Lima et al, 2013.
For NMDP populations, HLA typing were described by Gragert et al, 2013