causal_XTY_binary.Rd
Creates a causal data set \(S = (X,T,Y_0, Y_1, Y_{obs})\) for causal inference. The \(p\) columns of \(X\) are sampled from an independent Gaussian distribution with mean \(\mu_i\) with standard deviation \(\sigma_i\), i.e. \(N(\mu_i, \sigma_i^2)\). The observations \(Y_0, Y_1\) correspond to the outcome if the treatment \(T\) is 0 or 1, respectively. A binary treatment \(T\) taking values 0 or 1 is sampled with probability \(p_{treatment}\) and \(Y_{obs}\) is obtained by choosing the potential outcome (either \(Y_0\) or \(Y_1\)) corresponding to the sampled treatment \(T\). The base outcome \(Y = X^T \beta\) is assumed to depend on \(X\) in a linear fashion, and the average treatment effect corresponds to the additive effect of obtaining treatment \(T = 1\). See Causality (Pearl 2009) for further details and a general introduction to causal inference.
causal_XTY_binary( n = 100, mu = rep(0, 4), sigma = rep(1, 4), beta_coefficients = 1:4, treatment_prob = 0.5, treatment_effect = 10 )
n | The desired number of data points in the data set. |
---|---|
mu | A \(p\)-dimensional vector of means for \(\mu\). |
sigma | A \(p\)-dimensional vector of non-negative standard deviations for \(\sigma\). |
beta_coefficients | A \(p\)-dimensional vector of coefficients for \(\beta\). |
treatment_prob | A probability between 0 and 1 specifying the probability of treatment assignment \(p_{treatment}\). |
treatment_effect | The average treatment between two potential outcomes \(Y_0\) and \(Y_1\). |
A causal data set \(S = (X,T,Y_0, Y_1, Y_{obs})\). In the default case, the \(p\) columns \(X_i\) are sampled from \(N(0,1)\) and the coefficients are all 1. We also have \(n = 100\), \(p = 4\), with beta-coefficients 1 to 4. The base treatment probability is 0.5 (i.e. a coin flip), with the default average treatment effect set to 10.
causal_XTY_binary() #> # A tibble: 100 × 8 #> X1 X2 X3 X4 treatment Y0 Y1 Y_observed #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 -1.40 -0.387 -0.429 -0.356 0 -4.89 5.11 -4.89 #> 2 0.255 -0.785 1.36 -1.06 1 -1.49 8.51 8.51 #> 3 -2.44 -1.06 -0.0709 1.08 1 -0.455 9.55 9.55 #> 4 -0.00557 -0.796 -0.272 1.18 0 2.31 12.3 2.31 #> 5 0.622 -1.76 -2.45 0.198 1 -9.44 0.563 0.563 #> 6 1.15 -0.691 0.0655 -0.400 1 -1.64 8.36 8.36 #> 7 -1.82 -0.559 -1.10 0.616 0 -3.77 6.23 -3.77 #> 8 -0.247 -0.537 -0.633 1.97 1 4.68 14.7 14.7 #> 9 -0.244 0.227 -2.06 1.88 1 1.56 11.6 11.6 #> 10 -0.283 0.978 2.65 -1.59 1 3.27 13.3 13.3 #> # … with 90 more rows causal_XTY_binary(n = 40, mu = 1:7, sigma = rep(1, 7), beta_coefficients = 1:7, treatment_prob = 0.75, treatment_effect = 25) #> # A tibble: 40 × 11 #> X1 X2 X3 X4 X5 X6 X7 treatment Y0 Y1 Y_observed #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 2.78 2.06 2.34 3.16 3.78 5.25 6.21 1 120. 145. 145. #> 2 1.39 1.73 3.63 2.65 2.55 5.84 6.27 1 118. 143. 143. #> 3 0.0813 1.55 2.49 3.18 3.51 6.35 7.68 1 133. 158. 158. #> 4 -0.584 0.589 3.27 3.37 4.57 5.71 6.77 1 128. 153. 153. #> 5 0.916 1.49 3.47 4.82 4.06 6.10 5.49 1 129. 154. 154. #> 6 -1.09 1.73 3.72 4.30 4.88 6.72 6.42 1 140. 165. 165. #> 7 1.00 0.915 3.61 5.81 6.34 5.39 4.98 1 136. 161. 161. #> 8 0.644 2.36 2.38 3.11 4.14 4.89 7.40 1 127. 152. 152. #> 9 2.15 1.66 3.22 3.95 5.67 6.53 7.55 0 151. 176. 151. #> 10 0.779 3.36 4.13 3.53 3.58 6.74 7.03 1 142. 167. 167. #> # … with 30 more rows