Multi-Outcome Regression Table — regtab • SimtablR

Fits generalized linear models (GLMs) for multiple outcome variables and generates a formatted wide-format table with point estimates and confidence intervals. Supports robust standard errors, automatic exponentiation for count/binary outcomes, and custom labeling for publication-ready tables.

Usage

regtab(
  data,
  outcomes,
  predictors,
  family = poisson(link = "log"),
  robust = TRUE,
  exponentiate = NULL,
  labels = NULL,
  d = 2,
  conf.level = 0.95,
  include_intercept = FALSE,
  p_values = FALSE
)

Arguments

data

Data.frame containing all variables for analysis.

outcomes

Character vector of dependent variable names. Each outcome is modeled separately with the same set of predictors.

predictors

Formula or character string specifying predictors. Can be:

Formula: ~ x1 + x2 + x3
Character: "~ x1 + x2 + x3" or "x1 + x2 + x3"

family

GLM family specification. Options:

poisson(link = "log") - For count outcomes (default)
binomial(link = "logit") - For binary outcomes
gaussian(link = "identity") - For continuous outcomes
quasipoisson(), quasibinomial() - For overdispersed data
Or character: "poisson", "binomial", "gaussian"

robust

Logical. If TRUE (default), calculates heteroskedasticity-consistent (HC0) robust standard errors via the sandwich package. CIs are based on robust SEs.

exponentiate

Logical. If TRUE, exponentiates coefficients and CIs:

Poisson: IRR (Incidence Rate Ratios)
Binomial: OR (Odds Ratios)
Gaussian: Not typically used (stays on linear scale)

If NULL (default), automatically detects: TRUE for Poisson/Binomial, FALSE for Gaussian.

labels

Named character vector for renaming outcome columns in output. Format: c("raw_name" = "Pretty Label"). Useful for publication tables.

d

Integer. Number of decimal places for rounding estimates and CIs. Default: 2.

conf.level

Numeric. Confidence level for intervals (0-1). Default: 0.95.

include_intercept

Logical. If TRUE, includes intercept in output table. Default: FALSE (typically excluded from publication tables).

p_values

Logical. If TRUE, adds p-values as separate column. Default: FALSE.

Value

A data.frame in wide format with:

Variable: Predictor names (first column)
Outcome columns: One column per outcome with formatted estimates and CIs

Can be directly exported to Excel, Word, or LaTeX for publication.

Details

Model Fitting

For each outcome, the function fits: glm(outcome ~ predictors, family = family, data = data)

Robust Standard Errors

When robust = TRUE, the function:

Fits the model with standard GLM
Computes sandwich covariance matrix (HC0 estimator)
Calculates Wald-type CIs based on robust SEs

This provides protection against heteroskedasticity and mild model misspecification.

Exponentiation

Poisson regression: exp(β) = Incidence Rate Ratio - IRR = 1: No association - IRR > 1: Increased rate - IRR < 1: Decreased rate
Logistic regression: exp(β) = Odds Ratio - OR = 1: No association - OR > 1: Increased odds - OR < 1: Decreased odds

Output Format

Returns a wide-format data.frame:


Variable    | Outcome1          | Outcome2          | ...
------------|-------------------|-------------------|----
(Intercept) | 2.34 (1.89-2.91) | 1.98 (1.65-2.38) | ...
age         | 1.05 (1.02-1.08) | 1.03 (1.01-1.06) | ...
sex         | 0.87 (0.75-1.01) | 0.92 (0.81-1.05) | ...

Each cell contains: "Estimate (Lower CI - Upper CI)"

Missing Data

GLM uses complete cases by default. Observations with missing values in any variable are excluded from that specific model.

Convergence Issues

If a model fails to converge or encounters errors:

A warning is issued with the outcome name and error message
That outcome column is skipped in the output
Other outcomes continue processing

Examples

# Create example data
set.seed(456)
n <- 500
df <- data.frame(
  age = rnorm(n, 50, 10),
  sex = factor(sample(c("M", "F"), n, replace = TRUE)),
  treatment = factor(sample(c("A", "B"), n, replace = TRUE)),
  outcome1 = rpois(n, lambda = 5),
  outcome2 = rpois(n, lambda = 8),
  outcome3 = rpois(n, lambda = 3)
)

# Basic usage: Poisson regression for multiple outcomes
regtab(df,
       outcomes = c("outcome1", "outcome2", "outcome3"),
       predictors = ~ age + sex + treatment,
       family = poisson(link = "log"))
#> Auto-detected family 'poisson': Coefficients will be exponentiated.
#>      Variable             Result  Outcome
#> 1 (Intercept) 4.98 (4.00 - 6.21) outcome1
#> 2         age 1.00 (1.00 - 1.00) outcome1
#> 3        sexM 1.05 (0.98 - 1.13) outcome1
#> 4  treatmentB 1.00 (0.93 - 1.08) outcome1

# With custom labels and no robust SEs
regtab(df,
       outcomes = c("outcome1", "outcome2"),
       predictors = "age + sex",
       labels = c(outcome1 = "Primary Endpoint", outcome2 = "Secondary Endpoint"),
       robust = FALSE)
#> Auto-detected family 'poisson': Coefficients will be exponentiated.
#>      Variable             Result  Outcome
#> 1 (Intercept) 4.99 (4.00 - 6.21) outcome1
#> 2         age 1.00 (1.00 - 1.00) outcome1
#> 3        sexM 1.05 (0.97 - 1.14) outcome1

# Logistic regression with p-values
df$binary_outcome <- rbinom(n, 1, 0.4)
regtab(df,
       outcomes = "binary_outcome",
       predictors = ~ age + sex,
       family = binomial(),
       p_values = TRUE)
#> Auto-detected family 'binomial': Coefficients will be exponentiated.
#>      Variable             Result        Outcome P_Value
#> 1 (Intercept) 0.49 (0.18 - 1.33) binary_outcome   0.163
#> 2         age 1.01 (0.99 - 1.03) binary_outcome   0.346
#> 3        sexM 0.98 (0.68 - 1.40) binary_outcome   0.900