Skip to contents

Basic idea

My previous workflow was pretty tedious, as after preparing tables in R (data frames), I would export them to Excel, then copy from Excel into Word, and finally format the table in Word. Whenever I would make minor changes, it would take quite a bit of time to repeat these steps.

Fortunately, I found a package that suits my needs nicely, flextable. However, I only really need my tables in APA style 7th edition (Times New Roman size 12, only some horizontal lines, double-spaced, right number of decimals, etc.), so I made a function just with the default settings I like to simplify my life.

There are many existing options for APA tables. Most of them however will prebuild tables for you only for specific analyses or contexts and provide little flexibility (or yet won’t export to Word). If you need to build your own tables and require more flexibility, read on!

Getting started

Load the rempsyc package:

Note: If you haven’t installed this package yet, you will need to install it via the following command: install.packages("rempsyc").

The function can be used on almost any dataframe (though it does not allow duplicate column names). Here’s a simple example using the mtcars dataset, which comes with base R (meaning you can try this example too without downloading anything).

nice_table(
  mtcars[1:3, ], 
  title = c("Table 1", "Motor Trend Car Road Tests"),
  footnote = c("The data was extracted from the 1974 Motor Trend US magazine.",
               "* p < .05, ** p < .01, *** p < .001"))

Publication-ready tables

Let’s setup a more ‘credible’ table with actual statistics for demonstration. We would normally need a bit of complicated code to extract some relevant statistical information and create a dataframe that suits our needs.

Custom table

# Standardize variables to get standardized coefficients
mtcars.std <- lapply(mtcars, scale)
# Create a simple linear model
model <- lm(mpg ~ cyl + wt * hp, mtcars.std)
# Gather summary statistics
stats.table <- as.data.frame(summary(model)$coefficients)
# Get the confidence interval (CI) of the regression coefficient
CI <- confint(model)
# Add a row to join the variables names and CI to the stats
stats.table <- cbind(row.names(stats.table), stats.table, CI)
# Rename the columns appropriately
names(stats.table) <- c("Term", "B", "SE", "t", "p", "CI_lower", "CI_upper")

The dataframe looks like this (notice the large number of decimals):

stats.table
##                    Term          B         SE          t            p
## (Intercept) (Intercept) -0.1835269 0.08532112 -2.1510135 4.058431e-02
## cyl                 cyl -0.1082286 0.15071576 -0.7180977 4.788652e-01
## wt                   wt -0.6230206 0.10927573 -5.7013627 4.663587e-06
## hp                   hp -0.2874898 0.11955935 -2.4045781 2.331865e-02
## wt:hp             wt:hp  0.2875867 0.08895462  3.2329593 3.221753e-03
##               CI_lower     CI_upper
## (Intercept) -0.3585914 -0.008462403
## cyl         -0.4174718  0.201014550
## wt          -0.8472358 -0.398805290
## hp          -0.5328053 -0.042174267
## wt:hp        0.1050669  0.470106453

Now we can apply our function!

nice_table(stats.table)

Next, we can save the flextable as an object, which we can later further edit, view, or export to another software (e.g., Microsoft Word).

my_table <- nice_table(stats.table)

Save table to Word

One can easily save the table to word by specifying the object name and desired path.

save_as_docx(my_table, path = "nice_tablehere.docx")

Simply change the path to where you would like to save it. If you copy-paste your path name, remember to use “R” slashes (‘/’ rather than ‘\’). Also remember to specify the file name and its .docx extension.

That’s it! Simple eh?


Statistical formatting

Notice that if you provide the CI_lower and CI_upper column names, it will automatically and properly format your confidence interval column and remove the lower and upper bound columns.

You can also see that it automatically formats the df, b, t, and p values to italic. It also correctly rounded each row, and formatted p values as < .001 and stripped the leading zeros (it will do the same for correlations r, R2, sr2).

Note: in order for this to work automatically, your columns must be named correctly. Currently the function will make the following conversions: p, t, SE, SD, M, W, N, n, z, F, b, r, and df to italic; R2 and sr2 to italic squared, dR to italic R subscript, np2 to italic η subscript-p squared, ges to italic η subscript-G squared, and B to β. Not seeing a symbol that should be there? Contact me and I’ll add it!

Let’s test this by simply changing our dataframe names for the exercise.

test <- head(mtcars, 3)
names(test) <- c("dR", "N", "M", "SD", "W", "np2", "ges", "z", "r", "R2", "sr2")
test[, 10:11] <- test[, 10:11]/10
nice_table(test)

Highlighting

You can also add an argument to highlight significant results for better visual discrimination, if you wish so.

nice_table(stats.table, highlight = TRUE)

Pro tip: You can instead provide the highlight argument with a numeric value to set whatever critical p-value check you want, like highlight = .10, for “marginally significant” results, or highlight = .01 if you want to be more conservative.

nice_table(stats.table, highlight = .01)

Integrations

Making your own table manually may be intimidating at first. Fortunately, this function integrates nicely with the broom and report packages. So we can also skip the complicated code if one is OK with using the default broom/report output. This requires specifying the type of model in the function’s broom/report argument (supported options are lm, t.test, cor.test and wilcox.test). We go through an example of each below.

broom table

library(broom)
model <- lm(mpg ~ cyl + wt * hp, mtcars)
(stats.table <- tidy(model, conf.int = TRUE))
## # A tibble: 5 × 7
##   term        estimate std.error statistic  p.value  conf.low conf.high
##   <chr>          <dbl>     <dbl>     <dbl>    <dbl>     <dbl>     <dbl>
## 1 (Intercept)  49.5      3.66       13.5   1.58e-13  42.0       57.0   
## 2 cyl          -0.365    0.509      -0.718 4.79e- 1  -1.41       0.678 
## 3 wt           -7.63     1.52       -5.01  2.93e- 5 -10.7       -4.51  
## 4 hp           -0.108    0.0298     -3.64  1.14e- 3  -0.169     -0.0473
## 5 wt:hp         0.0258   0.00799     3.23  3.22e- 3   0.00944    0.0422
nice_table(stats.table, broom = "lm")

report table

library(report)
model <- lm(mpg ~ cyl + wt * hp, mtcars)
(stats.table <- as.data.frame(report(model)))
## Parameter   | Coefficient |          95% CI | t(27) |      p | Std. Coef. | Std. Coef. 95% CI |    Fit
## ------------------------------------------------------------------------------------------------------
## (Intercept) |       49.49 | [ 41.97, 57.01] | 13.51 | < .001 |      -0.18 |    [-0.36, -0.01] |       
## cyl         |       -0.37 | [ -1.41,  0.68] | -0.72 | 0.479  |      -0.11 |    [-0.42,  0.20] |       
## wt          |       -7.63 | [-10.75, -4.51] | -5.01 | < .001 |      -0.62 |    [-0.85, -0.40] |       
## hp          |       -0.11 | [ -0.17, -0.05] | -3.64 | 0.001  |      -0.29 |    [-0.53, -0.04] |       
## wt × hp     |        0.03 | [  0.01,  0.04] |  3.23 | 0.003  |       0.29 |    [ 0.11,  0.47] |       
##             |             |                 |       |        |            |                   |       
## AIC         |             |                 |       |        |            |                   | 147.01
## AICc        |             |                 |       |        |            |                   | 150.37
## BIC         |             |                 |       |        |            |                   | 155.80
## R2          |             |                 |       |        |            |                   |   0.89
## R2 (adj.)   |             |                 |       |        |            |                   |   0.87
## Sigma       |             |                 |       |        |            |                   |   2.17
nice_table(stats.table)

The report package provides quite comprehensive tables, so one may request an abbreviated table with the short argument.

nice_table(stats.table, short = TRUE)

rempsyc table

nice_table also integrates nicely with other functions from the rempsyc package: nice_t_test, nice_mod, nice_slopes, nice_lm, and nice_lm_slopes, because they provide good default formats that include effect sizes. Let’s make a quick demo for some of them. The t-test function supports making several t-tests at once by specifying the desired dependent variables.

t-tests: nice_t_test

nice_t_test(data = mtcars,
            response = c("mpg", "disp", "drat"),
            group = "am",
            warning = FALSE) -> stats.table
stats.table
##   Dependent Variable         t       df            p         d   CI_lower
## 1                mpg -3.767123 18.33225 1.373638e-03 -1.477947 -2.2659731
## 2               disp  4.197727 29.25845 2.300413e-04  1.445221  0.6417834
## 3               drat -5.646088 27.19780 5.266742e-06 -2.003084 -2.8592770
##     CI_upper
## 1 -0.6705686
## 2  2.2295592
## 3 -1.1245498
nice_table(stats.table)

Moderations: nice_mod

nice_mod(data = mtcars,
         response = "mpg",
         predictor = "gear",
         moderator = "wt") -> stats.table
stats.table
##   Dependent Variable Predictor df         b          t          p         sr2
## 1                mpg      gear 28  5.615951  1.9437108 0.06204275 0.028488305
## 2                mpg        wt 28  1.403861  0.4301493 0.67037970 0.001395217
## 3                mpg   gear:wt 28 -1.966931 -2.1551077 0.03989970 0.035022025
nice_table(stats.table)

Custom cell formatting

In some cases, one may want to define specific formatting for specific columns. For example, one may be building a table full of p-values and may want them formatted as such (or just the appropriate columns).

p-values

nice_table(test[8:11], col.format.p = 1:4)

r-values

The same goes for r-values. As you see below, you can also overwrite automatic default formatting.

nice_table(test[8:11], col.format.r = 1:4)

Custom functions

And one can even provide a custom function:

fun <- function(x) {x+11.1}

nice_table(test[8:11], col.format.custom = 2:4, format.custom = "fun")
fun <- function(x) {paste("×", x)}

nice_table(test[8:11], col.format.custom = 2:4, format.custom = "fun")
fun <- function(x) {formatC(x, format = "f", digits = 0)}

nice_table(test[3:6], col.format.custom = 1:4, format.custom = "fun")
fun <- function(x) {formatC(x, format = "f", digits = 5)}

nice_table(test[3:6], col.format.custom = 1:4, format.custom = "fun")

Further editing

Often, one will need to tweak a table for a particular situation. Have no fear. This function outputs a flextable object, which can be ‘easily’ edited via the regular flextable functions. For an intro to flextable functions, see: https://davidgohel.github.io/flextable/.

Here is the basic formatting example provided by the flextable package:

library(dplyr)
library(flextable)
my_table %>%
  italic(j = 1, part = "body") %>%
  bg(bg = "gray", part = "header") %>%
  color(color = "blue", part = "header") %>%
  color(~ t > -3.5, ~ t + SE, color = "red") %>%
  bold(~ t > -3.5, ~ t + p, bold = TRUE) %>%
  set_header_labels(Term = "Model Term",
                    B   = "Standardized Beta",
                    p = "p-value")

Special situation: multilevel headers

Some people have asked how to make multilevel descriptive level tables with nice_table. There are several other ways than using this package. It is not straightforward, but here’s my attempt for multiple time measurements, multiple groups, and multiple dependent variables.

So assuming we have a study with several time measurements, we will make a copy of the iris data set and pretend this is the “Time 2”. Species will be our grouping variable. Before we can apply our simple function, however, we have to (painstakingly) reshape the data to the proper format.

# Setup example dataset
data <- cbind(iris[c(5, 1:3)], iris[1:3]+1)
names(data)[-1] <- c(paste0("T1.", names(data[2:4])),
                     paste0("T2.", names(data[2:4])))

# Get descriptive statistics
library(dplyr)
data %>%
  group_by(Species) %>% 
  summarize(across(T1.Sepal.Length:T2.Petal.Length, 
                   list(m = mean, sd = sd),
                   .names = "{.col}.{.fn}")) -> descriptive.data

# Rename the columns so we can merge them later
names(descriptive.data) <- c("Species", rep(c("T1.M", "T1.SD"), 3),
                             rep(c("T2.M", "T2.SD"), 3))

# Extract the data by variable and measurement time
T1.disp <- cbind(descriptive.data[1, 2:3], 
                 descriptive.data[2, 2:3], 
                 descriptive.data[3, 2:3])
T1.hp <- cbind(descriptive.data[1, 4:5], 
               descriptive.data[2, 4:5], 
               descriptive.data[3, 4:5])
T1.drat <- cbind(descriptive.data[1, 6:7], 
                 descriptive.data[2, 6:7], 
                 descriptive.data[3, 6:7])
T2.disp <- cbind(descriptive.data[1, 8:9],
                 descriptive.data[2, 8:9], 
                 descriptive.data[3, 8:9])
T2.hp <- cbind(descriptive.data[1, 10:11], 
               descriptive.data[2, 10:11], 
               descriptive.data[3, 10:11])
T2.drat <- cbind(descriptive.data[1, 12:13], 
                 descriptive.data[2, 12:13], 
                 descriptive.data[3, 12:13])

# Combine Time 1 with Time 2
T1 <- rbind(T1.disp, T1.hp, T1.drat)
T2 <- rbind(T2.disp, T2.hp, T2.drat)
wide.data <- cbind(Variable = names(iris[1:3]), T1, T2)

# Rename variables to avoid duplicate names not allowed
names(wide.data)[-1] <- paste0(
  rep(c("T1.", "T2."), each = 6), 
  rep(descriptive.data$Species, times = 2, each = 2),
  paste0(c(".M", ".SD")))

# Make preliminary nice_table
nice_table(wide.data)

So far so good; we’ve managed to tranform the data in a suitable format for the next step. Once the data is in the right shape (header components separated by dots), we can apply our magic:

nice_table(wide.data, separate.header = TRUE, italics = seq(wide.data))

If you find a more efficient way to do this (the data wrangling part), please let me know. Nice result though!

Multilevel heading, with formatting

Another colleague asked, whether it was possible to use the multilevel headings, while still benefiting from the regular automatic formatting of the p-values, confidence intervals, etc. That was a challenging task to implement, but I think I’ve got something that should mostly work. Demo:

T1.mpg <- nice_t_test(data = mtcars, response = "mpg", group = "am")
T2.mpg <- nice_t_test(data = mtcars, response = "mpg", group = "vs")
T1.disp <- nice_t_test(data = mtcars, response = "disp", group = "am")
T2.disp <- nice_t_test(data = mtcars, response = "disp", group = "vs")
names(T1.mpg)[-1] <- paste0("T1.", names(T1.mpg)[-1])
names(T2.mpg) <- paste0("T2.", names(T2.mpg))
names(T1.disp)[-1] <- paste0("T1.", names(T1.disp)[-1])
names(T2.disp) <- paste0("T2.", names(T2.disp))
T1 <- rbind(T1.mpg, T1.disp)
T2 <- rbind(T2.mpg, T2.disp)
wide.data <- cbind(T1, T2[-(1)])
nice_table(wide.data)
nice_table(wide.data, separate.header = TRUE)

Let’s test adding another level of heading for testing.

names(wide.data)[-1] <- paste0(rep(c("Early.", "Late."), each = 6), names(wide.data)[-1])
nice_table(wide.data)
nice_table(wide.data, separate.header = TRUE)

Thanks for checking in

Make sure to check out this page again if you use the code after a time or if you encounter errors, as I periodically update or improve the code. Feel free to contact me for comments, questions, or requests to improve this function at https://github.com/rempsyc/rempsyc/issues. See all tutorials here: https://remi-theriault.com/tutorials.