Nicely reports NA values according to existing guidelines. This function reports both absolute and percentage values of specified column lists. Some authors recommend reporting item-level missing item per scale, as well as participant’s maximum number of missing items by scale. For example, Parent (2013) writes:

*I recommend that authors (a) state their tolerance level for
missing data by scale or subscale (e.g., “We calculated means
for all subscales on which participants gave at least 75%
complete data”) and then (b) report the individual missingness
rates by scale per data point (i.e., the number of missing
values out of all data points on that scale for all participants)
and the maximum by participant (e.g., “For Attachment Anxiety,
a total of 4 missing data points out of 100 were observed,
with no participant missing more than a single data point”).*

## Arguments

- data
The data frame.

- vars
Variable (or lists of variables) to check for NAs.

- scales
The scale names to check for NAs (single character string).

## Value

A dataframe, with:

`var`

: variables selected`items`

: number of items for selected variables`na`

: number of missing cell values for those variables (e.g., 2 missing values for first participant + 2 missing values for second participant = total of 4 missing values)`cells`

: total number of cells (i.e., number of participants multiplied by number of variables,`items`

)`na_percent`

: the percentage of missing values (number of missing cells,`na`

, divided by total number of cells,`cells`

)`na_max`

: The amount of missing values for the participant with the most missing values for the selected variables`na_max_percent`

: The amount of missing values for the participant with the most missing values for the selected variables, in percentage (i.e.,`na_max`

divided by the number of selected variables,`items`

)`all_na`

: the number of participants missing 100% of items for that scale (the selected variables)

## References

Parent, M. C. (2013). Handling item-level missing
data: Simpler is just as good. *The Counseling Psychologist*,
*41*(4), 568-600. https://doi.org/10.1177%2F0011000012445176

## Examples

```
# Use whole data frame
nice_na(airquality)
#> var items na cells na_percent na_max na_max_percent all_na
#> 1 Ozone:Day 6 44 918 4.79 2 33.33 0
# Use selected columns explicitly
nice_na(airquality,
vars = list(
c("Ozone", "Solar.R", "Wind"),
c("Temp", "Month", "Day")
)
)
#> var items na cells na_percent na_max na_max_percent all_na
#> 1 Ozone:Wind 3 44 459 9.59 2 66.67 0
#> 2 Temp:Day 3 0 459 0.00 0 0.00 0
#> 3 Total 6 44 918 4.79 2 33.33 0
# If the questionnaire items start with the same name, e.g.,
set.seed(15)
fun <- function() {
c(sample(c(NA, 1:10), replace = TRUE), NA, NA, NA)
}
df <- data.frame(
ID = c("idz", NA),
scale1_Q1 = fun(), scale1_Q2 = fun(), scale1_Q3 = fun(),
scale2_Q1 = fun(), scale2_Q2 = fun(), scale2_Q3 = fun(),
scale3_Q1 = fun(), scale3_Q2 = fun(), scale3_Q3 = fun()
)
# One can list the scale names directly:
nice_na(df, scales = c("ID", "scale1", "scale2", "scale3"))
#> var items na cells na_percent na_max na_max_percent all_na
#> 1 ID:ID 1 7 14 50.00 1 100 7
#> 2 scale1_Q1:scale1_Q3 3 11 42 26.19 3 100 3
#> 3 scale2_Q1:scale2_Q3 3 17 42 40.48 3 100 3
#> 4 scale3_Q1:scale3_Q3 3 10 42 23.81 3 100 3
#> 5 Total 10 45 140 32.14 10 100 2
```