R dplyr summarize percent

9/13/2023

Making an educated guess and only seeing three unique values, gtsummary will treat this as a categorical variable and return frequencies of those values however, you may still want a mean. For example, consider a rating scale with possible values of 1, 2, 3, … 7, but in which respondents only select values of 3, 4, 5. One default I frequently correct is treatment of discrete numeric values.

Pay attention to the footnote on the statistical tests performed and adjust if needed with the test argument in the add_p function. In addition, gtsummary makes an educated guess on how to summarize your data and which statistical test to use.

Statistical tests performed: Wilcoxon rank-sum test chi-square test of independence

Statistics presented: Mean (SD) % (n / N) Note that there is an overall N that corresponds to the number of observations, and each each variable can have its own N that corresponds to the number of non-missing observations for that variable. Here are a few modifications you might be interested in trying to customize your table, including adding an overall column, custom statistic formatting, and table styling. Statistics presented: Median (IQR) n (%)Īnd wait - did you see that?! The raw data had variable names of q12, stheight, and q69 but the table printed the variable label! (I previously tweeted about the awesome package pairing of haven and gtsummary.) If your data does not come with handy labels, you can create them with the label option in tbl_summary or with the var_label function in the labelled package. I’ll demonstrate with the Youth Risk Behavior Surveillance System (YRBSS) data my previous post Leveraging labelled data in R has more background details. My favourite R package for: summarising data by Dabbling with data (2018) How to make beautiful tables in R by R for the Rest of Us (2019). If you are still searching for your favorite table package, here are two round up resources: The gtsummary documentation is excellent so I won’t cover all of its awesome functionality, but I will add a bit of my specific experience.

This blog post is to promote gtsummary and make it more searchable for those still seeking the one table to rule them all. When I showed him gtsummary in 5 minutes, his reaction was all Try it out! BackgroundĪ colleague learning R just told me that he spent 45 minutes searching for a summary table function and couldn’t quite find anything that met his needs. The gtsummary package in R creates amazing publication / presentation / whatever-you-need-it-for ready tables of summary statistics. Library ( dplyr ) starwars %>% filter ( species = "Droid" ) #> # A tibble: 6 × 14 #> name height mass hair_color skin_color eye_color birth_year sex gender #> #> 1 C-3PO 167 75 gold yellow 112 none masculi… #> 2 R2-D2 96 32 white, blue red 33 none masculi… #> 3 R5-D4 97 32 white, red red NA none masculi… #> 4 IG-88 200 140 none metal red 15 none masculi… #> 5 R4-P17 96 NA none silver, red red, blue NA none feminine #> # ℹ 1 more row #> # ℹ 5 more variables: homeworld, species, films, #> # vehicles, starships starwars %>% select ( name, ends_with ( "color" ) ) #> # A tibble: 87 × 4 #> name hair_color skin_color eye_color #> #> 1 Luke Skywalker blond fair blue #> 2 C-3PO gold yellow #> 3 R2-D2 white, blue red #> 4 Darth Vader none white yellow #> 5 Leia Organa brown light brown #> # ℹ 82 more rows starwars %>% mutate ( name, bmi = mass / ( ( height / 100 ) ^ 2 ) ) %>% select ( name : mass, bmi ) #> # A tibble: 87 × 4 #> name height mass bmi #> #> 1 Luke Skywalker 172 77 26.0 #> 2 C-3PO 167 75 26.9 #> 3 R2-D2 96 32 34.7 #> 4 Darth Vader 202 136 33.3 #> 5 Leia Organa 150 49 21.8 #> # ℹ 82 more rows starwars %>% arrange ( desc ( mass ) ) #> # A tibble: 87 × 14 #> name height mass hair_color skin_color eye_color birth_year sex gender #> #> 1 Jabba De… 175 1358 green-tan… orange 600 herm… mascu… #> 2 Grievous 216 159 none brown, wh… green, y… NA male mascu… #> 3 IG-88 200 140 none metal red 15 none mascu… #> 4 Darth Va… 202 136 none white yellow 41.9 male mascu… #> 5 Tarfful 234 136 brown brown blue NA male mascu… #> # ℹ 82 more rows #> # ℹ 5 more variables: homeworld, species, films, #> # vehicles, starships starwars %>% group_by ( species ) %>% summarise ( n = n ( ), mass = mean ( mass, na.rm = TRUE ) ) %>% filter ( n > 1, mass > 50 ) #> # A tibble: 8 × 3 #> species n mass #> #> 1 Droid 6 69.8 #> 2 Gungan 3 74 #> 3 Human 35 82.8 #> 4 Kaminoan 2 88 #> 5 Mirialan 2 53.Figure 1: Happy R adapted from artwork by the beach and cocktail images are from ,

0 Comments

R dplyr summarize percent

Leave a Reply.

Author

Archives

Categories