Join values ​​spread across different lines on the same line

Question:

Hi,

I have a database in which I used tidyr::spread which resulted in information spread over several lines. I would like to merge the different lines into one. Example:

df <- data.frame(obs = c(1,1,1,2),
                 msgA = c("A", NA, NA, "A" ),
                 msgB = c(NA,"B",NA, NA),
                 msgC = c(NA,NA,"C", NA))

#O que eu tentei:

library(dplyr)
df %>%
  group_by(obs) %>%
  mutate(msg = paste(msgA, msgB, msgC, sep=", ")) %>%
  select(-c(2:4))

# Resultado que eu queria:
# 
# > df
# obs     msg
# 1     A, B, C
# 2       A

Thanks for the help!

Answer:

has a unite() function that can be used for this.

df %>% 
  tidyr::unite(msg, msgA:msgC, sep = ", ")
  obs       msg
1   1 A, NA, NA
2   1 NA, B, NA
3   1 NA, NA, C
4   2 A, NA, NA

It so happens that unite() is designed for tidy , so the result it generates is not quite what you wanted.

To generate the expected result it is better to use data.frame before spread() and use group_by() combined with summarise() .

library(tidyverse)
tidy <- df %>% 
  gather(tipo, msg, -obs) %>% 
  filter(!is.na(msg))

tidy %>% 
  group_by(obs) %>% 
  summarise(msg = paste(msg, collapse = ", "))
## # A tibble: 2 x 2
##     obs msg    
##   <dbl> <chr>  
## 1     1 A, B, C
## 2     2 A 
Scroll to Top