r Averaging the elements of a list
Clash Royale CLAN TAG#URR8PPP
r Averaging the elements of a list
Suppose I have a list with three object as shown below
[[1]]
yeargp gender Estimate ci.lower ci.upper
1 1991-1995 M 0.8757711 -0.8407402 2.592282
2 1991-1995 F 0.0000000 0.0000000 0.000000
3 1996-2000 M 2.2119671 -0.8536629 5.277597
4 1996-2000 F 2.8254349 -0.3718457 6.022715
5 2001-2005 M 7.7695653 2.6460791 12.893051
6 2001-2005 F 2.2710074 -0.3108077 4.852822
7 2006-2010 M 12.1639403 6.1435827 18.184298
8 2006-2010 F 6.3637686 2.5667028 10.160834
[[2]]
yeargp gender Estimate ci.lower ci.upper
1 1991-1995 M 0.000000 0.0000000 0.000000
2 1991-1995 F 0.000000 0.0000000 0.000000
3 1996-2000 M 2.211967 -0.8536629 5.277597
4 1996-2000 F 2.825435 -0.3718457 6.022715
5 2001-2005 M 8.599076 3.2238115 13.974341
6 2001-2005 F 1.517900 -0.6003366 3.636137
7 2006-2010 M 13.485237 7.1911854 19.779289
8 2006-2010 F 5.991342 2.2651006 9.717582
[[3]]
yeargp gender Estimate ci.lower ci.upper
1 1991-1995 M 0.000000 0.0000000 0.000000
2 1991-1995 F 0.000000 0.0000000 0.000000
3 1996-2000 M 3.317951 -0.4366640 7.072565
4 1996-2000 F 1.883623 -0.7269454 4.494192
5 2001-2005 M 7.643263 2.6144621 12.672065
6 2001-2005 F 2.366219 -0.3266446 5.059082
7 2006-2010 M 13.637280 7.2795528 19.995008
8 2006-2010 F 5.991342 2.2651006 9.717582
what is an efficient way to compute the average of all the elements in column3-column5, (Estimate, ci.lower, ci.upper) ?
This is what i am expecting to achive.
year Gender Estimate L.C.L U.C.L
1991-1995 M 0.2919237 -0.280246733 0.864094
1991-1995 F 0 0 0
1996-2000 M 2.580628367 -0.714663267 5.875919667
1996-2000 F 2.511497633 -0.490212267 5.513207333
2001-2005 M 8.0039681 2.828117567 13.179819
2001-2005 F 2.0517088 -0.4125963 4.516013667
2006-2010 M 13.09548577 6.8714403 19.31953167
2006-2010 F 6.1154842 2.365634667 9.865332667
Any advice is much appreciated thanks. Below is the output from the dput function on my list.
templist <- list(structure(list(yeargp = structure(c(1L, 1L, 2L, 2L, 3L,
3L, 4L, 4L), .Label = c("1991-1995", "1996-2000", "2001-2005",
"2006-2010"), class = "factor"), gender = structure(c(1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"),
Estimate = c(0.875771052955988, 0, 2.2119670520759, 2.82543488793347,
7.76956525829443, 2.27100738124732, 12.1639402903974, 6.36376856610303
), ci.lower = c(-0.840740210837749, 0, -0.853662876400907,
-0.371845674593782, 2.64607905876294, -0.310807679155956,
6.14358267928312, 2.56670275678554), ci.upper = c(2.59228231674973,
0, 5.2775969805527, 6.02271545046073, 12.8930514578259, 4.85282244165059,
18.1842979015118, 10.1608343754205)), .Names = c("yeargp",
"gender", "Estimate", "ci.lower", "ci.upper"), row.names = c(NA,
-8L), class = "data.frame"), structure(list(yeargp = structure(c(1L,
1L, 2L, 2L, 3L, 3L, 4L, 4L), .Label = c("1991-1995", "1996-2000",
"2001-2005", "2006-2010"), class = "factor"), gender = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"),
Estimate = c(0, 0, 2.2119670520759, 2.82543488793347, 8.59907630432197,
1.51790034439859, 13.4852371898016, 5.9913415231189), ci.lower = c(0,
0, -0.853662876400907, -0.371845674593782, 3.2238114821611,
-0.600336642205772, 7.19118540022504, 2.26510058455415),
ci.upper = c(0, 0, 5.2775969805527, 6.02271545046073, 13.9743411264828,
3.63613733100296, 19.7792889793781, 9.71758246168364)), .Names = c("yeargp",
"gender", "Estimate", "ci.lower", "ci.upper"), row.names = c(NA,
-8L), class = "data.frame"), structure(list(yeargp = structure(c(1L,
1L, 2L, 2L, 3L, 3L, 4L, 4L), .Label = c("1991-1995", "1996-2000",
"2001-2005", "2006-2010"), class = "factor"), gender = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"),
Estimate = c(0, 0, 3.31795057811384, 1.88362325862232,
7.6432632822894, 2.36621893284824, 13.6372803202135, 5.9913415231189
), ci.lower = c(0, 0, -0.436663954372684, -0.726945388947865,
2.6144620600312, -0.32664459212626, 7.27955279689059, 2.26510058455415
), ci.upper = c(0, 0, 7.07256511060037, 4.4941919061925,
12.6720645045476, 5.05908245782275, 19.9950078435365, 9.71758246168364
)), .Names = c("yeargp", "gender", "Estimate", "ci.lower",
"ci.upper"), row.names = c(NA, -8L), class = "data.frame"))
4 Answers
4
Short and sweet:
df <- do.call(rbind, templist)
aggregate(df[3:5], df[1:2], mean)
Here is one possible dplyr
solution yielding the exact same outcome you expect:
dplyr
library(dplyr)
# binding the data together
bind_rows(templist[[1]], templist[[2]], templist[[3]]) %>%
# grouping by year and gender
group_by(yeargp, gender) %>%
# computing necessary averages with names wanted
summarise(
Estimate = mean(Estimate),
L.C.L = mean(ci.lower),
U.C.L = mean(ci.upper)
) %>%
# renaming year and gender as your expected output
rename(
year = yeargp,
Gender = gender
)
# # A tibble: 8 x 5
# # Groups: year [4]
# year Gender Estimate L.C.L U.C.L
# <fct> <fct> <dbl> <dbl> <dbl>
# 1 1991-1995 M 0.292 -0.280 0.864
# 2 1991-1995 F 0 0 0
# 3 1996-2000 M 2.58 -0.715 5.88
# 4 1996-2000 F 2.51 -0.490 5.51
# 5 2001-2005 M 8.00 2.83 13.2
# 6 2001-2005 F 2.05 -0.413 4.52
# 7 2006-2010 M 13.1 6.87 19.3
# 8 2006-2010 F 6.12 2.37 9.87
For the mean of each element:
lapply(templist, function(x) apply(x[,3:5], 2, mean))
[[1]]
Estimate ci.lower ci.upper
4.310 1.122 7.498
[[2]]
Estimate ci.lower ci.upper
4.329 1.357 7.301
[[3]]
Estimate ci.lower ci.upper
4.355 1.334 7.376
global means:
apply(data.frame(lapply(templist, function(x) apply(x[,3:5], 2, mean))),1,mean)
Estimate ci.lower ci.upper
4.331 1.271 7.392
this is collapsing the estimates across the years and gender I wanted the estimates separate by year and gender.
– Sundown Brownbear
Aug 10 at 21:35
We can do this with Reduce
to get the sum of corresponding elements of numeric columns, divide by the length of the list
and cbind
with the first 2 columns of one of the list
elements
Reduce
list
cbind
list
cbind(templist[[1]][1:2], Reduce(`+`, lapply(templist, `[`, 3:5))/3)
# yeargp gender Estimate ci.lower ci.upper
#1 1991-1995 M 0.2919237 -0.2802467 0.8640941
#2 1991-1995 F 0.0000000 0.0000000 0.0000000
#3 1996-2000 M 2.5806282 -0.7146632 5.8759197
#4 1996-2000 F 2.5114977 -0.4902122 5.5132076
#5 2001-2005 M 8.0039683 2.8281175 13.1798190
#6 2001-2005 F 2.0517089 -0.4125963 4.5160141
#7 2006-2010 M 13.0954859 6.8714403 19.3195316
#8 2006-2010 F 6.1154839 2.3656346 9.8653331
Assuming the 'yeargp' and 'gender' are the same in all the list
elements
list
Or using tidyverse
with group_by_at
and summarise_all
tidyverse
group_by_at
summarise_all
library(tidyverse)
templist %>%
bind_rows %>%
group_by_at(1:2) %>%
summarise_all(mean)
that is correct
– Sundown Brownbear
Aug 10 at 21:35
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
thanks that did the magic.
– Sundown Brownbear
Aug 10 at 21:34