r Averaging the elements of a list

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



r Averaging the elements of a list



Suppose I have a list with three object as shown below


[[1]]
yeargp gender Estimate ci.lower ci.upper
1 1991-1995 M 0.8757711 -0.8407402 2.592282
2 1991-1995 F 0.0000000 0.0000000 0.000000
3 1996-2000 M 2.2119671 -0.8536629 5.277597
4 1996-2000 F 2.8254349 -0.3718457 6.022715
5 2001-2005 M 7.7695653 2.6460791 12.893051
6 2001-2005 F 2.2710074 -0.3108077 4.852822
7 2006-2010 M 12.1639403 6.1435827 18.184298
8 2006-2010 F 6.3637686 2.5667028 10.160834

[[2]]
yeargp gender Estimate ci.lower ci.upper
1 1991-1995 M 0.000000 0.0000000 0.000000
2 1991-1995 F 0.000000 0.0000000 0.000000
3 1996-2000 M 2.211967 -0.8536629 5.277597
4 1996-2000 F 2.825435 -0.3718457 6.022715
5 2001-2005 M 8.599076 3.2238115 13.974341
6 2001-2005 F 1.517900 -0.6003366 3.636137
7 2006-2010 M 13.485237 7.1911854 19.779289
8 2006-2010 F 5.991342 2.2651006 9.717582

[[3]]
yeargp gender Estimate ci.lower ci.upper
1 1991-1995 M 0.000000 0.0000000 0.000000
2 1991-1995 F 0.000000 0.0000000 0.000000
3 1996-2000 M 3.317951 -0.4366640 7.072565
4 1996-2000 F 1.883623 -0.7269454 4.494192
5 2001-2005 M 7.643263 2.6144621 12.672065
6 2001-2005 F 2.366219 -0.3266446 5.059082
7 2006-2010 M 13.637280 7.2795528 19.995008
8 2006-2010 F 5.991342 2.2651006 9.717582



what is an efficient way to compute the average of all the elements in column3-column5, (Estimate, ci.lower, ci.upper) ?



This is what i am expecting to achive.


year Gender Estimate L.C.L U.C.L
1991-1995 M 0.2919237 -0.280246733 0.864094
1991-1995 F 0 0 0
1996-2000 M 2.580628367 -0.714663267 5.875919667
1996-2000 F 2.511497633 -0.490212267 5.513207333
2001-2005 M 8.0039681 2.828117567 13.179819
2001-2005 F 2.0517088 -0.4125963 4.516013667
2006-2010 M 13.09548577 6.8714403 19.31953167
2006-2010 F 6.1154842 2.365634667 9.865332667



Any advice is much appreciated thanks. Below is the output from the dput function on my list.


templist <- list(structure(list(yeargp = structure(c(1L, 1L, 2L, 2L, 3L,
3L, 4L, 4L), .Label = c("1991-1995", "1996-2000", "2001-2005",
"2006-2010"), class = "factor"), gender = structure(c(1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"),
Estimate = c(0.875771052955988, 0, 2.2119670520759, 2.82543488793347,
7.76956525829443, 2.27100738124732, 12.1639402903974, 6.36376856610303
), ci.lower = c(-0.840740210837749, 0, -0.853662876400907,
-0.371845674593782, 2.64607905876294, -0.310807679155956,
6.14358267928312, 2.56670275678554), ci.upper = c(2.59228231674973,
0, 5.2775969805527, 6.02271545046073, 12.8930514578259, 4.85282244165059,
18.1842979015118, 10.1608343754205)), .Names = c("yeargp",
"gender", "Estimate", "ci.lower", "ci.upper"), row.names = c(NA,
-8L), class = "data.frame"), structure(list(yeargp = structure(c(1L,
1L, 2L, 2L, 3L, 3L, 4L, 4L), .Label = c("1991-1995", "1996-2000",
"2001-2005", "2006-2010"), class = "factor"), gender = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"),
Estimate = c(0, 0, 2.2119670520759, 2.82543488793347, 8.59907630432197,
1.51790034439859, 13.4852371898016, 5.9913415231189), ci.lower = c(0,
0, -0.853662876400907, -0.371845674593782, 3.2238114821611,
-0.600336642205772, 7.19118540022504, 2.26510058455415),
ci.upper = c(0, 0, 5.2775969805527, 6.02271545046073, 13.9743411264828,
3.63613733100296, 19.7792889793781, 9.71758246168364)), .Names = c("yeargp",
"gender", "Estimate", "ci.lower", "ci.upper"), row.names = c(NA,
-8L), class = "data.frame"), structure(list(yeargp = structure(c(1L,
1L, 2L, 2L, 3L, 3L, 4L, 4L), .Label = c("1991-1995", "1996-2000",
"2001-2005", "2006-2010"), class = "factor"), gender = structure(c(1L,
2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"),
Estimate = c(0, 0, 3.31795057811384, 1.88362325862232,
7.6432632822894, 2.36621893284824, 13.6372803202135, 5.9913415231189
), ci.lower = c(0, 0, -0.436663954372684, -0.726945388947865,
2.6144620600312, -0.32664459212626, 7.27955279689059, 2.26510058455415
), ci.upper = c(0, 0, 7.07256511060037, 4.4941919061925,
12.6720645045476, 5.05908245782275, 19.9950078435365, 9.71758246168364
)), .Names = c("yeargp", "gender", "Estimate", "ci.lower",
"ci.upper"), row.names = c(NA, -8L), class = "data.frame"))




4 Answers
4



Short and sweet:


df <- do.call(rbind, templist)
aggregate(df[3:5], df[1:2], mean)





thanks that did the magic.
– Sundown Brownbear
Aug 10 at 21:34



Here is one possible dplyr solution yielding the exact same outcome you expect:


dplyr


library(dplyr)

# binding the data together
bind_rows(templist[[1]], templist[[2]], templist[[3]]) %>%
# grouping by year and gender
group_by(yeargp, gender) %>%
# computing necessary averages with names wanted
summarise(
Estimate = mean(Estimate),
L.C.L = mean(ci.lower),
U.C.L = mean(ci.upper)
) %>%
# renaming year and gender as your expected output
rename(
year = yeargp,
Gender = gender
)

# # A tibble: 8 x 5
# # Groups: year [4]
# year Gender Estimate L.C.L U.C.L
# <fct> <fct> <dbl> <dbl> <dbl>
# 1 1991-1995 M 0.292 -0.280 0.864
# 2 1991-1995 F 0 0 0
# 3 1996-2000 M 2.58 -0.715 5.88
# 4 1996-2000 F 2.51 -0.490 5.51
# 5 2001-2005 M 8.00 2.83 13.2
# 6 2001-2005 F 2.05 -0.413 4.52
# 7 2006-2010 M 13.1 6.87 19.3
# 8 2006-2010 F 6.12 2.37 9.87



For the mean of each element:


lapply(templist, function(x) apply(x[,3:5], 2, mean))
[[1]]
Estimate ci.lower ci.upper
4.310 1.122 7.498

[[2]]
Estimate ci.lower ci.upper
4.329 1.357 7.301

[[3]]
Estimate ci.lower ci.upper
4.355 1.334 7.376



global means:


apply(data.frame(lapply(templist, function(x) apply(x[,3:5], 2, mean))),1,mean)
Estimate ci.lower ci.upper
4.331 1.271 7.392





this is collapsing the estimates across the years and gender I wanted the estimates separate by year and gender.
– Sundown Brownbear
Aug 10 at 21:35



We can do this with Reduce to get the sum of corresponding elements of numeric columns, divide by the length of the list and cbind with the first 2 columns of one of the list elements


Reduce


list


cbind


list


cbind(templist[[1]][1:2], Reduce(`+`, lapply(templist, `[`, 3:5))/3)
# yeargp gender Estimate ci.lower ci.upper
#1 1991-1995 M 0.2919237 -0.2802467 0.8640941
#2 1991-1995 F 0.0000000 0.0000000 0.0000000
#3 1996-2000 M 2.5806282 -0.7146632 5.8759197
#4 1996-2000 F 2.5114977 -0.4902122 5.5132076
#5 2001-2005 M 8.0039683 2.8281175 13.1798190
#6 2001-2005 F 2.0517089 -0.4125963 4.5160141
#7 2006-2010 M 13.0954859 6.8714403 19.3195316
#8 2006-2010 F 6.1154839 2.3656346 9.8653331



Assuming the 'yeargp' and 'gender' are the same in all the list elements


list



Or using tidyverse with group_by_at and summarise_all


tidyverse


group_by_at


summarise_all


library(tidyverse)
templist %>%
bind_rows %>%
group_by_at(1:2) %>%
summarise_all(mean)





that is correct
– Sundown Brownbear
Aug 10 at 21:35






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard