r Averaging the elements of a list

r Averaging the elements of a list

Suppose I have a list with three object as shown below

[[1]] yeargp gender Estimate ci.lower ci.upper 1 1991-1995 M 0.8757711 -0.8407402 2.592282 2 1991-1995 F 0.0000000 0.0000000 0.000000 3 1996-2000 M 2.2119671 -0.8536629 5.277597 4 1996-2000 F 2.8254349 -0.3718457 6.022715 5 2001-2005 M 7.7695653 2.6460791 12.893051 6 2001-2005 F 2.2710074 -0.3108077 4.852822 7 2006-2010 M 12.1639403 6.1435827 18.184298 8 2006-2010 F 6.3637686 2.5667028 10.160834 [[2]] yeargp gender Estimate ci.lower ci.upper 1 1991-1995 M 0.000000 0.0000000 0.000000 2 1991-1995 F 0.000000 0.0000000 0.000000 3 1996-2000 M 2.211967 -0.8536629 5.277597 4 1996-2000 F 2.825435 -0.3718457 6.022715 5 2001-2005 M 8.599076 3.2238115 13.974341 6 2001-2005 F 1.517900 -0.6003366 3.636137 7 2006-2010 M 13.485237 7.1911854 19.779289 8 2006-2010 F 5.991342 2.2651006 9.717582 [[3]] yeargp gender Estimate ci.lower ci.upper 1 1991-1995 M 0.000000 0.0000000 0.000000 2 1991-1995 F 0.000000 0.0000000 0.000000 3 1996-2000 M 3.317951 -0.4366640 7.072565 4 1996-2000 F 1.883623 -0.7269454 4.494192 5 2001-2005 M 7.643263 2.6144621 12.672065 6 2001-2005 F 2.366219 -0.3266446 5.059082 7 2006-2010 M 13.637280 7.2795528 19.995008 8 2006-2010 F 5.991342 2.2651006 9.717582

what is an efficient way to compute the average of all the elements in column3-column5, (Estimate, ci.lower, ci.upper) ?

This is what i am expecting to achive.

year Gender Estimate L.C.L U.C.L 1991-1995 M 0.2919237 -0.280246733 0.864094 1991-1995 F 0 0 0 1996-2000 M 2.580628367 -0.714663267 5.875919667 1996-2000 F 2.511497633 -0.490212267 5.513207333 2001-2005 M 8.0039681 2.828117567 13.179819 2001-2005 F 2.0517088 -0.4125963 4.516013667 2006-2010 M 13.09548577 6.8714403 19.31953167 2006-2010 F 6.1154842 2.365634667 9.865332667

Any advice is much appreciated thanks. Below is the output from the dput function on my list.

templist <- list(structure(list(yeargp = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L), .Label = c("1991-1995", "1996-2000", "2001-2005", "2006-2010"), class = "factor"), gender = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"), Estimate = c(0.875771052955988, 0, 2.2119670520759, 2.82543488793347, 7.76956525829443, 2.27100738124732, 12.1639402903974, 6.36376856610303 ), ci.lower = c(-0.840740210837749, 0, -0.853662876400907, -0.371845674593782, 2.64607905876294, -0.310807679155956, 6.14358267928312, 2.56670275678554), ci.upper = c(2.59228231674973, 0, 5.2775969805527, 6.02271545046073, 12.8930514578259, 4.85282244165059, 18.1842979015118, 10.1608343754205)), .Names = c("yeargp", "gender", "Estimate", "ci.lower", "ci.upper"), row.names = c(NA, -8L), class = "data.frame"), structure(list(yeargp = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L), .Label = c("1991-1995", "1996-2000", "2001-2005", "2006-2010"), class = "factor"), gender = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"), Estimate = c(0, 0, 2.2119670520759, 2.82543488793347, 8.59907630432197, 1.51790034439859, 13.4852371898016, 5.9913415231189), ci.lower = c(0, 0, -0.853662876400907, -0.371845674593782, 3.2238114821611, -0.600336642205772, 7.19118540022504, 2.26510058455415), ci.upper = c(0, 0, 5.2775969805527, 6.02271545046073, 13.9743411264828, 3.63613733100296, 19.7792889793781, 9.71758246168364)), .Names = c("yeargp", "gender", "Estimate", "ci.lower", "ci.upper"), row.names = c(NA, -8L), class = "data.frame"), structure(list(yeargp = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L), .Label = c("1991-1995", "1996-2000", "2001-2005", "2006-2010"), class = "factor"), gender = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("M", "F"), class = "factor"), Estimate = c(0, 0, 3.31795057811384, 1.88362325862232, 7.6432632822894, 2.36621893284824, 13.6372803202135, 5.9913415231189 ), ci.lower = c(0, 0, -0.436663954372684, -0.726945388947865, 2.6144620600312, -0.32664459212626, 7.27955279689059, 2.26510058455415 ), ci.upper = c(0, 0, 7.07256511060037, 4.4941919061925, 12.6720645045476, 5.05908245782275, 19.9950078435365, 9.71758246168364 )), .Names = c("yeargp", "gender", "Estimate", "ci.lower", "ci.upper"), row.names = c(NA, -8L), class = "data.frame"))

4 Answers
4

Short and sweet:

df <- do.call(rbind, templist) aggregate(df[3:5], df[1:2], mean)

thanks that did the magic.
– Sundown Brownbear
Aug 10 at 21:34

Here is one possible dplyr solution yielding the exact same outcome you expect:

dplyr

library(dplyr) # binding the data together bind_rows(templist[[1]], templist[[2]], templist[[3]]) %>% # grouping by year and gender group_by(yeargp, gender) %>% # computing necessary averages with names wanted summarise( Estimate = mean(Estimate), L.C.L = mean(ci.lower), U.C.L = mean(ci.upper) ) %>% # renaming year and gender as your expected output rename( year = yeargp, Gender = gender ) # # A tibble: 8 x 5 # # Groups: year [4] # year Gender Estimate L.C.L U.C.L # <fct> <fct> <dbl> <dbl> <dbl> # 1 1991-1995 M 0.292 -0.280 0.864 # 2 1991-1995 F 0 0 0 # 3 1996-2000 M 2.58 -0.715 5.88 # 4 1996-2000 F 2.51 -0.490 5.51 # 5 2001-2005 M 8.00 2.83 13.2 # 6 2001-2005 F 2.05 -0.413 4.52 # 7 2006-2010 M 13.1 6.87 19.3 # 8 2006-2010 F 6.12 2.37 9.87

For the mean of each element:

lapply(templist, function(x) apply(x[,3:5], 2, mean)) [[1]] Estimate ci.lower ci.upper 4.310 1.122 7.498 [[2]] Estimate ci.lower ci.upper 4.329 1.357 7.301 [[3]] Estimate ci.lower ci.upper 4.355 1.334 7.376

global means:

apply(data.frame(lapply(templist, function(x) apply(x[,3:5], 2, mean))),1,mean) Estimate ci.lower ci.upper 4.331 1.271 7.392

this is collapsing the estimates across the years and gender I wanted the estimates separate by year and gender.
– Sundown Brownbear
Aug 10 at 21:35

We can do this with Reduce to get the sum of corresponding elements of numeric columns, divide by the length of the list and cbind with the first 2 columns of one of the list elements

Reduce

list

cbind

list

cbind(templist[[1]][1:2], Reduce(`+`, lapply(templist, `[`, 3:5))/3) # yeargp gender Estimate ci.lower ci.upper #1 1991-1995 M 0.2919237 -0.2802467 0.8640941 #2 1991-1995 F 0.0000000 0.0000000 0.0000000 #3 1996-2000 M 2.5806282 -0.7146632 5.8759197 #4 1996-2000 F 2.5114977 -0.4902122 5.5132076 #5 2001-2005 M 8.0039683 2.8281175 13.1798190 #6 2001-2005 F 2.0517089 -0.4125963 4.5160141 #7 2006-2010 M 13.0954859 6.8714403 19.3195316 #8 2006-2010 F 6.1154839 2.3656346 9.8653331

Assuming the 'yeargp' and 'gender' are the same in all the list elements

list

Or using tidyverse with group_by_at and summarise_all

tidyverse

group_by_at

summarise_all

library(tidyverse) templist %>% bind_rows %>% group_by_at(1:2) %>% summarise_all(mean)

that is correct
– Sundown Brownbear
Aug 10 at 21:35

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

n r9,H WncJY mLA JOAt e J,eysd,cU

搜尋此網誌

Sfyjdyy