Allocate and free memory in the loop (C + MPI)
Clash Royale CLAN TAG#URR8PPP
Allocate and free memory in the loop (C + MPI)
Please see my following code snippet (floatalloc2 is for allocating a 2D contiguous array with datatype float, see Appendix if interested):
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
int main(int argc, char *argv)
float **p=NULL, **buffer=NULL;
int it, nt=3, i, j, k, NP, MYID, nx=1, nz=2, nsrc=3, isrc;
MPI_Init ( &argc, &argv );
MPI_Comm_size ( MPI_COMM_WORLD, &NP );
MPI_Comm_rank ( MPI_COMM_WORLD, &MYID );
p = floatalloc2(nx,nz);
memset(p[0],0,nz*nx*sizeof(float));
for (it=0; it<nt; it++)
for (isrc=MYID; isrc<nsrc; isrc+=NP)
for (j=0; j<nz; j++)
for (i=0; i<nx; i++)
p[j][i] += 1.5 + (float)(isrc) + (float)(j);
for (k=0;k<nsrc-1;k++)
if (MYID==k)
buffer = floatalloc2(nx,nz);
memset(buffer[0],0,nz*nx*sizeof(float));
buffer = p;
else
buffer = floatalloc2(nx,nz);
memset(buffer[0],0,nz*nx*sizeof(float));
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&buffer[0][0],nx*nz,MPI_FLOAT,k,MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
for (j=0; j<nz; j++)
for (i=0; i<nx; i++)
printf("it=%d,k=%d,Node %d,p[%d][%d]=%fn",it,k,MYID,j,i,p[j][i]);
free(*buffer);free(buffer); /*w/o this line is ok while not ok with this line */
MPI_Finalize();
exit(0);
As a thumb of rule, C frees memory once it was allocated, even in the loop. But here, if I don't add free(*buffer);free(buffer);
, it's quite good. However, if with the free
, the results are wrong. So what's wrong in my code?
free(*buffer);free(buffer);
free
Appendix for floatalloc2:
/*@out@*/ void *sf_alloc (size_t n, size_t size )
/*< output-checking allocation >*/
void *ptr;
size *= n;
if (0>=size) sf_error("%s: illegal allocation (%d bytes)",__FILE__,size);
ptr = malloc (size);
if (NULL == ptr)
sf_error ("%s: cannot allocate %lu bytes:", __FILE__,size);
return ptr;
/*@out@*/ float *sf_floatalloc (size_t n)
/*< float allocation >*/
float *ptr;
ptr = (float*) sf_alloc (n,sizeof(float));
return ptr;
/*@out@*/ float **floatalloc2 (size_t n1 , size_t n2 )
/*< float 2-D allocation, out[0] points to a contiguous array >*/
size_t i2;
float **ptr;
ptr = (float**) sf_alloc (n2,sizeof(float*));
ptr[0] = sf_floatalloc (n1*n2);
for (i2=1; i2 < n2; i2++)
ptr[i2] = ptr[0]+i2*n1;
return ptr;
floatalloc2
floatunalloc2
free
@Bathsheba I've added an appendix for
floatalloc2
– coco
Aug 10 at 14:36
floatalloc2
There must be a library function to
free
this for you. Even if you manage to replicate that, there's no guarantee it will work in subsequent versions of whatever library you're using.– Bathsheba
Aug 10 at 14:38
free
Since you are using MPI, you are probably interested in high-performance computing. In that case, you should know that double indirection (
float **p
, i.e. p[row][column]
referring to an element, with p
and p[row]
being pointers) is much slower than using a linear array with index arithmetic, i.e. float *all;
with all[row*columns + column]
referring to an element, on current architectures. This means that this approach will never be as efficient as a linear array would be.– Nominal Animal
Aug 10 at 14:38
float **p
p[row][column]
p
p[row]
float *all;
all[row*columns + column]
@NominalAnimal Thanks for your comment. At current stage my goal is to get the code run correctly. I'll think about efficiency at next stage.
– coco
Aug 10 at 14:44
2 Answers
2
Let's say we use float **p = floatalloc2(columns, rows)
. Then, p
is allocated for rows
float pointers (float *
), and p[0]
for columns*rows
floats. If the library does not provide a floatfree2()
or free2()
function, you can free the memory allocated for the 2D array using free(*p); free(p);
, in that order. This is what OP is doing, too.
float **p = floatalloc2(columns, rows)
p
rows
float *
p[0]
columns*rows
floatfree2()
free2()
free(*p); free(p);
The inner loop,
for (k=0;k<nsrc-1;k++)
if (MYID==k)
buffer = floatalloc2(nx,nz);
memset(buffer[0],0,nz*nx*sizeof(float));
buffer = p;
else
buffer = floatalloc2(nx,nz);
memset(buffer[0],0,nz*nx*sizeof(float));
...
free(*buffer);free(buffer); /*w/o this line is ok while not ok with this line */
The buffer = p;
line replaces the 2D array allocated and initialized on the preceding two lines; this is essentially a memory leak. I suppose the idea was to copy the contents of the p
to it, but I cannot be sure.
buffer = p;
p
Furthermore, later on, the free(*buffer); free(buffer);
line ends up freeing the 2D array described by p
on the k == MYID
iteration.
free(*buffer); free(buffer);
p
k == MYID
One approach to solve this is to let buffer
alias p
when k == MYID
:
buffer
p
k == MYID
for (k = 0; k < nsrc-1; k++)
if (MYID == k)
buffer = p;
else
buffer = floatalloc2(nx,nz);
memset(buffer[0], 0, nz*nx*sizeof(float));
...
if (buffer != p)
free(*buffer);
free(buffer);
This way, when k == MYID
, buffer
actually points to p
; otherwise it is dynamically allocated and freed for each iteration. Obviously, when buffer
aliases p
, we do not free it, because that would free p
.
k == MYID
buffer
p
buffer
p
p
You could add free(*p); free(p);
just before MPI_Finalize()
, to free the memory allocated for p
too, but it isn't strictly necessary because the process is about to exit anyway. (It is useful, however, if you use e.g. Valgrind to look for memory leaks, or wish to show that you (the programmer) do track dynamic allocations correctly; in that case, adding a comment there is probably useful.)
free(*p); free(p);
MPI_Finalize()
p
A much better approach is to allocate the buffer only once, and then reuse it for each iteration:
float **p, **buffer;
p = floatalloc2(nx, nz);
memset(p[0], 0, nz*nx*sizeof p[0][0]);
buffer = floatalloc2(nx, nz);
memset(buffer[0], 0, nz*nx*sizeof buffer[0][0]);
for (it=0; it < nt; it++)
for (isrc=MYID; isrc < nsrc; isrc+=NP)
for (j=0; j<nz; j++)
for (i=0; i<nx; i++)
p[j][i] += 1.5 + (float)(isrc) + (float)(j);
for (k=0; k<nsrc-1; k++)
float **data;
if (MYID == k)
data = p;
else
data = buffer;
memset(buffer[0], 0, nz*nx*sizeof buffer[0][0]);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Bcast(&(data[0][0]), nx*nz, MPI_FLOAT, k, MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
for (j=0; j<nz; j++)
for (i=0; i<nx; i++)
printf("it=%d,k=%d,Node %d,data[%d][%d]=%f, p[%d][%d]=%fn",
it, k, MYID, j, i, data[j][i], p[j][i]);
free(*buffer);
free(buffer);
free(*p);
free(p);
Note that the sizeof p[0][0]
and sizeof buffer[0][0]
are statements that evaluate to the size (in chars) of each element of p
/buffer
, and do not actually examine any memory. Those are allowed, and safe, even if p == NULL
or p[0] == NULL
, because the compiler simply examines the type of the expression (to the right of the sizeof
operator). To remind myself of that, I never use parentheses around the variable reference on the right of the sizeof
operator. (When you use a type, like float
, the parentheses are required. Then, I write it as sizeof (float)
.) This helps me remember that sizeof
is an operator, and does not behave like a function, although it may look like a function.
sizeof p[0][0]
sizeof buffer[0][0]
p
buffer
p == NULL
p[0] == NULL
sizeof
sizeof
float
sizeof (float)
sizeof
The float **data;
is just a reference (or alias) to the actual data in p
or buffer
; all we need to use it is to assign it, either data = p;
or data = buffer;
. Because it is a reference or alias, pointing to exactly the same data, we do not free()
it at all. If we did, we'd just end up freeing the original data, p
or buffer
.
float **data;
p
buffer
data = p;
data = buffer;
free()
p
buffer
In a comment to the question, I mentioned that using double indirection (float **data
, where data[row][column]
is an element, and both data[row]
and data
are pointers) is not very efficient. In practice, a structure is used instead. For example:
float **data
data[row][column]
data[row]
data
typedef struct
int rows;
int cols;
float *data;
float2d;
#define FLOAT2D_INIT 0, 0, NULL
static inline void float2d_alloc(float2d *m, const int rows, const int cols)
do
if (!m)
fprintf(stderr, "float2d_alloc(): No matrix specified.n");
break;
if (rows < 1 while (0);
/* MPI_Abort(MPI_COMM_WORLD, 1); */
exit(1);
static inline void float2d_free(float2d *m)
if (m)
free(m->data);
m->rows = 0;
m->cols = 0;
m->data = NULL;
If you have float2d p;
allocated, then to access the element on row r
, column c
, you use p.data[r*p.cols + c]
. All the data is consecutive in memory; there being p.cols*p.rows
elements, with total size being p.cols*p.rows*sizeof p.data[0]
. To refer to the array of floats on row r
, you can use p.data + r*p.rows
(which is equivalent to &(p.data[r*p.rows])
).
float2d p;
r
c
p.data[r*p.cols + c]
p.cols*p.rows
p.cols*p.rows*sizeof p.data[0]
r
p.data + r*p.rows
&(p.data[r*p.rows])
If you are interested in this approach, and work a lot with matrices (or dense 2D arrays of data), you might wish to look here for an example of structures that takes this a step further. It basically allows one to create realtime views (references to the exact same data) to any regular rectangular part (can be e.g. diagonal, a submatrix, some consecutive odd rows, even rows, etc.) of another matrix, with the code keeping tabs (counting references to) the data, so that when you destroy a matrix that was the final user of some data, the data gets freed automatically. Each element access does require two multiplications and one addition instead of just one multiplication and one addition, but it turns out to be neglible overhead in practice, and the versatility of the matrix structures definitely balances that out.
Very Excellent and detailed answer. As you said,
A much better approach is to allocate the buffer only once, and then reuse it for each iteration
. I found that if I allocate the buffer only once, and in the for loop, I use for (j=0; j<nz; j++) for (i=0; i<nx; i++) buffer[j][i] = p[j][i];
(so called deep copy) can also get the expected results without freeing buffer
– coco
Aug 10 at 15:52
A much better approach is to allocate the buffer only once, and then reuse it for each iteration
for (j=0; j<nz; j++) for (i=0; i<nx; i++) buffer[j][i] = p[j][i];
buffer
"I suppose the idea was to copy the contents of the p to it, but I cannot be sure." Yes, exactly, you're right.
– coco
Aug 10 at 16:01
I read your answer word by word. It's really great answer and I love it. Thanks.
– coco
Aug 10 at 16:27
But I have a small question, you defined the alias
float **data
in the code body. Can we do this? Because I was taught variables are defined in the very beginning of the code.– coco
Aug 10 at 23:14
float **data
@coco: Yes. Even in C89, you can declare variables at the start of each block (
...
).– Nominal Animal
Aug 11 at 16:59
...
A different approach is to use VLAs:
#include <stdio.h>
#include <stdlib.h>
void print_matrix(int nx, int ny, int a[nx][ny])
printf("Printing matrix at %pn", &a[0][0]);
for (int i=0; i < nx; i++)
printf("%d", a[i][0]);
for (int j=1; j < ny; j++)
printf(", %d", a[i][j]);
printf("n");
int main(int argc, char *argv)
int nx = 3, nz = 2;
/* two different syntaxes */
int (*A)[nx][nz] = malloc(sizeof(int[nx][nz]));
int (*B)[nz] = malloc(sizeof(int[nx][nz]));
for (int i = 0; i < nx; i++)
for (int k = 0; k < nz; k++)
(*A)[i][k] = (i+1)*(k+1);
B[i][k] = (i+1)*(k+1);
print_matrix(nx, nz, *A);
print_matrix(nx, nz, B);
free(A);
free(B);
I strongly recommend the free book "Modern C" available here: https://gustedt.wordpress.com/2016/11/25/modern-c-is-now-feature-complete/
Tip:
int (*A)[nx][nz] = malloc(sizeof(int[nx][nz]));
may be simplified to int (*A)[nx][nz] = malloc(sizeof *A);
and int (*B)[nz] = malloc(sizeof *B * nx);
– chux
Aug 10 at 16:47
int (*A)[nx][nz] = malloc(sizeof(int[nx][nz]));
int (*A)[nx][nz] = malloc(sizeof *A);
int (*B)[nz] = malloc(sizeof *B * nx);
@chux, thanks for highlighting that. I go back and forth on which syntax I prefer.
– ptb
Aug 10 at 17:33
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
Doesn't the library that provides
floatalloc2
not provide something likefloatunalloc2
? Most likely yourfree
calls are not in correspondence with the memory allocations.– Bathsheba
Aug 10 at 14:31