C problem: 2D arrays using malloc

Code junkies hangout here

Moderators: ChrisThornett, LXF moderators

C problem: 2D arrays using malloc

Postby Crispy » Sun Jan 22, 2012 3:00 pm

Hi,

I am teaching myself C and have been working my way through a book. I have finished the book and I am now starting to create my own program. I need to be able to create an matrix of size NxN however I can't get malloc to allocate the memory for the matrix.

I have tried everything I can think of, but nothing seems to work. I had a quick search and found that malloc doesn't have the capability to allocate 2D arrays, saying you have to allocate each dimension one at a time and also that one of the dimensions needs to be known before hand - which kind of defeats the object for a NxN matrix (having said that even when creating x,y coordinates for each of the N points, it still throws a warning and I wouldn't say I was confident then that the memory was being freed).

This is the code that I have:

Code: Select all
#include <stdio.h>

int N;
double **nodeCoord;

int main()

  /* Define external variables */
  extern int N;
  extern double **nodeCoord;
 
  N = 2;
 
  nodeCoord = (double**) malloc(N);

  free(nodeCoord);
 
  return 0;
}


The above code throws a warning saying:

main.c:32: warning: incompatible implicit declaration of built-in function ‘malloc’
main.c:40: warning: incompatible implicit declaration of built-in function ‘free’

I don't know what this means, but I would assume it's because malloc returns an integer pointer and therefore cannot be cast to a double**. So I tried changing the
Code: Select all
nodeCoord = (double**) malloc(N);

line to
Code: Select all
&nodeCoord = (double*) malloc(N);

I also tried changing it to:
Code: Select all
*nodeCoord = (double*) malloc(N);


However, both of those gave the same warning message. Normally I'm not that worried about warning messages, however when I check to see if the memory isn't free'd simply by outputting some values, it still outputs the correct values even after freeing the memory.

I have also tried the following:
Code: Select all
typedef struct
{
  double *x;
  double *y;
} coord;

then changing the nodeCoord line to
Code: Select all
nodeCoord.x = (double*) malloc(N);
nodeCoord.y = (double*) malloc(N);

where nodeCoord is defined by: coord nodeCoord;. This also gave the same warning message and didn't free the memory correctly - this I find quite strange as surely it shud call malloc return the integer pointer which is then cast to a double pointer, which is exactly what the x member of nodeCoord struct is.

I also tried creating an array of pointers of size 2:
Code: Select all
double *nodeCoord[2];

then use malloc on each element. However this also gave the same warning.

I honestly cannot see what I am doing wrong and am not 100% sure that I can even create a two dimensional array using malloc? Alternatively, does anyone know of an alternate way to create 2D arrays in C?

Any help is greatly appreciated.

Finally, would it be possible for a C tutorial to be covered in Linux Format?

Thanks,
Chris
Crispy
 
Posts: 57
Joined: Mon May 31, 2010 6:35 pm

Postby sledgehammer » Mon Jan 23, 2012 8:33 am

Its been a while since I did some C but I think you are missing a header file, probably malloc.h or similar.
User avatar
sledgehammer
 
Posts: 47
Joined: Fri Jul 28, 2006 2:03 pm

Postby Crispy » Fri Feb 10, 2012 1:19 pm

Thank you for your reply sledgehammer and I apologise for not replying sooner.

You were right, I was missing a library it was stdlib.h, I think malloc.h is C++? Thanks again for your reply really appreciate it :)

Chris
Crispy
 
Posts: 57
Joined: Mon May 31, 2010 6:35 pm

Postby Larry » Mon Feb 13, 2012 6:47 am

Well this works:
    #include <stdlib.h>
    #include <stdio.h>

    int main(void)
    {
    // Assume a 2X2 array
    int m = sizeof(double[2][2]);
    // Assume you want to creat 4 2X2 arrays.
    int n = 4;
    printf("The size of matrix m is %d \n", m);
    double (*MyArray)[2][2][4] = malloc(m * n);
    int sa = m * n;
    printf("The size of MyArray is %d \n", sa);
    // Suppose we want to store 5 at *MyArray[1][0][2];
    *(MyArray) [1][0][2] = 5.0;
    // Let's see if it's stored in there.
    printf("The value of MyArray[1][0][2] is %f\n", *(MyArray) [1][0][2]);
    free(MyArray);
    return(1);
    }

    I have to admit that I didn't do extensive bounds checking, though it looks good. BE WARE! The "C" language does not do extensive bounds checking with pointer arithmetic. It's up to you, the programmer to write bug free code. I love "C", but it's been termed one of the most dangerous languages ever invented. Not true, assembly is much more dangerous.

    Anyway, I hope this helps.

    :)
Larry
 
Posts: 18
Joined: Tue Aug 23, 2011 5:16 am

Postby Crispy » Sun Apr 01, 2012 4:57 pm

Thank you for your reply and again I'm sorry for my late reply. Had a little break from C but have now returned.

I am still a bit confused with allocating dynamic arrays. I wish to be able to dynamically allocate an array in a separate function however when I do so the array becomes NULL when leaving the function. My code is as follows:

Code: Select all
#include <stdio.h>
#include <stdlib.h>

void allocate(int N_in, int M_in, double** A_in);

int N, M;
double** A;

int main( int argc, char *argv[] )
{
//   Declare external variables
  extern int N,M;
  extern double** A;
 
  N = 0;
  M = 0;
 
//   Declare internal variables
  int i,j;
 
 
  if (argc != 3)
  {
    printf("Usage:   %s N M\n", argv[0]);
    free(A);
    return 0;
  }
 
  N = atoi(argv[1]);
  M = atoi(argv[2]);
 
  allocate(N, M, A);
 
  if ( A == NULL )
  {
    printf("Array A is null using separate function\n");
  }
  else
  {
    printf("Array A is not null using separate function\n");
  }
 
  A = (double**) malloc(N * sizeof(int));
 
  for (i = 0; i < N; i++)
  {
    A[i] = (double*) malloc(M * sizeof(int));
  }
 
  if ( A == NULL )
  {
    printf("Array A is null using main function\n");
  }
  else
  {
    printf("Array A is not null using main function\n");
  }
 
  free(A);
     
  return 0;
}

void allocate(int N_in, int M_in, double** A_in)
{
  int i;
 
  A_in = (double**) malloc(N_in * sizeof(int));
 
  for (i = 0; i < 2; i++)
  {
    A_in[i] = (double*) malloc(M_in * sizeof(int));
  }
 
  A_in[0][0] = 1;
}


I would have thought that because I am passing a pointer to A to the function allocate, that when the array A_in is allocated inside the function allocate, the array A should point to where A_in is pointing so that the array is present in memory. However when leaving the function the array A becomes NULL. To me this suggests that the array is being deleted from memory when I leave the function allocate? Although I am not sure.

I have noticed that if I change:
Code: Select all
void allocate(int N_in, int M_in, double** A_in);

to:
Code: Select all
double** allocate(int N_in, int M_in, double** A_in);

and set
Code: Select all
A = allocate(N,M,A);

then it works. But I don't understand why or why it wasn't working the first time?

Any help would be greatly appreciated.

Thanks in advance
Chris
Crispy
 
Posts: 57
Joined: Mon May 31, 2010 6:35 pm

Postby SheamusPatt » Mon Apr 09, 2012 10:44 pm

What you are missing is that in C / C++, parameters are generally passed by value, not by reference. That means that changing a parameter inside a function does not change the original value (which might not even have a name, as it could be an expression). It's likely just coincidence that you find A is NULL when allocate returns - it should not have changed value at all, and could be anything. The memory will still be allocated (that's something C leaves up to the programmer to deal with, unlike Java and other languages that provide garbage collection on their memory heaps). You'll just have lost its address, so it will have 'leaked'.

If you want to change the variable that you're passing in, then you would pass it by reference if you're using C++. Since you're using C, though, you don't have that option. Returning the array from 'allocate' as you've shown is likely the simplest approach though there's no need to also pass A into the function. Or you can do something like a C++ reference by adding an additional level of indirection. That is, declare

void allocate(int N_in, int M_in, double*** A_in);

and wherever you reference A_in within allocate, change it to (*A_in) so it will dereference the pointer, and change the value in the passed in variable. You also need to pass in &A in place of A when you call allocate.

Hope that helps.
Sheamus
Ottawa, CANADA
SheamusPatt
 
Posts: 2
Joined: Mon Apr 09, 2012 10:04 pm
Location: Ottawa, Canada

Postby Larry » Tue Apr 10, 2012 7:55 am

Hi Chris:

I'm sorry I didn't get back to you sooner, but time and commitments precluded my doing so.

First off, regarding your angst at still viewing the data after you've freed the, "malloc", allocated block. This is very common and normal. Just because you freed your lock on the malloc allocated memory block, does not automatically cause it to be immediately reallocated to other uses. This block of memory could remain unallocated for the duration of your application. Conversely, it could be allocated to another application, but that application just hasn't needed to initialize that block of memory yet. Malloc just requests, and if available, receives blocks of memory. By the way, the argument you pass to malloc is an integer value representing the number of bytes of memory you are requesting. Malloc isn't particularly smart. You simply request the number of bytes that you want. If malloc is successful, it returns a pointer to this memory block. By checking to see if your data was still there after freeing the block, you have uncovered an important fact that, "C", programmers need to watch for. That is dealing with pointers you defined that no longer point to a valid allocated memory. This can lead to the famous intermittent, "C", program. Most of the time it works, but sometimes it crashes, or returns garbage. This is because you are using a pointer that is pointing to a freed memory location that another application is now using for its own purposes.

Second, the passing of arguments to, "C", functions is always done by value. This is the case with, "C", and that specification was carried forward with C++. This has the advantage of not accidentally trashing your data in a function you wrote. On the other hand, if you desire to modify the arguments you pass, then you pass a pointer to the data as an argument to the function. In a ,"C", function all arguments and internally generated data are stored on the program stack. That's why, when the function ends, all internal data is lost. So if you want to abstract the process of allocating memory on the fly by creating a function, you must save the memory pointers external to the function. The, "C", way of doing this is to have the malloc returned pointer be the return value of the function you wrote.

Third, deals with the matter of how, "C", stores matrix array data. Chris, you are over complicating the way you are thinking about matrix data. From a mathematics point of view, each array index corresponds to an array dimension. However, the C compiler simply stores these values consecutively in memory. For example, referring to my previous posting of 2/13/2012, line seven obtains the size of a double precision 2 by 2 array. This number is then used in a call to malloc to obtain a block of memory. That's all there is to it, nothing more. You'll also notice that I'm using subscript notation,( Yes I know that each dimension is enclosed in brackets. This is, "C", notation to denote a subscript.), instead of pointer arithmetic to access the array contents. The fact is, in, "C", the name you give an array is really a pointer to the base address of the array memory location. That's why you can use either pointer arithmetic, or subscript notation which I used. I advise my method. This causes the compiler to do the address calculation to the data you want. It's just as valid to use pointer arithmetic to access the data, but it's easier for you to make a programming mistake, and access the wrong data. Try the program I supplied in my 2/13/2012 posting. You will find that the program will compile and do exactly what you want.

Fourth, you seem to be overly interested in double indirection pointers. I have written many programs. I stay away from double indirection every chance I get. This is because every time you use indirection, extra programming steps are required. Double indirection merely increases this overhead, and further slows your program. As you can see from the program I wrote, I completely avoided double indirection. This program accomplishes exactly what you want to do.

Fifth, and lastly is the matter of data visibility. I noticed that you are using the extern directive every time you try to access data outside a function. This is not necessary. If you have not declared data of the same name in your function, than any global data is visible to your function. In programming, it's always wise not to overcomplicate things.

Enjoy!
Larry
 
Posts: 18
Joined: Tue Aug 23, 2011 5:16 am

Postby Crispy » Sat Apr 14, 2012 1:26 am

Thank you SheamusPatt and Larry for your replies, they have helped a great deal. In reply to the first two points of Larry's last post and of Sheamus' post, I had assumed that because I was passing in a pointer, that it should point to the allocated memory; but as you both correctly pointed out, the value that was passed in is the double** pointer and so I should be passing in the pointer to the double** pointer. Thank you, that cleared up the problem.

I realised that if I put the allocate function inside a header along with the type definitions, then I do not need to pass in the pointers. I can simply allocate the memory and the pointers automatically point to the right place. What is interesting however, is that if I run:
Code: Select all
int **A;

A = (int**) malloc( N * sizeof(int) );
for (i = 0; i < N; i++)
{
    A[i] = (int*) malloc( M * sizeof(int) );
}

I get an invalid write error when the program is ran in valgrind. However, when I run the above code replacing int with double, it works? Another puzzling factor is that when the above is run on my work PC it works perfectly fine; I get no errors in valgrind. My laptop runs a 64-bit Ubuntu and my work PC runs a 32-bit Ubuntu - perhaps this is the reason?

In response to Larry's 3rd point, I thought - and I might be mistaken - that I was considering the simplest approach to matrix data; i.e. allocate the first dimension, allocate the second dimension. If I have interpreted your code correctly - and I apologise if I have misunderstood it - it appears, that you assume a 2x2 array, then assume 4 2x2 arrays. Essentially, decomposing a matrix A into 4 2x2 arrays, or more generally, K 2x2 arrays? I would have thought that this relies on the matrix A being both square and have an even number of rows and columns. Whilst your method does work (as a side, it seems to give invalid write errors in valgrind on both systems, no idea why), it could get very complicated for some problems. For example, say I would like to solve a linear system Ax = b. The matrix A is a block matrix, in other words it contains several sub-matrices A_x, A_y, B_x and B_y. If the problem is such that the linear system Ax = b is constructed such that A is never actually made, it is found only by considering the block matrices A_x, A_y, B_x and B_y, of which one might not be square. Then the decomposition would either not work or become more complicated. For example, the matrix A_x (and others) could take the form A_x[i][j][k][l][M]. If we consider your approach, the i,j and k,l would need to be decomposed into K_ij and K_kl, 2x2 arrays, thus A_x would become: A_x[2][2][K_ij][2][2][K_kl][M] - which, to me at least, seems more complicated.

So whilst your idea does work, double indirection can make things simpler (although might contain more code) and I would probably bet that it is unavoidable in certain scenarios. Therefore, it is something I wish to try to understand as best I can.

Thanks again for your replies,
Chris
Crispy
 
Posts: 57
Joined: Mon May 31, 2010 6:35 pm

Postby Larry » Thu Apr 19, 2012 6:54 am

Hi Chris:

Well I decided to pass my original code through the KDbg debugger. I stood in awe at the creative places the, "gcc", compiler picked to store my elements. Needless to say, some of the elements landed in the, "malloc", allocated memory, but many did not.

I decided to download the, "C", documentation from gnu.org. The file is called, "gnu-c-manual.pdf". Interestingly enough, the information on array subscripts avoided giving information on multidimensional arrays. Since this is all volunteer, it could be just a lack of man power to do the documentation. Then again, it could be the old adage, “Ask me no questions, I'll tell you no lies”. Without downloading the, “gcc” source code, and wading into just how it handles subscripts, life is just too short for that, I decided to fall back to pointer arithmetic. In any event, since we are dealing with pointers, pointer math is definitely an option. The ,”gcc”, compiler seems to deal effectively with that. One word of caution, pointer arithmetic can get very cryptic very fast, as you're about do discover with the included code.

To abstract this, I created one structure, and two functions. The first function, mtrixdyn, takes up to three matrix dimensions, the enumerated element type, and a pointer to an instance of the afore mentioned structure. This function then returns a void pointer to the requested block of memory. The pointer is of type void simply because the function at creation simply does not know what element type you, the user, will select. Therefore, you must cast the returned function value for the type that you selected. Chris, this function does answer your original question at the beginning of this thread regarding the allocation of 2D arrays using, “malloc”. The second function, “elemptr”, takes element coordinates, a pointer to the afore mentioned structure, and a pointer to the base address returned by the first function.

So, without further adieu, here is the code.

Code: Select all
#include <stdlib.h>
#include <stdio.h>

struct matrixdim
   {
   int row;
   int col;
   int cmpx;
   size_t typesize;
   };

void* mtrixdyn(int row, int col, int cmpx, int type, struct matrixdim* mtxprms)
   {
   // Begin populating the matrix dimension structure.   
   mtxprms->row = row;
   mtxprms->col = col;
   mtxprms->cmpx = cmpx;
   switch( type )
      {
      case 1:
         mtxprms->typesize = sizeof(char);
         break;
      case 2:
         mtxprms->typesize = sizeof(int);
         break;
      case 3:
         mtxprms->typesize = sizeof(long);
         break;
      case 4:
         mtxprms->typesize = sizeof(float);
         break;
      case 5:
         mtxprms->typesize = sizeof(double);
         break;
      default:
         break;
      }

   // Calculate the number of bytes needed for the array.
   size_t mtxsize = row * col * cmpx * mtxprms->typesize;
   
   // Obtain a pointer, if possible, and return the pointer.   
   return(malloc( mtxsize ));
   }

void* elemptr(int row, int col, int cmpx, struct matrixdim* mtxprms, void* baseadr)
   {
   // Calculate the element offset from the base address.
   size_t offset = row + mtxprms->row * col + mtxprms->row * mtxprms->col * cmpx;

   // Calculate the element pointer address.
   void* elptr = (void*)(baseadr + (offset * mtxprms->typesize));
   
   // Return the calculated pointer.
   return( elptr );
   }

int main(void)
   {
   // Begin by creating an instance of matrixdim
   struct matrixdim mtrix;
   
   // Assume that we are going to creat 4 2x2 arrays.
   // The element type long integer was used because it displays well in
   // the memory window of KDbg, which is the debugger I used.
   long *newmtrix = mtrixdyn( 2, 2, 4, 3, &mtrix);
   printf("\nThe new matrix base address is: %x\n\n", newmtrix);

   if (newmtrix == 0)
      {
      printf("Matrix allocation failure\n");
      return(1);
      }

   // Just to make sure it works, we are going to populate every element with an
   // auto incremented value, "l".
   int i = 0;
   int j = 0;
   int k = 0;
   int l = 0;   
   for (k=0; k<4; k++)
      {
      for (j=0; j<2; j++)
         {
         for(i=0; i<2; i++)
            {
            *((long*)(elemptr(i, j, k, &mtrix, newmtrix))) = ++l;
            }
         }
      }

   // Now lets read every element of this array.
   for(i=0; i<(mtrix.row * mtrix.col * mtrix.cmpx); i++)
      {
      printf("This value was found at location %d: %d\n", i, *(newmtrix + i));
      }

   free(newmtrix);   
   return(0);
   }


Well there it is, you are free to pass this through a code validation program of your choice. Let me know what you find out.

Lastly, in this example, it's interesting to note that with the, “elemptr” function, setting row and col dimensions to zero, and the cmpx value to an integer from 0-3, will return a pointer to the base address of any of the 2x2 arrays allocated. This could be useful should you desire to create an array of pointers to each of the 2x2 arrays.

Lastly, in your last post, you alluded to a matrix composed of sub matrices. What you are describing correlates very closely to a method used by engineers to describe a control system known as the, “State variable method”. Should this be the direction your headed in, you will probably want to have a matrix that supports complex floating point variables. This is accomplished very easily by creating a MxNx2 matrix. The first of these MxN matrices could contain the real component while the second could contain the imaginary part. Of course, you would have to write functions to handle complex math, but you said you wanted to learn, “C”.

Have fun!
Larry
 
Posts: 18
Joined: Tue Aug 23, 2011 5:16 am


Return to Programming

Who is online

Users browsing this forum: heiowge and 0 guests