An introduction to pointers
I am Jyotiprakash, a deeply driven computer systems engineer, software developer, teacher, and philosopher. With a decade of professional experience, I have contributed to various cutting-edge software products in network security, mobile apps, and healthcare software at renowned companies like Oracle, Yahoo, and Epic. My academic journey has taken me to prestigious institutions such as the University of Wisconsin-Madison and BITS Pilani in India, where I consistently ranked among the top of my class.
At my core, I am a computer enthusiast with a profound interest in understanding the intricacies of computer programming. My skills are not limited to application programming in Java; I have also delved deeply into computer hardware, learning about various architectures, low-level assembly programming, Linux kernel implementation, and writing device drivers. The contributions of Linus Torvalds, Ken Thompson, and Dennis Ritchie—who revolutionized the computer industry—inspire me. I believe that real contributions to computer science are made by mastering all levels of abstraction and understanding systems inside out.
In addition to my professional pursuits, I am passionate about teaching and sharing knowledge. I have spent two years as a teaching assistant at UW Madison, where I taught complex concepts in operating systems, computer graphics, and data structures to both graduate and undergraduate students. Currently, I am an assistant professor at KIIT, Bhubaneswar, where I continue to teach computer science to undergraduate and graduate students. I am also working on writing a few free books on systems programming, as I believe in freely sharing knowledge to empower others.
Pointers in C are a fundamental and powerful concept used for memory management. A pointer is essentially a variable that stores the memory address of another variable. This allows for direct access and manipulation of memory locations, which can lead to more efficient and flexible programs.
Key Concepts:
Address-of Operator (
&): This operator is used to obtain the memory address of a variable.Dereference Operator (
*): When applied to a pointer, it accesses the value stored at the pointer's memory address.Pointer Declaration: Declaring a pointer involves specifying the data type it points to, e.g.,
int *ptr;declares a pointer to an integer.Pointer Initialization: A pointer is usually initialized with the address of a variable, e.g.,
ptr = &var;.Null Pointer: A pointer that doesn't point to any valid memory location. It's good practice to initialize unused pointers to
NULL.
Example: Swapping Two Numbers Using Pointers
This example demonstrates how pointers can be used to swap the values of two variables. It's a basic yet practical illustration of how pointers allow functions to modify the actual values of variables passed to them.
#include <stdio.h>
void swap(int *a, int *b) {
int temp = *a;
*a = *b;
*b = temp;
}
int main() {
int num1 = 10;
int num2 = 20;
printf("Before swap: num1 = %d, num2 = %d\n", num1, num2);
// Passing addresses of num1 and num2
swap(&num1, &num2);
printf("After swap: num1 = %d, num2 = %d\n", num1, num2);
return 0;
}
In this code:
The
swapfunction takes two integer pointers as arguments.It swaps the values of the two integers pointed to by these pointers.
In
main, the addresses ofnum1andnum2are passed toswap.The
swapfunction then modifies the values at these addresses, effectively swapping them.
To understand the behavior of the swap function and the main function in the given code, it's crucial to delve into the concept of stack frames in C. Each function call in C creates a new stack frame in the stack memory, which is used to store local variables, arguments passed to the function, and other housekeeping information like the return address.
Stack Frame Structure
When a function is called, a stack frame is created with the following typical structure:
Function Arguments: Passed-in values or references.
Return Address: The address in the calling function to return execution after the function completes.
Local Variables: Variables declared within the function.
Control Data: Used by the operating system or runtime environment for various housekeeping tasks.
Main Function Stack Frame
In the main function:
Local Variables:
num1andnum2are local tomain. They are stored in the stack frame formain.Function Call: When
swapis called, the addresses ofnum1andnum2(&num1,&num2) are passed to it. These addresses are the actual arguments for theswapfunction.
Swap Function Stack Frame
In the swap function:
Function Arguments: The parameters
int *aandint *bare pointers, and they form part of theswapfunction's stack frame. Whenswapis called frommain, the addresses ofnum1andnum2are copied intoaandb. It meansaandbare now pointing tonum1andnum2, respectively.Local Variables:
tempis a local variable ofswap.
Execution Flow
Before the Swap Call:
num1andnum2inmainhave their respective values (10 and 20).On Swap Call: The stack frame for
swapis created abovemain's frame. The values&num1and&num2(addresses ofnum1andnum2) are passed toswapand copied into the pointer variablesaandb.Inside Swap:
Dereferencing
*agives access tonum1and*btonum2.The value at
*a(value ofnum1) is copied totemp.The value at
*b(value ofnum2) is copied to*a.temp(original value ofnum1) is copied to*b.Thus, the values of
num1andnum2are swapped.
After Swap Call: The stack frame of
swapis destroyed, and control returns tomain.num1andnum2inmain's frame now have their values swapped.
Visualization
[ Main Stack Frame ]
| num2 (20) | <- Initially
| num1 (10) | <- Initially
| Return Address |
| ... |
[ Swap Stack Frame ]
| temp (10) | <- During execution
| b (&num2) | <- Copy of address of num2
| a (&num1) | <- Copy of address of num1
| Return Address |
| ... |
This detailed flow highlights the critical role of stack frames in managing function calls and local variables, and how pointers enable direct manipulation of memory locations across different stack frames.
Let's write a simple program that involves a structure and a function. This function will attempt to modify the structure's fields, but since the structure is passed by value, the changes won't be reflected in the main function. Here's how the program will be structured:
Define a Structure: Create a simple structure, for example, a
Pointstructure withxandyas its members.Create a Function: This function takes a structure of type
Pointas its parameter and attempts to modify its members.Main Function: Initialize a
Pointvariable, pass it to the function, and then display its values after the function call to demonstrate that they remain unchanged.
Here's the C program illustrating this:
#include <stdio.h>
// Defining the Point structure
typedef struct {
int x;
int y;
} Point;
// Function to modify Point structure, passed by value
void modifyPoint(Point p) {
p.x = 100; // Attempt to modify the Point's x
p.y = 200; // Attempt to modify the Point's y
printf("Inside modifyPoint: x = %d, y = %d\n", p.x, p.y);
}
int main() {
Point pt = {10, 20}; // Initializing a Point
printf("Before modifyPoint: x = %d, y = %d\n", pt.x, pt.y);
modifyPoint(pt); // Passing Point by value
printf("After modifyPoint: x = %d, y = %d\n", pt.x, pt.y);
return 0;
}
In this program:
The
modifyPointfunction modifies its parameterp, which is a copy of thePointstructure passed to it.Changes made to
pinmodifyPointdo not affect the originalPointinstanceptinmain.The output will show that the values of
ptin themainfunction remain unchanged after the call tomodifyPoint.
To understand why the values did not get updated in the main function in the provided C program, we need to delve into the details of what happens in the stack, specifically focusing on stack frames, local variables, and the concept of passing by value.
Execution Flow in the Program
Let's break down what happens in the stack when the main and modifyPoint functions are called:
1. Main Function Stack Frame
When the program starts, a stack frame for
mainis created.Local Variables in
mainStack Frame:Point ptis allocated in this frame withx = 10andy = 20.
2. Calling modifyPoint
When
modifyPointis called, a new stack frame is created for it.The
Pointstructureptis passed by value. This means the entire structure is copied byte-by-byte.
3. modifyPoint Function Stack Frame
Function Arguments in
modifyPointStack Frame:- The parameter
Point pis a local copy ofpt. Any changes made topwill only affect this local copy and not the originalptinmain.
- The parameter
p.xandp.yare modified to100and200, respectively. However, these changes are made to the copy ofptthat exists within themodifyPointstack frame.
4. Returning to Main
After
modifyPointcompletes execution, its stack frame is destroyed, along with the local copyp.Control returns to the
mainfunction. Themainfunction's stack frame is still intact with the originalpthavingx = 10andy = 20.Therefore, the changes made in
modifyPointare not reflected inmain.
Visualization
[ Main Stack Frame ]
| pt (x=10, y=20) | <- Remains unchanged throughout
| Return Address |
| ... |
[ modifyPoint Stack Frame ]
| p (x=100, y=200) | <- Local copy, modified within this frame
| Return Address |
| ... |
The crucial point here is that passing by value creates a separate copy of the data in the new function's stack frame. Modifications to this copy do not affect the original data in the caller's (here, main) stack frame, which explains why the values in main did not get updated.
To understand how passing a pointer to a structure allows changes to be reflected in the main function, we'll again examine the stack frames and the flow of execution in the modified program where we use pointers. The key difference here is the use of a pointer to pass the address of the structure, allowing direct manipulation of the original structure in memory.
Stack and Stack Frames
As before, each function call results in the creation of a stack frame. However, this time the content and interaction of these stack frames are different due to the use of pointers.
Execution Flow in the Program
1. Main Function Stack Frame
When the program starts, a stack frame for
mainis created.Local Variables in
mainStack Frame:Point ptis allocated in this frame withx = 10andy = 20.
2. Calling modifyPoint
When
modifyPointis called, a new stack frame is created for it.Instead of passing the whole structure, the address of
pt(&pt) is passed. This is a pointer toPoint.
3. modifyPoint Function Stack Frame
Function Arguments in
modifyPointStack Frame:- The parameter
Point *pis a pointer, holding the address ofptfrom themainfunction.
- The parameter
Modifications in
modifyPoint:- When
p->xandp->yare modified, it directly changes thexandyvalues ofptin themainstack frame becauseppoints topt.
- When
The
modifyPointfunction does not have a separate copy of thePointstructure; it has a pointer that refers to the original structure in themainfunction.
4. Returning to Main
After
modifyPointcompletes, its stack frame is destroyed, but the changes made toptpersist because they were made directly to the memory location whereptresides.Control returns to the
mainfunction. Themainfunction's stack frame shows the updated values ofpt(x = 100andy = 200).
Visualization
[ Main Stack Frame ]
| pt (x=100, y=200)| <- Updated via pointer in modifyPoint
| Return Address |
| ... |
[ modifyPoint Stack Frame ]
| p (&pt) | <- Pointer to pt in main
| Return Address |
| ... |
In this case, the modifyPoint function directly accesses and modifies the contents of pt in the main function's stack frame through the pointer p. This direct access and modification via a pointer is why the changes are reflected in the main function after modifyPoint returns.
To illustrate passing an array to a function and modifying it, let's write a C program that includes a function to modify the elements of the array. In C, when you pass an array to a function, what you're actually passing is the address of the array's first element. Therefore, modifying the array within the function affects the original array in the calling function.
Let's use both arr[i] and *(arr + i) syntaxes to show that they are equivalent ways of accessing array elements.
Here's the C program:
#include <stdio.h>
#define SIZE 5
// Function to modify the array
void modifyArray(int arr[], int size) {
for (int i = 0; i < size; i++) {
arr[i] *= 2; // Using [] syntax
}
printf("Array inside function (using [] syntax): ");
for (int i = 0; i < size; i++) {
printf("%d ", arr[i]);
}
printf("\n");
for (int i = 0; i < size; i++) {
*(arr + i) += 2; // Using *(arr + i) syntax
}
printf("Array inside function (using *(arr + i) syntax): ");
for (int i = 0; i < size; i++) {
printf("%d ", arr[i]);
}
printf("\n");
}
int main() {
int myArray[SIZE] = {1, 2, 3, 4, 5};
printf("Original array: ");
for (int i = 0; i < SIZE; i++) {
printf("%d ", myArray[i]);
}
printf("\n");
modifyArray(myArray, SIZE); // Passing the array to the function
printf("Array after modification in main: ");
for (int i = 0; i < SIZE; i++) {
printf("%d ", myArray[i]);
}
printf("\n");
return 0;
}
In this program:
The
modifyArrayfunction is designed to modify the elements of the passed array.arr[i]and*(arr + i)are used interchangeably to demonstrate their equivalence.When the array
myArrayis passed tomodifyArray, its address (specifically, the address of the first element) is passed. There's no need to use&myArray;myArrayby itself suffices and refers to the address of the first element of the array.Any modifications made to the array within
modifyArraydirectly affectmyArrayin themainfunction because they operate on the same memory location.After returning from
modifyArray, the changes to the array are reflected in themainfunction, demonstrating that the array was indeed modified by the function.
To understand how passing an array to a function works in C and why modifications in the function affect the original array, let's analyze the stack frames and memory interactions.
In C, arrays are stored in contiguous memory locations. When you pass an array to a function, you are actually passing the address of the first element of the array, not a separate copy of the array.
Execution Flow in the Program
1. Main Function Stack Frame
Local Variables in
mainStack Frame:int myArray[SIZE]is an array of integers. It's allocated in the stack frame formain, and the array elements are stored in contiguous memory locations.
The array's name,
myArray, acts as a pointer to its first element.
2. Calling modifyArray
When
modifyArrayis called, a new stack frame is created for it.The argument
myArraypassed tomodifyArrayis essentially the address of the first element ofmyArray.
3. modifyArray Function Stack Frame
Function Arguments in
modifyArrayStack Frame:- The parameter
int arr[]is a pointer to the first element ofmyArray. It doesn't allocate new memory for the array; it simply points to the original array's memory in themainfunction's frame.
- The parameter
Modifications in
modifyArray:Any changes made using
arr[i]or*(arr + i)directly affect the elements ofmyArrayin themainstack frame.Both
arr[i]and*(arr + i)are different syntaxes for accessing the same memory location.
4. Returning to Main
After
modifyArraycompletes, its stack frame is destroyed, but since the modifications were made to the memory locations ofmyArray, the changes persist.Control returns to the
mainfunction, where the updated values ofmyArraycan be seen.
Visualization
[ Main Stack Frame ]
| myArray[4] = 14 | <- Directly modified by modifyArray
| myArray[3] = 12 | <- Directly modified by modifyArray
| myArray[2] = 10 | <- Directly modified by modifyArray
| myArray[1] = 8 | <- Directly modified by modifyArray
| myArray[0] = 6 | <- Directly modified by modifyArray
| Return Address |
| ... |
[ modifyArray Stack Frame ]
| arr (points to myArray[0]) | <- Pointer to the first element of myArray
| size = 5 |
| Return Address |
| ... |
In this setup, modifyArray doesn't have a separate copy of the array; it operates on the original array in the main function's memory space. This direct manipulation of the array's memory is why the changes in modifyArray are reflected in the main function.