Static Compilation. Where is my printf?
Static compilation is the process of compiling a computer program in such a way that all the library code that the program depends on is included within the program's executable file. This is done by linking the program with static libraries (.a files in Unix-like systems, .lib files in Windows) rather than dynamic libraries (.so files in Unix-like systems, .dll files in Windows).
When you statically compile a program, the compiler includes copies of all the routines the program uses directly into the executable. The advantages of static compilation include:
Portability: The resulting executable is self-contained, which means it does not depend on the system's shared libraries and can be run on any compatible system without additional dependencies.
Performance: Sometimes, statically compiled programs can run slightly faster because they don't incur the overhead of dynamic linking at runtime.
Reliability: Since all the code the program needs is contained within its own executable, it's not susceptible to issues like "dependency hell" or problems arising from the wrong version of a shared library being present on the system.
However, there are also disadvantages:
Size: Statically compiled executables are typically larger because they include all the code they use, rather than sharing common libraries across the system.
Updates: If a library has a bug that is fixed or improved, you need to recompile the entire program with the updated static library to benefit from the changes. With dynamic libraries, you can simply update the library on the system.
Memory Usage: Multiple running instances of statically compiled programs do not share common library code in memory, leading to higher memory usage.
Overall, static compiling allows for the creation of executables that are self-contained and include all required library code. This approach has its advantages in terms of simplicity and reliability, but it can lead to larger file sizes and the possibility of redundancy.
On a Linux system, tools like objdump and nm can be used to examine statically built C programs to find the printf function and its call locations within the binary. Allow me to show you the way:
Compile the Program Statically: First, you need to compile your C program statically. You can do this using the
-static
flag withgcc
:gcc -static -o myprogram myprogram.c
Identify the
printf
Function in the Binary: Use thenm
tool to list symbols in the binary. Theprintf
function will be included in the binary since it's statically linked:nm --defined-only myprogram | grep ' printf'
This should give you the address of the
printf
function within your binary.Disassemble the Binary: Use
objdump
to disassemble the binary and find theprintf
code:objdump -d myprogram > myprogram.asm
Then you can search for the address found with
nm
in themyprogram.asm
file to see the disassembled code forprintf
.Find Calls to
printf
: To find whereprintf
is being called from, you can search for the call instruction in the disassembly:grep -B 5 'call.*<printf>' myprogram.asm
The
-B 5
flag will show you 5 lines before thecall
instruction, which can help you identify the calling function. The output will show you the addresses of the instructions that are callingprintf
.Analyze the Call Sites: Each call site will have an address in the disassembly. You can look around these addresses to understand the context of the call, such as which function is making the call and what parameters are being passed.
You can see the assembly language representation of the machine code—the actual binary code for printf—in the disassembly. The results of objdump could be difficult to understand if you aren't an expert in assembly language.
Following these steps will assume that you are using an AMD64 machine and have the necessary tools to work in an environment similar to Unix. The process and tools you use may change depending on the system or architecture you're working with.