Dynamic Linking in Practice
So now we understand how dynamic linking works in theory, but how do we actually use it?
Well, most of the time, it's just there in the background, making sure everything works. But there are some interesting things you can do with it!
Let's look at some practical aspects of dynamic linking, from creating shared libraries to some clever tricks you can do with them.
Creating a Shared Library
Let's make a simple library that provides a function to print a greeting:
/* hello_lib.c */
#include <stdio.h>
void
say_hello(const char *name)
{
printf("Hello, %s!\n", name);
}
To convert this code into a shared library, we need to compile it in a special way:
$ gcc -fPIC -c hello_lib.c
$ gcc -shared -o libhello.so hello_lib.o
What's PIC?
“Position Independent Code”—remember how we said the shared library could be put anywhere in memory?
-fPICtells the compiler to generate code that totally doesn't mind where in memory it's been placed.
Now we can write a program that uses our library:
/* greet.c */
void say_hello(const char *name);
int
main()
{
say_hello("Dynamic Linking");
return 0;
}
And compile it:
$ gcc -o greet greet.c -L. -lhello
$ ./greet
./greet: error while loading shared libraries: libhello.so: cannot open shared object file: No such file or directory
Hay! What went wrong?
The dynamic linker can't find our library! By default, it only looks in certain places…
Finding Libraries
On Linux, the dynamic linker looks for libraries in
- Directories listed in
LD_LIBRARY_PATH. - Directories listed in
/etc/ld.so.conf. - Standard directories like
/liband/usr/lib.
(On a Mac it's similar, but the environment variable is DYLD_LIBRARY_PATH. Windows has a different system.)
We can see what libraries a program needs:
$ ldd greet
linux-vdso.so.1 (0x00007fff5cd7c000)
libhello.so => not found
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4b12c00000)
/lib64/ld-linux-x86-64.so.2 (0x00007f4b12e00000)
To run our program, we can either
- Install the library in a standard location; or
- Set
LD_LIBRARY_PATH:
$ LD_LIBRARY_PATH=. ./greet
Hello, Dynamic Linking!
Another option would be to compile our program with the -rpath option so that it automatically knows where to find the library:
$ gcc -o greet greet.c -L. -lhello -Wl,-rpath,`pwd`
$ ./greet
Hello, Dynamic Linking!
Dynamically Loading a Plugin
The dynamic-linking facility doesn't just have to be when a program starts. You can also load libraries after it's started running! On a POSIX system, we have the functions
dlopen— open a shared library/plug-in, loading it into memory and linking it against the program, returns a handle to itdlsym— get the address of a function in the library (pass library handle and the function name)dlclose— close the library, unloading it from memory
Note that while the library is linked against our program, our program isn't relinked against the newly loaded code, so the only way to see symbols in that code is via dlsym.
Here's an example program that generates C code, compiles it, then loads it into itself at runtime:
#include <dlfcn.h>
#include <err.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <setjmp.h>
#include <signal.h>
typedef void (*demo_func_t)(void);
#define TEMPLATE_PATH "/tmp/dynlink_XXXXXX"
#define MAX_LINE 1024
#define COMPILE_CMD "gcc -shared -fPIC -o %s %s"
static void
generate_source(const char *filename, const char *user_code)
{
FILE *fp;
if ((fp = fopen(filename, "w")) == NULL)
err(1, "fopen");
fprintf(fp, "#include <stdio.h>\n#include <unistd.h>\n"
"#include <fcntl.h>\n#include <stddef.h>\n"
"#include <stdint.h>\n#include <stdlib.h>\n\n"
"void demo_function(void)\n"
"{\n"
" %s\n"
"}\n", user_code);
fclose(fp);
}
static char *
create_temp_file(const char *suffix)
{
char *filename;
int fd;
/* Allocate space for template + suffix + null terminator */
if (!(filename = malloc(strlen(TEMPLATE_PATH) + strlen(suffix) + 1)))
err(1, "malloc");
strcpy(filename, TEMPLATE_PATH);
if ((fd = mkstemp(filename)) == -1)
err(1, "mkstemp");
close(fd);
/* Add suffix to the generated filename */
strcat(filename, suffix);
return filename;
}
static int
compile_shared_object(const char *src_file, const char *so_file)
{
char cmd[MAX_LINE];
snprintf(cmd, sizeof(cmd), COMPILE_CMD, so_file, src_file);
printf("[DEBUG] Compiling with command: %s\n", cmd);
return system(cmd);
}
static char *src_file, *so_file;
static void
zap_files(void)
{
if (src_file) {
unlink(src_file);
free(src_file);
src_file = NULL;
}
if (so_file) {
unlink(so_file);
free(so_file);
so_file = NULL;
}
}
/* Signal handlers for Alarm, Segmentation fault, Bus error */
jmp_buf give_up;
static void
sig_alarm(int signo)
{
fprintf(stderr, "Timeout: Execution took too long\n");
longjmp(give_up, 1);
}
static void
sig_mem(int signo)
{
fprintf(stderr, "Memory access error\n");
longjmp(give_up, 1);
}
int
main(void)
{
char *line;
void *handle;
demo_func_t func;
const char *error;
size_t linelen = 0;
/* Set up cleanup handler */
atexit(zap_files);
while (1) {
printf("Enter one line of C code (for function body):\n"
"(e.g., --> printf(\"Hello, world!\\n\"); <-- )\n"
"Type 'exit(0);' to quit\n");
if ((linelen = getline(&line, &linelen, stdin)) == -1)
err(1, "getline");
/* Create temporary source and shared object files */
src_file = create_temp_file(".c");
so_file = strdup("./temp.so");
printf("[DEBUG] Temporary files: Source = %s, "
"Shared object: %s\n", src_file, so_file);
/* Generate and compile the source */
generate_source(src_file, line);
printf("[DEBUG] Generated source file with user code\n");
if (compile_shared_object(src_file, so_file) != 0) {
fprintf(stderr, "Compilation failed\n");
goto cleanup;
}
printf("[DEBUG] Compilation successful\n");
/* Load the shared object */
handle = dlopen(so_file, RTLD_NOW);
if (handle == NULL) {
fprintf(stderr, "dlopen failed: %s\n", dlerror());
goto cleanup;
}
printf("[DEBUG] Loaded shared object at %p\n", handle);
/* Get the function symbol */
func = (demo_func_t)dlsym(handle, "demo_function");
if ((error = dlerror()) != NULL) {
fprintf(stderr, "dlsym failed: %s\n", error);
dlclose(handle);
goto cleanup;
}
printf("[DEBUG] Retrieved function symbol at %p\n",
(void *)func);
/* Execute the function */
printf("\nExecuting user function:\n");
printf("--------------------\n");
/* Protect against infinite loops and some bad memory access */
if (setjmp(give_up) == 0) {
signal(SIGALRM, sig_alarm);
signal(SIGSEGV, sig_mem);
signal(SIGBUS, sig_mem);
alarm(3);
func();
alarm(0);
signal(SIGALRM, SIG_DFL);
signal(SIGSEGV, SIG_DFL);
signal(SIGBUS, SIG_DFL);
}
printf("--------------------\n");
/* Cleanup */
dlclose(handle);
printf("[DEBUG] Unloaded shared object\n");
cleanup:
zap_files();
printf("[DEBUG] Removed temporary files\n");
}
return 0;
}
Do I need to follow this code in detail?
The main thing is the big picture—that we can load libraries at runtime and use them. The details are just to show how it's done and give a fun demo.
I think it's pretty cool. It's like a REPL for C code!
This whole thing of loading code at runtime is cool, but also kinda dangerous. You're giving the user the power to run arbitrary code in your program! But at least it was our choice to do that.
Well, actually, we can load code into programs that didn't expect it, too…
Fun with LD_PRELOAD
The dynamic linker has a special feature: if we set a special environment variable, it will load libraries before all others. On Linux, the variable is LD_PRELOAD (on a Mac it's DYLD_INSERT_LIBRARIES). This feature means you can override functions in other libraries!
Let's make a library that intercepts calls to time() (and equivalent functions) and always returns the time for New Year's Day 2025:
/* faketime.c */
#include <time.h>
#include <sys/time.h>
/* Always return New Year's 2025! */
#define NEW_YEARS_2025 1735718400 /* in seconds since epoch */
/* redefine time() to our fake time */
time_t
time(time_t *tloc)
{
time_t fake_time = NEW_YEARS_2025;
if (tloc) *tloc = fake_time;
return fake_time;
}
/* and, gettimeofday() is used by some programs */
int
gettimeofday(struct timeval *restrict tv, void *restrict tz)
{
tv->tv_sec = NEW_YEARS_2025;
tv->tv_usec = 0;
return 0;
}
/* and, clock_gettime() is used by other programs */
int
clock_gettime(clockid_t clk_id, struct timespec *tp)
{
tp->tv_sec = NEW_YEARS_2025;
tp->tv_nsec = 0;
return 0;
}
Compile it:
$ gcc -fPIC -shared -o libfaketime.so faketime.c
Now we can make ANY program think it's New Year's Day:
$ date
Mon Nov 4 11:54:23 AM PST 2024
$ LD_PRELOAD=./libfaketime.so date
Wed Jan 1 12:00:00 AM PST 2025
That's both cool and a bit scary. You can change how programs behave without modifying them!
True, but It does have legitimate uses. For example, you can use it to debug programs by intercepting functions and printing when they're called.
Often we want to not only have our replacement function run, but also call the original function. We call this usage interposition, because our code lies between the program and the original library function (interposing itself).
Interposition can be used for many things, including
- Debugging (intercepting functions to print when they're called).
- Testing (simulating different conditions).
- Compatibility layers (making old programs work with new libraries).
But how do we call the original function?
It actually varies. On Linux, you look it up with
dlsymusing theRTLD_NEXThandle. On a Mac, you use a different approach called swizzling, where it does a switcharoo with the original function, so you try to call yourself but it's actually the original function.
It still all seems a bit sneaky to me…
Modern Unix-like systems have some safeguards, including
LD_PRELOADis ignored forsetuidprograms.- Only trusted users should be able to set
LD_LIBRARY_PATHandLD_PRELOAD. - System libraries are usually installed in protected directories.
- Macs may refuse to honor
DYLD_INSERT_LIBRARIESfor signed binaries.
Library Versioning
Meh. This is all well and good, but what happens when you update a library and the new version isn't compatible with old programs? You'd have been better off statically linking!
Actually, that's where library versioning comes in…
Libraries can have multiple versions installed at once:
$ ( cd /lib/x86_64-linux-gnu/ ; ls -l libpcre*so* )
lrwxrwxrwx 1 root root 21 Apr 8 2024 libpcre2-16.so -> libpcre2-16.so.0.11.2
lrwxrwxrwx 1 root root 21 Apr 8 2024 libpcre2-16.so.0 -> libpcre2-16.so.0.11.2
-rw-r--r-- 1 root root 572064 Apr 8 2024 libpcre2-16.so.0.11.2
lrwxrwxrwx 1 root root 21 Apr 8 2024 libpcre2-32.so -> libpcre2-32.so.0.11.2
lrwxrwxrwx 1 root root 21 Apr 8 2024 libpcre2-32.so.0 -> libpcre2-32.so.0.11.2
-rw-r--r-- 1 root root 543392 Apr 8 2024 libpcre2-32.so.0.11.2
lrwxrwxrwx 1 root root 20 Apr 8 2024 libpcre2-8.so -> libpcre2-8.so.0.11.2
lrwxrwxrwx 1 root root 20 Apr 8 2024 libpcre2-8.so.0 -> libpcre2-8.so.0.11.2
-rw-r--r-- 1 root root 625344 Apr 8 2024 libpcre2-8.so.0.11.2
lrwxrwxrwx 1 root root 23 Apr 8 2024 libpcre2-posix.so -> libpcre2-posix.so.3.0.4
lrwxrwxrwx 1 root root 23 Apr 8 2024 libpcre2-posix.so.3 -> libpcre2-posix.so.3.0.4
-rw-r--r-- 1 root root 14568 Apr 8 2024 libpcre2-posix.so.3.0.4
And programs record which version they were built against:
$ objdump -p /bin/grep | grep NEEDED
NEEDED libpcre2-8.so.0
NEEDED libc.so.6
This way,
- Old programs keep working with the version they expect.
- New programs can use new features.
- Security fixes can be applied to all versions.
So that's how Linux manages to update libraries without breaking everything!
Exactly! The dynamic linker handles all the complexity of finding the right version of each library for each program.
Symbol Versioning
But what if you only change one function in the library? Do you need a whole new version?
Actually, libraries can version individual functions! This is called symbol versioning.
Let's see symbol versioning in action. Here's a simple example:
/* coolmath.c - Our growing math library */
#include <math.h>
/* Original version of our function */
__asm__(".symver add_positive_v1,add_positive@COOLMATH_1.0");
int
add_positive_v1(int a, int b)
{
return a + b; /* Oops, we forgot to check for positive! */
}
/* New version with proper checking (@@ means also set the default version) */
__asm__(".symver add_positive_v2,add_positive@@COOLMATH_2.0");
int
add_positive_v2(int a, int b)
{
if (a < 0 || b < 0) {
return 0;
}
return a + b;
}
We also need a version script (coolmath.map),
COOLMATH_1.0 {
global:
add_positive;
local:
*;
};
COOLMATH_2.0 {
global:
add_positive;
} COOLMATH_1.0;
Hay! What does
local: *;do?
It says that any functions not explicitly listed as global are private to the library. Using it is good practice to avoid accidentally exposing internal functions, even if we forgot to declare them as
static.
Compile it:
$ gcc -fPIC -shared -Wl,--version-script=coolmath.map -o libcoolmath.so coolmath.c
For the header file, we'll declare the function so that it chooses the right version. By default, programs will just want add_positive, but if we specify a version, it'll use that one:
/* coolmath.h */
#ifndef COOLMATH1_H
#define COOLMATH1_H
#define STR_HELPER(x) #x
#define STR(x) STR_HELPER(x) /* Stringify the version */
#ifdef COOLMATH_VERSION
__asm__(".symver add_positive,add_positive@COOLMATH_" STR(COOLMATH_VERSION));
#endif
int add_positive(int a, int b);
#endif
So now we have two versions of add_positive in one library?
Exactly! Let's see how different programs use them:
Here's our test program, mathtest.c:
#include <stdio.h>
#include "coolmath.h"
int
main() {
printf("3 + (-2) = %d\n", add_positive(3, -2));
return 0;
}
We'll compile two versions of the executable: one uses the current version of the library, and the other uses the old version:
$ gcc -o newmathtest mathtest.c -L. -lcoolmath
$ gcc -DCOOLMATH_VERSION=1.0 -o oldmathtest mathtest.c -L. -lcoolmath
$ LD_LIBRARY_PATH=. ./oldmathtest
3 + (-2) = 1
$ LD_LIBRARY_PATH=. ./newmathtest
3 + (-2) = 0
So the same function name in the same library gives different results depending on when the program was compiled?
Right! The dynamic linker knows which version each program expects and calls the right one.
That seems complex. Why go to all this trouble?
It lets library authors fix bugs and add features while still keeping compatibility with old programs.
We can see the version information with objdump:
$ objdump -T libcoolmath.so | grep add_positive
0000000000001111 g DF .text 000000000000002b (COOLMATH_2.0) add_positive
00000000000010f9 g DF .text 0000000000000018 (COOLMATH_1.0) add_positive
$ objdump -T oldmathtest | grep add_positive
0000000000000000 DF *UND* 0000000000000000 (COOLMATH_1.0) add_positive
$ objdump -T newmathtest | grep add_positive
0000000000000000 DF *UND* 0000000000000000 (COOLMATH_2.0) add_positive
Notice that even though we never said which version to use when we compiled newmathtest, it has recorded that it needs version 2.0 of the library. If we came out with a version 3.0 of the library, newmathtest would still use version 2.0 until we recompiled it.
In fact, glibc uses this mechanism extensively. Functions like
memcpyhave multiple versions optimized for different CPU features, and the dynamic linker can even switch versions at runtime!
That's a bit beyond what we need to cover here, but yes, symbol versioning is a powerful tool!
If you want to know more, here's a good article on the subject of shared libraries and all the things you can do with them.
Meh. I still think static linking is simpler. Worse is better! Are we done with dynamic linking now?
I think we're at a good place to wrap up!
(When logged in, completion status appears here.)