Why your python script should not modify LD_LIBRARY_PATH

It starts with some colleague trying to manipulate LD_LIBRARY_PATH in his python scripts before importing a module.

His shell starts up environment has a LD_LIBRARY_PATH setting which he would rather not change.

But before loading a specific module that is implemented in shared library, he would like the LD_LIBRARY_PATH be changed so it picks up its dependent libraries elsewhere.

Surely enough, that does not work, but why?

It has to do with how LD_LIBRARY_PATH is used in a typical process.

Here I am trying to dig a bit deeper and explain why it would not work.

Let’s first see where LD_LIBRARY_PATH is initially parsed.
It is first parsed by elf rtld component in glibc as we can see in the following code:

06 void
307 internal_function
308 _dl_non_dynamic_init (void)
309 {
310   _dl_main_map.l_origin = _dl_get_origin ();
311   _dl_main_map.l_phdr = GL(dl_phdr);
312   _dl_main_map.l_phnum = GL(dl_phnum);
313
314   if (HP_SMALL_TIMING_AVAIL)
315     HP_TIMING_NOW (_dl_cpuclock_offset);
316
317   _dl_verbose = *(getenv ("LD_WARN") ?: "") == '\0' ? 0 : 1;
318
319   /* Set up the data structures for the system-supplied DSO early,
320      so they can influence _dl_init_paths.  */
321   setup_vdso (NULL, NULL);
322
323   /* Initialize the data structures for the search paths for shared
324      objects.  */
325   _dl_init_paths (getenv ("LD_LIBRARY_PATH"));

what _dl_init_paths does is to fill in a static global variable env_path_list:

  97 /* This is the decomposed LD_LIBRARY_PATH search path.  */
  98 static struct r_search_path_struct env_path_list attribute_relro;
  99
...
 658 void
 659 internal_function
 660 _dl_init_paths (const char *llp)
 661 {
...
 811       env_path_list.dirs = (struct r_search_path_elem **)
 812     malloc ((nllp + 1) * sizeof (struct r_search_path_elem *));
 813       if (env_path_list.dirs == NULL)
 814     {
 815       errstring = N_("cannot create cache for search path");
 816       goto signal_error;
 817     }
 818
 819       (void) fillin_rpath (llp_tmp, env_path_list.dirs, ":;",
 820                __libc_enable_secure, "LD_LIBRARY_PATH",
 821                NULL, l);
 822

After the variable env_path_list is filled in it is later used in _dl_map_object:

1929 struct link_map *
1930 internal_function
1931 _dl_map_object (struct link_map *loader, const char *name,
1932         int type, int trace_mode, int mode, Lmid_t nsid)
1933 {
...
2068       /* Try the LD_LIBRARY_PATH environment variable.  */
2069       if (fd == -1 && env_path_list.dirs != (void *) -1)
2070     fd = open_path (name, namelen, mode, &env_path_list,
2071             &realname, &fb,
2072             loader ?: GL(dl_ns)[LM_ID_BASE]._ns_loaded,
2073             LA_SER_LIBPATH, &found_other_class);
2074
2075       /* Look at the RUNPATH information for this binary.  */
2076       if (fd == -1 && loader != NULL
2077       && cache_rpath (loader, &loader->l_runpath_dirs,
2078               DT_RUNPATH, "RUNPATH"))
2079     fd = open_path (name, namelen, mode,
2080             &loader->l_runpath_dirs, &realname, &fb, loader,
2081             LA_SER_RUNPATH, &found_other_class);

so _dl_map_object first tries to find the object by name from env_path_list then find in current binary’s RUNPATH.

The logic is normally what we expect to see when using LD_LIBRARY_PATH.

And _dl_map_object is actually used in dl_open which is the API exposed normally for dynamically loading shared library.

184 static void
185 dl_open_worker (void *a)
186 {
...
224   /* Load the named object.  */
225   struct link_map *new;
226   args->map = new = _dl_map_object (call_map, file, lt_loaded, 0,
227                     mode | __RTLD_CALLMAP, args->nsid);
228

We can try to write a simple program and see where _dl_init_paths is called:

$cat a.C
#include <stdio.h>

int main(void)
{
  printf("hello\n");
  return 0;
}

$g++ -o a a.C

We start the program with gdb and set to break on _dl_init_paths:

$gdb ./a
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
(gdb) b _dl_init_paths
Function "_dl_init_paths" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_dl_init_paths) pending.
(gdb) r
Starting program: ./a

Breakpoint 1, 0x00007ffff7de2d40 in _dl_init_paths () from /lib64/ld-linux-x86-64.so.2
Missing separate debuginfos, use: debuginfo-install glibc-2.17-106.el7_2.1.x86_64
(gdb) bt
#0  0x00007ffff7de2d40 in _dl_init_paths () from /lib64/ld-linux-x86-64.so.2
#1  0x00007ffff7dde0dd in dl_main () from /lib64/ld-linux-x86-64.so.2
#2  0x00007ffff7df17d5 in _dl_sysdep_start () from /lib64/ld-linux-x86-64.so.2
#3  0x00007ffff7ddfcc1 in _dl_start () from /lib64/ld-linux-x86-64.so.2
#4  0x00007ffff7ddc438 in _start () from /lib64/ld-linux-x86-64.so.2
#5  0x0000000000000001 in ?? ()
#6  0x00007fffffffe550 in ?? ()
#7  0x0000000000000000 in ?? ()
gdb) info shared
From                To                  Syms Read   Shared Object Library
0x00007ffff7ddbae0  0x00007ffff7df627a  Yes (*)     /lib64/ld-linux-x86-64.so.2
(*): Shared library is missing debugging information.

As can be seen from stack trace it is called from _start entry point and at this time only one library is loaded: /lib64/ld-linux-x86-64.so.2.

This is because normally the first step for starting an elf program is to load the interpreter which is responsible for loading the real program.

The interpreter is specified in interpreter section which we can show using readelf:

$readelf -a ./a|grep -i interp
  [ 1] .interp           PROGBITS         0000000000400238  00000238
  INTERP         0x0000000000000238 0x0000000000400238 0x0000000000400238
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
   01     .interp

So the interpreter program ld-linux-x86-64.so.2 runs its own stuff and read from the process environment for critical variables such as “LD_LIBRARY_PATH” and initialize its own static variables.

After that it goes on to execute at the entry point coded in ELF header which we can also display via readelf:

$readelf -a ./a|grep -i entry
  Entry point address:               0x400500

We can show that the function at that address is actually the target program’s _start which will later loads the main function we wrote:

(gdb) b _start
Breakpoint 2 at 0x400440
(gdb) cont
Continuing.

Breakpoint 2, 0x0000000000400440 in _start ()
(gdb) bt
#0  0x0000000000400440 in _start ()
(gdb) info shared
From                To                  Syms Read   Shared Object Library
0x00007ffff7ddbae0  0x00007ffff7df627a  Yes (*)     /lib64/ld-linux-x86-64.so.2
0x00007ffff7a393e0  0x00007ffff7b7cba0  Yes (*)     /lib64/libc.so.6
(*): Shared library is missing debugging information.

We can even manually set the environment variable at the start of gdb session and examine the internal data structure to show it is already filled in:

(gdb) set env LD_LIBRARY_PATH /var/tmp
(gdb) b _start
Breakpoint 1 at 0x400500
(gdb) r
Starting program: ./a

Breakpoint 1, 0x0000000000400500 in _start ()
(gdb) x/4xg &env_path_list
0x7ffff7ffcdc0 <env_path_list>: 0x00007ffff7ff9640      0x0000000000000000
0x7ffff7ffcdd0 <__stack_prot>:  0x0000000001000000      0x0000000000000000
(gdb) x/4xg 0x00007ffff7ff9640
0x7ffff7ff9640: 0x00007ffff7ff9650      0x0000000000000000
0x7ffff7ff9650: 0x00007ffff7ff9000      0x00007ffff7df770c
(gdb) x/4xg 0x00007ffff7ff9650
0x7ffff7ff9650: 0x00007ffff7ff9000      0x00007ffff7df770c
0x7ffff7ff9660: 0x0000000000000000      0x00007ffff7ff9688
(gdb) x/s 0x00007ffff7df770c
0x7ffff7df770c: "LD_LIBRARY_PATH"
(gdb) x/s 0x00007ffff7ff9688
0x7ffff7ff9688: "/var/tmp/"

The displayed values are for below variables what and where:

 160 struct r_search_path_elem
 161   {
 162     /* This link is only used in the `all_dirs' member of `r_search_path'.  */
 163     struct r_search_path_elem *next;
 164
 165     /* Strings saying where the definition came from.  */
 166     const char *what;
 167     const char *where;
 168
 169     /* Basename for this search path element.  The string must end with
 170        a slash character.  */
 171     const char *dirname;
 172     size_t dirnamelen;
 173
 174     enum r_dir_status status[0];
 175   };

As you can see, the environment variable LD_LIBRARY_PATH is already consumed well before python binary even starts to parse the python script.

So any attempt from within the python scripts to change LD_LIBRARY_PATH is like a character in a novel trying to talk to the author to change the story line.

So not just for python scripts it would be the same for perl scripts and even for a C program that tries to manipulate LD_LIBRARY_PATH from within the code.

ld-linux-x86-64.so is only called when restarting a new program and that gives us the natural solution – simply to restart the same program after fixing any environment variables.

Here is a sample program skeleton:

#!/usr/bin/python

import os
import sys

if "LD_LIBRARY_PATH" in os.environ and os.environ["LD_LIBRARY_PATH"] == "/var/tmp/test":
  print "bad one"
  del os.environ["LD_LIBRARY_PATH"]
  os.execve(__file__, sys.argv, os.environ)
else:
  print "good one"

When we run it in bad environment it will restart itself, otherwise it will proceed normally:

$echo $LD_LIBRARY_PATH
/var/tmp/test

$./a.py
bad one
good one

$unset LD_LIBRARY_PATH

$./a.py
good one

Also we have checked how dlopen is implemented based on using LD_LIBRARY_PATH and that function is exactly what python’s ctypes.CDLL and imp.load_dynamic uses for loading shared library.

So we can imagine if glibc exposes API to manipulate env_path_list and python provides interfaces to use that we can indeed allow ourselves to load shared library by dynamically updating library loading path.

About codywu2010

a programmer
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s