Recently I have been reading something about ELF PIE and PIC and I realized that there is some confusions about them and I have tried to summarize my understandings here.
a) How to tell if an ELF file is PIC or not?
PIC strictly speaking only applies to shared library so first we need to make sure it’s a shared library.
That seems to be easy to answer but actually the addition of PIE has made it more difficult because as we shall see soon – PIE executable is nothing but a shared object and thus shared the same e_type in ELF header.
So if we run “file” command on both a shared library and a PIE executable they would both show as shared object.
$file libhello.so libhello.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped $file client_pie client_pie: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped
Luckily there is one notable difference – PIE normally needs dynamic linker to load it to memory so they come with an interpreter.
For “file” command it will display “(uses shared libs)” whenever it sees PT_INTERP segment in target elf, which provides us with a way to identify it easily.
Now that we know for sure we are looking at an ELF shared library what is the next step.
Many documents mentions we can use existence of TEXTREL and that seems to have worked out well.
Let’s see why and how that works.
We start with a simple shared library that references “puts” function to output a const string.
#include <stdio.h>
void hello()
{
puts("hello");
}
And we compile with either “-fPIC” on or off and intentionally compile to 32bit lib with “-m32” switch. (reason shall be obvious later)
$gcc -m32 -o libhello.so -shared hello.c $gcc -m32 -fPIC -o libhello_pic.so -shared hello.c $file libhello.so libhello.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped $file libhello_pic.so libhello_pic.so: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked, not stripped $readelf -a libhello.so |grep -i textrel 0x00000016 (TEXTREL) 0x0 $readelf -a libhello_pic.so |grep -i textrel
So “file” command gives us no difference but readelf has been able to show the existence of “TEXTREL” in non-PIC libs.
Let’s first take a look what “TEXTREAL” means, here is what is taken from ELF spec:
DT_TEXTREL This member’s absence signifies that no relocation entry should cause a modification to
a non-writable segment, as specified by the segment permissions in the program header
table. If this member is present, one or more relocation entries might request
modifications to a non-writable segment, and the dynamic linker can prepare accordingly.
So the key here is we need to modify a non-writable segment (which really means readonly) and since we cannot do that at run time, dynamic linker is supposed to fix that before it loads the segment into memory.
What makes it necessary to modify a non-writable segment in the first place? Let’s move on to see the relocation part.
First we dump the relocation entries by “readelf -r” command:
$readelf -r libhello.so Relocation section '.rel.dyn' at offset 0x2f0 contains 10 entries: Offset Info Type Sym.Value Sym. Name 000004ef 00000008 R_386_RELATIVE 00001598 00000008 R_386_RELATIVE 0000159c 00000008 R_386_RELATIVE 000016b4 00000008 R_386_RELATIVE 000004f4 00000602 R_386_PC32 00000000 puts 0000168c 00000106 R_386_GLOB_DAT 00000000 _ITM_deregisterTMClone 00001690 00000406 R_386_GLOB_DAT 00000000 __cxa_finalize 00001694 00000706 R_386_GLOB_DAT 00000000 __gmon_start__ 00001698 00000a06 R_386_GLOB_DAT 00000000 _Jv_RegisterClasses 0000169c 00000b06 R_386_GLOB_DAT 00000000 _ITM_registerTMCloneTa Relocation section '.rel.plt' at offset 0x340 contains 2 entries: Offset Info Type Sym.Value Sym. Name 000016ac 00000407 R_386_JUMP_SLOT 00000000 __cxa_finalize 000016b0 00000707 R_386_JUMP_SLOT 00000000 __gmon_start__
There is a special entry for “puts” which seems to couple with what we did earlier and it’s in offset 0x4f4, could that be the one? Let’s dump out segments and sections.
$readelf -l libhello.so Elf file type is DYN (Shared object file) Entry point 0x3b0 There are 5 program headers, starting at offset 52 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x00598 0x00598 R E 0x1000 LOAD 0x000598 0x00001598 0x00001598 0x00120 0x00124 RW 0x1000 DYNAMIC 0x0005a4 0x000015a4 0x000015a4 0x000e8 0x000e8 RW 0x4 GNU_EH_FRAME 0x00051c 0x0000051c 0x0000051c 0x0001c 0x0001c R 0x4 GNU_STACK 0x000000 0x00000000 0x00000000 0x00000 0x00000 RW 0x10 Section to Segment mapping: Segment Sections... 00 .hash .dynsym .dynstr .gnu.version .gnu.version_r .rel.dyn .rel.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame 01 .init_array .fini_array .jcr .dynamic .got .got.plt .data .bss 02 .dynamic 03 .eh_frame_hdr 04
So 0x602 seems to fall on the first LOAD segment which has the offset range of 0x0 to 0x598 and in it there are several sections that needs to map to it.
And yes, the first LOAD segment is with flags “R E” which is exactly non-writable. (wit E meaning executable)
Here goes the section list:
$readelf -S libhello.so There are 27 section headers, starting at offset 0xcec: Section Headers: [Nr] Name Type Addr Off Size ES Flg Lk Inf Al [ 0] NULL 00000000 000000 000000 00 0 0 0 [ 1] .hash HASH 000000d4 0000d4 000048 04 A 2 0 4 [ 2] .dynsym DYNSYM 0000011c 00011c 0000d0 10 A 3 1 4 [ 3] .dynstr STRTAB 000001ec 0001ec 0000b8 00 A 0 0 1 [ 4] .gnu.version VERSYM 000002a4 0002a4 00001a 02 A 2 0 2 [ 5] .gnu.version_r VERNEED 000002c0 0002c0 000030 00 A 3 1 4 [ 6] .rel.dyn REL 000002f0 0002f0 000050 08 A 2 0 4 [ 7] .rel.plt REL 00000340 000340 000010 08 AI 2 9 4 [ 8] .init PROGBITS 00000350 000350 000023 00 AX 0 0 4 [ 9] .plt PROGBITS 00000380 000380 000030 04 AX 0 0 16 [10] .text PROGBITS 000003b0 0003b0 00014d 00 AX 0 0 16 [11] .fini PROGBITS 00000500 000500 000014 00 AX 0 0 4 [12] .rodata PROGBITS 00000514 000514 000006 00 A 0 0 1 [13] .eh_frame_hdr PROGBITS 0000051c 00051c 00001c 00 A 0 0 4 [14] .eh_frame PROGBITS 00000538 000538 000060 00 A 0 0 4 [15] .init_array INIT_ARRAY 00001598 000598 000004 00 WA 0 0 4 [16] .fini_array FINI_ARRAY 0000159c 00059c 000004 00 WA 0 0 4 [17] .jcr PROGBITS 000015a0 0005a0 000004 00 WA 0 0 4 [18] .dynamic DYNAMIC 000015a4 0005a4 0000e8 08 WA 3 0 4 [19] .got PROGBITS 0000168c 00068c 000014 04 WA 0 0 4 [20] .got.plt PROGBITS 000016a0 0006a0 000014 04 WA 0 0 4 [21] .data PROGBITS 000016b4 0006b4 000004 00 WA 0 0 4 [22] .bss NOBITS 000016b8 0006b8 000004 00 WA 0 0 1 [23] .comment PROGBITS 00000000 0006b8 000011 01 MS 0 0 1 [24] .shstrtab STRTAB 00000000 0006c9 0000d9 00 0 0 1 [25] .symtab SYMTAB 00000000 0007a4 000370 10 26 43 4 [26] .strtab STRTAB 00000000 000b14 0001d7 00 0 0 1 Key to Flags: W (write), A (alloc), X (execute), M (merge), S (strings) I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown) O (extra OS processing required) o (OS specific), p (processor specific)
So 0x4f4 actually falls on .text section and in ELF terms .text just means code that we write.
so ELF wants to fix value at offset 0x4f4 before it can map our code segment into memory, what is there at 0x4f4? Let’s dump the code in assembly forms.
$objdump -D libhello.so |less ... 000004e5 <hello>: 4e5: 55 push %ebp 4e6: 89 e5 mov %esp,%ebp 4e8: 83 ec 08 sub $0x8,%esp 4eb: 83 ec 0c sub $0xc,%esp 4ee: 68 14 05 00 00 push $0x514 4f3: e8 fc ff ff ff call 4f4 <hello+0xf> 4f8: 83 c4 10 add $0x10,%esp 4fb: c9 leave 4fc: c3 ret ...
So above code at address 0x4f3 tries to call “puts” function but the disassembled code shows it thinks “puts” is at address 0x4f4 which is obviously not correct – as we know “puts” must come from libc.
So if we “ldd” on the libhello.so we know it needs help from libc.so to get “put” function and when it is compiled as a shared library it has absolutely no idea where libc.so is going to be loaded and where “puts” function would be located.
That’s why we needs a fix here at address 0x4f4 and give it the right address.
Let’s write a simple client program to verify our idea.
void hello();
int main(void)
{
hello();
return 0;
}
Building the client program is trivial and we only need to bear in mind we have to put “-m32” switch to be compatible.
$gcc -m32 -o client client.c libhello.so
$ldd client
linux-gate.so.1 => (0xf77b6000)
libhello.so => not found
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf75ee000)
/lib/ld-linux.so.2 (0xf77b7000)
$export LD_LIBRARY_PATH=.
$ldd client
linux-gate.so.1 => (0xf7704000)
libhello.so => ./libhello.so (0xf76ff000)
libc.so.6 => /lib/i386-linux-gnu/libc.so.6 (0xf753a000)
/lib/ld-linux.so.2 (0xf7705000)
$./client
hello
We will gdb to execute the client program and see what is the value at offset 0x4f4 at run time.
$gdb ./client (gdb) b hello Breakpoint 1 at 0x80483e0 (gdb) r Starting program: client Breakpoint 1, 0xf7fd74eb in hello () from ./libhello.so (gdb) disassemble Dump of assembler code for function hello: 0xf7fd74e5 <+0>: push %ebp 0xf7fd74e6 <+1>: mov %esp,%ebp 0xf7fd74e8 <+3>: sub $0x8,%esp => 0xf7fd74eb <+6>: sub $0xc,%esp 0xf7fd74ee <+9>: push $0xf7fd7514 0xf7fd74f3 <+14>: call 0xf7e78190 <puts> 0xf7fd74f8 <+19>: add $0x10,%esp 0xf7fd74fb <+22>: leave 0xf7fd74fc <+23>: ret End of assembler dump. (gdb) x/5xb 0xf7fd74f3 0xf7fd74f3 <hello+14>: 0xe8 0x98 0x0c 0xea 0xff (gdb) x/1xw 0xf7fd74f4 0xf7fd74f4 <hello+15>: 0xffea0c98 (gdb) disassemble 0xf7e78190 Dump of assembler code for function puts: 0xf7e78190 <+0>: push %ebp 0xf7e78191 <+1>: push %edi ... (gdb) info sharedlibrary From To Syms Read Shared Object Library 0xf7fdc860 0xf7ff47ac Yes (*) /lib/ld-linux.so.2 0xf7fd73b0 0xf7fd74fd Yes (*) ./libhello.so 0xf7e29420 0xf7f5b68e Yes (*) /lib/i386-linux-gnu/libc.so.6
As can be seen from our gdb session, the memory at offset 0x4f4, which maps to virtual address 0xf7fd74f4 is replaced with real address where function “puts” is loaded (hence where libc.so is loaded).
And also notice the “call” instruction uses offset rather than absolute address in its next 4 bytes after instruction byte “e8”.
So value “0xffea0c98” is not “puts”‘s address but rather relative offset from next instruction to “puts”‘s real address.
This also explains why the relocation type is “R_386_PC32” which is for “relative offset” relocation fixing.
What about PIC version how does it solve the issue?
There are lots of articles on this and we know it is using PLT (Procedure Linkage table) and the code part will always access the data part of the module for the address of the target function.
While the code part of the module is always read-only, the data part of the module does not have this limitation. And as you can easily guess, there is a second LOAD segment with “RW” as the flag and for mapping those data segments.
Without going through too much details let’s quickly check the relocation of PIC version.
$readelf -r libhello_pic.so Relocation section '.rel.dyn' at offset 0x2f0 contains 8 entries: Offset Info Type Sym.Value Sym. Name 000015ac 00000008 R_386_RELATIVE 000015b0 00000008 R_386_RELATIVE 000016c4 00000008 R_386_RELATIVE 00001698 00000106 R_386_GLOB_DAT 00000000 _ITM_deregisterTMClone 0000169c 00000406 R_386_GLOB_DAT 00000000 __cxa_finalize 000016a0 00000706 R_386_GLOB_DAT 00000000 __gmon_start__ 000016a4 00000a06 R_386_GLOB_DAT 00000000 _Jv_RegisterClasses 000016a8 00000b06 R_386_GLOB_DAT 00000000 _ITM_registerTMCloneTa Relocation section '.rel.plt' at offset 0x330 contains 3 entries: Offset Info Type Sym.Value Sym. Name 000016b8 00000407 R_386_JUMP_SLOT 00000000 __cxa_finalize 000016bc 00000607 R_386_JUMP_SLOT 00000000 puts 000016c0 00000707 R_386_JUMP_SLOT 00000000 __gmon_start__
Just like we said, “puts” is handled in a special section “.rel.plt” with relocation type “R_386_JUMP_SLOT” and if you run readelf to list segments and sections you can easily find it goes to the second LOAD segments which is perfectly writable.
b) Why cannot I create non-PIC shared library on x64 box?
In the previous section I have intentionally added “-m32” switch to force creating 32bit-targeted binary.
And if you try without that you will instantly get an error.
$gcc -o libhello_64.so -shared hello.c /usr/bin/ld: /tmp/ccE058CB.o: relocation R_X86_64_32 against `.rodata' can not be used when making a shared object; recompile with -fPIC /tmp/ccE058CB.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status
However, if you add “-fPIC” it works.
$gcc -fPIC -o libhello_64_pic.so -shared hello.c $file libhello_64_pic.so libhello_64_pic.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
This has to do again with the similar location type we talked about in previous section “R_386_PC32”, just this time it is explicit that it’s on X86_64 platform hence the name “R_X86_64_32” with the last “32” meaning 32 bit offset.
Since we are talking about 64bit program, 32bit offset may not be big enough to give the correct target function address.
To fix this we simply requests to use 64bit as offset in relocating and as you can guess its relocation type would be “R_X86_64_64”.
We can enable that by a new switch “-mcmodel=large” which suggests large memory model which explained by gcc man:
-mcmodel=large
Generate code for the large code model. This makes no assumptions about addresses and sizes of sections. Pointers are 64 bits. Programs can be statically linked only.
Compare this with the default model we have
-mcmodel=small
Generate code for the small code model. The program and its statically defined symbols must be within 4GB of each other. Pointers are 64 bits. Programs can be statically or
dynamically linked. This is the default code model.
When we applies this we get:
$gcc -mcmodel=large -o libhello_64.so -shared hello.c $file libhello_64.so libhello_64.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped $readelf -r libhello_64.so Relocation section '.rela.dyn' at offset 0x3e8 contains 10 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000000666 000000000008 R_X86_64_RELATIVE 685 000000200710 000000000008 R_X86_64_RELATIVE 630 000000200718 000000000008 R_X86_64_RELATIVE 5f0 000000200948 000000000008 R_X86_64_RELATIVE 200948 000000000670 000300000001 R_X86_64_64 0000000000000000 puts + 0 0000002008f8 000200000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_deregisterTMClone + 0 000000200900 000700000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0 000000200908 000a00000006 R_X86_64_GLOB_DAT 0000000000000000 _Jv_RegisterClasses + 0 000000200910 000b00000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_registerTMCloneTa + 0 000000200918 000c00000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0 Relocation section '.rela.plt' at offset 0x4d8 contains 2 entries: Offset Info Type Sym. Value Sym. Name + Addend 000000200938 000700000007 R_X86_64_JUMP_SLO 0000000000000000 __gmon_start__ + 0 000000200940 000c00000007 R_X86_64_JUMP_SLO 0000000000000000 __cxa_finalize + 0
And again we see the familiar relocation for “puts” in the form of “R_X86_64_64”!
(note however mcmodel=large support doesn’t seem to be there for all versions so some old gcc may not support it, again check your man gcc)
And here is a good reference artical expanding on this bit: http://eli.thegreenplace.net/2011/11/11/position-independent-code-pic-in-shared-libraries-on-x64/
c) How to tell if an ELF executable is PIE or not?
First let’s see how to build a PIE executable.
Articles on the internet seems to talk about “-fPIE” switch a lot but actually that only deals with compiling stage. To create a PIE properly you also need to pass on the instruction to linker hence another switch is needed as well “-pie”.
We will talk about -fPIE and -pie in next section.
Now let’s build both pie version and non-pie version.
$gcc -o client_64 client.c libhello_64.so $gcc -fPIE -pie -o client_64_pie client.c libhello_64.so $file client_64 client_64: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped $file client_64_pie client_64_pie: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped
You may instantly notice the difference in the “file” output and that’s exactly how we tell the two apart – pie is compiled as a shared object while non-pie is compiled as an executable.
We already know pic shared library can be freely relocated to any other address so making pie executable as a shared object seems to be a very straight forward choice.
As the name suggests, pie can be loaded to any address not just those fixed at compile time.
We can verify that by “size” command.
For non-pie executable, address for various sections are hardcoded to its base address, by default around 0x400000.
$size --format=sysv client_64 client_64 : section size addr .interp 28 4194816 .note.ABI-tag 32 4194844 .hash 68 4194880 .dynsym 288 4194952 .dynstr 187 4195240 .gnu.version 24 4195428 .gnu.version_r 32 4195456 .rela.dyn 24 4195488 .rela.plt 72 4195512 .init 26 4195584 .plt 64 4195616 .text 386 4195680 .fini 9 4196068 .rodata 4 4196080 .eh_frame_hdr 52 4196084 .eh_frame 244 4196136 .init_array 8 6293536 .fini_array 8 6293544 .jcr 8 6293552 .dynamic 480 6293560 .got 8 6294040 .got.plt 48 6294048 .data 16 6294096 .bss 8 6294112 .comment 53 0 Total 2177
Compare this with pie executable which always assumes a base address of 0.
$size --format=sysv client_64_pie client_64_pie : section size addr .interp 28 512 .note.ABI-tag 32 540 .hash 76 576 .dynsym 336 656 .dynstr 202 992 .gnu.version 28 1194 .gnu.version_r 32 1224 .rela.dyn 192 1256 .rela.plt 96 1448 .init 26 1544 .plt 80 1584 .text 450 1664 .fini 9 2116 .rodata 4 2128 .eh_frame_hdr 52 2132 .eh_frame 244 2184 .init_array 8 2099584 .fini_array 8 2099592 .jcr 8 2099600 .dynamic 480 2099608 .got 40 2100088 .got.plt 56 2100128 .data 16 2100184 .bss 8 2100200 .comment 53 0 Total 2564
So pie executable can be loaded to any address and with Address space layout randomization (ASLR) enabled system it’s intentionally done for enhancement of security.
Let’s change our client program a little bit to show that is indeed the case.
#include <stdio.h>
void hello();
static int value = 0;
int main(void)
{
hello();
printf("value at 0x%x\n", &value);
return 0;
}
And run the two versions multiple times.
non-pie version always gives the same address:
$./client_64 hello value at 0x600adc $./client_64 hello value at 0x600adc $./client_64 hello value at 0x600adc ...
While pie version gives different address every time:
$./client_64_pie hello value at 0x46714c54 $./client_64_pie hello value at 0x9954cc54 $./client_64_pie hello value at 0xdd92c54 ...
d) Why do I need “-pie” aside from “-fPIE” when building pie executable?
We know -fPIC is enough for building a shared library but why is that for pie executable we need -pie as well?
As mentioned in previous section, “-fPIE” is for compiler while “-pie” is for linker.
We can ask gcc to emit more verbose output by “-v” to make this point clearer.
$gcc -v -fPIE -pie -o client_aa client.c libhello_64.so Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ./configure Thread model: posix gcc version 4.9.2 (GCC) COLLECT_GCC_OPTIONS='-v' '-fPIE' '-pie' '-o' 'client_aa' '-mtune=generic' '-march=x86-64' /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/cc1 -quiet -v -imultiarch x86_64-linux-gnu client.c -quiet -dumpbase client.c -mtune=generic -march=x86-64 -auxbase client -version -fPIE -o /tmp/ccgtrDhr.s ... COLLECT_GCC_OPTIONS='-v' '-fPIE' '-pie' '-o' 'client_aa' '-mtune=generic' '-march=x86-64' as -v --64 -o /tmp/ccRxf5SA.o /tmp/ccgtrDhr.s GNU assembler version 2.24.90 (x86_64-linux-gnu) using BFD version (GNU Binutils for Ubuntu) 2.24.90.20141104 COMPILER_PATH=/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/:/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/:/usr/local/libexec/gcc/x86_64-unknown-linux-gnu/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/ LIBRARY_PATH=/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../../lib64/:/lib/x86_64-linux-gnu/:/lib/../lib64/:/usr/lib/x86_64-linux-gnu/:/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-fPIE' '-pie' '-o' 'client_aa' '-mtune=generic' '-march=x86-64' /usr/local/libexec/gcc/x86_64-unknown-linux-gnu/4.9.2/collect2 --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -pie -o client_aa /usr/lib/x86_64-linux-gnu/Scrt1.o /usr/lib/x86_64-linux-gnu/crti.o /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/crtbeginS.o -L/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/../../.. /tmp/ccRxf5SA.o libhello_64.so -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/local/lib/gcc/x86_64-unknown-linux-gnu/4.9.2/crtendS.o /usr/lib/x86_64-linux-gnu/crtn.o
As you can see “-fPIE” is given to compiler front “cc1” and “-pie” is passed to linker front “collect2”.
So if we only provide -fPIE it will not be built as a pie executable, as shown below.
$gcc -fPIE -o client_no_pie client.c libhello_64.so $file client_no_pie client_no_pie: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped
What if we pass -pie but not -fPIE?
In this case, it seems to still work as long as we add -mcmodel=large explicitly.
Without adding -mcmodel=large it fails while it works with the switch added.
$gcc -pie -o client_no_fPIE client.c libhello_64.so /usr/bin/ld: /tmp/ccTXKNOl.o: relocation R_X86_64_32 against `.bss' can not be used when making a shared object; recompile with -fPIC /tmp/ccTXKNOl.o: error adding symbols: Bad value collect2: error: ld returned 1 exit status $gcc -mcmodel=large -pie -o client_no_fPIE client.c libhello_64.so $file client_no_fPIE client_no_fPIE: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped
gcc man suggests the same that we can replace -fPIE with model suboption:
-pie
Produce a position independent executable on targets that support it. For predictable results, you must also specify the same set of options used for compilation (-fpie,
-fPIE, or model suboptions) when you specify this linker option.
And if we only use -fPIE it will create non-pie executables:
$gcc -fPIE -o client_no_pie client.c libhello_64.so $file client_no_pie client_no_pie: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped
e) Does it matter that pie executable links with non-pic shared library?
pie is really about the executable itself so whether it links with pic or non-pic shared library does not really matter.
Linking with a non-PIC shared library simply suggests the library has to be patched before loading to a different address and may discourage sharing the same copy across multiple processes loading the same lib.
Let’s do a simple experiment with revised hello.c
#include <stdio.h>
static int hello_v = 0;
void hello()
{
puts("hello");
printf("hello_v at 0x%x\n", &hello_v);
}
We then compile both pic and non-pic version:
$gcc -mcmodel=large -o libhello_64.so hello.c -shared $gcc -mcmodel=large -o libhello_64_pic.so hello.c -shared -fPIC $readelf -a libhello_64.so |grep -i textrel 0x0000000000000016 (TEXTREL) 0x0 $readelf -a libhello_64_pic.so |grep -i textrel
Then we build two pie executable with it:
$gcc -o client_64_pie_pic -fPIE -pie client.c libhello_64_pic.so $gcc -o client_64_pie -fPIE -pie client.c libhello_64.so $file client_64_pie client_64_pie: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped $file client_64_pie_pic client_64_pie_pic: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.24, not stripped
If we run the two executable multiple times we can see both printed out addresses change with every invocation.
$./client_64_pie hello hello_v at 0x928009ec value at 0x92c26c54 $./client_64_pie hello hello_v at 0xb418b9ec value at 0xb45b1c54 $./client_64_pie hello hello_v at 0x991fc9ec value at 0x99622c54
$./client_64_pie_pic hello hello_v at 0x1af9aa04 value at 0x1b3c0c64 $./client_64_pie_pic hello hello_v at 0x24db7a04 value at 0x251ddc64 $./client_64_pie_pic hello hello_v at 0xa0bf4a04 value at 0xa101ac64
f)Where is the magic that pie executable gets loaded to different address?
The magic happens at dynamic linker by the way how it calls mmap, let’s take a look at the code in glibc/glibc-2.16.0/elf/dl-load.c
struct link_map *
_dl_map_object_from_fd (const char *name, int fd, struct filebuf *fbp,
char *realname, struct link_map *loader, int l_type,
int mode, void **stack_endp, Lmid_t nsid)
{
...
if (__builtin_expect (type, ET_DYN) == ET_DYN)
{
/* This is a position-independent shared object. We can let the
kernel map it anywhere it likes, but we must have space for all
the segments in their specified positions relative to the first.
So we map the first segment without MAP_FIXED, but with its
extent increased to cover all the segments. Then we remove
access from excess portion, and there is known sufficient space
there to remap from the later segments.
As a refinement, sometimes we have an address that we would
prefer to map such objects at; but this is only a preference,
the OS can do whatever it likes. */
ElfW(Addr) mappref;
mappref = (ELF_PREFERRED_ADDRESS (loader, maplength,
c->mapstart & GLRO(dl_use_load_bias))
- MAP_BASE_ADDR (l));
/* Remember which part of the address space this object uses. */
l->l_map_start = (ElfW(Addr)) __mmap ((void *) mappref, maplength,
c->prot,
MAP_COPY|MAP_FILE,
fd, c->mapoff);
...
}
/* This object is loaded at a fixed address. This must never
happen for objects loaded with dlopen(). */
if (__builtin_expect ((mode & __RTLD_OPENEXEC) == 0, 0))
{
errstring = N_("cannot dynamically load executable");
goto call_lose;
}
/* Notify ELF_PREFERRED_ADDRESS that we have to load this one
fixed. */
ELF_FIXED_ADDRESS (loader, c->mapstart);
/* Remember which part of the address space this object uses. */
l->l_map_start = c->mapstart + l->l_addr;
l->l_map_end = l->l_map_start + maplength;
l->l_contiguous = !has_holes;
while (c < &loadcmds[nloadcmds])
{
if (c->mapend > c->mapstart
/* Map the segment contents from the file. */
&& (__mmap ((void *) (l->l_addr + c->mapstart),
c->mapend - c->mapstart, c->prot,
MAP_FIXED|MAP_COPY|MAP_FILE,
fd, c->mapoff)
== MAP_FAILED))
Obviously for any type that is not ET_DYN, i.e. shared object, dl-load will try to call mmap with MAP_FIXED indicating fixed address loading, while for shared object it allows kernel implementation to picks up an address at will.
The same logic applies to both shared lib as well as pie executable since they are under the cover the same sort of stuff.
(more to be added)

This was very helpful. Thank you!
Pingback: The anatomy of ldd program on OpenBSD | Nan Xiao's Blog
This is an excellent demonstration of what pedagogy means.
Thank you for this true and excellent course.