Introduction
ELF, Executable and Linking Format (ELF) files, is a universal binary format in Linux. As its name suggests, any executable or linking files in Linux are in format of ELF, which consists of an ELF header, followed by a program header table or a section header table, or both. The two tables describe the rest of the particularities of the file.
The header file <elf.h> defines the format of ELF files and related C structures.
Top-View
1 | | -------------- | |
We take Elf32 as an example, it's ELF header is like below:
1 | typedef struct |
e_shoff defines the offset of section header tables from file begin. And section tables consist of consecutive sections.
p_shoff defines the offset of program header tables from file begin.
Section Header
A file's section header table lets one locate all the file's sections. From e_shoff we can reach the table of section headers. And e_shnum holds the number of entries the section header table contains.
A section header table index is a subscript into this array. Some section header table indices are reserved: the initial entry and the indices between SHN_LORESERVE and SHN_HIRESERVE. The initial entry is used in ELF extensions for e_phnum, e_shnum, and e_shstrndx; in other cases, each field in the initial entry is set to zero. An object file does not have sections for these special indices:
For details about these special indices, see also man 5 elf
.
The section header has the following structure:
1 | typedef struct |
sh_name: indicates the index of section name in Section Header String Table.
sh_type: mainly includes(The part I'm interested in):
- SHT_NULL: Marks the section header as inactive.
- SHT_SYMTAB: Symbol Table, for link editing and dynamic linking.
- SHT_DYNSYM: Dynamic Symbol Table, holds a minimal set of dynamic symbols linking symbols.
- SHT_STRTAB: String Table. An object file may have multiple string sections.
sh_offset: functions as above, determining the offset of section from from begin.
sh_link: This member holds a section header table index link, whose interpretation depends on the section type. For symbol table, it's the section index of String Table Section (holding name of symbol).
ELF Symbol Table
ELF Symbol Table consists of consecutive entries.
The structure of the ELF symbol table entry is like:
1 | typedef struct |
As the comment shows, st_name is an entry index in String Table. And the section index of String Table is holded in sh_link. Based on both, we can get the function/variable name easily.
st_info: Consist of 2 field: Bind and Type, we focus on latter(which can be derived by ELF32_ST_TYPE(info)
) now.
Symbol type mainly includes:
- STT_OBJECT: A data object. (Such as C variable)
- STT_FUNC: A function or other executable code.
- STT_SECTION: A section, for relocation.
- STT_FILE: The name of the source file.
- STB_LOCAL: Local symbols are not visible outside the object file containing their definition. Local symbols of the same name may exist in multiple files without interfering with each other.
- STB_GLOBAL: Global symbols are visible to all object files being com‐ bined. One file's definition of a global symbol will satisfy another file's undefined reference to the same symbol.
- STB_WEAK: Weak symbols resemble global symbols, but their definitions have lower precedence.
How to Find String in String Table?
String Table can be seen as an array of multiple null-terminated strings. The index of the entry in String Table is just the index of string in array.
For example, a String Table may look like this:
1 | "\0hello\0world\0xxxxxxxxx" |
The 0th entry is always empty. The 1st entry is "hello". The 2nd is "world". So every time you want to find an string entry by index, you must traverse every string before the one you looks for. Or you can cache the whole string table to speed up the whole ELF analysis.
Some Tools
readelf can read the elf file easily by various options.
Some cheatsheet:
1 | readelf -h # show elf header |
Reference
- Linux ELF Manual