Hi, friends in the last blog we’d seen about ELF and some topics around it. In this blog, we gonna see about the ELF in more detail. Let’s get started!
The ELF Header :
The ELF header is present in every type of ELF file. It helps the operating system to recognize that, it is dealing with ELF files. It contains the detail like what type of ELF file is this, what is the version, what is the class etc … And only by this information, we can decide the behavior of the particular ELF file in the particular operating system. Simply we can say it as the Metadata of the ELF file. It present in /usr/include/elf.h in Linux machines
ELF Structure :
Elf has its own types, in which the data present in it is represented. The representation of data differs from standard view and kernel view. Here, I’m only going with the standard form, If you are interested to see the kernel view I’ll give the link to it.
The above two pictures you’ve seen are an ELF Header in the Standard Form and one is 32bit and another one is Common View of it, for our reference. So let’s Breakdown it
- It is an initial 16 byte of the ELF header which is represented as an unsigned character type.
- It tells the operating system that how this ELF is to be parsed, And it also helps the operating system to verify that it is an ELF file.
- Below the picture, your seeing is the e_ident part of the ELF header.
- Typically, It contains 7 indexes that hold different values.
- The very first one is Magic Sequence.
The Magic Sequence:
- In the picture, you can see two e_ident_magic indexes. The first one represents hex “7f” and the second one represents hex “45,4c,46”.
- The values “7f[ELFMAG0] 45[ELFMAG1] 4c[ELFMAG2] 46[ELFMAG3]” is common for all type of ELF files.
- Linux system has a concept called “Magic Numbers” which is used by the operating system to differentiate one type of file format from other. In that way, All types of elf files you see constitute “7f 45 4c 46”.
- The last three bytes “45 4c 46” means “ELF”, The first byte 7f means DEL in ASCII but still, we don’t; have any correct reason for using it as a first byte. We may assume that it is used to differentiate it from other files.
Class : [e_ident[EI_CLASS]]
- It is used to help the operating system to find out it is 32-bit or 64-bit.
- It has 3 values - “0[ELFCLASSNONE], 1[ELFCLASS32], 2[ELFCLASS64]”. 0 = invalid, 1 = 32-bit, 2 = 64-bit.
- Other data types which are present in the ELF files are created depending on this Index because by this only an ELF parser could understand whether it is 32 or 64-bit.
Data : [e_ident[EI_DATA]]
- It is used to find what type of data encoding is used in the ELF file. Whether the ELF file is LSB(Least Significant Byte) or MSB(Most Significant Byte) AKA Little-endian or Big-endian.
- It has three values “0[ELFDATANONE], 1[ELFDATA2LSB], 2[ELFDATA2MSB]”.
- 1 = LSB(Least Significant Byte), 2 = MSB(Most Significant Byte), 0 = None(Invalid)
Version : [e_ident[EI_VERSION]]
- It is used to find out the version of the ELF file but, since the 80s when it is discovered it is always set to 1.
- It has two values 0[EV_NONE],1[EV_CURRENT].
- 0 = invalid, 1 = Always 1.
Padding : [e_ident[EI_PAD]]
- These bytes are reserved as 0, and it is only used for padding.
- These are the last 7 bytes of the e_ident part of the ELF Header.
There are also other bytes like OS ABI, ABI Version.
- This byte is used to specify the ABI(Application Binary Interface) of the operating system.
- This index has three values “0[None/System V], 1[HP-UX], 2[Net-BSD], 3[Linux]”.
- Still, it is a Linux machine but it is set to 0 instead of 3. It happens only in dynamically linked binaries while statically binaries get it correctly.
It is not useful for us in any way so, We just skip it and move on to the next thing.
In the above picture, you can see that the types that ELF use are HalfWord, Word, Address, Offset. Let’s start with Halfword
- Halfword → Unsigned Integer type 16 bit
- Word → Unsigned Integer type 32 bit
- Address → Unsigned Integer type, maybe 32 or 64 bit depending on whether it is 32 or 64 bit ELF file
- Offset → It is also like Address type, differ according to system architecture
Now we can move on to other members of the ELF Header.
e_type[HalfWord type]: This identifies the type of the ELF file.
- It has 7 values, in which 5 values are defined.
- ET_NONE = 0 → Invalid, ET_REL = 1 → Relocatable File,
- ET_EXEC = 2 → Executable File(Not support ASLR and PIE), ET_DYN = 3 → Shared object file, Executables with PIE(Position Independant Code)
- ET_CORE = 4 → Core Dumps
Other values like ET_LOPROC, ET_HIPROC are processor specified one. That is not used in a wild
e_machine[HalfWord type]: This identifies the Architecture of the file.
The ELF file type is used in many of the machines for various purposes. Here I’m only going to mention some specific machines, if you want to see more I’ll give the link to that.
- 0×02 → SPARC
- 0×03 → x86
- 0×28 → ARM
- 0×3E → AMD x86–64
- 0xB7 → ARM 64-bit
- 0xF3 → RISC-V
e_version[Word type]: Specifies the Object files version
It has two values :
- EV_NONE = 0 → Invalid
- EV_CURRENT = 1 → Current Version
e_entry[Address type]: This specifies the entry point of the executable types if there is no entry point it will be set to zero
e_phoff(Program Headers Offset)[Offset type]: This holds the program header table’s file offsets in bytes This is the point where the program header table was linked to the ELF header. If the file has a zero program header table then it will be set to zero.
e_shoff(Section Headers Offset)[Offset types]: This holds the section header table’s file offset in bytes. This is the point where the section header table was linked to the ELF header. If the file has a zero section header table then it will be set to zero.
e_flags[Word]: This holds the processor-specific flags associated with the file.
e_ehsize[HalfWord]: This holds the ELF header’s size in bytes.
e_phentsize[HalfWord]: This specifies the size of each entry in the program header table in bytes.
e_phnum[HalfWord]: This holds the number of entries in the program header table, If a file has no program header table, it will be set to zero
e_shentsize[HalfWord]: This specifies the size of each entry in the section header table in bytes.
e_shnum[HalfWord]: This specifies the number of entries in the section header table, If a file has no section header table, it will be set to zero.
e_shstrndx(Section Header String Table)[HalfWord]: This is used to resolve the section names which are present in the file. If the file has zero sections then it will be set to SHN_UNDEF.
I recommend you, load the ELF file in Ghidra or IDA pro or in Readelf to take a look at it. Because practical knowledge can give you more than you think. Readelf cmd → readelf -a <file> to see all the things, readelf -h <file> to see only the ELF header.
I hope I gave you some information.
And if you have any queries DM me on Twitter
Bye, see You all in the next blog.