Intro

Hello there!

A long time ago, as a college student project, I created a library to parse mathematical expressions: xFunc. Over time, I added a lot of features. The latest of them are lambdas, curring, rational numbers, etc. A simple library started to look more and more like a small programming language, which defeats the idea of the library: to be a small library to parse mathematical expressions and nothing more.

So recently, I decided to start to work on my programming language: GitHub. It is still in development and will be in this state for several years at least. At some point, I will need to generate a binary executable. So, to distract myself from parsing, I decided to investigate the structure of an ELF file and create one such file from scratch.

The ELF file structure

So, let’s start with the elf file structure. The internet already has a lot of articles about it. There is an official System V documentation, Linux docs, and a lot of 3rd party articles, but you can just open a wiki page and get a pretty good explanation of how an elf file should look. And for the completeness of this article, I will add the structure here.

The elf file consists of these major parts (we are going to discuss files only 64-bit systems):

  • ELF header
  • Program Headers
  • Section Headers
  • Content

The content is just a part of the file that contains actual code or data, or other resources. Whereas all other parts are metadata which describes this content or needed OS to load/execute a program or some tools to compile/link this file.

The order of these sections is not important. The only restriction is that the ELF header should be at the beginning of the file. Any other section could be placed anywhere in the file.

ELF header

Each ELF file should start with the ELF header. It’s just basic information about ELF, on what OS/CPU it is expected to run, whre the starting point is, and where all sections are.

Size (bytes)FieldDescription
4Magin Number0x7F followed by ‘ELF’ (0x45, 0x4C, 0x46).
1Class2 for 64-bit format.
1Data1 - little endian. 2 - big endian.
1Ident VersionAlways 1.
1OSIdentifies the target operating system.
1ABI VersionABI version. Interpretation depends on target ABI. Often ignored.
7PaddingReserved padding bytes. Should be filled with zeros.
2TypeObject file type.
2ISATarget instruction set architecture.
4VersionAlways 1.
8EntryEntry point virtual address.
8Program HeaderProgram header table offset.
8Session HeaderSection header table offset.
4FlagsArchitecture-specific flags.
2ELF SizeELF header size (64 for 64-bit).
2PH SizeSize of one program header entry.
2PH CountNumber of program header entries.
2SH SizeSize of one section header entry.
2SH CountNumber of section header entries.
2Section NamesIndex of the section header with section names.

OS Values:

Here are some values for OS field: System V = 0x00, Linux = 0x03. Usually, it is equal to System V even on a Linux distribution and I’m not sure it really affects anything.

File Type:

ValueTypeMeaning
0x00ET_NONEUnknown
0x02ET_EXECExecutable file
0x03ET_DYNShared object

ISA Values:

ValueISA
0x3EAMD x86-64
0xB7Arm 64-bits (Armv8/AArch64)

Program Headers

The program headers table contains information needed for OS to understand how to load your program into memory and allocate all necessary blocks for execution:

Size (bytes)FieldDescription
4TypeIdentifies the type of the segment.
4FlagsSegment-dependent flags.
8OffsetOffset of the segment in the file image.
8Virtual AddressVirtual address of the segment in memory.
8Physical AddressPhysical address (if relevant).
8File SizeSize of the segment in the file image.
8Memory SizeSize of the segment in memory.
8AlignmentSegment alignment. 0 or 1 = no alignment. Otherwise power of 2.

There are some restrictions on several fields.

The first one is that the segments that are loaded into memory must have Virtual Address and Offset fields congruent modulo Alignment: Virtual Address % Alignment = Offset % Alignment.

The second one is pretty much similar but for the page size of OS: Virtual Address % Page Size = Offset % Page Size.

Otherwise, the OS will show the Segmentation Fault error or other strange errors. For example, the segment might not be executable, even if it has the proper executable flag.

Often Virtual Address and Physical Address fields are equal, the same for File Size and Memory Size with some small exceptions. For example, the part of memory with uninitialized data (zeros). We don’t need and don’t want to put all these zero bytes into the file. It will increase the file size without any good reason but instead, we could use File Size and Memory Size fields. Set File Size to 0 and Memory Size to a needed value. So, the OS will create a segment of memory with zeros while the size of file is not affected.

Program Header Type Values

ValueNameMeaning
0x00000000PT_NULLProgram header table entry unused.
0x00000001PT_LOADLoadable segment. E.g. code or data segments.
0x00000003PT_INTERPInterpreter information. Isn’t needed for static linked binaries.
0x00000006PT_PHDRSegment containing program header table itself.

Flag Values:

ValueNameMeaning
0x1PF_XExecutable segment.
0x2PF_WWritable segment.
0x4PF_RReadable segment.

Section Headers

The section headers table is not required for execution. The OS will be able to run the program even if there are no section headers at all. But they are needed for linking and debugging. Usually, this table is stored at the end of the ELF file.

OffsetSize (bytes)FieldPurpose
0x004NameOffset to a string in the .shstrtab section representing the section name.
0x044TypeIdentifies the type of the section.
0x088FlagsAttributes of the section.
0x108AddressVirtual address of the section in memory (for loaded sections).
0x188OffsetOffset of the section in the file image.
0x208SizeSize in bytes of the section. May be 0.
0x284LinkSection index of an associated section. Purpose depends on section type.
0x2C4InfoExtra information about the section. Purpose depends on section type.
0x308AlignmentRequired alignment of the section (must be a power of 2).
0x388Entry SizeSize of each entry for sections with fixed-size entries, or 0 otherwise.

The section headers table has a similar restriction as the program headers table: Address % Alignment == 0.

Section Header Types:

ValueNameMeaning
0x0SHT_NULLSection header table entry unused.
0x1SHT_PROGBITSProgram data (code or data).
0x3SHT_STRTABString table. List of null-terminated strings.
0x8SHT_NOBITSProgram space with no data/uninitialized data (BSS).

Section Header Flags:

ValueNameMeaning
0x1SHF_WRITEWritable
0x2SHF_ALLOCOccupies memory during execution
0x4SHF_EXECINSTRExecutable

Generate ELF file

The generation of ELF files is straightforward. You just need to follow the format described above but there is one problem. Some sections have a reference (offset) to the sections defined later in the file. For example, the ELF header has offset to the Program Headers table and the Section Headers table. So, you have two options. The first one is to generate the ELF header with offset unset and fill them later. Or skip these sections and generate them later them the other parts of the file are known.

I decided to go with the second approach and create a whole file in memory and then write it to the disk. Each part of the file is represented by an implementation of ISegment:

internal interface ISegment
{
    string SegmentName { get; }
    long FileOffset { get; set; }
    long FileSize { get; }
 
    void WriteTo(BinaryStream stream);
}

This defines an important part of the output file, the offset where this particular part should be placed and its size. So, for the ELF header, I created a separate class with all necessary fields and code to write into the stream. The same for other parts nothing special. Also, I created a “placeholder” implementation:

internal class NullSegment : ISegment
{
    public NullSegment(string name, long fileOffset, long fileSize)
    {
        FileOffset = fileOffset;
        FileSize = fileSize;
        SegmentName = name;
    }
 
    public void WriteTo(BinaryStream stream)
        => stream.Write(new byte[FileSize]);
 
    public string SegmentName { get; }
 
    public long FileOffset { get; set; }
 
    public long FileSize { get; }
}

It does nothing, just writes a bunch of zeros to the file. It is needed for sections with forward references to other segments. So, we can “reserve” the space in the file and at the same time calculate correct offsets. Also, I created a container for all segments:

internal class OutputFile
{
    private readonly List<ISegment> segments = [];
 
    public void Add(ISegment segment)
    {
        var lastSegment = segments.LastOrDefault();
        var lastSegmentOffset = lastSegment?.FileOffset ?? 0;
        var lastSegmentSize = lastSegment?.FileSize ?? 0;
        var segmentOffset = lastSegmentOffset + lastSegmentSize;
        segment.FileOffset = segmentOffset;
 
        segments.Add(segment);
    }
 
    public void Set(string name, ISegment segment)
    {
        var index = segments.FindIndex(x => x.SegmentName == name);
        if (index < 0)
            throw new InvalidOperationException($"Segment {name} does not exist");
 
        var existing = segments[index];
        if (segment.FileSize != existing.FileSize)
            throw new InvalidOperationException("Cannot set segment with different offset or size");
 
        segment.FileOffset = existing.FileOffset;
 
        segments[index] = segment;
    }
 
    public void WriteTo(BinaryStream stream)
    {
        foreach (var segment in Segments)
        {
            stream.Seek(segment.FileOffset, SeekOrigin.Begin);
 
            segment.WriteTo(stream);
        }
    }
}

It is needed to automatically calculate offsets. So, instead of getting the last segment and calculating it manually each time, this class is doing it for us. And at last, the final part, the code which combines everything to create ELF:

internal class ElfWriter
{
    private const string ElfHeaderName = "ELF Header";
    private const string ProgramHeaderName = "Program Header";
    private const string PhCodeName = "PH Code";
    private const string PhDataName = "PH Data";
    private const string ContentCodeName = "Content Code";
    private const string ContentDataName = "Content Data";
    private const string ContentShStrTabName = "Content Shstrtab";
    private const string ShNullName = "SH Null";
    private const string ShTextName = "SH Text";
    private const string ShDataName = "SH Text";
    private const string ShShStrTabName = "SH ShStrTab";
 
    private const ulong BaseAddress = 0x00400000;
 
    public void Write(ElfOptions options)
    {
        var outputFile = new OutputFile();
 
        // add null segments to calculate offsets correctly
        outputFile.Add(new NullSegment(ElfHeaderName, ElfHeader.ElfHeaderSize));
 
        var nullProgramHeader = new NullSegment(ProgramHeaderName, ElfHeader.PhSize);
        outputFile.Add(nullProgramHeader);
        outputFile.Add(new NullSegment(PhCodeName, ElfHeader.PhSize));
        outputFile.Add(new NullSegment(PhDataName, ElfHeader.PhSize));
 
        var contentCode = new DataSegment(ContentCodeName, [
            // TODO: assembly code
        ]);
        outputFile.Add(contentCode);
 
        var contentData = new DataSegment(ContentDataName, [
            // TODO: data
        ]);
        outputFile.Add(contentData);
 
        var shstrtab = new StringSegment(ContentShStrTabName);
        var shstrtabOffset = shstrtab.Add(".shstrtab");
        var textOffset = shstrtab.Add(".text");
        var dataOffset = shstrtab.Add(".data");
        outputFile.Add(shstrtab);
 
        // add section headers
        var sectionHeader = SectionHeader.CreateNullSection(ShNullName);
        outputFile.Add(sectionHeader);
        var codeSection = SectionHeader.CreateProgBits(
            segmentName: ShTextName,
            nameOffset: (uint)textOffset,
            address: BaseAddress + (ulong)contentCode.FileOffset,
            offset: (ulong)contentCode.FileOffset,
            size: (ulong)contentCode.FileSize,
            alignment: 8);
        outputFile.Add(codeSection);
        var dataSection = SectionHeader.CreateData(
            segmentName: ShDataName,
            nameOffset: (uint)dataOffset,
            address: BaseAddress + 0x10000 + (ulong)contentData.FileOffset,
            offset: (ulong)contentData.FileOffset,
            size: (ulong)contentData.FileSize,
            alignment: 1);
        outputFile.Add(dataSection);
        outputFile.Add(SectionHeader.CreateShStrTab(
            segmentName: ShShStrTabName,
            nameOffset: (uint)shstrtabOffset,
            offset: (ulong)shstrtab.FileOffset,
            size: (ulong)shstrtab.FileSize));
 
        // populate null segments
        outputFile.Set(ProgramHeaderName, ProgramHeader.CreateProgramHeader(
            segmentName: ProgramHeaderName,
            offset: (ulong)nullProgramHeader.FileOffset,
            address: BaseAddress + (ulong)nullProgramHeader.FileOffset,
            phFileSize: 3 * ElfHeader.PhSize, // TODO: calculate?
            alignment: 0
        ));
        outputFile.Set(PhCodeName, ProgramHeader.CreateProgramHeaderCode(
            segmentName: PhCodeName,
            offset: 0,
            address: BaseAddress,
            phFileSize: (ulong)(contentCode.FileOffset + contentCode.FileSize),
            alignment: 0x10000 // TODO: calculate?
        ));
        outputFile.Set(PhDataName, ProgramHeader.CreateProgramHeaderData(
            segmentName: PhDataName,
            offset: (ulong)contentData.FileOffset,
            address: dataSection.Address,
            phFileSize: dataSection.Size,
            alignment: 0x10000 // TODO: calculate?
        ));
 
        var elfHeader = new ElfHeader(
            segmentName: ElfHeaderName,
            endianness: ElfEndianness.Little, // TODO: support big-endian
            instructionSet: options.InstructionSet,
            entryPoint: codeSection.Address,
            programHeaderOffset: (ulong)nullProgramHeader.FileOffset,
            sectionHeaderOffset: (ulong)sectionHeader.FileOffset,
            programHeaderCount: 3, // TODO: calculate?
            sectionHeaderCount: 4, // TODO: calculate?
            sectionNameIndex: 3 // TODO: calculate?
        );
        outputFile.Set(ElfHeaderName, elfHeader);
 
        using var file = new BinaryStream(
            File.Open(options.OutputPath, FileMode.Create, FileAccess.Write),
            elfHeader.Endianness == ElfEndianness.Little);
 
        outputFile.WriteTo(file);
    }
}

It creates placeholders for the ELF Header and Program Headers, then creates segments for code and data, and, then creates Section Headers. And only after that when almost the entire file is ready, it goes back and creates the correct ELF Header and Program Headers.

Here is the whole code combined:

Source Code
internal interface ISegment
{
    string SegmentName { get; }
    long FileOffset { get; set; }
    long FileSize { get; }
 
    void WriteTo(BinaryStream stream);
}
 
internal class OutputFile
{
    private readonly List<ISegment> segments;
 
    public OutputFile()
        => segments = [];
 
    public void Add(ISegment segment)
    {
        var lastSegment = segments.LastOrDefault();
        var lastSegmentOffset = lastSegment?.FileOffset ?? 0;
        var lastSegmentSize = lastSegment?.FileSize ?? 0;
        var segmentOffset = lastSegmentOffset + lastSegmentSize;
        segment.FileOffset = segmentOffset;
 
        segments.Add(segment);
    }
 
    public void Set(string name, ISegment segment)
    {
        var index = segments.FindIndex(x => x.SegmentName == name);
        if (index < 0)
            throw new InvalidOperationException($"Segment {name} does not exist");
 
        var existing = segments[index];
        if (segment.FileSize != existing.FileSize)
            throw new InvalidOperationException("Cannot set segment with different offset or size");
 
        segment.FileOffset = existing.FileOffset;
 
        segments[index] = segment;
    }
 
    public void WriteTo(BinaryStream stream)
    {
        foreach (var segment in Segments)
        {
            stream.Seek(segment.FileOffset, SeekOrigin.Begin);
 
            segment.WriteTo(stream);
        }
    }
 
    public IReadOnlyList<ISegment> Segments
        => segments;
}
 
internal class NullSegment : ISegment
{
    public NullSegment(string name, long fileSize) : this(name, 0, fileSize)
    {
    }
 
    public NullSegment(string name, long fileOffset, long fileSize)
    {
        FileOffset = fileOffset;
        FileSize = fileSize;
        SegmentName = name;
    }
 
    public void WriteTo(BinaryStream stream)
        => stream.Write(new byte[FileSize]);
 
    public string SegmentName { get; }
 
    public long FileOffset { get; set; }
 
    public long FileSize { get; }
}
 
internal class DataSegment : ISegment
{
    public DataSegment(string name, byte[] data) : this(name, 0, data)
    {
    }
 
    public DataSegment(string name, long fileOffset, byte[] data)
    {
        FileOffset = fileOffset;
        FileSize = data.Length;
        Data = data;
        SegmentName = name;
    }
 
    public void WriteTo(BinaryStream stream)
        => stream.Write(Data);
 
    public string SegmentName { get; }
 
    public long FileOffset { get; set; }
 
    public long FileSize { get; }
 
    public byte[] Data { get; }
}
 
internal class StringSegment : ISegment
{
    private readonly List<string> strings;
 
    public StringSegment(string segmentName) : this(segmentName, 0)
    {
    }
 
    public StringSegment(string segmentName, long fileOffset)
    {
        strings = [];
 
        SegmentName = segmentName;
        FileOffset = fileOffset;
        FileSize = 1;
    }
 
    public long Add(string s)
    {
        strings.Add(s);
 
        var offset = FileSize;
        FileSize += Encoding.ASCII.GetByteCount(s) + 1;
 
        return offset;
    }
 
    public void WriteTo(BinaryStream stream)
    {
        stream.Write(0x00);
 
        foreach (var s in strings)
        {
            stream.Write(Encoding.UTF8.GetBytes(s));
            stream.Write(0x00);
        }
    }
 
    public string SegmentName { get; }
 
    public long FileOffset { get; set; }
 
    public long FileSize { get; private set; }
}
 
internal class SectionHeader : ISegment
{
    public SectionHeader(
        string segmentName,
        uint nameOffset,
        SectionHeaderType type,
        SectionHeaderFlags flags,
        ulong address,
        ulong offset,
        ulong size,
        uint link,
        uint info,
        ulong addressAlignment,
        ulong entrySize)
    {
        if (addressAlignment != 0 && address % addressAlignment != 0)
            throw new ArgumentException("Address must be aligned to address alignment");
 
        SegmentName = segmentName;
        FileOffset = 0;
        FileSize = ElfHeader.ShSize;
 
        NameOffset = nameOffset;
        Type = type;
        Flags = flags;
        Address = address;
        Offset = offset;
        Size = size;
        Link = link;
        Info = info;
        AddressAlignment = addressAlignment;
        EntrySize = entrySize;
    }
 
    public static SectionHeader CreateNullSection(string segmentName)
        => new SectionHeader(
            segmentName: segmentName,
            nameOffset: 0,
            type: SectionHeaderType.Null,
            flags: SectionHeaderFlags.None,
            address: 0,
            offset: 0,
            size: 0,
            link: 0,
            info: 0,
            addressAlignment: 0,
            entrySize: 0);
 
    public static SectionHeader CreateProgBits(
        string segmentName,
        uint nameOffset,
        ulong address,
        ulong offset,
        ulong size,
        ulong alignment)
        => new SectionHeader(
            segmentName: segmentName,
            nameOffset: nameOffset,
            type: SectionHeaderType.ProgramData,
            flags: SectionHeaderFlags.Alloc | SectionHeaderFlags.Executable,
            address: address,
            offset: offset,
            size: size,
            link: 0,
            info: 0,
            addressAlignment: alignment,
            entrySize: 0);
 
    public static SectionHeader CreateData(
        string segmentName,
        uint nameOffset,
        ulong address,
        ulong offset,
        ulong size,
        ulong alignment)
        => new SectionHeader(
            segmentName: segmentName,
            nameOffset: nameOffset,
            type: SectionHeaderType.ProgramData,
            flags: SectionHeaderFlags.Alloc | SectionHeaderFlags.Writable,
            address: address,
            offset: offset,
            size: size,
            link: 0,
            info: 0,
            addressAlignment: alignment,
            entrySize: 0);
 
    public static SectionHeader CreateShStrTab(
        string segmentName,
        uint nameOffset,
        ulong offset,
        ulong size)
        => new SectionHeader(
            segmentName,
            nameOffset,
            SectionHeaderType.StringTable,
            SectionHeaderFlags.None,
            0,
            offset,
            size,
            0,
            0,
            1,
            0);
 
    public void WriteTo(BinaryStream stream)
    {
        stream.Write(NameOffset);
        stream.Write((uint)Type);
        stream.Write((ulong)Flags);
        stream.Write(Address);
        stream.Write(Offset);
        stream.Write(Size);
        stream.Write(Link);
        stream.Write(Info);
        stream.Write(AddressAlignment);
        stream.Write(EntrySize);
    }
 
    public string SegmentName { get; }
 
    public long FileOffset { get; set; }
 
    public long FileSize { get; }
 
    public uint NameOffset { get; }
 
    public SectionHeaderType Type { get; }
 
    public SectionHeaderFlags Flags { get; }
 
    public ulong Address { get; }
 
    public ulong Offset { get; }
 
    public ulong Size { get; }
 
    public uint Link { get; }
 
    public uint Info { get; }
 
    public ulong AddressAlignment { get; }
 
    public ulong EntrySize { get; }
}
 
internal class ProgramHeader : ISegment
{
    private ProgramHeader(
        string segmentName,
        ProgramHeaderType type,
        ProgramHeaderFlags flags,
        ulong offset,
        ulong virtualAddress,
        ulong physicalAddress,
        ulong phFileSize,
        ulong memorySize,
        ulong alignment)
    {
        const ulong pageSize = 0x1000;
        if (virtualAddress % pageSize != offset % pageSize)
            throw new ArgumentException("Virtual address must be aligned to page size");
 
        if (alignment != 0 && virtualAddress % alignment != offset % alignment)
            throw new ArgumentException("Virtual address must be aligned to alignment");
 
        SegmentName = segmentName;
        FileOffset = 0;
        FileSize = ElfHeader.PhSize;
 
        Type = type;
        Flags = flags;
        Offset = offset;
        VirtualAddress = virtualAddress;
        PhysicalAddress = physicalAddress;
        PhFileSize = phFileSize;
        MemorySize = memorySize;
        Alignment = alignment;
    }
 
    public static ProgramHeader CreateProgramHeader(
        string segmentName,
        ulong offset,
        ulong address,
        ulong phFileSize,
        ulong alignment)
        => new ProgramHeader(
            segmentName: segmentName,
            type: ProgramHeaderType.ProgramHeader,
            flags: ProgramHeaderFlags.Read,
            offset: offset,
            virtualAddress: address,
            physicalAddress: address,
            phFileSize: phFileSize,
            memorySize: phFileSize,
            alignment: alignment);
 
    public static ProgramHeader CreateProgramHeaderCode(
        string segmentName,
        ulong offset,
        ulong address,
        ulong phFileSize,
        ulong alignment)
        => new ProgramHeader(
            segmentName: segmentName,
            type: ProgramHeaderType.Load,
            flags: ProgramHeaderFlags.Read | ProgramHeaderFlags.Execute,
            offset: offset,
            virtualAddress: address,
            physicalAddress: address,
            phFileSize: phFileSize,
            memorySize: phFileSize,
            alignment: alignment);
 
    public static ProgramHeader CreateProgramHeaderData(
        string segmentName,
        ulong offset,
        ulong address,
        ulong phFileSize,
        ulong alignment)
        => new ProgramHeader(
            segmentName: segmentName,
            type: ProgramHeaderType.Load,
            flags: ProgramHeaderFlags.Read | ProgramHeaderFlags.Write,
            offset: offset,
            virtualAddress: address,
            physicalAddress: address,
            phFileSize: phFileSize,
            memorySize: phFileSize,
            alignment: alignment);
 
    public void WriteTo(BinaryStream stream)
    {
        stream.Write((uint)Type);
        stream.Write((uint)Flags);
        stream.Write(Offset);
        stream.Write(VirtualAddress);
        stream.Write(PhysicalAddress);
        stream.Write(PhFileSize);
        stream.Write(MemorySize);
        stream.Write(Alignment);
    }
 
    public string SegmentName { get; }
 
    public long FileOffset { get; set; }
 
    public long FileSize { get; }
 
    public ProgramHeaderType Type { get; }
 
    public ProgramHeaderFlags Flags { get; }
 
    public ulong Offset { get; }
 
    public ulong VirtualAddress { get; }
 
    public ulong PhysicalAddress { get; }
 
    public ulong PhFileSize { get; }
 
    public ulong MemorySize { get; }
 
    public ulong Alignment { get; }
}
 
internal class ElfWriter
{
    private const string ElfHeaderName = "ELF Header";
    private const string ProgramHeaderName = "Program Header";
    private const string PhCodeName = "PH Code";
    private const string PhDataName = "PH Data";
    private const string ContentCodeName = "Content Code";
    private const string ContentDataName = "Content Data";
    private const string ContentShStrTabName = "Content Shstrtab";
    private const string ShNullName = "SH Null";
    private const string ShTextName = "SH Text";
    private const string ShDataName = "SH Text";
    private const string ShShStrTabName = "SH ShStrTab";
 
    private const ulong BaseAddress = 0x00400000;
 
    public void Write(ElfOptions options)
    {
        var outputFile = new OutputFile();
 
        // add null segments to calculate offsets correctly
        outputFile.Add(new NullSegment(ElfHeaderName, ElfHeader.ElfHeaderSize));
 
        var nullProgramHeader = new NullSegment(ProgramHeaderName, ElfHeader.PhSize);
        outputFile.Add(nullProgramHeader);
        outputFile.Add(new NullSegment(PhCodeName, ElfHeader.PhSize));
        outputFile.Add(new NullSegment(PhDataName, ElfHeader.PhSize));
 
        // TODO: replace by code
        var contentCode = new DataSegment(ContentCodeName, [
            0x20, 0x00, 0x80, 0xD2, // mov x0, #1
            0xE1, 0x00, 0x00, 0x58, // mov x1, =msg
            0xC2, 0x01, 0x80, 0xD2, // mov x2, #14
            0x08, 0x08, 0x80, 0xD2, // mov x8, #64
            0x01, 0x00, 0x00, 0xD4, // svc #0
 
            0x20, 0x00, 0x80, 0xD2, // mov x0, #1
            0xA8, 0x0B, 0x80, 0xD2, // mov x8, #93
            0x01, 0x00, 0x00, 0xD4, // svc #0
 
            0x10, 0x01, 0x41, 0x00, // address 0x400110
            0x00, 0x00, 0x00, 0x00,
        ]);
        outputFile.Add(contentCode);
 
        var contentData = new DataSegment(ContentDataName, [
            .."Hello, World!\n"u8, 0x00
        ]);
        outputFile.Add(contentData);
 
        var shstrtab = new StringSegment(ContentShStrTabName);
        var shstrtabOffset = shstrtab.Add(".shstrtab");
        var textOffset = shstrtab.Add(".text");
        var dataOffset = shstrtab.Add(".data");
        outputFile.Add(shstrtab);
 
        // add section headers
        var sectionHeader = SectionHeader.CreateNullSection(ShNullName);
        outputFile.Add(sectionHeader);
        var codeSection = SectionHeader.CreateProgBits(
            segmentName: ShTextName,
            nameOffset: (uint)textOffset,
            address: BaseAddress + (ulong)contentCode.FileOffset,
            offset: (ulong)contentCode.FileOffset,
            size: (ulong)contentCode.FileSize,
            alignment: 8);
        outputFile.Add(codeSection);
        var dataSection = SectionHeader.CreateData(
            segmentName: ShDataName,
            nameOffset: (uint)dataOffset,
            address: BaseAddress + 0x10000 + (ulong)contentData.FileOffset,
            offset: (ulong)contentData.FileOffset,
            size: (ulong)contentData.FileSize,
            alignment: 1);
        outputFile.Add(dataSection);
        outputFile.Add(SectionHeader.CreateShStrTab(
            segmentName: ShShStrTabName,
            nameOffset: (uint)shstrtabOffset,
            offset: (ulong)shstrtab.FileOffset,
            size: (ulong)shstrtab.FileSize));
 
        // populate null segments
        outputFile.Set(ProgramHeaderName, ProgramHeader.CreateProgramHeader(
            segmentName: ProgramHeaderName,
            offset: (ulong)nullProgramHeader.FileOffset,
            address: BaseAddress + (ulong)nullProgramHeader.FileOffset,
            phFileSize: 3 * ElfHeader.PhSize, // TODO: calculate?
            alignment: 0
        ));
        outputFile.Set(PhCodeName, ProgramHeader.CreateProgramHeaderCode(
            segmentName: PhCodeName,
            offset: 0,
            address: BaseAddress,
            phFileSize: (ulong)(contentCode.FileOffset + contentCode.FileSize),
            alignment: 0x10000 // TODO: calculate?
        ));
        outputFile.Set(PhDataName, ProgramHeader.CreateProgramHeaderData(
            segmentName: PhDataName,
            offset: (ulong)contentData.FileOffset,
            address: dataSection.Address,
            phFileSize: dataSection.Size,
            alignment: 0x10000 // TODO: calculate?
        ));
 
        var elfHeader = new ElfHeader(
            segmentName: ElfHeaderName,
            endianness: ElfEndianness.Little, // TODO: support big-endian
            instructionSet: options.InstructionSet,
            entryPoint: codeSection.Address,
            programHeaderOffset: (ulong)nullProgramHeader.FileOffset,
            sectionHeaderOffset: (ulong)sectionHeader.FileOffset,
            programHeaderCount: 3, // TODO: calculate?
            sectionHeaderCount: 4, // TODO: calculate?
            sectionNameIndex: 3 // TODO: calculate?
        );
        outputFile.Set(ElfHeaderName, elfHeader);
 
        using var file = new BinaryStream(
            File.Open(options.OutputPath, FileMode.Create, FileAccess.Write),
            elfHeader.Endianness == ElfEndianness.Little);
 
        outputFile.WriteTo(file);
    }
}

More at GitHub.

This is not the final version to generate any ELF file. It still has a lot of hardcoded stuff and it is strictly designed to create a specific ELF file. But it is a good starting point to understand how it works. For example, the “Hello, World” string could be in a read-only section of memory. Or the amount of sections should be calculated instead of hardcoded.

Pitfalls

First of all, you have to be careful with field sizes and offsets. It is easy to make a mistake and write some field as 32-bit interger instead of 64-bits.

The program header table (PT_PHDR) should be mapped to a loadable segment, otherwise readelf will output the following error: “PHDR segment not covered by LOAD segment”. Initially, I created an elf file where PT_PHDR is not loaded into memory by any Program Header. There was only one loadable segment with code and it pointed directly at the beginning of the code (the first instruction). And the fix is quite simple. Instead of adding file offsets to the program header, just use 0 and load a file from the very beginning including the ELF header. This way the PT_PHDR header will be included. But because your code segment will have more data, you also need to correct all addresses. For example, the address of the program’s entry point.

In the previous section, I mentioned that there are some restrictions for Address, Offset, or Alignment fields. And it is pretty important because otherwise, the OS will just crash your application with the Segmentation Fault error. In some cases, your application will be loaded incorrectly in memory or into a section with wrong permissions (the code section without the Executable permission).

ARM has the fixed size instructions - 32-bits. So, if you want to call a function and pass an address like in the example above (syscall to print the message), you can’t do it directly. Because addresses are 64-bits long obviously you won’t be able to encode 64-bits in 32-bits. But there is a workaround. The ARM supports several addressing modes and one of them allows the calculation of the address as an offset from the current instruction. Usually, compilers/disassemblers display such commands as ldr x0, =msg which means loading the address of the msg variable into x0. But in fact it is ldr x0, [pc, #offset]. The pc register is Program Counter - a pointer (address) to the current instruction. This way we can calculate the address without a message but there are some limitations and compilers are making an additional step. Instead of calculating the address directly, they put the address into some memory segment (usually at the end of your code, to be close to the current instruction) and use the ldr instruction to point to this memory segment. So, ldr x0, =msg actually doesn’t point to msg but instead it point to a memory where the address of msg is stored and the CPU can load 64-bit address easily.

Conclusion

Sources