A bit on custom image formats

haskal finally makes another blog post challenge 2020

this may seem fairly trivial but one thing that has been surprising in a lot of RE work i have been working on is that custom firmware image formats aren't actually difficult to figure out with completely opaque-box analysis. by custom image formats, i mean software blobs that vendors like to put out where for one reason or another, they decided they were too cool for something standard so they decided to roll their own way of putting together the blob from the component pieces of data that need to go in the firmware file. now usually you might just throw binwalk at it and call that a day but sometimes there might be stuff that binwalk doesn't have signatures for that could be important, so understanding the image format could be an important thing too.

the basics with firmware images is mostly that there's data inside that you want to look at, most of the time there's multiple sections of different kinds of data, and usually each section has some metadata associated with it, like a name, data size, load address, checksum, build date, etc. the key is actually you only care about the name and the size. and if you want to create your own images, the checksum too. and these are all actually surprisingly easy to identify even if you're in a frill-less tool like xxd | less (confession: yeah i use xxd | less when i can't be bothered with things)

i'm going to refer to stuff that you are going to be interested in as "fields", think of each instance of the metadata as a C struct, and we're trying to decipher what the data types and semantics of each field in the struct are. for example, if metadata were described by this

struct meta {
    char name[120];
    uint32_t size;
    uint32_t crc32;
};

then you might be given the following, and you'd have to decipher how the original struct looked and what the parts mean

00000000: 6b65 726e 656c 2069 6d61 6765 0000 0000  kernel image....
00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000040: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000070: 0000 0000 0000 0000 c461 4a00 012b 9df8  .........aJ..+..

first, run binwalk to get an idea of where some data in the image is. note the addresses. then open the image in some sort of hex viewer and find what looks like metadata. there might be an ASCII name that makes it apparent where the metadata is (one thing that seems to appear a lot is having a whole build date and build number and stuff in ASCII as part of this textual name field), or you might just see a repeating pattern of similar-looking data, maybe it always starts with the same magic number. if at a complete loss, take the differences between some addresses from binwalk and see if any words match those differences indicating a size field (note: they could be little endian too!). for the overall layout of images, there are two main patterns that i have personally seen. one where all the metadata is grouped together at the beginning of the image, and one where it's interspersed with the data, so be on the lookout for data that looks like either of these (or perhaps something else). more concretely, like

format 1:
---
metadata 1
metadata 2
metadata 3
section 1 
section 2
section 3
---

format 2:
---
metadata 1
section 1
metadata 2
section 2
metadata 3
section 3
---

now compare each metadata part and note the similarities and differences between them. if you end up with an image that has only one section in it, try to find another version of the same firmware or anything that can be compared. also note the target's word size because fields will most likely be words. look for words in the metadata that are approximately the same as the differences between addresses found in binwalk, these will be size fields. also look for parts that are just completely different between sections, these are going to be checksums or hashes, you can guess by the apparent size of the field what kind it might be (crc32, sha256, ...). once you know where the size fields are located in the image you can actually go walk the whole image and copy out each individual section based on what the overall layout looks like.

so to put it together, basically you can find the following types of fields in arbitrary firmware images that describe the data inside and might help you dissect the image

  • name: look for ASCII, perhaps terminated by a lot of zeroes to a certain length that's always the same
  • size: look for words that are approximately the same as offsets between stuff that binwalk found in the image
  • checksum: look for parts of the data that are just wildly different when comparing multiple instances of metadata, all the bytes seem basically random

unfortunately there aren't really good concrete examples that it would be a good idea to post on this blog (if you have any suggestions, send a comment!) but i hope this is mildly helpful at least