Reverse Engineering an OVL File Format¶
This tutorial assumes that you have basic knowledge of a hex editor, data types and reverse engineering in general.
Workflow¶
-
Download the source code for cobra tools, open in an IDE. For a new format, you need to do three things:
- In
/source/formats/
, duplicate an existing OVL file format's folder (eg.animalresearch
) and rename it to your new format. Rename the XML inside of it, too. - Run
python -m codegen
to generate.py
files from the XML structure definitions in/source/formats/
. These will be put in in/generated/formats/
. Whenever you have made a change to an XML definition, run the codegen again to update the.py
files. - In
/modules/formats/
, create a new.py
file with a class that handles your format. This tells the tools which XML-defined struct class to use (eg.ResearchRoot
) and which file extension to apply this to (eg..animalresearchunlockssettings
). Minimal example:from generated.formats.animalresearch.compound.ResearchRoot import ResearchRoot from modules.formats.BaseFormat import MemStructLoader class AnimalresearchunlockssettingsLoader(MemStructLoader): target_class = ResearchRoot extension = ".animalresearchunlockssettings"
- In
-
Open an OVL file with your format in the OVL editor, run
Util > Dump Debug Data
.DEV MODE
Certain dev functions require the existence of a
.git
folder in thecobra-tools
location. Either checkout the repository withgit
, or create this folder to enable Dev Mode. -
Open the
.stack
file that was created in your OVL's folder. Search for your file extension. You will find, for example, the following:FILE [ 0 | 896] ( 64) cc_anubis.fgm PTR @ 16 -> SUB [ 0 | 164] ( 120) PTR @ 24 -> SUB [ 0 | 288] ( 608) PTR @ 32 -> SUB [ 0 | 1008] ( 40) DEP @ 0 -> cc_anubis.paosamplertexture.tex DEP @ 8 -> cc_anubis.pbasecolourtexture.tex DEP @ 16 -> cc_anubis.pflexicolourmaskssamplertexture.tex DEP @ 24 -> cc_anubis.pmetalsmoothnesscavitysamplertexture.tex DEP @ 32 -> cc_anubis.pnormaltexture.tex PTR @ 40 -> SUB [ 0 | 0] ( 164)
-
The above tells you that the main struct for
cc_anubis.fgm
starts in pool number 0 at offset896
and occupies64
bytes, starting at that offset. -
Open the pool
.dmp
file in a hex editor. If you set the width to8
(not always useful), navigate to offset896
, you will see the following:Offset(d) 00 04 00000896 05000000 00000000 ........ 00000904 26000000 00000000 &....... 00000912 40504F49 4E544552 @POINTER 00000920 40504F49 4E544552 @POINTER 00000928 40504F49 4E544552 @POINTER 00000936 40504F49 4E544552 @POINTER 00000944 00000000 00000000 ........ 00000952 00000000 00000000 ........
The hex and the stack log tell you a number of things:
- There are 4 pointers in your struct, which occupy 32 bytes in total (from 912-944, or relative to the struct: 16-48). For convenience, pointers are always given with
PTR @ relative_offset
in the log and marked@POINTER
in the.dmp
. - There are likely integers at relative offsets 0 (
5
) and 8 (38
). These could represent counts for one of the pointers. - This file depends on 5 external files (in this case
.tex
textures, which makes sense, as a.fgm
material has to refer to them). These are markedDEP @ relative_offset
in the log and marked@DEPENDS
in the.dmp
. But these do not appear directly in the main struct.
Let's start by writing an XML representation for the main struct, which covers its 64 bytes:
Setting<compound name="FgmHeader" inherit="MemStruct"> <add name="count_0" type="uint64" /> <add name="count_1" type="uint64" /> <add name="ptr_0" type="Pointer" /> <add name="ptr_1" type="Pointer" /> <add name="ptr_2" type="Pointer" /> <add name="ptr_3" type="Pointer" /> <add name="unk_0" type="uint64" /> <add name="unk_1" type="uint64" /> </compound>
type
toPointer
will make the tool read those 8 bytes as a pointer and then read a sub-struct at the address that this pointer points to. But first, you need to figure out the data layout of the pointer's sub-struct for this to work. - There are 4 pointers in your struct, which occupy 32 bytes in total (from 912-944, or relative to the struct: 16-48). For convenience, pointers are always given with
-
Look at the sub-structs pointed to by the pointers.
For
PTR @ 16
, you'll find 120 bytes starting at offset 164. You'll notice a repetition in the pattern after 24 bytes.The sizeOffset(d) 00 04 00000160 AC020000 ¬... 00000168 08000000 00000000 ........ 00000176 00000000 00000000 ........ 00000184 00000000 BE020000 ....¾... 00000192 08000000 01000000 ........ 00000200 00000000 00000000 ........ 00000208 00000000 D1020000 ....Ñ... ........
24 (size of sub-sub-struct) * 5 (count) = 120 (size of sub-struct)
indicates that the count is actually used for this pointer, and you're looking at an array. The whole struct forPTR @ 16
, now set to 24 bytes width. Now you can see the sub-sub-struct is likely composed of 6uint
s. The first of these could be a string offset, the second is constantly 8, the third increments (an index?) and the rest are zeros.Offset(d) 00 04 08 12 16 20 00000144 AC020000 ¬... 00000168 08000000 00000000 00000000 00000000 00000000 BE020000 ....................¾... 00000192 08000000 01000000 00000000 00000000 00000000 D1020000 ....................Ñ... 00000216 08000000 02000000 00000000 00000000 00000000 F1020000 ....................ñ... 00000240 08000000 03000000 00000000 00000000 00000000 16030000 ........................ 00000264 08000000 04000000 00000000 00000000 00000000 ....................
For
PTR @ 32
, you'll find 40 bytes, occupied only by 5 dependency links. The stack log tells you which external file dependency points there.Offset(d) 00 04 00001008 40444550 454E4453 @DEPENDS 00001016 40444550 454E4453 @DEPENDS 00001024 40444550 454E4453 @DEPENDS 00001032 40444550 454E4453 @DEPENDS 00001040 40444550 454E4453 @DEPENDS
-
Now you have some more knowledge of the format, so time to document the struct in XML syntax for the codegen. This will result in something like the following:
Notice that<compound name="FgmHeader" inherit="MemStruct"> <add name="count_0" type="uint64" /> <add name="count_1" type="uint64" /> <add name="array_0" type="ArrayPointer" template="Sub1" arg="count_0"/> <add name="ptr_1" type="Pointer" /> <add name="dependencies" type="Pointer" /> <add name="ptr_3" type="Pointer" /> <add name="unk_0" type="uint64" /> <add name="unk_1" type="uint64" /> </compound> <compound name="Sub1" inherit="MemStruct"> <add name="offset" type="uint" /> <add name="constant_eight" type="uint" /> <add name="index" type="uint" /> <add name="zero_0" type="uint" /> <add name="zero_1" type="uint" /> <add name="zero_2" type="uint" /> </compound>
ptr_0
has been renamed toarray_0
, itstype
changed toArrayPointer
. Its sub-struct is set totemplate="Sub1"
, counted byarg="count_0"
.
Tips & Tricks¶
Identifying counts for pointers
You'll want to compare the data size of the sub-structs with candidates for counts. If you find integer divisions, you have a likely match. Be aware that these are memory representations and (array) data can be and often is padded to align with 16 bytes offsets.
In most but not all formats, the count somewhat counter-intuitively follows the array pointer.
Data type of sub-structs
A quick way to determine the data type of sub-structs is looking at the stack log.
- Are there any pointers in the sub-struct? → it must be a struct too
- Is its length not divisible by 8? → it is most likely a ZString
Finding rare pointers
You can easily miss out on conditional pointers if you don't look at all files of a format in the stack log, as null pointers don't necessarily appear in the stack log.
Once you have defined and implemented a preliminary struct, open an OVL containing your format with Debug Mode
turned on. All instances of the struct are then checked for pointers missing from the XML specification and you will receive warnings in the console if any are found.
Naming arrays and counts
If you follow naming conventions for arrays and counts, the count is automatically hidden from the XML on extraction and calculated on injection. Consider the following examples:
- dependencies, dependency_count
- dependencies, dependencies_count
- dependencies, num_dependencies
Homogeneous data with no obvious pattern
Assuming you have identified the data type already: Modify data, put ingame, observe changes to identify the meaning of the data.
Comparing to original
Once you have a suitable XML description, extract your file and inject it back into the ovl. If it injected successfully, select the file in the GUI and click Utils > Compare, then select the OVL that contains the original file. If there are differences, you will find warnings in the log. If the injected file is identical to the original, the log will tell you so. You then have a working description of your format, at least for that particular file.