Vault7: CIA Hacking Tools Revealed
Navigation: » Latest version
Owner: User #5341226
Reforge bytecode specification
This document is intended to document the details of Reforge's compiled bytecode. This will be a living document for several weeks.
TODO: fix up formatting once I get all my initial thoughts recorded here.
When a reforge script file is compiled, several steps will need to be taken to produce the compiled bytecode.
All values and variables will be listed with ids in a vtable that contains their type and any initial value information.
All opcodes used in the bytecode will be randomly assigned, and recorded in a ctable that includes which module or core command it represents.
All instructions will be reduced down to bytecode instructions (as detailed in this document) that reference the information in the vtable as operand ids and information in the ctable as opcodes.
All non-core modules that were used will be packed into the compiled binary in an mtable that includes the offsets and sizes needed to locate the packed module.
The bytecode will be organized as 64-bit aligned instructions.
Each opcode will be 16-bits in length, with the first 12-bits as the randomized (at compile time) command id and the last 4-bits reserved as bitflags.
Each operand id will be 16-bits in length, and will match an id in the compiled vtable. (NOTE: I don't know if 16-bit operand ids is sufficient to future-proof this design, but it makes the byte packing work out nicely. It does limit us to 65,535 operand ids in any one script. If, during testing, we notice our test scripts getting upwards of 5000 active operand ids, I will change this to 24-bit operand ids to give us room for 16,777,216 in one script.)
These bitflags will identify how the operands are laid out. One bitflag will indicate that the first 16-bit operand is the total number of operands needed. If this number is >2, the next 64-bit instruction will be read and interpreted as 4 additional 16-bit operand ids. This will continue until a sufficient number of operands have been read. Any excess operand id space can be filled in with random data as it will be ignored.
One opcode will be randomly assigned to the extended opcode value. This will indicate that the 48-bits set aside for operand ids are to be interpreted as an extended opcode, whose operands can be found in subsequent 64-bit instruction blocks. As with the other opcodes, the last 4-bits of the original extended opcode value (the 16-bit one) will be bitflags as discussed above. (NOTE: I don't think Reforge will actually need to use this for a very long time, if ever. But I'm future-proofing the design just in case. We would need to require >4095 opcodes in a single compiled script to need this feature.)
Bytecode Instructions: (Format: <OPCODE> <DESTINATION> <SOURCE> [<INDEX>])
List operations:
LAD - Set the value stored in the list specified in the destination operand after the index specified in the index operand to the value in the source operand (if index is not provided, add to end)
LRM - Remove the value stored in the list specified in the source operand at the index specified in the index operand and store it in the destination operand
LST - Set the value stored in the list specified in the destination operand at the index specified in the index operand to the value in the source operand
LGT - Get the value stored in the list specified in the source operand at the index specified in the index operand and store it in the destination operand
Integer operations:
ADD - Add source operand to destination operand and store the result in destination operand
SUB - Subtract source operand from destination operand and store the result in destination operand
DIV - Divide destination operand by source operand and store the result in destination operand (NOTE: might store remainder of division in index operand if specified)
MOD - Divide destination operand by source operand and store the remainder in destination operand (NOTE: might store result of division in index operand if specified)
MUL - Multiply source operand by destination operand and store the result in destination operand
String operations:
SAP - Append the string stored in the source operand to the string stored in the destination operand and store the resulting string in the destination operand (NOTE: might repurpose the index operand to convert this into an insert command, with append as the assumed behavior if index isn't specified)
Stream operations:
ESO - Open an encrypted stream on the location specified in the source operand and set the destination operand to the created encrypted stream
PSO - Open a plaintext stream on the location specified in the source operand and set the destination operand to the created plaintext stream
PIP/FLS - Pipe/Flush the remaining contents of the stream in the source operand to the the stream in the destination operand (NOTE: might repurpose the index operand as a byte count to limit amount read)
Control Flow operations: (NOTE: this should cover all loop types now that we have indexing in lists to handle things like for each loops)
JMP - Jump to the instruction located at the offset specified by the destination operand
CMP - Compare the source operand to the destination operand and set condition flags (NOTE: this might be implicitly called by Integer operations above)
JNE - Jump to the instruction located at the offset specified by the destination operand if the equal condition flag is not set
JEQ - Jump to the instruction located at the offset specified by the destination operand if the equal condition flag is set
JLE - Jump to the instruction located at the offset specified by the destination operand if the equal or less than condition flags are set
JGE - Jump to the instruction located at the offset specified by the destination operand if the equal or greater than condition flags are set
JLS - Jump to the instruction located at the offset specified by the destination operand if the the less than condition flag is set
JGR - Jump to the instruction located at the offset specified by the destination operand if the greater than condition flags are set
Core operations: (NOTE: might add more like Run, Start, Unpack, etc... but for now I'm planning on those being separate modules)
EHO - Echo the contents of the source operand to the stream specified by the destination operand
PSE - Pause execution for a number of seconds specified by the destination operand
TODO: ensure opcode support for generating a list from a filepath is included, I don't think it is with just the above.
Mapping of Reforge commands to bytecode:
TODO: add mapping of Reforge commands here
vtable:
- x, int, 0
- y, list, [] #all lists are initialized in the vtable as empty, then dynamically created using bytecode instructions
- z, int, 0
- t1, string, 'test'
- t2, string, '.txt'
- t3, string, ' - plaintext'
- t4, string, ''
- filename, string, ''
- output, encryptedstream, ''
- output2, plaintextstream, ''
list y = [1, 2, 3]
- LAD y, 1
- LAD y, 2
- LAD y, 3
int x = y[2]
- LGT x, y, 2 #LGT does NOT perform type checking, however subsequent use of the x operand with integer operations will attempt to force the value in x to be an integer
y[1] = 4
- LST y, 4, 1
int z = x + y[1] + 8
- LGT z, y, 1
- ADD z, x, 8 #alternatively: ADD z, x then ADD z, 8
z = x + y[1] + 8 - 3
- LGT z, y, 1
- ADD z, x, 5 #alternatively: ADD z, x then ADD z, 8 then SUB z, 3
z = x + y[1] - 3
- LGT z, y, 1
- ADD z, x, -3 #alternatively: ADD z, x then SUB z, 3
add_to_list y 'test'
- LAD y, t1
remove_from_list y, 0
- LRM NUL, y, 0 #NUL is a reserved operand id that is of void type and maintains no reference count. it's used when you have a mandatory operand that you don't need/care about
string filename = y[3] + '.txt'
- LGT filename, y, 3
- SAP filename, t2
encryptedstream output = filename
- ESO output, filename
plaintextstream output2 = filename + ' - plaintext'
- SAP t4, filename
- SAP t4, t3
- PSO output2, t4