Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to Hermes version 83, 89 and improve contribution process #25

Open
wants to merge 9 commits into
base: add-hbc-83-89-and-improve-contribution
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions ADD_NEW_VERSION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@

# Process

create a fork of repository https://github.com/cyfinoid/hbctool

## Example adding HBC89

1. update `hbctool/hbc/__init__.py`
![image](https://user-images.githubusercontent.com/236843/197051405-6cc73905-c3ff-4624-b7ce-d566b6d22e4b.png)

2. Create a folder HBC89 in `hbctool/hbc/` folder
3. Perform following actions
1. `mkdir data raw tool`
2. `cp ../hbc76/translator.py ./`
3. `cp ../hbc76/parser.py ./`
4. `cd tool; cp ../../hbc76/tool/opcode_generator.py; cd ..`
5. `cp ../hbc76/data/structure.json ./data/structure.json`
4. Find the exact version of hermes which creates same bytecode version
For latest versions of facebook hermes after version 0.70 hermes is compiled by facebook reactnative team directly and we need to dig in to identify the source.

a. we used reactnative 0.70.3 as our base.
b. Either download the source corresponding to that version or navigate it in the github interface itself. https://github.com/facebook/react-native/blob/v0.70.3/sdks/.hermesversion
b. under `sdks/.hermesversion` contains tag details for the specific hermes code used in this version of reactnative.
c. In our case we found `hermes-2022-09-14-RNv0.70.1-2a6b111ab289b55d7b78b5fdf105f466ba270fd7`

Now navigate to facebook/hermes https://github.com/facebook/hermes
there is a same name tag in that project https://github.com/facebook/hermes/tree/hermes-2022-09-14-RNv0.70.1-2a6b111ab289b55d7b78b5fdf105f466ba270fd7

**Please note**: earlier notice from original author @ https://suam.wtf/posts/react-native-application-static-analysis-en/ suggested to look for Version as `The version of Hermes bytecode can be found in /include/hermes/BCGen/HBC/BytecodeFileFormat.h file around line 33.`
However the new updates have pushed version number out in a seperate file @ `include/hermes/BCGen/HBC/BytecodeVersion.h`

we need the content on this project for next step.

5. Download following files from that location into raw folder from `include/hermes/BCGen/HBC/` folder of the project tree at that specific trunk
6. add `BytecodeList.def` https://github.com/facebook/hermes/blob/hermes-2022-09-14-RNv0.70.1-2a6b111ab289b55d7b78b5fdf105f466ba270fd7/include/hermes/BCGen/HBC/BytecodeList.def
7. add `BytecodeFileFormat.h` https://github.com/facebook/hermes/blob/hermes-2022-09-14-RNv0.70.1-2a6b111ab289b55d7b78b5fdf105f466ba270fd7/include/hermes/BCGen/HBC/BytecodeFileFormat.h
8. add `SerializedLiteralGenerator.h` https://github.com/facebook/hermes/blob/hermes-2022-09-14-RNv0.70.1-2a6b111ab289b55d7b78b5fdf105f466ba270fd7/include/hermes/BCGen/HBC/SerializedLiteralGenerator.h

Once these 3 files are copied then you run the opcode generator tool in `tool` folder.
9. `cd tool; python3 opcode_generator.py` This will generate the `opcode.json` in `data` folder

Once this is done we just need one more change. copy over `__init__.py` from older version of hbcXX and we modify 2 lines.
Class name to match HBCXX and change getVersion(self) value
![image](https://user-images.githubusercontent.com/236843/197051569-df72e045-8a56-4773-b46f-50f997a17877.png)


Note:
This is still work in progress, we are working backwards from the code. I need to double check if there are other changes then just this needed.
16 changes: 13 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# hbctool
# hbctool

[![Python 3.x](https://img.shields.io/badge/python-3.x-yellow.svg)](https://python.org) [![PyPI version](https://badge.fury.io/py/hbctool.svg)](https://badge.fury.io/py/hbctool) [![Software License](https://img.shields.io/badge/license-MIT-brightgreen.svg)](/LICENSE)

Expand Down Expand Up @@ -70,6 +70,10 @@ hbctool currently supports the following Hermes Bytecode version:
- [Hermes Bytecode version 62](/hbctool/hbc/hbc62/)
- [Hermes Bytecode version 74](/hbctool/hbc/hbc74/)
- [Hermes Bytecode version 76](/hbctool/hbc/hbc76/)
- [Hermes Bytecode version 83](/hbctool/hbc/hbc83/)
- [Hermes Bytecode version 84](/hbctool/hbc/hbc84/)
- [Hermes Bytecode version 85](/hbctool/hbc/hbc85/)
- [Hermes Bytecode version 89](/hbctool/hbc/hbc89/)

## Contribution

Expand All @@ -78,10 +82,11 @@ Feel free to create an issue or submit the merge request. Anyway you want to con
However, please run the unit test before submiting the pull request.

```
cd hbctool
python test.py
python3 test.py
```

Note: test.py is moved a level up in directory structure to correctly refer

I use poetry to build this tool. To build it yourself, simply execute:

```
Expand All @@ -94,3 +99,8 @@ poetry install
- Create a class abstraction
- Support overflow patching
- Do all TODO, NOTE, FIXME in source code


## Credits

- [https://github.com/bongtrop : Bongtrop](https://github.com/bongtrop) : For initial commits
6 changes: 6 additions & 0 deletions hbctool/hbc/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@

from hbctool.util import *
from hbctool.hbc.hbc89 import HBC89
from hbctool.hbc.hbc85 import HBC85
from hbctool.hbc.hbc84 import HBC84
from hbctool.hbc.hbc83 import HBC83
from hbctool.hbc.hbc76 import HBC76
from hbctool.hbc.hbc74 import HBC74
from hbctool.hbc.hbc62 import HBC62
Expand All @@ -15,7 +18,10 @@
BYTECODE_ALIGNMENT = 4

HBC = {
89: HBC89,
85: HBC85,
84: HBC84,
83: HBC83,
76: HBC76,
74: HBC74,
62: HBC62,
Expand Down
2 changes: 1 addition & 1 deletion hbctool/hbc/hbc59/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def test_hbc(self):
hbcl.dump(hbc, f)
f.close()

f = open("hbc/hbc59/example/index.android.bundle", "rb")
f = open("hbctool/hbc/hbc59/example/index.android.bundle", "rb")
a = f.read()
f.close()
f = open("/tmp/hbctool_test.android.bundle", "rb")
Expand Down
2 changes: 1 addition & 1 deletion hbctool/hbc/hbc62/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def test_hbc(self):
hbcl.dump(hbc, f)
f.close()

f = open("hbc/hbc62/example/index.android.bundle", "rb")
f = open("hbctool/hbc/hbc62/example/index.android.bundle", "rb")
a = f.read()
f.close()
f = open("/tmp/hbctool_test.android.bundle", "rb")
Expand Down
2 changes: 1 addition & 1 deletion hbctool/hbc/hbc74/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ def test_hbc(self):
hbcl.dump(hbc, f)
f.close()

f = open("hbc/hbc74/example/index.android.bundle", "rb")
f = open("hbctool/hbc/hbc74/example/index.android.bundle", "rb")
a = f.read()
f.close()
f = open("/tmp/hbctool_test.android.bundle", "rb")
Expand Down
2 changes: 1 addition & 1 deletion hbctool/hbc/hbc76/test.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ def test_hbc(self):
hbcl.dump(hbc, f)
f.close()

f = open("hbc/hbc76/example/index.android.bundle", "rb")
f = open("hbctool/hbc/hbc76/example/index.android.bundle", "rb")
a = f.read()
f.close()
f = open("/tmp/hbctool_test.android.bundle", "rb")
Expand Down
233 changes: 233 additions & 0 deletions hbctool/hbc/hbc83/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
from hbctool.util import *
from .parser import parse, export, INVALID_LENGTH
from .translator import disassemble, assemble
from struct import pack, unpack

NullTag = 0
TrueTag = 1 << 4
FalseTag = 2 << 4
NumberTag = 3 << 4
LongStringTag = 4 << 4
ShortStringTag = 5 << 4
ByteStringTag = 6 << 4
IntegerTag = 7 << 4
TagMask = 0x70

class HBC83:
def __init__(self, f=None):
if f:
self.obj = parse(f)
else:
self.obj = None

def export(self, f):
export(self.getObj(), f)

def getObj(self):
assert self.obj, "Obj is not set."
return self.obj

def setObj(self, obj):
self.obj = obj

def getVersion(self):
return 83

def getHeader(self):
return self.getObj()["header"]

def getFunctionCount(self):
return self.getObj()["header"]["functionCount"]

def getFunction(self, fid, disasm=True):
assert fid >= 0 and fid < self.getFunctionCount(), "Invalid function ID"

functionHeader = self.getObj()["functionHeaders"][fid]
offset = functionHeader["offset"]
paramCount = functionHeader["paramCount"]
registerCount = functionHeader["frameSize"]
symbolCount = functionHeader["environmentSize"]
bytecodeSizeInBytes = functionHeader["bytecodeSizeInBytes"]
functionName = functionHeader["functionName"]

instOffset = self.getObj()["instOffset"]
start = offset - instOffset
end = start + bytecodeSizeInBytes
bc = self.getObj()["inst"][start:end]
insts = bc
if disasm:
insts = disassemble(bc)

functionNameStr, _ = self.getString(functionName)

return functionNameStr, paramCount, registerCount, symbolCount, insts, functionHeader

def setFunction(self, fid, func, disasm=True):
assert fid >= 0 and fid < self.getFunctionCount(), "Invalid function ID"

functionName, paramCount, registerCount, symbolCount, insts, _ = func

functionHeader = self.getObj()["functionHeaders"][fid]

functionHeader["paramCount"] = paramCount
functionHeader["frameSize"] = registerCount
functionHeader["environmentSize"] = symbolCount

# TODO : Make this work
# functionHeader["functionName"] = functionName

offset = functionHeader["offset"]
bytecodeSizeInBytes = functionHeader["bytecodeSizeInBytes"]

instOffset = self.getObj()["instOffset"]
start = offset - instOffset

bc = insts

if disasm:
bc = assemble(insts)

assert len(bc) <= bytecodeSizeInBytes, "Overflowed instruction length is not supported yet."
functionHeader["bytecodeSizeInBytes"] = len(bc)
memcpy(self.getObj()["inst"], bc, start, len(bc))

def getStringCount(self):
return self.getObj()["header"]["stringCount"]

def getString(self, sid):
assert sid >= 0 and sid < self.getStringCount(), "Invalid string ID"

stringTableEntry = self.getObj()["stringTableEntries"][sid]
stringStorage = self.getObj()["stringStorage"]
stringTableOverflowEntries = self.getObj()["stringTableOverflowEntries"]

isUTF16 = stringTableEntry["isUTF16"]
offset = stringTableEntry["offset"]
length = stringTableEntry["length"]

if length >= INVALID_LENGTH:
stringTableOverflowEntry = stringTableOverflowEntries[offset]
offset = stringTableOverflowEntry["offset"]
length = stringTableOverflowEntry["length"]

if isUTF16:
length*=2

s = bytes(stringStorage[offset:offset + length])
return s.hex() if isUTF16 else s.decode("utf-8"), (isUTF16, offset, length)

def setString(self, sid, val):
assert sid >= 0 and sid < self.getStringCount(), "Invalid string ID"

stringTableEntry = self.getObj()["stringTableEntries"][sid]
stringStorage = self.getObj()["stringStorage"]
stringTableOverflowEntries = self.getObj()["stringTableOverflowEntries"]

isUTF16 = stringTableEntry["isUTF16"]
offset = stringTableEntry["offset"]
length = stringTableEntry["length"]

if length >= INVALID_LENGTH:
stringTableOverflowEntry = stringTableOverflowEntries[offset]
offset = stringTableOverflowEntry["offset"]
length = stringTableOverflowEntry["length"]

if isUTF16:
s = list(bytes.fromhex(val))
l = len(s)//2
else:
l = len(val)
s = val.encode("utf-8")

assert l <= length, "Overflowed string length is not supported yet."

memcpy(stringStorage, s, offset, len(s))

def _checkBufferTag(self, buf, iid):
keyTag = buf[iid]
if keyTag & 0x80:
return (((keyTag & 0x0f) << 8) | (buf[iid + 1]), keyTag & TagMask)
else:
return (keyTag & 0x0f, keyTag & TagMask)

def _SLPToString(self, tag, buf, iid, ind):
start = iid + ind
if tag == ByteStringTag:
type = "String"
val = buf[start]
ind += 1
elif tag == ShortStringTag:
type = "String"
val = unpack("<H", bytes(buf[start:start+2]))[0]
ind += 2
elif tag == LongStringTag:
type = "String"
val = unpack("<L", bytes(buf[start:start+4]))[0]
ind += 4
elif tag == NumberTag:
type = "Number"
val = unpack("<d", bytes(buf[start:start+8]))[0]
ind += 8
elif tag == IntegerTag:
type = "Integer"
val = unpack("<L", bytes(buf[start:start+4]))[0]
ind += 4
elif tag == NullTag:
type = "Null"
val = None
elif tag == TrueTag:
type = "Boolean"
val = True
elif tag == FalseTag:
type = "Boolean"
val = False
else:
type = "Empty"
val = None

return type, val, ind

def getArrayBufferSize(self):
return self.getObj()["header"]["arrayBufferSize"]

def getArray(self, aid):
assert aid >= 0 and aid < self.getArrayBufferSize(), "Invalid Array ID"
tag = self._checkBufferTag(self.getObj()["arrayBuffer"], aid)
ind = 2 if tag[0] > 0x0f else 1
arr = []
t = None
for _ in range(tag[0]):
t, val, ind = self._SLPToString(tag[1], self.getObj()["arrayBuffer"], aid, ind)
arr.append(val)

return t, arr

def getObjKeyBufferSize(self):
return self.getObj()["header"]["objKeyBufferSize"]

def getObjKey(self, kid):
assert kid >= 0 and kid < self.getObjKeyBufferSize(), "Invalid ObjKey ID"
tag = self._checkBufferTag(self.getObj()["objKeyBuffer"], kid)
ind = 2 if tag[0] > 0x0f else 1
keys = []
t = None
for _ in range(tag[0]):
t, val, ind = self._SLPToString(tag[1], self.getObj()["objKeyBuffer"], kid, ind)
keys.append(val)

return t, keys

def getObjValueBufferSize(self):
return self.getObj()["header"]["objValueBufferSize"]

def getObjValue(self, vid):
assert vid >= 0 and vid < self.getObjValueBufferSize(), "Invalid ObjValue ID"
tag = self._checkBufferTag(self.getObj()["objValueBuffer"], vid)
ind = 2 if tag[0] > 0x0f else 1
keys = []
t = None
for _ in range(tag[0]):
t, val, ind = self._SLPToString(tag[1], self.getObj()["objValueBuffer"], vid, ind)
keys.append(val)

return t, keys
Loading