diff --git a/eip-0032.md b/eip-0032.md new file mode 100644 index 00000000..75523935 --- /dev/null +++ b/eip-0032.md @@ -0,0 +1,400 @@ +Non-interactive encryption of TX-purpose messages +============================================ + +* Author: nitram147 +* Status: Proposed +* Created: 05-May-2022 +* Last edited: 05-May-2022 +* License: CC0 +* Forking: not needed + +Introduction & Motivation +-------------------------------------------- +Many modern centralized internet banking systems support attaching a message for transaction recipient, which can be used for example for indication of purpose of this transaction. However, to this day there does not exist such a standard for UTXO blockchains. + +Few ergo developers decided to tackle this problem and proposed [EIP-0029](https://github.com/ergoplatform/eips/blob/eip29-tx-purpose/eip-0029.md). However current version of EIP-0029 supports only option for sending plaintext messages. This would mean that anyone with access to the Ergo blockchain (anyone with access to the internet and even people without it) will be able to read exchanged messages. + +Most people would probably prefer to keep their exchanged messages secret and let only them and their recipient be able to read them in their plaintext form. First thing which comes to mind is to somehow encrypt these messages, but any encryption would require to use some form of encryption key. Users could just exchange password in some way and use it for the encryption of messages. However, this would be impractical, as password could be lost etc. In many cases, the sender even does not know the receiver, thus does not have a channel via which he could exchange the password for the messages encryption, while he still wants to send him an encrypted message in transaction together with funds. + +This EIP aims to provide solution for the above mentioned problems and proposes a standard for non-interactive encryption of TX-purpose messages. This solution can be easily extended to support even content types other than ordinary UTF-8 messages. + +The term non-interactive in this context means that the sender doesn't have to interact in any way with the receiver, the only thing he needs to know is receiver's public key which is already known, because it's embedded in the P2PK ergo address. + +Analysis of the problem +-------------------------------------------- +The core of this EIP stands on finding a way how to non-interactively generate some shared secret between transaction sender and receiver. Thankfully, "ordinary" Ergo's users addresses are in P2PK (Pay-to-Public-Key) format. We can take advantage of the commutative behaviour of [eliptic curve point multiplication](https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication) function and use the sender's private key together with receiver's public key (extracted from his P2PK address) to generate shared secret. In the same way, transaction's receiver could use his corresponding private key together with sender's public key to generate the same shared secret. External observer is not able to do so, because he only knows public keys of both parties and not their private keys. This method is known as [ECDH](https://en.wikipedia.org/wiki/Elliptic-curve_Diffie%E2%80%93Hellman). + +Sender and receiver might want to exchange more messages in the future, but with the use of ECDH, their shared secret would stay the same (for the combination of the same sender's address and same receiver's address). The shared secret also stays the same in both directions of message sending: Alice -> Bob shared secret is the same as Bob -> Alice shared secret. This would imply that if we use just this shared secret to encrypt messages, the same messages, e.g. message "groceries", would lead to the same ciphertext. This could be used by an attacker to link together transactions between Alice and Bob which used the same plaintext - he does not know the unencrypted content of messages, however, he knows that the messages are the same. + +In order to prevent this from happening, the encryption key should not depend solely on the shared secret. In this EIP, the use of additional transaction data is proposed. One possible solution would be to use ID of the first UTXO (unspent box which is transation trying to spend) to derive the encryption together with shared secret. + +This however wouldn't be enough to fix all of the privacy issues. The very same sender Alice can create multiple transaction output boxes for Bob containing the same message, so in order to stop the possible attacker from linking these boxes as boxes containing the same message we need to introduce also the output box index into the encryption key calculation algorithm. + +Once the algorithm for derivation of the encryption is designed, the encryption function has to be choosen. As the encryption function is the main part of the solution, the privacy of the whole solution depends heavily on the selection of good encryption function, which should not have any critical vulnerabilities. During the process of choosing an encryption function, the output overhead of the function should be also taken into account because the block space is expensive and we do not want to waste this precious space with unnecessary overhead. + +Since the transaction may contain multiple input boxes from various P2PK addresses, the receiving user should be also notified from which address the message for him comes from so he will be able to calculate correct encryption key (he needs to know the address of the message sender). + +Proposed solution implementation details +-------------------------------------------- +The result of [ECC point multiplication]((https://en.wikipedia.org/wiki/Elliptic_curve_point_multiplication)) of first party private key and second party public key results in a point on the eliptic curve. + +Let ``intermediate_shared_secret`` be the result of concatenation of two 32 bytes arrays where the first array corresponds to big-endian representation of the x-coordinate of the before mentioned point. The second array is big-endian representation of y-coordinate of the before mentioned point. + +Then let ``shared_secret = Blake2b256(intermediate_shared_key)``. + +Let ``first_utxo_id`` be the ID of the first UTXO which is being spent in the transaction (32 bytes long). + +Let ``hashed_output_index = Blake2b256(output_index)``, where ``output_index`` corresponds to 2 bytes long big-endian representation of the position of output box in the transaction outputs list, for which we are trying to encrypt/decrypt the message. + +Then ``encryption_key = Blake2b256(shared_secret XOR first_utxo_id XOR hashed_output_index)`` + +As the message encryption function, the ``ChaCha20`` symetric stream cipher function is choosen. Stream cipher is preffered over the block cipher because it produces only the same amount of bytes on its output as it gets on its input. Block cipher such as AES could be also used in the AES-CTR mode for this purpose. However, ``ChaCha20`` is faster and also uses the same sub-routines as the ``Blake2b`` function which is widely used in the Ergo ecosystem, thus some code could be reused if being implemented for restricted enviroment, such as microcontrolers in hardware wallets. + +The ``encryption_key`` is used as the key for the ``ChaCha20`` function. For the nonce, the 12 bytes long value is calculated as the 12 leftmost bytes of ``Blake2b256(encryption_key)``. The reason for nonce in ``ChaCha20`` cipher is to always provide some additional entropy while using the same encryption key so the cipher would return different ciphertexts for same plaintext while using the same encryption key. It means that nonce should be choosen always different. However, we don't need to care about this because we are using always different encryption key, which is guaranteed to be always different (assuming that the ``Blake2b256`` function is collision-resistant) because of uniquenes of the first input utxo's ID together with the box output index across all of the transactions output possible in the ergo blockchain ecosystem. + +The TX-purpose message is converted to UTF-8 bytes which are then encrypted via the ``ChaCha20`` function with above mentioned parameters. The resulting ciphertext's bytes are then prepended with 1 byte value corresponding to the index of sender's box being spent in the transaction inputs list. This is required for the receiver so he would be able to derive the correct encryption key (as mentioned in the motivation section). + +This data are then encoded by the same way such as "plaintext" option in [EIP-0029](https://github.com/ergoplatform/eips/blob/eip29-tx-purpose/eip-0029.md) while using type code 3. + +This algorithm specification assumes that the transaction containing one or more TX-purpose messages would not have any TX-purpose messages on outputs with index greater or equal to 2^16 (thus the 2 byte long value of ``output_index``), and all of the TX-purpose messages senders addresses are contained in the first 2^8 inputs of the transaction (thus the 1 byte long value for the index of TX-purpose message sender's box). + +Wallets implementation +-------------------------------------------- +In order to be able to decrypt incoming message, wallet needs to know private key corresponding to the receiving address. Most of the wallets store the private keys in some encrypted form, which means the user will have to enter the wallet password for the decryption of such message. In case that the wallet stores those incoming messages in their original encrypted form, the user would need to enter the wallet password each time he wants to see the transaction history containing TX-purpose messages, which could be quite user unfriendly for ordinary users (non-privacy focused ones). + +The second option would be for wallet to calculate ``shared_secret`` for each pair of sender-receiver addresses and store it unencrypted. This would allow the wallet to decrypt any past or future message between those two parties without requirement for a wallet password. However, it should be noted that if this ``shared_secret`` is leaked, the attacker can decrypt all the past and all of the future messages between affected parties (whose ``shared_secret`` was leaked). This EIP highly does not recommend such a solution!!! + +Another way would be to store decrypted messages or the encryption keys in an plaintext form. Because each encryption key is 32 bytes long, it would be better to store decrypted messages (assuming that most of the messages are shorter than 32 characters long). In case of wallet data leak, it would allow attacker to read only those leaked messages and not allow him to read any future messages. + +If user decides that he wants to share the content of some specific message with some third party, he could safely do so via sharing of encryption key corresponding to this specific message. The third party would be able to decrypt only this specific message and would not be able to decrypt any past or future messages between the user and the other party (with whom is user exchanging messages). This property results from the design of the encryption key derivation algorithm. + +Test vectors +-------------------------------------------- +To be done (TODO) + +Reference implementation +-------------------------------------------- +This EIP also provides a reference implementation for the above mentioned algorithm. +Run ``python3 eip_poc.py`` to see the help with example usage. +```python +# /* +----------------------------------+ */ +# /* | EIP-0032 | */ +# /* | eip_poc.py | */ +# /* | (c)copyright nitram147 2022 | */ +# /* +----------------------------------+ */ + +import base58 +import hashlib +from Crypto.Cipher import ChaCha20 +from bip_utils import Bip39SeedGenerator, Bip32Secp256k1 +from fastecdsa.curve import secp256k1 +from fastecdsa.point import Point +import binascii +import sys + +mainnet_prefix = 0x00 +testnet_prefix = 0x10 +p2pk_address_type = 0x01 + +checksum_length = 4 + +eip29_preamble = bytes.fromhex("3c0e400e0350525006") + + +def blake2b256(data): + blake2b256 = hashlib.blake2b(digest_size=32) + blake2b256.update(data) + return bytes(blake2b256.digest()) + +def ergo_p2pk_address_to_pubkey(address): + # for more info about ergo addresses content have a look at: + # https://ergoplatform.org/en/blog/2019_07_24_ergo_address/ + decoded = bytes(base58.b58decode(address)) + valid_prefixes = [] + valid_prefixes.append(testnet_prefix + p2pk_address_type) + valid_prefixes.append(mainnet_prefix + p2pk_address_type) + + if decoded[0] not in valid_prefixes: + raise ValueError("Wrong address!!!") + + prefix_and_content_bytes = decoded[0:-checksum_length] + checksum = blake2b256(prefix_and_content_bytes)[0:checksum_length] + + if decoded[-checksum_length:] != checksum: + raise ValueError("Ergo address has invalid checksum!!!") + + compressed_pubkey = decoded[1:-checksum_length] + + if compressed_pubkey[0] not in [0x02, 0x03]: + raise ValueError("Ergo address contains invalid public key!!!") + + return compressed_pubkey + +def ergo_pubkey_to_p2pk_address(pubkey, mainnet): + + if len(pubkey) != 33: + raise ValueError("Bad public key length!!!") + + if pubkey[0] not in [0x02, 0x03]: + raise ValueError("Bad public key!!!") + + prefix_byte = mainnet_prefix if mainnet else testnet_prefix + prefix_byte += p2pk_address_type + prefix_byte = bytes([prefix_byte]) + + prefix_and_content_bytes = prefix_byte + pubkey + + checksum = bytes(blake2b256(prefix_and_content_bytes)[0:checksum_length]) + + return base58.b58encode(prefix_and_content_bytes + checksum).decode("utf-8") + +def xor_two_32bytes_arrays(first, second): + if len(first) != 32 or len(second) != 32: + raise ValueError("Both arrays should be 32bytes long!!!") + return bytes(a ^ b for (a, b) in zip(first, second)) + +def symetric_encrypt_message(message, key): + if len(key) != 32: + raise ValueError("Bad key length!!!") + nonce = blake2b256(key)[0:12] + cipher = ChaCha20.new(key=key, nonce=nonce) + return cipher.encrypt(message) + +def symetric_decrypt_message(message, key): + if len(key) != 32: + raise ValueError("Bad key length!!!") + nonce = blake2b256(key)[0:12] + cipher = ChaCha20.new(key=key, nonce=nonce) + return cipher.decrypt(message) + +def get_ergo_derivation_path(account, index): + if ( + not isinstance(account, int) or not isinstance(index, int) or + account < 0 or index < 0 + ): + raise ValueError("Account and Index should be integer greater or equal to zero!!!") + return "m/44'/429'/" + str(account) + "'/0/" + str(index) + +def generate_ergo_keys_from_mnemonic(mnemonic, account, index): + seed_bytes = Bip39SeedGenerator(mnemonic).Generate() + bip32_ctx = Bip32Secp256k1.FromSeedAndPath(seed_bytes, get_ergo_derivation_path(account, index)) + return [bytes(bip32_ctx.PrivateKey().Raw()), bytes(bip32_ctx.PublicKey().RawCompressed())] + +def uncompress_secp256k1_public_key(compressed_pubkey): + + if len(compressed_pubkey) != 33 or compressed_pubkey[0] not in [0x02, 0x03]: + raise ValueError("Bad public key!!!") + + p = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFC2F + + x = int.from_bytes(compressed_pubkey[1:33], byteorder='big') + + y_sq = (pow(x, 3, p) + 7) % p + y = pow(y_sq, (p + 1) // 4, p) + if y % 2 != compressed_pubkey[0] % 2: y = p - y + y = y.to_bytes(32, byteorder='big') + + return [compressed_pubkey[1:33], y] + +def get_shared_secret(private_key, public_key): + + if len(private_key) != 32: + raise ValueError("Invalid private key!!!") + + if len(public_key) != 33 or public_key[0] not in [0x02, 0x03]: + raise ValueError("Invalid public key!!!") + + uncompressed_public_key = uncompress_secp256k1_public_key(public_key) + + upbk_x = int.from_bytes(uncompressed_public_key[0], byteorder="big") + upbk_y = int.from_bytes(uncompressed_public_key[1], byteorder="big") + public_key_point = Point(upbk_x, upbk_y, curve=secp256k1) + + #operation bellow represents ECC point multiplication + new_point = int.from_bytes(private_key, byteorder="big") * public_key_point + + shared_secret = new_point.x.to_bytes(32, byteorder="big") + new_point.y.to_bytes(32, byteorder="big") + return blake2b256(shared_secret) + +def load_utxo_id_from_hex_to_bytes(utxo_id): + if len(utxo_id) != 64: + raise ValueError("Bad UTXO id!!!") + return bytes.fromhex(utxo_id) + +def get_message_encryption_key(shared_secret, first_utxo_id, output_index): + + if len(shared_secret) != 32 or not isinstance(shared_secret, bytes): + raise ValueError("Bad shared secret value!!!") + + if len(shared_secret) != 32 or not isinstance(shared_secret, bytes): + raise ValueError("Bad first UTXO id value!!!") + + if not isinstance(output_index, int) or output_index < 0 or output_index > 65535: + raise ValueError("Bad output index value!!!") + + xored = xor_two_32bytes_arrays(shared_secret, first_utxo_id) + xored = xor_two_32bytes_arrays(xored, blake2b256(output_index.to_bytes(2, byteorder="big"))) + return blake2b256(xored) + +def ergo_vlq_encode(value): + if ( + not isinstance(value, int) or + value < 0 or value > 65535 + ): + raise ValueError("Value should be int in range from 0 to 65535!!!") + + result = bytearray() + + while True: + if not (value & (~(0x007F))): + result.append(value & 0xff) + break + else: + result.append((value & 0x7F) | 0x80) + value >>= 7 + + return bytes(result) + +def ergo_vlq_decode(value): + if not isinstance(value, bytes): + raise ValueError("Value should be in bytes format!!!") + + result = 0 + position = 0 + all_ok = False + + for tmp_byte in value: + result |= ((tmp_byte & (0x7F)) << (7*position)) + position += 1 + if tmp_byte < 128: + all_ok = True + break + + if not all_ok: + raise ValueError("Not enough bytes provided!!!") + + return result, position + +def encode_encrypted_message_into_eip29_format(message, input_index): + if not isinstance(input_index, int) or input_index < 0 or input_index > 255: + raise ValueError("Bad input index value!!!") + if not isinstance(message, bytes) or len(message) < 1: + raise ValueError("Bad format of message!") + return ( + eip29_preamble + + ergo_vlq_encode(len(message) + 1) + + input_index.to_bytes(1, byteorder="big") + + message + ) + +def decode_encrypted_message_from_eip29_format(message): + if not isinstance(message, bytes) or len(message) < 1: + raise ValueError("Bad format of message!!!") + if ( + message[0:len(eip29_preamble)] != eip29_preamble or + len(message) < (len(eip29_preamble) + 3) + ): + raise ValueError("Message is not in eip29 format!!!") + + m_len, pos = ergo_vlq_decode(message[len(eip29_preamble):]) + + if len(message[len(eip29_preamble)+pos:]) != m_len: + raise ValueError("Message length in bytes is wrong!!!") + + return [message[len(eip29_preamble)+pos], message[(len(eip29_preamble)+pos+1):]] + +def print_help(): + print("Usage:") + print("\tSend message:") + print( + "\tpython3 " + sys.argv[0] + + " -s sender_mnemonic sender_wallet_idx receiver_address input_idx output_idx first_utxo_id message" + ) + print("\n") + print("\tReceive message:") + print( + "\tpython3 " + sys.argv[0] + + " -r receiver_mnemonic receiver_wallet_idx sender_address output_idx first_utxo_id r9_value" + ) + print("\nExample usecase - sending from Alice wallet address 0 to Bob address 1:") + print("Message to recipient: \"groceries\"") + print("Alice's UTXO will be 3rd input of transaction, Bob's UTXO will be 4th output of transaction") + print("ID of first UTXO input of transaction is \"8040ad2f6284bb27894243c20817fc048fa9e8051fce7b4bed516fa96c3f83aa\"") + print("Alice mnemonic is: \"globe merit misery culture hold same tomorrow water wife uncle glad fall zebra artist steak\"") + print("\n") + print( + "python3 " + + sys.argv[0] + + " -s \"globe merit misery culture hold same tomorrow water wife uncle glad fall zebra artist steak\" " + + "\"0\" \"3Wz13TjNEa6RBedov8VZGGMA6M87EFTDNof3H6jbacQecU7g7kgS\" \"3\" \"4\" " + + "\"8040ad2f6284bb27894243c20817fc048fa9e8051fce7b4bed516fa96c3f83aa\" \"groceries\"" + ) + print("\nExample output is going to be: \"3c0e400e03505250060a03a02d035a40ad4debb3\"") + print("\nExample usecase - receiving message from previous example:") + print("Bob mnemonic is: \"since pave praise monkey bleak reflect oil inquiry month enact dragon calm average weather aspect\"") + print("\n") + print( + "python3 " + + sys.argv[0] + + " -r \"since pave praise monkey bleak reflect oil inquiry month enact dragon calm average weather aspect\" " + + "\"1\" \"3WyPcnBn1Lx8MM73fQr2UJJQ5SJ2AAYJ1r9YGzqcQmUg1mHE1Q5m\" \"4\" " + + "\"8040ad2f6284bb27894243c20817fc048fa9e8051fce7b4bed516fa96c3f83aa\" \"3c0e400e03505250060a03a02d035a40ad4debb3\"" + ) + print("\nExample output is going to be: \"groceries\"") + +if len(sys.argv) == 2 and sys.argv[1] in ["-h", "--help"]: + print_help() + sys.exit(0) + +elif len(sys.argv) not in [8, 9]: + print("Bad arguments count!", file=sys.stderr) + print_help() + sys.exit(1) + +elif sys.argv[1] not in ["-s", "-r"]: + print("Bad first argument values!", file=sys.stderr) + print_help() + sys.exit(2) + +elif ( + sys.argv[1] == "-s" and len(sys.argv) != 9 or + sys.argv[1] == "-r" and len(sys.argv) != 8 +): + print("Bad first argument values!", file=sys.stderr) + print_help() + sys.exit(3) + +mode_sending = True if sys.argv[1] == "-s" else False + +wallet_mnemonic = sys.argv[2] +wallet_idx = int(sys.argv[3]) +other_party_address = sys.argv[4] +input_idx = int(sys.argv[5]) if mode_sending else 0 +output_idx = int(sys.argv[6]) if mode_sending else int(sys.argv[5]) +first_utxo_id_string = sys.argv[7] if mode_sending else sys.argv[6] + +message = sys.argv[8] if mode_sending else "" +r9_value = "" if mode_sending else sys.argv[7] + +wallet_private_key, _ = generate_ergo_keys_from_mnemonic(wallet_mnemonic, 0, wallet_idx) +other_party_public_key = ergo_p2pk_address_to_pubkey(other_party_address) + +shared_key = get_shared_secret(wallet_private_key, other_party_public_key) +first_utxo_id = load_utxo_id_from_hex_to_bytes(first_utxo_id_string) + +encryption_key = get_message_encryption_key(shared_key, first_utxo_id, output_idx) + +if mode_sending: + encrypted_message = symetric_encrypt_message(message.encode("utf-8"), encryption_key) + r9_value = encode_encrypted_message_into_eip29_format(encrypted_message, input_idx) + print(binascii.hexlify(r9_value).decode()) + sys.exit(0) + +else: + _, encrypted_message = decode_encrypted_message_from_eip29_format(bytes.fromhex(r9_value)) + message = symetric_decrypt_message(encrypted_message, encryption_key).decode("utf-8") + print(message) + sys.exit(0) + +sys.exit(0) +```