Skip to content
This repository was archived by the owner on Jan 27, 2022. It is now read-only.

Conversation

manju956
Copy link

This feature initiates refresh of enclave encryption key pair based on timer value set in the config file. A new pair of encryption key is generated in the enclave and the updated enclave signup details are stored in the KvStorage in workers table.
Additioanlly, encryption key signature is also generated during enclave initialization and its made part of PublicEnclaveData. This encryption key signature is re-computed when encryption key gets refreshed.

When a worker key gets refreshed during the work order submission, a specific error code is returned to client to indicate worker key refresh happened at the enclave. On receiving this error code, client retrieves the updated worker details and does work order submission again.

Copy link
Contributor

@danintel danintel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved after you make corrections I suggested.

@manju956 manju956 force-pushed the key_refresh branch 2 times, most recently from 672f9e3 to 292c2c1 Compare March 27, 2020 06:38
PROCESSING = 7
BUSY = 8
UNKNOWN_ERROR = 9
WORKER_KEY_REFRESHED = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WORKER_KEY_REFRESHED -> WORKER_ENCRYPTION_KEY_REFRESHED

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

enclave_manager.enclave_id).encode("UTF-8")
# Calculate sha256 of worker id to get 32 bytes. The TC spec proxy
# model contracts expect byte32. Then take a hexdigest for hex str.
worker_id = hashlib.sha256(worker_id).hexdigest()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a function generate_worker_id() and call it at multiple places instead of duplicating this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

enclave_manager.process_work_orders(kv_helper)
logger.info("Enclave manager sleeping for %d secs", sleep_interval)
time.sleep(sleep_interval)
key_refresh_time = key_refresh_time + sleep_interval
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decouple key refresh from work order execution. You could use asyncio coroutine to create key refresh as a separate task. That's will also avoid using sleep interval which is used for work order polling.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will explore asyncio and use it to do key refresh

logger.error("\n Unable to get worker \n")
sys.exit(-1)

_start_client(config, options, jrpc_req_id, worker_id, worker_registry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason why prepare worker is done outside start_client ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved worker prepare code to start_client function.

// to do this; for now, the constant will be good enough.
static const size_t cMaxSealedDataSize = 8192;
static const size_t cMaxPublicDataSize = 4096;
//static const size_t cMaxPublicDataSize = 4096;
Copy link
Contributor

@manojgop manojgop Mar 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove commented line if it's not needed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

//static const size_t cMaxPublicDataSize = 4096;

// size of RSA key signature is 2048
static const size_t cMaxPublicDataSize = 6144;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did we arrive at 6144 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need a larger buffer because of base64, the formula is
(bytes * 4 / 3) + 2 +1
The +2 is because of the 0 to 2 "=" padding bytes. This can be computed exactly if needed.
The +1 is if there is a NUL terminator byte
So I come up with 2048*4/3 +2 + 1 = 2733

Copy link
Author

@manju956 manju956 Mar 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added 2048 bytes(to account encryptionKeySignature) to cMaxPublicDataSize(before 4096 bytes) which sum up to 6144.
the base64 encoded version of the signature was less than 2048, hence i chose to add 2048 to the cMaxPublicDataSize buffer size.
@dan yes we can allot max buffer size equals to 2733 bytes as you suggested. thanks for the suggestion.

[out] size_t* outSealedEnclaveDataSize,
[out] sgx_report_t* outEnclaveReport
);
public tcf_err_t ecall_RefreshEnclaveKey(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ecall_RefreshEnclaveKey -> ecall_RefreshWorkerEncryptionKey

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Log(TCF_LOG_ERROR, "error::DecryptMessage - %d - %s", \
e.error_code(), e.what());
tcf::error::ThrowIf<tcf::error::KeyRefreshError>(
true == true, "Failed to decrypt encrypted session key. " \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use throw tcf::error::KeyRefreshError(msg) instead of tcf::error::ThrowIf(true==true)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

} // tcf::enclave_api::base::CreateSignupData

// XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
tcf_err_t tcf::enclave_api::enclave_data::RefreshEnclaveKey(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RefreshEnclaveKey -> RefreshWorkerEncryptionKey

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

result = TCF_ERR_UNKNOWN;
} catch (...) {
tcf::enclave_api::base::SetLastError("Unexpected exception in (RefreshEnclaveKey)");
result = TCF_ERR_UNKNOWN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RefreshEnclaveKey -> RefreshWorkerEncryptionKey

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@manju956 manju956 closed this Apr 6, 2020
@manju956 manju956 reopened this Apr 6, 2020
@manju956 manju956 force-pushed the key_refresh branch 2 times, most recently from f34bf46 to efa7b39 Compare April 6, 2020 10:28
"Setting sleep interval to 10 seconds: %s", str(err))
sleep_interval = 10
while True:
await lock.acquire()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a comment why we need a lock and why we need to synchronize between key refresh and work order processing if its required.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add comment to explain the need to synchronization using lock

res["error"]["code"] == \
WorkOrderStatus.WORKER_ENCRYPT_KEY_REFRESHED:
logger.error("Worker Key refreshed. Retrieving latest Worker details")
_start_client(config, options, jrpc_req_id, worker_id, worker_registry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So _start_client() is being called recursively here ? If so, please modify the logs. The logs is only mentioning about retrieving latest worker details. _start_client() also resubmits the work order.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not called recursively. Only if work order execution fails due to worker key refresh during the submission of work order, the error scenario is handle by retrieving latest worker details(after key refresh) and submit the work order again.

@manju956 manju956 force-pushed the key_refresh branch 3 times, most recently from e6be4ea to 4e27e62 Compare April 13, 2020 12:46
enclave manager service
"""
try:
sleep_interval = int(config["EnclaveManager"]["sleep_interval"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to integrate the change for synchronous processing via ZMQ instead of polling based on the config

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will rebase my PR.

This feature initiates refresh of worker encryption key pair based on
number of work orders processed in case of Singleton worker or number of
pre-processed work orders in case of KME worker.
A new pair of encryption key is generated in the enclave and the updated
enclave signup details are stored in the KvStorage in workers table.

Worker encryption key signature is re-computed when encryption key gets refreshed.

When a worker key gets refreshed during the work order submission,
a specific error code is returned to client to indicate worker key refresh.
On receiving this error code, client retrieves the updated worker details and
does work order submission again.

Signed-off-by: manju956 <[email protected]>
@lgtm-com
Copy link

lgtm-com bot commented Sep 1, 2020

This pull request introduces 5 alerts when merging 1df89b3 into ed424e7 - view on LGTM.com

new alerts:

  • 4 for Unused import
  • 1 for Syntax error

@manju956
Copy link
Author

manju956 commented Nov 9, 2020

A new updated PR is raised for worker key refresh policy implementation #671. Hence closing this PR

@manju956 manju956 closed this Nov 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants