Skip to content

Commit

Permalink
Merge pull request #25 from ckamps/main
Browse files Browse the repository at this point in the history
support 0 pHsmsPerSubnet
  • Loading branch information
ckamps authored May 29, 2023
2 parents 9544387 + 92bd55f commit 2f284df
Show file tree
Hide file tree
Showing 6 changed files with 153 additions and 34 deletions.
9 changes: 7 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# CHANGELOG

## May 2023
## May 2023 - v2.1

* Enable stack update to delete all HSMs yet retain the cluster. Set the `pHsmsPerSubnet` to `0` during a stack update to delete all HSMs. This technique can reduce costs of operating non-production clusters.
* When deleting a stack, avoid waiting for the status of HSMs when no HSMs exist.

## May 2023 - v2.0

### Overall enhancements

Expand Down Expand Up @@ -32,6 +37,6 @@ Error handling and reporting has been improved across all stack actions.
* Updated all Lambda functions use Python 3.10.
* Removed `vpc.yaml`.

## October 2021
## October 2021 - v1.0

Original version.
26 changes: 24 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -226,8 +226,8 @@ aws acm-pca get-certificate --certificate-authority-arn <CA arn> --certificate-a
|---------|--------|-----------|-------|---------------------------|
|`pSystem`|Optional|Used as a prefix in the names of many of the newly created cloud resources. You normally do not need to override the default value.|`cloudhsm`|No|
|`pEnvPurpose`|Optional|Identifies the purpose for this particular instance of the stack. Used as part of the prefix in the names of many of the newly created resources. Enables you to create and more easily distinguish resources of multiple stacks in the same AWS account. For example, `1`, `2`, `test1`, `test2`, etc.|`1`|No|
|`pSubnets`|Required|List of subnets to associate with the CloudHSM cluster. Subnets must exists within the same VPC. Only one subnet per availability zone (AZ) may be specified.<br><br>If you're using an AWS region that proivides access to more than 3 AZs, ensure that you supply subnet IDs associated with AZs in which CloudHSM is available. See [Determining regions and AZs in which CloudHSM is available](#determining-regions-and-azs-in-which-cloudhsm-is-available) for details.<br><br>Since CloudHSM does not support making changes to the to the list of subnets after a cluster is created, this template does not support making changes to the list of subnets during stack updates.|None|No|
|`pHsmsPerSubnet`|Optional|Number of HSMs to create per subnet.<br><br>If you intend to connect a KMS custom key store to the cluster, a minimum of 2 HSMs is required. In the simplest case, to support a KMS custom key store, you could specify one subnet in `pSubnets` and `2` for `pHsmsPerSubnet`.<br><br>See the [AWS CloudHSM Quotas](https://docs.aws.amazon.com/cloudhsm/latest/userguide/limits.html) for the maximum total HSMs you can create per cluster.<br><br>You can modify this parameter during a stack update to either grow or shrink the number of HSMs.|`1`|Yes|
|`pSubnets`|Required|List of subnets to associate with the CloudHSM cluster. Subnets must exists within the same VPC. Only one subnet per availability zone (AZ) may be specified.<br><br>If you're using an AWS region that provides access to more than 3 AZs, ensure that you supply subnet IDs associated with AZs in which CloudHSM is available. See [Determining regions and AZs in which CloudHSM is available](#determining-regions-and-azs-in-which-cloudhsm-is-available) for details.<br><br>Since CloudHSM does not support making changes to the to the list of subnets after a cluster is created, this template does not support making changes to the list of subnets during stack updates.|None|No|
|`pHsmsPerSubnet`|Optional|Number of HSMs to create per subnet.<br><br>If you intend to connect a KMS custom key store to the cluster, a minimum of 2 HSMs is required. In the simplest case, to support a KMS custom key store, you could specify one subnet in `pSubnets` and `2` for `pHsmsPerSubnet`.<br><br>See the [AWS CloudHSM Quotas](https://docs.aws.amazon.com/cloudhsm/latest/userguide/limits.html) for the maximum total HSMs you can create per cluster.<br><br>You can modify this parameter during a stack update to either grow or shrink the number of HSMs.<br><br>During a stack update, you can specify `0` to delete all HSMs in the cluster without deleting the cluster. See [Saving costs by deleting HSMs](#saving-costs-by-deleting-hsms).|`1`|Yes|
|`pHsmType`|Optional|The type of HSM to use in the cluster. Currently the only supported value is `hsm1.medium`|`hsm1.medium`|No|
|`pUseExternalPkiProcess`|Optional|Select `true` if you want to use your own PKI process to issue a cluster certificate based on the CSR obtained from the cluster creation process. By default, the stack uses AWS Private CA to create a root CA and issue the cluster certificate. See [Determine whether or not you want to use your own PKI process for issuing the cluster certificate](#5-determine-whether-or-not-you-want-to-use-your-own-pki-process-for-issuing-the-cluster-certificate) to help you understand if this option applies to you.|`false`|Yes|
|`pExternallyProvidedCertsReady`|Optional|Select `true` only if you selected `true` for `pUseExternalPkiProcess` and you've made the necessary certificates available per the process outlined in [Using your own PKI process](#using-your-own-pki-process). This parameter is only processed during stack update operations.<br><br>After you've successfully completed an update based on using your own PKI process, you may leave this parameter set to `true` for subsequent stack updates. The cluster and CA certificates will be processed only once.|`false`|Yes|
Expand Down Expand Up @@ -461,6 +461,28 @@ If you're creating a new cluster from a backup with which a KMS custom key store

Since a CloudHSM backup retains the state of a cluster, creating a new cluster using a backup does not require initialization and activation of the newly created cluster. As part of the stack creation process and after the cluster is restored from the specified backup, the number of HSMs per subnet that you specified when creating the new stack will be created.

### Saving costs by deleting HSMs

If you have a non-production cluster that doesn't need to be used at all times, you can perform a stack update to delete all of the HSMs without deleting the cluster. Later, when you need to use the cluster, you can perform a stack update to recreate the number of HSMs of interest. By retaining the original cluster, you can avoid the process of creating a new cluster including issuing a cluster certificate.

#### Delete all HSMs without deleting a cluster

To delete all HSMs without deleting a cluster, set the `pHsmsPerSubnet` parameter to `0` during a stack update.

Since CloudHSM makes a backup of the cluster when an HSM is removed, you'll have an up-to-date cluster backup that will be used when you create new HSMs. See [Managing AWS CloudHSM backups](https://docs.aws.amazon.com/cloudhsm/latest/userguide/manage-backups.html.)

#### Create HSMs in an empty cluster

When you need to use the cluster again, during a stack update, you can set the `pHsmsPerSubnet` parameter to the desired number of HSMs.

When the new HSMs are added to an CloudHSM cluster that doesn't have any HSMs, CloudHSM will use the most recent cluster backup as the basis for the newly created HSMs.

#### Ensure application client configurations are updated

Note that when you create HSMs, the IP addresses of the HSMs may be different from the original set of HSMs. Consequently, you'll need to ensure that application client configurations are updated to reference at least one of the newly created HSMs. See [Connect the client SDK to the AWS CloudHSM cluster](https://docs.aws.amazon.com/cloudhsm/latest/userguide/cluster-connect.html#connect-how-to) for information on configuring application clients to work with your cluster.

The CloudFormation template automatically updates the EC2 client's configuration when HSMs are created and deleted.

## Deleting the stack

Deletion of the stack generally reverses the process described earlier. When the CloudFormation custom resource is called with the `delete` action, a CloudHSM cluster delete state machine is executed to delete the HSMs and the cluster.
Expand Down
3 changes: 2 additions & 1 deletion TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ Ensure that you test the template in multiple AWS Regions including `us-east-1`
* Change number of HSMs per subnet to 3
* Vary the number of subnets
* Use AMI ID instead of default AWS Systems Manager parameter store parameter
* Validate that the `ClusterId` attribute is set properly.

#### Select CloudHSM client or CLI package

Expand Down Expand Up @@ -69,7 +70,6 @@ Attempt to create a stack...
* Specify multiple subnets/AZs, but with only one that is not supported.
* Test both automated rollbacks and preserve resources upon creation failure.


In all failure scenarios, assess the extent to which resources are rolled back and a stack deletion causes deletion of resources.

### Post Stack Creation
Expand Down Expand Up @@ -116,6 +116,7 @@ Confirm that all resources have been deleted except for:
* Decrease the number of HSMs per subnet.
* Stop the EC2 client (do not terminate) prior to changing the number of HSMs per subnet.
* Set the number of HSMs per subnet to 0.
* After successfully deleting all HSMs, apply another update with HSMs per subnet > 0.

### Updating the EC2 client subnet

Expand Down
123 changes: 107 additions & 16 deletions cloudhsm.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ Parameters:
Description: Number of HSMs per subnet to create in the CloudHSM cluster
Type: Number
Default: 1
MinValue: 1
MinValue: 0

pHsmType:
Description: Type of HSM to use in the cluster. Currently the only supported value is "hsm1.medium"
Expand Down Expand Up @@ -861,8 +861,25 @@ Resources:
Choices:
- Variable: $.ClusterId
StringMatches: cluster-*
Next: GetHsmsState
Next: GetHsmNum
Default: SendCfnSuccess
GetHsmNum:
Type: Task
Resource: !Sub 'arn:aws:lambda:${AWS::Region}:${AWS::AccountId}:function:${rLambdaGetHsmNum}'
ResultPath: $.HsmNum
Next: HsmsExist?
Catch:
- ErrorEquals:
- States.ALL
ResultPath: $.Error
Next: SendCfnFailed
HsmsExist?:
Type: Choice
Choices:
- Variable: $.HsmNum
NumericEquals: 0
Next: DeleteSecrets
Default: GetHsmsState
# When stack creation fails, creation of HSMs could still be in progress.
# Wait for HSMs that are still being created to transition to the active
# state prior to deleting the HSMs.
Expand Down Expand Up @@ -997,6 +1014,7 @@ Resources:
- lambda:InvokeFunction
Resource:
- !GetAtt rLambdaGetHsmsState.Arn
- !GetAtt rLambdaGetHsmNum.Arn
- !GetAtt rLambdaDeleteHsms.Arn
- !GetAtt rLambdaGetHsmState.Arn
- !GetAtt rLambdaDeleteSecrets.Arn
Expand Down Expand Up @@ -1338,6 +1356,10 @@ Resources:
def lambda_handler(event, context):
logger.info(json.dumps(event, indent=2, default=str))
hsms_per_subnet_desired = int(event['ResourceProperties']['HsmsPerSubnet'])
if hsms_per_subnet_desired == 0:
raise Exception('pHsmsPerSubnet must be greater than 0 when creating a cluster')
subnets = event['ResourceProperties']['Subnets']
backup_id = event['ResourceProperties']['BackupId']
backup_retention_days = event['ResourceProperties']['BackupRetentionDays']
Expand Down Expand Up @@ -1755,6 +1777,68 @@ Resources:
'aws:PrincipalAccount': !Sub '${AWS::AccountId}'
'aws:RequestedRegion': !Sub '${AWS::Region}'

rLambdaGetHsmNum:
Type: AWS::Lambda::Function
Properties:
FunctionName: !Sub '${pSystem}-${pEnvPurpose}-get-hsm-num'
Handler: index.lambda_handler
Runtime: python3.10
Timeout: 20
Role: !GetAtt rLambdaGetHsmNumRole.Arn
Code:
ZipFile: |
import boto3
import json
import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)
cloudhsm = boto3.client('cloudhsmv2')
def lambda_handler(event, context):
logger.info(json.dumps(event, indent=2, default=str))
cluster_id = event['ClusterId']
hsms = cloudhsm.describe_clusters(Filters={'clusterIds': [cluster_id]})['Clusters'][0]['Hsms']
return (len(hsms))
rLambdaGetHsmNumRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub '${pSystem}-${pEnvPurpose}-${AWS::Region}-svc-lambda-get-hsm-num'
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Principal:
Service:
- lambda.amazonaws.com
Action:
- sts:AssumeRole
Path: /
Policies:
- PolicyName: CloudHSMLambdaPolicy
PolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
Action:
- logs:CreateLogGroup
- logs:CreateLogStream
- logs:PutLogEvents
Resource:
- !Sub 'arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/lambda/${pSystem}-${pEnvPurpose}-*'
- Effect: Allow
Action:
- cloudhsm:DescribeClusters
Resource: '*'
Condition:
'ForAllValues:StringEquals':
'aws:PrincipalAccount': !Sub '${AWS::AccountId}'
'aws:RequestedRegion': !Sub '${AWS::Region}'

rLambdaUpdateClusterSecGroup:
Type: AWS::Lambda::Function
Properties:
Expand Down Expand Up @@ -2735,7 +2819,7 @@ Resources:
else:
physical_resource_id = event['PhysicalResourceId']
cfnresponse.send(event, context, cfnresponse.SUCCESS, {'cluster_id': physical_resource_id}, physical_resource_id)
cfnresponse.send(event, context, cfnresponse.SUCCESS, {'ClusterId': physical_resource_id}, physical_resource_id)
return
Expand Down Expand Up @@ -3015,6 +3099,19 @@ Resources:
SYSTEM_ID=${pSystem}
REGION=${AWS::Region}
# Determine if at least 1 HSM exists. e.g. All HSMs may have been temporarily deleted from the cluster to reduce costs.
HSM_IP_ADDR=$(/usr/local/bin/aws cloudhsmv2 describe-clusters --filters clusterIds=$CLUSTER_ID --query "Clusters[0].Hsms[0].EniIp" --output text --region $REGION)
if [ $? -eq 0 ] ; then
# If zero HSMs exist, then an IP address won't be returned
if [ ${!HSM_IP_ADDR} = "None" ] ; then
/usr/bin/echo "No HSMs exist. No need to update client configuration."
exit 0
fi
else
/usr/bin/echo "Failed to get HSM IP address. Cannot configure CloudHSM client: ${!HSM_IP_ADDR}"
exit 1
fi
CA_CERT_LOOKUP=$(/usr/local/bin/aws secretsmanager get-secret-value --secret-id "/${!SYSTEM_ID}/${!CLUSTER_ID}/customer-ca-cert" --region $REGION 2>&1)
if [ $? -eq 0 ] ; then
CA_CERT=$(/usr/bin/echo $CA_CERT_LOOKUP | jq -r '.SecretString')
Expand All @@ -3039,18 +3136,12 @@ Resources:
/usr/bin/systemctl enable cloudhsm-client.service > /dev/null 2>&1
/usr/bin/systemctl start cloudhsm-client.service > /dev/null 2>&1
/usr/bin/systemctl stop cloudhsm-client.service > /dev/null 2>&1
HSM_IP_ADDR=$(/usr/local/bin/aws cloudhsmv2 describe-clusters --filters clusterIds=$CLUSTER_ID --query "Clusters[0].Hsms[0].EniIp" --output text --region $REGION)
if [ $? -eq 0 ] ; then
/opt/cloudhsm/bin/configure --cmu $HSM_IP_ADDR
if [ $? -ne 0 ] ; then
/usr/bin/echo "Failed to configure CloudHSM client"
exit 1
fi
/usr/bin/systemctl start cloudhsm-client.service > /dev/null 2>&1
else
/usr/bin/echo "Failed to get HSM IP address. Cannot configure CloudHSM client: ${!HSM_IP_ADDR}"
/opt/cloudhsm/bin/configure --cmu $HSM_IP_ADDR
if [ $? -ne 0 ] ; then
/usr/bin/echo "Failed to configure CloudHSM client"
exit 1
fi
/usr/bin/systemctl start cloudhsm-client.service > /dev/null 2>&1
fi
fi
mode: '000700'
Expand Down Expand Up @@ -3253,6 +3344,6 @@ Resources:
Status: ACTIVE

Outputs:
oClusterInfo:
Description: The cluster_id value of the Cluster that has been set up
Value: !GetAtt rCloudHsmCluster.cluster_id
oClusterId:
Description: The cluster ID
Value: !Ref rCloudHsmCluster
Binary file modified images/state-machine-delete-cluster.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 2f284df

Please sign in to comment.