diff --git a/.travis.yml b/.travis.yml index 4a40d3e5af3..9971664b845 100644 --- a/.travis.yml +++ b/.travis.yml @@ -2,7 +2,6 @@ sudo: required dist: trusty services: - docker - - mysql language: scala scala: - 2.12.6 @@ -22,27 +21,80 @@ before_cache: env: matrix: # Setting this variable twice will cause the 'script' section to run twice with the respective env var invoked - - BUILD_TYPE=centaurAws - - BUILD_TYPE=centaurBcs - - BUILD_TYPE=centaurEngineUpgradeLocal - - BUILD_TYPE=centaurEngineUpgradePapiV2 - - BUILD_TYPE=centaurHoricromtalPapiV2 - - BUILD_TYPE=centaurHoricromtalEngineUpgradePapiV2 - - BUILD_TYPE=centaurPapiUpgradePapiV1 - - BUILD_TYPE=centaurPapiUpgradeNewWorkflowsPapiV1 - - BUILD_TYPE=centaurLocal - - BUILD_TYPE=centaurPapiV1 - - BUILD_TYPE=centaurPapiV2 - - BUILD_TYPE=centaurSlurm - - BUILD_TYPE=centaurTes - - BUILD_TYPE=centaurWdlUpgradeLocal - - BUILD_TYPE=checkPublish - - BUILD_TYPE=conformanceLocal - - BUILD_TYPE=conformancePapiV2 - - BUILD_TYPE=conformanceTesk - - BUILD_TYPE=dockerDeadlock - - BUILD_TYPE=dockerScripts - - BUILD_TYPE=sbt + - >- + BUILD_TYPE=centaurAws + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurBcs + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurEngineUpgradeLocal + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurEngineUpgradePapiV2 + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurHoricromtalPapiV2 + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurHoricromtalPapiV2 + BUILD_MARIADB=10.3 + - >- + BUILD_TYPE=centaurHoricromtalEngineUpgradePapiV2 + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurHoricromtalEngineUpgradePapiV2 + BUILD_MARIADB=10.3 + - >- + BUILD_TYPE=centaurPapiUpgradePapiV1 + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurPapiUpgradeNewWorkflowsPapiV1 + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurLocal + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurLocal + BUILD_POSTGRESQL=11.3 + - >- + BUILD_TYPE=centaurPapiV1 + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurPapiV2 + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurSlurm + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurTes + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=centaurWdlUpgradeLocal + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=checkPublish + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=conformanceLocal + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=conformancePapiV2 + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=conformanceTesk + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=dockerDeadlock + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=dockerScripts + BUILD_MYSQL=5.7 + - >- + BUILD_TYPE=sbt + BUILD_MYSQL=5.7 + BUILD_POSTGRESQL=11.3 + BUILD_MARIADB=10.3 script: - src/ci/bin/test.sh notifications: diff --git a/CHANGELOG.md b/CHANGELOG.md index 8895f64ab67..ddf68b64e27 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,67 @@ # Cromwell Change Log +## 43 Release Notes + +### Virtual Private Cloud with Subnetworks + +Cromwell now allows PAPIV2 jobs to run on a specific subnetwork inside a private network by adding the subnetwork key +`subnetwork-label-key` inside `virtual-private-cloud` in backend configuration. More info [here](https://cromwell.readthedocs.io/en/stable/backends/Google/). + +### Call caching database refactoring + +Cromwell's `CALL_CACHING_HASH_ENTRY` primary key has been refactored to use a `BIGINT` datatype in place of the previous +`INT` datatype. Cromwell will not be usable during the time the Liquibase migration for this refactor is running. +In the Google Cloud SQL with SSD environment this migration runs at a rate of approximately 100,000 `CALL_CACHING_HASH_ENTRY` +rows per second. In deployments with millions or billions of `CALL_CACHING_HASH_ENTRY` rows the migration may require +a significant amount of downtime so please plan accordingly. The following SQL could be used to estimate the number of +rows in this table: + +``` +select max(CALL_CACHING_HASH_ENTRY_ID) from CALL_CACHING_HASH_ENTRY +``` + +### Stackdriver Instrumentation + +Cromwell now supports sending metrics to [Google's Stackdriver API](https://cloud.google.com/monitoring/api/v3/). +Learn more on how to configure [here](https://cromwell.readthedocs.io/en/stable/developers/Instrumentation/). + +### BigQuery in PAPI + +Cromwell now allows a user to specify BigQuery jobs when using the PAPIv2 backend + +### Configuration Changes + +#### StatsD Instrumentation + +There is a small change in StatsD's configuration path. Originally, the path to the config was `services.Instrumentation.config.statsd` +which now has been updated to `services.Instrumentation.config`. More info on its configuration can be found +[here](https://cromwell.readthedocs.io/en/stable/developers/Instrumentation/). + +#### cached-copy + +A new experimental feature, the `cached-copy` localization strategy is available for the shared filesystem. +More information can be found in the [documentation on localization](https://cromwell.readthedocs.io/en/stable/backends/HPC). + +#### Yaml node limits + +Yaml parsing now checks for cycles, and limits the maximum number of parsed nodes to a configurable value. It also +limits the nesting depth of sequences and mappings. See [the documentation on configuring +YAML](https://cromwell.readthedocs.io/en/stable/Configuring/#yaml) for more information. + +### API Changes + +#### Workflow Metadata + +* It is now possible to use `includeKey` and `excludeKey` at the same time. If so, the metadata key must match the `includeKey` **and not** match the `excludeKey` to be included. +* It is now possible to use "`calls`" as one of your `excludeKey`s, to request that only workflow metadata gets returned. + +### PostgreSQL support + +Cromwell now supports PostgreSQL (version 9.6 or higher, with the Large Object +extension installed) as a database backend. +See [here](https://cromwell.readthedocs.io/en/stable/Configuring/#database) for +instructions for configuring the database connection. + ## 42 Release Notes ### Womtool endpoint diff --git a/backend/src/main/scala/cromwell/backend/ReadLikeFunctions.scala b/backend/src/main/scala/cromwell/backend/ReadLikeFunctions.scala index 0b0c59831e6..7f718978435 100644 --- a/backend/src/main/scala/cromwell/backend/ReadLikeFunctions.scala +++ b/backend/src/main/scala/cromwell/backend/ReadLikeFunctions.scala @@ -8,7 +8,7 @@ import scala.concurrent.Future import scala.util.Try trait ReadLikeFunctions extends PathFactory with IoFunctionSet with AsyncIoFunctions { - + override def readFile(path: String, maxBytes: Option[Int], failOnOverflow: Boolean): Future[String] = Future.fromTry(Try(buildPath(path))) flatMap { p => asyncIo.contentAsStringAsync(p, maxBytes, failOnOverflow) } diff --git a/centaur/src/main/resources/standardTestCases/cached_copy/cached_copy.wdl b/centaur/src/main/resources/standardTestCases/cached_copy/cached_copy.wdl new file mode 100644 index 00000000000..a683b01df0e --- /dev/null +++ b/centaur/src/main/resources/standardTestCases/cached_copy/cached_copy.wdl @@ -0,0 +1,55 @@ +version 1.0 + +workflow cached_inputs { + Array[Int] one_to_ten = [1,2,3,4,5,6,7,8,9,10] + + call ten_lines + + scatter (x in one_to_ten) { + call read_line { + input: + file=ten_lines.text, + line_number=x + } + } + output { + Array[String] lines = read_line.line + } +} + +task ten_lines { + command { + echo "Line 1 + Line 2 + Line 3 + Line 4 + Line 5 + Line 6 + Line 7 + Line 8 + Line 9 + Line 10" > outfile.txt + } + output { + File text = "outfile.txt" + } + runtime { + docker: "ubuntu:latest" + } +} + +task read_line { + input { + File file + Int line_number + } + command { + sed -n ~{line_number}p ~{file} + } + output { + String line = read_string(stdout()) + } + runtime { + docker: "ubuntu:latest" + } +} \ No newline at end of file diff --git a/centaur/src/main/resources/standardTestCases/check_network_in_vpc.test b/centaur/src/main/resources/standardTestCases/check_network_in_vpc.test index 71359de57b7..2658b81d4d6 100644 --- a/centaur/src/main/resources/standardTestCases/check_network_in_vpc.test +++ b/centaur/src/main/resources/standardTestCases/check_network_in_vpc.test @@ -4,6 +4,7 @@ backends: [Papiv2-Virtual-Private-Cloud] files { workflow: virtual_private_cloud/check_network_in_vpc.wdl + options: virtual_private_cloud/wf_zone_options.json } metadata { @@ -11,4 +12,6 @@ metadata { status: Succeeded "outputs.check_network_in_vpc.network_used": "cromwell-ci-vpc-network" + "outputs.check_network_in_vpc.subnetwork_used": "cromwell-ci-vpc-network" + "outputs.check_network_in_vpc.zone_used": "us-east1-c" } diff --git a/centaur/src/main/resources/standardTestCases/dollars_in_strings.test b/centaur/src/main/resources/standardTestCases/dollars_in_strings.test new file mode 100644 index 00000000000..c2838e6dfb5 --- /dev/null +++ b/centaur/src/main/resources/standardTestCases/dollars_in_strings.test @@ -0,0 +1,14 @@ +name: dollars_in_strings +testFormat: workflowsuccess + +files { + workflow: dollars_in_strings/dollars_in_strings.wdl +} + +metadata { + workflowName: read_dollared_strings + status: Succeeded + "outputs.read_dollared_strings.s1": "${BLAH}" + "outputs.read_dollared_strings.s2": "${BLAH}" + "outputs.read_dollared_strings.s3": "oops ${BLAH}" +} diff --git a/centaur/src/main/resources/standardTestCases/dollars_in_strings/dollars_in_strings.wdl b/centaur/src/main/resources/standardTestCases/dollars_in_strings/dollars_in_strings.wdl new file mode 100644 index 00000000000..f98eb1b0e6f --- /dev/null +++ b/centaur/src/main/resources/standardTestCases/dollars_in_strings/dollars_in_strings.wdl @@ -0,0 +1,32 @@ +workflow read_dollared_strings { + + call dollars_in_strings + + String dollar = "$" + + output { + String s1 = "${dollar}{BLAH}" + String s2 = s1 + + String s3 = dollars_in_strings.s3 + } +} + + +task dollars_in_strings { + String dollar = "$" + command <<< + cat > foo.txt << 'EOF' + oops ${dollar}{BLAH} + EOF + >>> + + output { + File x = "foo.txt" + String s3 = read_string(x) + } + + runtime { + docker: "ubuntu:latest" + } +} diff --git a/centaur/src/main/resources/standardTestCases/drs_tests/wf_level_file_size.wdl b/centaur/src/main/resources/standardTestCases/drs_tests/wf_level_file_size.wdl new file mode 100644 index 00000000000..56d89094921 --- /dev/null +++ b/centaur/src/main/resources/standardTestCases/drs_tests/wf_level_file_size.wdl @@ -0,0 +1,11 @@ +version 1.0 + +workflow wf_level_file_size { + File input1 = "dos://wb-mock-drs-dev.storage.googleapis.com/4a3908ad-1f0b-4e2a-8a92-611f2123e8b0" + File input2 = "dos://wb-mock-drs-dev.storage.googleapis.com/0c8e7bc6-fd76-459d-947b-808b0605beb3" + + output { + Float fileSize1 = size(input1) + Float fileSize2 = size(input2) + } +} diff --git a/centaur/src/main/resources/standardTestCases/drs_wf_level_file_size.test b/centaur/src/main/resources/standardTestCases/drs_wf_level_file_size.test new file mode 100644 index 00000000000..67210745838 --- /dev/null +++ b/centaur/src/main/resources/standardTestCases/drs_wf_level_file_size.test @@ -0,0 +1,17 @@ +name: drs_wf_level_read_size +testFormat: workflowsuccess +backends: [Papiv2NoDockerHubConfig] + +files { + workflow: drs_tests/wf_level_file_size.wdl +} + +metadata { + workflowName: wf_level_file_size + status: Succeeded + + "outputs.wf_level_file_size.fileSize1": 43.0 + "outputs.wf_level_file_size.fileSize2": 45.0 +} + + diff --git a/centaur/src/main/resources/standardTestCases/empty_inputs_file.test b/centaur/src/main/resources/standardTestCases/empty_inputs_file.test index dbda16cdcab..d9cb0fb972d 100644 --- a/centaur/src/main/resources/standardTestCases/empty_inputs_file.test +++ b/centaur/src/main/resources/standardTestCases/empty_inputs_file.test @@ -10,6 +10,6 @@ submit { statusCode: 400 message: """{ "status": "fail", - "message": "Error(s): Input file is not a valid yaml or json. Inputs data: ''. Error: MatchError: null." + "message": "Error(s): Input file is not a valid yaml or json. Inputs data: ''. Error: ParsingFailure: null." }""" } diff --git a/centaur/src/main/resources/standardTestCases/gpu_on_papi/gpu_cuda_image.wdl b/centaur/src/main/resources/standardTestCases/gpu_on_papi/gpu_cuda_image.wdl index 5ce0a5c33a8..f7c784cb574 100644 --- a/centaur/src/main/resources/standardTestCases/gpu_on_papi/gpu_cuda_image.wdl +++ b/centaur/src/main/resources/standardTestCases/gpu_on_papi/gpu_cuda_image.wdl @@ -33,6 +33,7 @@ task get_machine_info { runtime { docker: "nvidia/cuda:9.0-cudnn7-devel-ubuntu16.04" + bootDiskSizeGb: 20 gpuType: "nvidia-tesla-k80" gpuCount: 1 nvidiaDriverVersion: driver_version diff --git a/centaur/src/main/resources/standardTestCases/virtual_private_cloud/check_network_in_vpc.wdl b/centaur/src/main/resources/standardTestCases/virtual_private_cloud/check_network_in_vpc.wdl index 23b52aa9992..81111c3594f 100644 --- a/centaur/src/main/resources/standardTestCases/virtual_private_cloud/check_network_in_vpc.wdl +++ b/centaur/src/main/resources/standardTestCases/virtual_private_cloud/check_network_in_vpc.wdl @@ -2,12 +2,17 @@ version 1.0 task get_network { command { + set -euo pipefail + apt-get install --assume-yes jq > /dev/null INSTANCE=$(curl -s "http://metadata.google.internal/computeMetadata/v1/instance/name" -H "Metadata-Flavor: Google") ZONE=$(curl -s "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor: Google" | sed -E 's!.*/(.*)!\1!') TOKEN=$(gcloud auth application-default print-access-token) INSTANCE_METADATA=$(curl "https://www.googleapis.com/compute/v1/projects/broad-dsde-cromwell-dev/zones/$ZONE/instances/$INSTANCE" -H "Authorization: Bearer $TOKEN" -H 'Accept: application/json') - echo $INSTANCE_METADATA | jq -r '.networkInterfaces[0].network' | sed -E 's!.*/(.*)!\1!' + NETWORK_OBJECT=$(echo $INSTANCE_METADATA | jq --raw-output --exit-status '.networkInterfaces[0]') + echo $NETWORK_OBJECT | jq --exit-status '.network' | sed -E 's!.*/(.*)!\1!' | sed 's/"//g' > network + echo $NETWORK_OBJECT | jq --exit-status '.subnetwork' | sed -E 's!.*/(.*)!\1!' | sed 's/"//g' > subnetwork + echo $ZONE > zone } runtime { @@ -16,7 +21,9 @@ task get_network { } output { - String network = read_string(stdout()) + String networkName = read_string("network") + String subnetworkName = read_string("subnetwork") + String zone = read_string("zone") } } @@ -24,6 +31,9 @@ workflow check_network_in_vpc { call get_network output { - String network_used = get_network.network + String network_used = get_network.networkName + String subnetwork_used = get_network.subnetworkName + String zone_used = get_network.zone } } + diff --git a/centaur/src/main/resources/standardTestCases/virtual_private_cloud/wf_zone_options.json b/centaur/src/main/resources/standardTestCases/virtual_private_cloud/wf_zone_options.json new file mode 100644 index 00000000000..8343cd1484d --- /dev/null +++ b/centaur/src/main/resources/standardTestCases/virtual_private_cloud/wf_zone_options.json @@ -0,0 +1,5 @@ +{ + "default_runtime_attributes": { + "zones": "us-east1-c" + } +} diff --git a/centaur/src/main/scala/centaur/CromwellConfiguration.scala b/centaur/src/main/scala/centaur/CromwellConfiguration.scala index 1b79a6573b0..f86268da4c3 100644 --- a/centaur/src/main/scala/centaur/CromwellConfiguration.scala +++ b/centaur/src/main/scala/centaur/CromwellConfiguration.scala @@ -1,5 +1,8 @@ package centaur +import java.lang.ProcessBuilder.Redirect +import better.files.File + trait CromwellProcess { def logFile: String def displayString: String @@ -7,6 +10,22 @@ trait CromwellProcess { def stop(): Unit def isAlive: Boolean def cromwellConfiguration: CromwellConfiguration + + protected def runProcess(command: Array[String]): Process = { + val processBuilder = new java.lang.ProcessBuilder() + .command(command: _*) + .redirectOutput(Redirect.appendTo(File(logFile).toJava)) + .redirectErrorStream(true) + processBuilder.start() + } + + protected def waitProcess(process: Process, destroy: Boolean = false): Unit = { + process.getOutputStream.flush() + if (destroy) + process.destroy() + process.waitFor() + () + } } trait CromwellConfiguration { diff --git a/centaur/src/main/scala/centaur/DockerComposeCromwellConfiguration.scala b/centaur/src/main/scala/centaur/DockerComposeCromwellConfiguration.scala index 12ef2bba2f3..182391491bd 100644 --- a/centaur/src/main/scala/centaur/DockerComposeCromwellConfiguration.scala +++ b/centaur/src/main/scala/centaur/DockerComposeCromwellConfiguration.scala @@ -1,11 +1,8 @@ package centaur -import java.lang.ProcessBuilder.Redirect - -import better.files.File +import centaur.CromwellManager.ManagedCromwellPort import com.typesafe.config.Config - object DockerComposeCromwellConfiguration { def apply(conf: Config): CromwellConfiguration = { val dockerTag = conf.getString("docker-tag") @@ -20,37 +17,41 @@ object DockerComposeCromwellConfiguration { case class DockerComposeCromwellConfiguration(dockerTag: String, dockerComposeFile: String, conf: String, logFile: String) extends CromwellConfiguration { override def createProcess: CromwellProcess = { case class DockerComposeCromwellProcess(override val cromwellConfiguration: DockerComposeCromwellConfiguration) extends CromwellProcess { - private val startCommand = Array( - "/bin/bash", - "-c", - s"CROMWELL_TAG=$dockerTag CROMWELL_CONFIG=$conf docker-compose -f $dockerComposeFile up -d") + + private def composeCommand(command: String): Array[String] = { + Array( + "/bin/bash", + "-c", + s"MANAGED_CROMWELL_PORT=$ManagedCromwellPort " + + s"CROMWELL_TAG=$dockerTag " + + s"CROMWELL_CONFIG=$conf " + + s"docker-compose -f $dockerComposeFile $command") + } + + private val startCommand = composeCommand("up") + private val stopCommand = composeCommand("down -v") + private val rmCommand = composeCommand("rm -fsv") override def displayString: String = startCommand.mkString(" ") + private var process: Option[Process] = None + override def start(): Unit = { - val processBuilder = new java.lang.ProcessBuilder() - .command(startCommand: _*) - .redirectOutput(Redirect.appendTo(File(logFile).toJava)) - .redirectErrorStream(true) - processBuilder.start().waitFor() - () + process = Option(runProcess(startCommand)) } override def stop(): Unit = { - val command = Array( - "/bin/bash", - "-c", - s"docker-compose -f $dockerComposeFile down" - ) - val processBuilder = new java.lang.ProcessBuilder() - .command(command: _*) - .redirectOutput(Redirect.appendTo(File(logFile).toJava)) - .redirectErrorStream(true) - processBuilder.start().waitFor() - () + waitProcess(runProcess(stopCommand)) + waitProcess(runProcess(rmCommand)) + process foreach { + waitProcess(_, destroy = true) + } + process = None } - override def isAlive: Boolean = true + override def isAlive: Boolean = process.exists { + _.isAlive + } override def logFile: String = cromwellConfiguration.logFile } diff --git a/centaur/src/main/scala/centaur/JarCromwellConfiguration.scala b/centaur/src/main/scala/centaur/JarCromwellConfiguration.scala index caccb8ef46f..3069e555316 100644 --- a/centaur/src/main/scala/centaur/JarCromwellConfiguration.scala +++ b/centaur/src/main/scala/centaur/JarCromwellConfiguration.scala @@ -1,8 +1,5 @@ package centaur -import java.lang.ProcessBuilder.Redirect - -import better.files.File import centaur.CromwellManager.ManagedCromwellPort import com.typesafe.config.Config @@ -32,20 +29,14 @@ case class JarCromwellConfiguration(jar: String, conf: String, logFile: String) override def displayString: String = command.mkString(" ") override def start(): Unit = { - val processBuilder = new java.lang.ProcessBuilder() - .command(command: _*) - .redirectOutput(Redirect.appendTo(File(logFile).toJava)) - .redirectErrorStream(true) - process = Option(processBuilder.start()) + process = Option(runProcess(command)) } override def stop(): Unit = { - process foreach { p => - p.getOutputStream.flush() - p.destroy() - p.waitFor() + process foreach { + waitProcess(_, destroy = true) } - () + process = None } override def isAlive: Boolean = process.exists { _.isAlive } diff --git a/centaur/src/main/scala/centaur/test/Test.scala b/centaur/src/main/scala/centaur/test/Test.scala index 89a528c1e97..e384539ae46 100644 --- a/centaur/src/main/scala/centaur/test/Test.scala +++ b/centaur/src/main/scala/centaur/test/Test.scala @@ -12,7 +12,8 @@ import centaur.test.metadata.WorkflowFlatMetadata import centaur.test.metadata.WorkflowFlatMetadata._ import centaur.test.submit.SubmitHttpResponse import centaur.test.workflow.Workflow -import com.google.api.services.genomics.Genomics +import com.google.api.services.genomics.{Genomics, GenomicsScopes} +import com.google.api.services.storage.StorageScopes import com.google.auth.Credentials import com.google.auth.http.HttpCredentialsAdapter import com.google.auth.oauth2.ServiceAccountCredentials @@ -86,9 +87,10 @@ object Operations { lazy val googleConf: Config = CentaurConfig.conf.getConfig("google") lazy val authName: String = googleConf.getString("auth") lazy val genomicsEndpointUrl: String = googleConf.getString("genomics.endpoint-url") + lazy val genomicsAndStorageScopes = List(StorageScopes.CLOUD_PLATFORM_READ_ONLY, GenomicsScopes.GENOMICS) lazy val credentials: Credentials = configuration.auth(authName) .unsafe - .pipelinesApiCredentials(GoogleAuthMode.NoOptionLookup) + .credentials(genomicsAndStorageScopes) lazy val credentialsProjectOption: Option[String] = { Option(credentials) collect { case serviceAccountCredentials: ServiceAccountCredentials => serviceAccountCredentials.getProjectId diff --git a/centaurCwlRunner/src/main/scala/centaur/cwl/CloudPreprocessor.scala b/centaurCwlRunner/src/main/scala/centaur/cwl/CloudPreprocessor.scala index 4e8b90e1316..68e79b9b701 100644 --- a/centaurCwlRunner/src/main/scala/centaur/cwl/CloudPreprocessor.scala +++ b/centaurCwlRunner/src/main/scala/centaur/cwl/CloudPreprocessor.scala @@ -1,6 +1,7 @@ package centaur.cwl import better.files.File import com.typesafe.config.Config +import common.util.StringUtil._ import common.validation.IOChecked.IOChecked import cwl.preprocessor.CwlPreProcessor import io.circe.optics.JsonPath @@ -8,7 +9,7 @@ import io.circe.optics.JsonPath._ import io.circe.yaml.Printer.StringStyle import io.circe.{Json, yaml} import net.ceedubs.ficus.Ficus._ -import common.util.StringUtil._ +import wom.util.YamlUtils /** * Tools to pre-process the CWL workflows and inputs before feeding them to Cromwell so they can be executed on PAPI. @@ -39,7 +40,7 @@ class CloudPreprocessor(config: Config, prefixConfigPath: String) { // Parse value, apply f to it, and print it back to String using the printer private def process(value: String, f: Json => Json, printer: Json => String) = { - yaml.parser.parse(value) match { + YamlUtils.parse(value) match { case Left(error) => throw new Exception(error.getMessage) case Right(json) => printer(f(json)) } diff --git a/centaurCwlRunner/src/test/scala/CloudPreprocessorSpec.scala b/centaurCwlRunner/src/test/scala/CloudPreprocessorSpec.scala index b8eb3d47818..513efa07407 100644 --- a/centaurCwlRunner/src/test/scala/CloudPreprocessorSpec.scala +++ b/centaurCwlRunner/src/test/scala/CloudPreprocessorSpec.scala @@ -1,6 +1,7 @@ import centaur.cwl.CloudPreprocessor import com.typesafe.config.ConfigFactory import org.scalatest.{FlatSpec, Matchers} +import wom.util.YamlUtils class CloudPreprocessorSpec extends FlatSpec with Matchers { behavior of "PAPIPreProcessor" @@ -8,8 +9,8 @@ class CloudPreprocessorSpec extends FlatSpec with Matchers { val pAPIPreprocessor = new CloudPreprocessor(ConfigFactory.load(), "papi.default-input-gcs-prefix") def validate(result: String, expectation: String) = { - val parsedResult = io.circe.yaml.parser.parse(result).right.get - val parsedExpectation = io.circe.yaml.parser.parse(expectation).right.get + val parsedResult = YamlUtils.parse(result).right.get + val parsedExpectation = YamlUtils.parse(expectation).right.get // This is an actual Json comparison from circe parsedResult shouldBe parsedExpectation diff --git a/cloud-nio/cloud-nio-impl-drs/src/main/scala/cloud/nio/impl/drs/DrsCloudNioFileProvider.scala b/cloud-nio/cloud-nio-impl-drs/src/main/scala/cloud/nio/impl/drs/DrsCloudNioFileProvider.scala index dbe4e7c0b94..38d443e2f87 100644 --- a/cloud-nio/cloud-nio-impl-drs/src/main/scala/cloud/nio/impl/drs/DrsCloudNioFileProvider.scala +++ b/cloud-nio/cloud-nio-impl-drs/src/main/scala/cloud/nio/impl/drs/DrsCloudNioFileProvider.scala @@ -39,10 +39,18 @@ class DrsCloudNioFileProvider(scheme: String, private def checkIfPathExistsThroughMartha(drsPath: String): IO[Boolean] = { - drsPathResolver.rawMarthaResponse(drsPath).use { marthaResponse => - val errorMsg = s"Status line was null for martha response $marthaResponse." - toIO(Option(marthaResponse.getStatusLine), errorMsg) - }.map(_.getStatusCode == HttpStatus.SC_OK) + /* + * Unlike other cloud providers where directories are identified with a trailing slash at the end like `gs://bucket/dir/`, + * DRS has a concept of bundles for directories (not supported yet). Hence for method `checkDirectoryExists` which appends a trailing '/' + * to see if the current path is a directory, return false + */ + if (drsPath.endsWith("/")) IO(false) + else { + drsPathResolver.rawMarthaResponse(drsPath).use { marthaResponse => + val errorMsg = s"Status line was null for martha response $marthaResponse." + toIO(Option(marthaResponse.getStatusLine), errorMsg) + }.map(_.getStatusCode == HttpStatus.SC_OK) + } } diff --git a/cloudSupport/src/main/scala/cromwell/cloudsupport/gcp/auth/GoogleAuthMode.scala b/cloudSupport/src/main/scala/cromwell/cloudsupport/gcp/auth/GoogleAuthMode.scala index 018dff2f364..74f56dbfc68 100644 --- a/cloudSupport/src/main/scala/cromwell/cloudsupport/gcp/auth/GoogleAuthMode.scala +++ b/cloudSupport/src/main/scala/cromwell/cloudsupport/gcp/auth/GoogleAuthMode.scala @@ -4,15 +4,9 @@ import java.io.{ByteArrayInputStream, FileNotFoundException, InputStream} import java.net.HttpURLConnection._ import better.files.File -import com.google.api.client.googleapis.auth.oauth2.GoogleCredential import com.google.api.client.googleapis.javanet.GoogleNetHttpTransport -import com.google.api.client.googleapis.testing.auth.oauth2.MockGoogleCredential import com.google.api.client.http.HttpResponseException import com.google.api.client.json.jackson2.JacksonFactory -import com.google.api.services.cloudkms.v1.CloudKMS -import com.google.api.services.compute.ComputeScopes -import com.google.api.services.genomics.v2alpha1.GenomicsScopes -import com.google.api.services.storage.StorageScopes import com.google.auth.Credentials import com.google.auth.http.HttpTransportFactory import com.google.auth.oauth2.{GoogleCredentials, OAuth2Credentials, ServiceAccountCredentials, UserCredentials} @@ -55,13 +49,6 @@ object GoogleAuthMode { val DockerCredentialsEncryptionKeyNameKey = "docker_credentials_key_name" val DockerCredentialsTokenKey = "docker_credentials_token" - private val PipelinesApiScopes = List( - StorageScopes.DEVSTORAGE_FULL_CONTROL, - StorageScopes.DEVSTORAGE_READ_WRITE, - GenomicsScopes.GENOMICS, - ComputeScopes.COMPUTE - ) - def checkReadable(file: File) = { if (!file.isReadable) throw new FileNotFoundException(s"File $file does not exist or is not readable") } @@ -85,24 +72,6 @@ object GoogleAuthMode { } } - def encryptKms(keyName: String, credential: GoogleCredential, plainText: String) = { - import com.google.api.services.cloudkms.v1.CloudKMSScopes - - // Depending on the environment that provides the default credentials (e.g. Compute Engine, App - // Engine), the credentials may require us to specify the scopes we need explicitly. - // Check for this case, and inject the scope if required. - val scopedCredential = if (credential.createScopedRequired) credential.createScoped(CloudKMSScopes.all) else credential - - val kms = new CloudKMS.Builder(httpTransport, jsonFactory, scopedCredential) - .setApplicationName("cromwell") - .build() - - import com.google.api.services.cloudkms.v1.model.EncryptRequest - val request = new EncryptRequest().encodePlaintext(plainText.toCharArray.map(_.toByte)) - val response = kms.projects.locations.keyRings.cryptoKeys.encrypt(keyName, request).execute - response.getCiphertext - } - /** Used for both checking that the credential is valid and creating a fresh credential. */ private def refreshCredentials(credentials: Credentials): Unit = { credentials.refresh() @@ -115,49 +84,33 @@ sealed trait GoogleAuthMode { def name: String - // Create a Credential object from the google.api.client.auth library (https://github.com/google/google-api-java-client) - def credentials(options: OptionLookup, scopes: java.util.Collection[String]): OAuth2Credentials - /** - * Create a credential object suitable for use with Pipelines API. - * - * @param options A lookup for external credential information. - * @return Credentials with scopes compatible with the Genomics API compute and storage. + * Creates OAuth credentials with the specified scopes. */ - def pipelinesApiCredentials(options: OptionLookup): OAuth2Credentials = { - credentials(options, PipelinesApiScopes.asJavaCollection) - } + def credentials(options: OptionLookup, scopes: Iterable[String]): OAuth2Credentials /** * Alias for credentials(GoogleAuthMode.NoOptionLookup, scopes). * Only valid for credentials that are NOT externally provided, such as ApplicationDefault. */ def credentials(scopes: Iterable[String]): OAuth2Credentials = { - credentials(GoogleAuthMode.NoOptionLookup, scopes.asJavaCollection) - } - - /** - * Alias for credentials(GoogleAuthMode.NoOptionLookup, scopes). - * Only valid for credentials that are NOT externally provided, such as ApplicationDefault. - */ - def credentials(scopes: java.util.Collection[String]): OAuth2Credentials = { credentials(GoogleAuthMode.NoOptionLookup, scopes) } /** - * Alias for credentials(GoogleAuthMode.NoOptionLookup, Set.empty). + * Alias for credentials(GoogleAuthMode.NoOptionLookup, Nil). * Only valid for credentials that are NOT externally provided and do not need scopes, such as ApplicationDefault. */ private[auth] def credentials(): OAuth2Credentials = { - credentials(GoogleAuthMode.NoOptionLookup, java.util.Collections.emptySet[String]) + credentials(GoogleAuthMode.NoOptionLookup, Nil) } /** - * Alias for credentials(options, Set.empty). + * Alias for credentials(options, Nil). * Only valid for credentials that are NOT externally provided and do not need scopes, such as ApplicationDefault. */ private[auth] def credentials(options: OptionLookup): OAuth2Credentials = { - credentials(options, java.util.Collections.emptySet[String]) + credentials(options, Nil) } def requiresAuthFile: Boolean = false @@ -176,20 +129,14 @@ sealed trait GoogleAuthMode { case Success(_) => credential } } - - def apiClientGoogleCredential(options: OptionLookup): Option[GoogleCredential] = None } case object MockAuthMode extends GoogleAuthMode { override val name = "no_auth" - override def credentials(unusedOptions: OptionLookup, unusedScopes: java.util.Collection[String]): NoCredentials = { + override def credentials(unusedOptions: OptionLookup, unusedScopes: Iterable[String]): NoCredentials = { NoCredentials.getInstance } - - override def apiClientGoogleCredential(options: OptionLookup): Option[MockGoogleCredential] = { - Option(new MockGoogleCredential.Builder().build()) - } } object ServiceAccountMode { @@ -204,20 +151,12 @@ object ServiceAccountMode { } -trait HasApiClientGoogleCredentialStream { self: GoogleAuthMode => - protected def credentialStream(options: OptionLookup): InputStream - - override def apiClientGoogleCredential(options: OptionLookup): Option[GoogleCredential] = Option(GoogleCredential.fromStream(credentialStream(options))) -} - final case class ServiceAccountMode(override val name: String, fileFormat: CredentialFileFormat) - extends GoogleAuthMode with HasApiClientGoogleCredentialStream { + extends GoogleAuthMode { private val credentialsFile = File(fileFormat.file) checkReadable(credentialsFile) - override protected def credentialStream(options: OptionLookup): InputStream = credentialsFile.newInputStream - private lazy val serviceAccountCredentials: ServiceAccountCredentials = { fileFormat match { case PemFileFormat(accountId, _) => @@ -228,25 +167,24 @@ final case class ServiceAccountMode(override val name: String, } override def credentials(unusedOptions: OptionLookup, - scopes: java.util.Collection[String]): GoogleCredentials = { - val scopedCredentials = serviceAccountCredentials.createScoped(scopes) + scopes: Iterable[String]): GoogleCredentials = { + val scopedCredentials = serviceAccountCredentials.createScoped(scopes.asJavaCollection) validateCredentials(scopedCredentials) } } -final case class UserServiceAccountMode(override val name: String) - extends GoogleAuthMode with HasApiClientGoogleCredentialStream { +final case class UserServiceAccountMode(override val name: String) extends GoogleAuthMode { private def extractServiceAccount(options: OptionLookup): String = { extract(options, UserServiceAccountKey) } - override protected def credentialStream(options: OptionLookup): InputStream = { + private def credentialStream(options: OptionLookup): InputStream = { new ByteArrayInputStream(extractServiceAccount(options).getBytes("UTF-8")) } - override def credentials(options: OptionLookup, scopes: java.util.Collection[String]): GoogleCredentials = { + override def credentials(options: OptionLookup, scopes: Iterable[String]): GoogleCredentials = { val newCredentials = ServiceAccountCredentials.fromStream(credentialStream(options)) - val scopedCredentials: GoogleCredentials = newCredentials.createScoped(scopes) + val scopedCredentials: GoogleCredentials = newCredentials.createScoped(scopes.asJavaCollection) validateCredentials(scopedCredentials) } } @@ -267,7 +205,7 @@ final case class UserMode(override val name: String, validateCredentials(UserCredentials.fromStream(secretsStream)) } - override def credentials(unusedOptions: OptionLookup, unusedScopes: java.util.Collection[String]): OAuth2Credentials = { + override def credentials(unusedOptions: OptionLookup, unusedScopes: Iterable[String]): OAuth2Credentials = { userCredentials } } @@ -278,11 +216,9 @@ object ApplicationDefaultMode { final case class ApplicationDefaultMode(name: String) extends GoogleAuthMode { override def credentials(unusedOptions: OptionLookup, - unusedScopes: java.util.Collection[String]): GoogleCredentials = { + unusedScopes: Iterable[String]): GoogleCredentials = { ApplicationDefaultMode.applicationDefaultCredentials } - - override def apiClientGoogleCredential(unused: OptionLookup): Option[GoogleCredential] = Option(GoogleCredential.getApplicationDefault(httpTransport, jsonFactory)) } final case class RefreshTokenMode(name: String, @@ -297,7 +233,7 @@ final case class RefreshTokenMode(name: String, extract(options, RefreshTokenOptionKey) } - override def credentials(options: OptionLookup, unusedScopes: java.util.Collection[String]): UserCredentials = { + override def credentials(options: OptionLookup, unusedScopes: Iterable[String]): UserCredentials = { val refreshToken = extractRefreshToken(options) val newCredentials: UserCredentials = UserCredentials .newBuilder() diff --git a/core/src/main/resources/reference.conf b/core/src/main/resources/reference.conf index 0eabd4eb146..c729c46af6e 100644 --- a/core/src/main/resources/reference.conf +++ b/core/src/main/resources/reference.conf @@ -497,6 +497,12 @@ services { Instrumentation { # Default noop service - instrumentation metrics are ignored class = "cromwell.services.instrumentation.impl.noop.NoopInstrumentationServiceActor" + + # StatsD instrumentation service actor + # class = "cromwell.services.instrumentation.impl.statsd.StatsDInstrumentationServiceActor" + + # Stackdriver instrumentation service actor + # class = "cromwell.services.instrumentation.impl.stackdriver.StackdriverInstrumentationServiceActor" } HealthMonitor { class = "cromwell.services.healthmonitor.impl.HealthMonitorServiceActor" diff --git a/core/src/test/resources/application.conf b/core/src/test/resources/application.conf index d6d14d3c9b7..22ce88ed919 100644 --- a/core/src/test/resources/application.conf +++ b/core/src/test/resources/application.conf @@ -21,21 +21,60 @@ database.db.connectionTimeout = 30000 database-test-mysql { # Run the following to (optionally) drop and (re-)create the database: - # mysql -utravis -e "DROP DATABASE IF EXISTS cromwell_test" && mysql -utravis -e "CREATE DATABASE cromwell_test" + # mysql -ucromwell -ptest -e "DROP DATABASE IF EXISTS cromwell_test; CREATE DATABASE cromwell_test;" profile = "slick.jdbc.MySQLProfile$" db { - hostname = localhost - hostname = ${?CROMWELL_BUILD_MYSQL_HOSTNAME} - port = 3306 - port = ${?CROMWELL_BUILD_MYSQL_PORT} - schema = cromwell_test - schema = ${?CROMWELL_BUILD_MYSQL_SCHEMA} - url = "jdbc:mysql://"${database-test-mysql.db.hostname}":"${database-test-mysql.db.port}"/"${database-test-mysql.db.schema}"?useSSL=false&rewriteBatchedStatements=true&serverTimezone=UTC" - user = "travis" + driver = "com.mysql.cj.jdbc.Driver" + url = "jdbc:mysql://localhost:3306/cromwell_test?useSSL=false&rewriteBatchedStatements=true&serverTimezone=UTC" + url = ${?CROMWELL_BUILD_MYSQL_JDBC_URL} + user = "cromwell" user = ${?CROMWELL_BUILD_MYSQL_USERNAME} - password = "" + password = "test" password = ${?CROMWELL_BUILD_MYSQL_PASSWORD} + connectionTimeout = 5000 + } +} + +database-test-mariadb { + # Installing both mysql and mariadb takes skill... Instead, try running this docker from the cromwell directory: + # + # docker run \ + # --rm \ + # --env MYSQL_ROOT_PASSWORD=private \ + # --env MYSQL_USER=cromwell \ + # --env MYSQL_PASSWORD=test \ + # --env MYSQL_DATABASE=cromwell_test \ + # --publish 13306:3306 \ + # --volume ${PWD}/src/ci/docker-compose/mariadb-conf.d:/etc/mysql/conf.d \ + # mariadb:10.3 + + # Run the following to (optionally) drop and (re-)create the database: + # mysql --protocol=tcp -P13306 -ucromwell -ptest -e "DROP DATABASE IF EXISTS cromwell_test; CREATE DATABASE cromwell_test;" + profile = "slick.jdbc.MySQLProfile$" + db { driver = "com.mysql.cj.jdbc.Driver" + url = "jdbc:mysql://localhost:13306/cromwell_test?useSSL=false&rewriteBatchedStatements=true&serverTimezone=UTC" + url = ${?CROMWELL_BUILD_MARIADB_JDBC_URL} + user = "cromwell" + user = ${?CROMWELL_BUILD_MARIADB_USERNAME} + password = "test" + password = ${?CROMWELL_BUILD_MARIADB_PASSWORD} + connectionTimeout = 5000 + } +} + +database-test-postgresql { + # Run the following to (optionally) drop and (re-)create the database: + # psql postgres <<< 'drop database if exists cromwell_test; create database cromwell_test;' + profile = "slick.jdbc.PostgresProfile$" + db { + driver = "org.postgresql.Driver" + url = "jdbc:postgresql://localhost:5432/cromwell_test?reWriteBatchedInserts=true" + url = ${?CROMWELL_BUILD_POSTGRES_JDBC_URL} + user = "cromwell" + user = ${?CROMWELL_BUILD_POSTGRES_USERNAME} + password = "test" + password = ${?CROMWELL_BUILD_POSTGRES_PASSWORD} connectionTimeout = 5000 } } diff --git a/cromwell.example.backends/cromwell.examples.conf b/cromwell.example.backends/cromwell.examples.conf index 11bbf533cb3..0c13f9b4577 100644 --- a/cromwell.example.backends/cromwell.examples.conf +++ b/cromwell.example.backends/cromwell.examples.conf @@ -397,7 +397,7 @@ backend { ${"--user " + docker_user} \ --entrypoint ${job_shell} \ -v ${cwd}:${docker_cwd} \ - ${docker} ${script} + ${docker} ${docker_script} """ # Root directory where Cromwell writes job results. This directory must be @@ -421,6 +421,10 @@ backend { localization: [ "hard-link", "soft-link", "copy" ] + # An experimental localization strategy called "cached-copy" is also available for SFS backends. + # This will copy a file to a cache and then hard-link from the cache. It will copy the file to the cache again + # when the maximum number of hardlinks for a file is reached. The maximum number of hardlinks can be set with: + # max-hardlinks: 950 # Call caching strategies caching { @@ -495,12 +499,26 @@ services { Instrumentation { # StatsD - Send metrics to a StatsD server over UDP # class = "cromwell.services.instrumentation.impl.statsd.StatsDInstrumentationServiceActor" - # config.statsd { + # config { # hostname = "localhost" # port = 8125 # prefix = "" # can be used to prefix all metrics with an api key for example # flush-rate = 1 second # rate at which aggregated metrics will be sent to statsd # } + + # Stackdriver - Send metrics to Google's monitoring API + # class = "cromwell.services.instrumentation.impl.stackdriver.StackdriverInstrumentationServiceActor" + # config { + # # auth scheme can be `application_default` or `service_account` + # auth = "service-account" + # google-project = "my-project" + # # rate at which aggregated metrics will be sent to Stackdriver API, must be 1 minute or more. + # flush-rate = 1 minute + # # below 3 keys are attached as labels to each metric. `cromwell-perf-test-case` is specifically meant for perf env. + # cromwell-instance-identifier = "cromwell-101" + # cromwell-instance-role = "role" + # cromwell-perf-test-case = "perf-test-1" + # } } HealthMonitor { config { @@ -616,4 +634,17 @@ database { # connectionTimeout = 3000 # } #} + + # Postgresql example + #database { + # profile = "slick.jdbc.PostgresProfile$" + # db { + # driver = "org.postgresql.Driver" + # url = "jdbc:postgresql://localhost:5432/cromwell" + # user = "" + # password = "" + # port = 5432 + # connectionTimeout = 5000 + # } + #} } diff --git a/cwl/src/main/scala/cwl/CommandLineTool.scala b/cwl/src/main/scala/cwl/CommandLineTool.scala index e720ebe62b8..21382cc1b99 100644 --- a/cwl/src/main/scala/cwl/CommandLineTool.scala +++ b/cwl/src/main/scala/cwl/CommandLineTool.scala @@ -17,6 +17,7 @@ import wom.callable.{Callable, CallableTaskDefinition, ContainerizedInputExpress import wom.expression.{IoFunctionSet, ValueAsAnExpression, WomExpression} import wom.graph.GraphNodePort.OutputPort import wom.types.{WomArrayType, WomIntegerType, WomOptionalType} +import wom.util.YamlUtils import wom.values.{WomArray, WomEvaluatedCallInputs, WomGlobFile, WomInteger, WomString, WomValue} import wom.{CommandPart, RuntimeAttributes, RuntimeAttributesKeys} @@ -167,8 +168,9 @@ case class CommandLineTool private( // Parse content as json and return output values for each output port def parseContent(content: String): EvaluatedOutputs = { + val yaml = YamlUtils.parse(content) for { - parsed <- io.circe.yaml.parser.parse(content).flatMap(_.as[Map[String, Json]]).leftMap(error => NonEmptyList.one(error.getMessage)) + parsed <- yaml.flatMap(_.as[Map[String, Json]]).leftMap(error => NonEmptyList.one(error.getMessage)) jobOutputsMap <- jsonToOutputs(parsed) } yield jobOutputsMap.toMap } diff --git a/cwl/src/main/scala/cwl/CwlExecutableValidation.scala b/cwl/src/main/scala/cwl/CwlExecutableValidation.scala index 8df69890ba7..f6f8237c5ff 100644 --- a/cwl/src/main/scala/cwl/CwlExecutableValidation.scala +++ b/cwl/src/main/scala/cwl/CwlExecutableValidation.scala @@ -3,18 +3,19 @@ package cwl import common.Checked import common.validation.Checked._ import io.circe.Json -import io.circe.yaml import wom.callable.{ExecutableCallable, TaskDefinition} import wom.executable.Executable import wom.executable.Executable.{InputParsingFunction, ParsedInputMap} import wom.expression.IoFunctionSet +import wom.util.YamlUtils object CwlExecutableValidation { // Decodes the input file, and build the ParsedInputMap private val inputCoercionFunction: InputParsingFunction = inputFile => { - yaml.parser.parse(inputFile).flatMap(_.as[Map[String, Json]]) match { + val yaml = YamlUtils.parse(inputFile) + yaml.flatMap(_.as[Map[String, Json]]) match { case Left(error) => error.getMessage.invalidNelCheck[ParsedInputMap] case Right(inputValue) => inputValue.map({ case (key, value) => key -> value.foldWith(CwlJsonToDelayedCoercionFunction) }).validNelCheck } diff --git a/cwl/src/main/scala/cwl/preprocessor/CwlPreProcessor.scala b/cwl/src/main/scala/cwl/preprocessor/CwlPreProcessor.scala index 010296c01a2..3444259f742 100644 --- a/cwl/src/main/scala/cwl/preprocessor/CwlPreProcessor.scala +++ b/cwl/src/main/scala/cwl/preprocessor/CwlPreProcessor.scala @@ -14,6 +14,7 @@ import io.circe.optics.JsonPath._ import io.circe.{Json, JsonNumber, JsonObject} import mouse.all._ import org.slf4j.LoggerFactory +import wom.util.YamlUtils import scala.concurrent.ExecutionContext @@ -218,7 +219,8 @@ object CwlPreProcessor { } private [preprocessor] def parseYaml(in: String): IOChecked[Json] = { - io.circe.yaml.parser.parse(in).leftMap(error => NonEmptyList.one(error.message)).toIOChecked + val yaml = YamlUtils.parse(in) + yaml.leftMap(error => NonEmptyList.one(error.message)).toIOChecked } /** diff --git a/database/migration/src/main/resources/changelog.xml b/database/migration/src/main/resources/changelog.xml index 477309766cf..2accb5c6f94 100644 --- a/database/migration/src/main/resources/changelog.xml +++ b/database/migration/src/main/resources/changelog.xml @@ -76,6 +76,8 @@ + + - + - + - + - + Adding some tracking columns for determining eligibility for Call Result Caching. @@ -38,4 +38,4 @@ constraintName="FK_RESULTS_CLONED_FROM" onDelete="SET NULL" /> - \ No newline at end of file + diff --git a/database/migration/src/main/resources/changesets/callcaching.xml b/database/migration/src/main/resources/changesets/callcaching.xml index 69f87ef6554..391c7a6d26e 100644 --- a/database/migration/src/main/resources/changesets/callcaching.xml +++ b/database/migration/src/main/resources/changesets/callcaching.xml @@ -6,7 +6,7 @@ - + One row per cached job result. Stores meta info about which job the result came from. @@ -41,12 +41,12 @@ - + - + One row per hashkey per call cache meta info. Allows us to link hash keys and values to any matching call cache results. @@ -71,13 +71,13 @@ - + - + - + One row per result simpleton in the job result. Simpleton: a single non-complex WDL value. @@ -115,13 +115,13 @@ - + - + - + Change unique constraint for Execution Table to include IDX column. For MySQL this requires first dropping the foreign key constraint, which we then restore after adding back the enhanced diff --git a/database/migration/src/main/resources/changesets/change_max_size_label_entry.xml b/database/migration/src/main/resources/changesets/change_max_size_label_entry.xml index dd82e754bd3..22bf0f6f760 100644 --- a/database/migration/src/main/resources/changesets/change_max_size_label_entry.xml +++ b/database/migration/src/main/resources/changesets/change_max_size_label_entry.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/change_max_size_workflow_url.xml b/database/migration/src/main/resources/changesets/change_max_size_workflow_url.xml index b549371b65d..d9512d4626a 100644 --- a/database/migration/src/main/resources/changesets/change_max_size_workflow_url.xml +++ b/database/migration/src/main/resources/changesets/change_max_size_workflow_url.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/custom_label_entry.xml b/database/migration/src/main/resources/changesets/custom_label_entry.xml index 49f09da8534..a207982f0bc 100644 --- a/database/migration/src/main/resources/changesets/custom_label_entry.xml +++ b/database/migration/src/main/resources/changesets/custom_label_entry.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + @@ -20,7 +20,7 @@ - + - + - + diff --git a/database/migration/src/main/resources/changesets/db_schema.xml b/database/migration/src/main/resources/changesets/db_schema.xml index 6fa0c397eee..1aad77dd3d7 100644 --- a/database/migration/src/main/resources/changesets/db_schema.xml +++ b/database/migration/src/main/resources/changesets/db_schema.xml @@ -2,7 +2,7 @@ - + @@ -104,7 +104,7 @@ - + @@ -131,7 +131,7 @@ - + diff --git a/database/migration/src/main/resources/changesets/docker_hash_store.xml b/database/migration/src/main/resources/changesets/docker_hash_store.xml index 7ec9fbbc4ff..37b3971daf1 100644 --- a/database/migration/src/main/resources/changesets/docker_hash_store.xml +++ b/database/migration/src/main/resources/changesets/docker_hash_store.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + Temporary storage area for docker hashes from workflows that are still in progress. @@ -26,7 +26,7 @@ - + - + Add a size column corresponding to the sum of all the layers size from the manifest diff --git a/database/migration/src/main/resources/changesets/drop_workflow_uri_and_local_command.xml b/database/migration/src/main/resources/changesets/drop_workflow_uri_and_local_command.xml index 645a496c41b..863c31733ed 100644 --- a/database/migration/src/main/resources/changesets/drop_workflow_uri_and_local_command.xml +++ b/database/migration/src/main/resources/changesets/drop_workflow_uri_and_local_command.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + Workflow URI is not needed in the DB. Local jobs don't need to store the command either. diff --git a/database/migration/src/main/resources/changesets/embiggen_detritus_value.xml b/database/migration/src/main/resources/changesets/embiggen_detritus_value.xml index 096f17a4c63..d1c9c397b57 100644 --- a/database/migration/src/main/resources/changesets/embiggen_detritus_value.xml +++ b/database/migration/src/main/resources/changesets/embiggen_detritus_value.xml @@ -6,7 +6,7 @@ - + diff --git a/database/migration/src/main/resources/changesets/embiggen_metadata_value.xml b/database/migration/src/main/resources/changesets/embiggen_metadata_value.xml index e0b1ed38d85..d965a9b9ee1 100644 --- a/database/migration/src/main/resources/changesets/embiggen_metadata_value.xml +++ b/database/migration/src/main/resources/changesets/embiggen_metadata_value.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + @@ -25,7 +25,7 @@ - + @@ -36,7 +36,7 @@ - + diff --git a/database/migration/src/main/resources/changesets/encrypt_and_clear_workflow_options.xml b/database/migration/src/main/resources/changesets/encrypt_and_clear_workflow_options.xml index 2fe2d766100..e44422d1fda 100644 --- a/database/migration/src/main/resources/changesets/encrypt_and_clear_workflow_options.xml +++ b/database/migration/src/main/resources/changesets/encrypt_and_clear_workflow_options.xml @@ -3,11 +3,11 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + - + diff --git a/database/migration/src/main/resources/changesets/enlarge_call_caching_hash_entry_id.xml b/database/migration/src/main/resources/changesets/enlarge_call_caching_hash_entry_id.xml new file mode 100644 index 00000000000..e875f45dc0d --- /dev/null +++ b/database/migration/src/main/resources/changesets/enlarge_call_caching_hash_entry_id.xml @@ -0,0 +1,24 @@ + + + + + + + + Factored into a separate changeset from the above to allow for handling various RDBMS implementations differently. + + + + + diff --git a/database/migration/src/main/resources/changesets/events_table.xml b/database/migration/src/main/resources/changesets/events_table.xml index 47014b6f8fb..681ba71c452 100644 --- a/database/migration/src/main/resources/changesets/events_table.xml +++ b/database/migration/src/main/resources/changesets/events_table.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/execution_backend_info.xml b/database/migration/src/main/resources/changesets/execution_backend_info.xml index 1368b9d1b4e..551112a6321 100644 --- a/database/migration/src/main/resources/changesets/execution_backend_info.xml +++ b/database/migration/src/main/resources/changesets/execution_backend_info.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + @@ -20,14 +20,14 @@ - + - + @@ -35,7 +35,7 @@ - + insert into EXECUTION_INFO(EXECUTION_ID, INFO_KEY, INFO_VALUE) select EXECUTION_ID, "JES_RUN_ID", JES_RUN_ID from JES_JOB; @@ -51,7 +51,7 @@ - + update EXECUTION e set BACKEND_TYPE = 'JES' where exists (select 1 from JES_JOB jj where jj.EXECUTION_ID = e.EXECUTION_ID); @@ -64,19 +64,19 @@ - + - + - + diff --git a/database/migration/src/main/resources/changesets/failure_table.xml b/database/migration/src/main/resources/changesets/failure_table.xml index 295e8d366ba..c00a73edc5b 100644 --- a/database/migration/src/main/resources/changesets/failure_table.xml +++ b/database/migration/src/main/resources/changesets/failure_table.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/jes_id_update.xml b/database/migration/src/main/resources/changesets/jes_id_update.xml index cbf10fd7f50..b28ec05ac0f 100644 --- a/database/migration/src/main/resources/changesets/jes_id_update.xml +++ b/database/migration/src/main/resources/changesets/jes_id_update.xml @@ -2,7 +2,7 @@ - + - + Temporary storage area for completed jobs which belong to workflows that are still in progress. @@ -49,14 +49,14 @@ - + - + diff --git a/database/migration/src/main/resources/changesets/job_store_simpletons.xml b/database/migration/src/main/resources/changesets/job_store_simpletons.xml index a251021bfb0..d6e60308018 100644 --- a/database/migration/src/main/resources/changesets/job_store_simpletons.xml +++ b/database/migration/src/main/resources/changesets/job_store_simpletons.xml @@ -6,7 +6,7 @@ - + One row per result simpleton in the job result. Simpleton: a single non-complex WDL value. @@ -35,13 +35,13 @@ - + - + - + There is no attempt at migrating the contents of JOB_STORE.JOB_OUTPUTS to simpletons, this just removes the column. diff --git a/database/migration/src/main/resources/changesets/job_store_tinyints.xml b/database/migration/src/main/resources/changesets/job_store_tinyints.xml index 1e91a696f09..cfe263d18bc 100644 --- a/database/migration/src/main/resources/changesets/job_store_tinyints.xml +++ b/database/migration/src/main/resources/changesets/job_store_tinyints.xml @@ -3,23 +3,23 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + - + - + - + diff --git a/database/migration/src/main/resources/changesets/lengthen_wdl_value.xml b/database/migration/src/main/resources/changesets/lengthen_wdl_value.xml index a138d58e30c..6595fe7f194 100644 --- a/database/migration/src/main/resources/changesets/lengthen_wdl_value.xml +++ b/database/migration/src/main/resources/changesets/lengthen_wdl_value.xml @@ -4,7 +4,7 @@ xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + WDL_VALUE should accept large strings diff --git a/database/migration/src/main/resources/changesets/local_job_allow_null.xml b/database/migration/src/main/resources/changesets/local_job_allow_null.xml index a4b564f1eeb..81598f88da6 100644 --- a/database/migration/src/main/resources/changesets/local_job_allow_null.xml +++ b/database/migration/src/main/resources/changesets/local_job_allow_null.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + The local jobs don't have process ID and result codes at the start. diff --git a/database/migration/src/main/resources/changesets/metadata_journal.xml b/database/migration/src/main/resources/changesets/metadata_journal.xml index 0eeedc27a5a..99ba1d9382f 100644 --- a/database/migration/src/main/resources/changesets/metadata_journal.xml +++ b/database/migration/src/main/resources/changesets/metadata_journal.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + @@ -36,14 +36,14 @@ - + - + @@ -53,7 +53,7 @@ - + diff --git a/database/migration/src/main/resources/changesets/metadata_journal_subsecond_timestamp.xml b/database/migration/src/main/resources/changesets/metadata_journal_subsecond_timestamp.xml index ae463a19225..d1751af3d9a 100644 --- a/database/migration/src/main/resources/changesets/metadata_journal_subsecond_timestamp.xml +++ b/database/migration/src/main/resources/changesets/metadata_journal_subsecond_timestamp.xml @@ -3,14 +3,14 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + - + - + diff --git a/database/migration/src/main/resources/changesets/move_sql_metadata_changelog.xml b/database/migration/src/main/resources/changesets/move_sql_metadata_changelog.xml index dc6c81b59cc..2edd9137aad 100644 --- a/database/migration/src/main/resources/changesets/move_sql_metadata_changelog.xml +++ b/database/migration/src/main/resources/changesets/move_sql_metadata_changelog.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + SELECT COUNT(1) FROM METADATA_ENTRY diff --git a/database/migration/src/main/resources/changesets/nullable_lobs.xml b/database/migration/src/main/resources/changesets/nullable_lobs.xml index 13783585c79..9f977a0aed6 100644 --- a/database/migration/src/main/resources/changesets/nullable_lobs.xml +++ b/database/migration/src/main/resources/changesets/nullable_lobs.xml @@ -2,7 +2,7 @@ - + - + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/database/migration/src/main/resources/changesets/rc.xml b/database/migration/src/main/resources/changesets/rc.xml index 5f03d69ea35..b0425f5e42d 100644 --- a/database/migration/src/main/resources/changesets/rc.xml +++ b/database/migration/src/main/resources/changesets/rc.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + Refactor the RC column off LOCAL_JOB up into EXECUTION since it should be usable by all backends. @@ -11,4 +11,4 @@ - \ No newline at end of file + diff --git a/database/migration/src/main/resources/changesets/remove_pre_pbe_tables.xml b/database/migration/src/main/resources/changesets/remove_pre_pbe_tables.xml index e2892b91b93..613eb7bb2ab 100644 --- a/database/migration/src/main/resources/changesets/remove_pre_pbe_tables.xml +++ b/database/migration/src/main/resources/changesets/remove_pre_pbe_tables.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + Remove the old pre-pluggable backend tables. diff --git a/database/migration/src/main/resources/changesets/rename_iteration_to_index.xml b/database/migration/src/main/resources/changesets/rename_iteration_to_index.xml index a85483f7908..afd71b2ae83 100644 --- a/database/migration/src/main/resources/changesets/rename_iteration_to_index.xml +++ b/database/migration/src/main/resources/changesets/rename_iteration_to_index.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + - + - \ No newline at end of file + diff --git a/database/migration/src/main/resources/changesets/rename_workflow_options_in_metadata.xml b/database/migration/src/main/resources/changesets/rename_workflow_options_in_metadata.xml index b189e535a8c..7f1cff48ca6 100644 --- a/database/migration/src/main/resources/changesets/rename_workflow_options_in_metadata.xml +++ b/database/migration/src/main/resources/changesets/rename_workflow_options_in_metadata.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/replace_empty_custom_labels.xml b/database/migration/src/main/resources/changesets/replace_empty_custom_labels.xml index 508497a8912..da7b346d3e0 100644 --- a/database/migration/src/main/resources/changesets/replace_empty_custom_labels.xml +++ b/database/migration/src/main/resources/changesets/replace_empty_custom_labels.xml @@ -7,7 +7,7 @@ - + - + Restart/recover migration from 0.19 to 0.21. @@ -31,7 +31,7 @@ - + Restart/recover migration from 0.19 to 0.21. @@ -76,7 +76,7 @@ - + @@ -93,14 +93,14 @@ columnDataType="LONGTEXT"/> - + Restart/recover migration from 0.19 to 0.21. - + Restart/recover migration from 0.19 to 0.21. @@ -130,7 +130,7 @@ - + Restart/recover migration from 0.19 to 0.21. diff --git a/database/migration/src/main/resources/changesets/runtime_attributes_table.xml b/database/migration/src/main/resources/changesets/runtime_attributes_table.xml index 4b4cfd368cc..a5ea9bb4611 100644 --- a/database/migration/src/main/resources/changesets/runtime_attributes_table.xml +++ b/database/migration/src/main/resources/changesets/runtime_attributes_table.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/sge.xml b/database/migration/src/main/resources/changesets/sge.xml index a3bca146662..aab8faa4b92 100644 --- a/database/migration/src/main/resources/changesets/sge.xml +++ b/database/migration/src/main/resources/changesets/sge.xml @@ -2,7 +2,7 @@ - + @@ -21,7 +21,7 @@ - + diff --git a/database/migration/src/main/resources/changesets/sge_job_execution_unique_key.xml b/database/migration/src/main/resources/changesets/sge_job_execution_unique_key.xml index 71aa97435db..7a362016bc2 100644 --- a/database/migration/src/main/resources/changesets/sge_job_execution_unique_key.xml +++ b/database/migration/src/main/resources/changesets/sge_job_execution_unique_key.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + Adds unique constraints UK_SGE_JOB_EXECUTION_UUID. diff --git a/database/migration/src/main/resources/changesets/standardize_column_names.xml b/database/migration/src/main/resources/changesets/standardize_column_names.xml index a309c8ea28b..50d26c429b4 100644 --- a/database/migration/src/main/resources/changesets/standardize_column_names.xml +++ b/database/migration/src/main/resources/changesets/standardize_column_names.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + Change all Workflow UUID column names to Workflow Execution UUID. @@ -21,7 +21,7 @@ tableName="JOB_STORE"/> - + Choose and implement common call/job identifiers. @@ -39,4 +39,4 @@ tableName="METADATA_JOURNAL"/> - \ No newline at end of file + diff --git a/database/migration/src/main/resources/changesets/standardize_column_names_again.xml b/database/migration/src/main/resources/changesets/standardize_column_names_again.xml index 111bcd74fed..a98e35f3448 100644 --- a/database/migration/src/main/resources/changesets/standardize_column_names_again.xml +++ b/database/migration/src/main/resources/changesets/standardize_column_names_again.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/workflow_store_state_widening.xml b/database/migration/src/main/resources/changesets/workflow_store_state_widening.xml index 157fd43bd3b..5d3b37bce87 100644 --- a/database/migration/src/main/resources/changesets/workflow_store_state_widening.xml +++ b/database/migration/src/main/resources/changesets/workflow_store_state_widening.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/workflow_store_type_and_version.xml b/database/migration/src/main/resources/changesets/workflow_store_type_and_version.xml index bdb38ad09ed..59876e1dffa 100644 --- a/database/migration/src/main/resources/changesets/workflow_store_type_and_version.xml +++ b/database/migration/src/main/resources/changesets/workflow_store_type_and_version.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/changesets/workflow_store_workflow_root_column.xml b/database/migration/src/main/resources/changesets/workflow_store_workflow_root_column.xml index dab1df1b71e..7358ffd4347 100644 --- a/database/migration/src/main/resources/changesets/workflow_store_workflow_root_column.xml +++ b/database/migration/src/main/resources/changesets/workflow_store_workflow_root_column.xml @@ -3,7 +3,7 @@ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-3.3.xsd"> - + diff --git a/database/migration/src/main/resources/metadata_changesets/postgresql_metadata_schema.xml b/database/migration/src/main/resources/metadata_changesets/postgresql_metadata_schema.xml new file mode 100644 index 00000000000..d1c5835954a --- /dev/null +++ b/database/migration/src/main/resources/metadata_changesets/postgresql_metadata_schema.xml @@ -0,0 +1,130 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/database/migration/src/main/resources/sql_metadata_changelog.xml b/database/migration/src/main/resources/sql_metadata_changelog.xml index 05500b36bc5..809942372cc 100644 --- a/database/migration/src/main/resources/sql_metadata_changelog.xml +++ b/database/migration/src/main/resources/sql_metadata_changelog.xml @@ -11,5 +11,6 @@ + diff --git a/database/sql/src/main/scala/cromwell/database/slick/MetadataSlickDatabase.scala b/database/sql/src/main/scala/cromwell/database/slick/MetadataSlickDatabase.scala index 91dbc44c3a0..d28801e7fb5 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/MetadataSlickDatabase.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/MetadataSlickDatabase.scala @@ -2,7 +2,6 @@ package cromwell.database.slick import java.sql.Timestamp -import cats.data.NonEmptyList import com.typesafe.config.{Config, ConfigFactory} import cromwell.database.slick.tables.MetadataDataAccessComponent import cromwell.database.sql.MetadataSqlDatabase @@ -35,7 +34,7 @@ class MetadataSlickDatabase(originalDatabaseConfig: Config) override def addMetadataEntries(metadataEntries: Iterable[MetadataEntry]) (implicit ec: ExecutionContext): Future[Unit] = { val action = DBIO.seq(metadataEntries.grouped(insertBatchSize).map(dataAccess.metadataEntries ++= _).toSeq:_*) - runAction(action) + runLobAction(action) } override def metadataEntryExists(workflowExecutionUuid: String)(implicit ec: ExecutionContext): Future[Boolean] = { @@ -83,29 +82,18 @@ class MetadataSlickDatabase(originalDatabaseConfig: Config) runTransaction(action) } - override def queryMetadataEntriesLikeMetadataKeys(workflowExecutionUuid: String, - metadataKeys: NonEmptyList[String], + override def queryMetadataEntryWithKeyConstraints(workflowExecutionUuid: String, + metadataKeysToFilterFor: List[String], + metadataKeysToFilterOut: List[String], metadataJobQueryValue: MetadataJobQueryValue) (implicit ec: ExecutionContext): Future[Seq[MetadataEntry]] = { val action = metadataJobQueryValue match { case CallQuery(callFqn, jobIndex, jobAttempt) => - dataAccess.metadataEntriesLikeMetadataKeysWithJob(workflowExecutionUuid, metadataKeys, callFqn, jobIndex, jobAttempt).result - case WorkflowQuery => dataAccess.metadataEntriesLikeMetadataKeys(workflowExecutionUuid, metadataKeys, requireEmptyJobKey = true).result - case CallOrWorkflowQuery => dataAccess.metadataEntriesLikeMetadataKeys(workflowExecutionUuid, metadataKeys, requireEmptyJobKey = false).result - } - - runTransaction(action) - } - - override def queryMetadataEntryNotLikeMetadataKeys(workflowExecutionUuid: String, - metadataKeys: NonEmptyList[String], - metadataJobQueryValue: MetadataJobQueryValue) - (implicit ec: ExecutionContext): Future[Seq[MetadataEntry]] = { - val action = metadataJobQueryValue match { - case CallQuery(callFqn, jobIndex, jobAttempt) => - dataAccess.metadataEntriesNotLikeMetadataKeysWithJob(workflowExecutionUuid, metadataKeys, callFqn, jobIndex, jobAttempt).result - case WorkflowQuery => dataAccess.metadataEntriesNotLikeMetadataKeys(workflowExecutionUuid, metadataKeys, requireEmptyJobKey = true).result - case CallOrWorkflowQuery => dataAccess.metadataEntriesNotLikeMetadataKeys(workflowExecutionUuid, metadataKeys, requireEmptyJobKey = false).result + dataAccess.metadataEntriesForJobWithKeyConstraints(workflowExecutionUuid, metadataKeysToFilterFor, metadataKeysToFilterOut, callFqn, jobIndex, jobAttempt).result + case WorkflowQuery => + dataAccess.metadataEntriesWithKeyConstraints(workflowExecutionUuid, metadataKeysToFilterFor, metadataKeysToFilterOut, requireEmptyJobKey = true).result + case CallOrWorkflowQuery => + dataAccess.metadataEntriesWithKeyConstraints(workflowExecutionUuid, metadataKeysToFilterFor, metadataKeysToFilterOut, requireEmptyJobKey = false).result } runTransaction(action) } @@ -186,12 +174,12 @@ class MetadataSlickDatabase(originalDatabaseConfig: Config) buildUpdatedSummary: (Option[WorkflowMetadataSummaryEntry], Seq[MetadataEntry]) => WorkflowMetadataSummaryEntry) - (implicit ec: ExecutionContext): Future[Long] = { + (implicit ec: ExecutionContext): Future[(Long, Long)] = { val action = for { previousMetadataEntryIdOption <- getSummaryStatusEntrySummaryPosition(summarizeNameIncreasing) previousMaxMetadataEntryId = previousMetadataEntryIdOption.getOrElse(-1L) nextMaxMetadataEntryId = previousMaxMetadataEntryId + limit - maximumMetadataEntryId <- summarizeMetadata( + maximumMetadataEntryIdConsidered <- summarizeMetadata( minMetadataEntryId = previousMaxMetadataEntryId + 1L, maxMetadataEntryId = nextMaxMetadataEntryId, startMetadataKey = startMetadataKey, @@ -220,7 +208,12 @@ class MetadataSlickDatabase(originalDatabaseConfig: Config) }, summaryName = summarizeNameIncreasing ) - } yield maximumMetadataEntryId + maximumMetadataEntryIdInTableOption <- dataAccess.metadataEntries.map(_.metadataEntryId).max.result + maximumMetadataEntryIdInTable = maximumMetadataEntryIdInTableOption.getOrElse { + // TODO: Add a logging framework to this 'database' project and log this weirdness. + maximumMetadataEntryIdConsidered + } + } yield (maximumMetadataEntryIdConsidered - previousMaxMetadataEntryId, maximumMetadataEntryIdInTable - maximumMetadataEntryIdConsidered) runTransaction(action) } @@ -239,7 +232,7 @@ class MetadataSlickDatabase(originalDatabaseConfig: Config) buildUpdatedSummary: (Option[WorkflowMetadataSummaryEntry], Seq[MetadataEntry]) => WorkflowMetadataSummaryEntry) - (implicit ec: ExecutionContext): Future[Long] = { + (implicit ec: ExecutionContext): Future[(Long, Long)] = { val action = for { previousExistingMetadataEntryIdOption <- getSummaryStatusEntrySummaryPosition(summaryNameDecreasing) previousInitializedMetadataEntryIdOption <- previousExistingMetadataEntryIdOption match { @@ -266,7 +259,8 @@ class MetadataSlickDatabase(originalDatabaseConfig: Config) summaryName = summaryNameDecreasing ) } - } yield newMinimumMetadataEntryId + rowsProcessed = previousExistingMetadataEntryIdOption.map(_ - newMinimumMetadataEntryId).getOrElse(0L) + } yield (rowsProcessed, newMinimumMetadataEntryId) runTransaction(action) } diff --git a/database/sql/src/main/scala/cromwell/database/slick/SlickDatabase.scala b/database/sql/src/main/scala/cromwell/database/slick/SlickDatabase.scala index 725ee999451..b1d68f5c590 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/SlickDatabase.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/SlickDatabase.scala @@ -8,6 +8,7 @@ import com.typesafe.config.{Config, ConfigFactory} import cromwell.database.slick.tables.DataAccessComponent import cromwell.database.sql.SqlDatabase import net.ceedubs.ficus.Ficus._ +import org.postgresql.util.{PSQLException, ServerErrorMessage} import org.slf4j.LoggerFactory import slick.basic.DatabaseConfig import slick.jdbc.{JdbcCapabilities, JdbcProfile, TransactionIsolation} @@ -58,6 +59,7 @@ abstract class SlickDatabase(override val originalDatabaseConfig: Config) extend override val urlKey = SlickDatabase.urlKey(originalDatabaseConfig) protected val slickConfig = DatabaseConfig.forConfig[JdbcProfile]("", databaseConfig) + lazy val isPostgresql = databaseConfig.getOrElse("db.driver", "unknown") == "org.postgresql.Driver" /* Not a def because we need to have a "stable identifier" for the imports below. @@ -167,10 +169,22 @@ abstract class SlickDatabase(override val originalDatabaseConfig: Config) extend runActionInternal(action.transactionally.withTransactionIsolation(isolationLevel)) } + /* Note that this is only appropriate for actions that do not involve Blob + * or Clob fields in Postgres, since large object support requires running + * transactionally. Use runLobAction instead, which will still run in + * auto-commit mode when using other database engines. + */ protected[this] def runAction[R](action: DBIO[R]): Future[R] = { runActionInternal(action.withPinnedSession) } + /* Wrapper for queries where Clob/Blob types are used + * https://stackoverflow.com/questions/3164072/large-objects-may-not-be-used-in-auto-commit-mode#answer-3164352 + */ + protected[this] def runLobAction[R](action: DBIO[R]): Future[R] = { + if (isPostgresql) runTransaction(action) else runAction(action) + } + private def runActionInternal[R](action: DBIO[R]): Future[R] = { //database.run(action) <-- See comment above private val actionThreadPool Future { @@ -186,6 +200,33 @@ abstract class SlickDatabase(override val originalDatabaseConfig: Config) extend case _ => /* keep going */ } throw rollbackException + case pSQLException: PSQLException => + val detailOption = for { + message <- Option(pSQLException.getServerErrorMessage) + detail <- Option(message.getDetail) + } yield detail + + detailOption match { + case None => throw pSQLException + case Some(_) => + /* + The exception may contain possibly sensitive row contents within the DETAIL section. Remove it. + + Tried adjusting this using configuration: + - log_error_verbosity=TERSE + - log_min_messages=PANIC + - client_min_messages=ERROR + + Instead resorting to reflection. + */ + val message = pSQLException.getServerErrorMessage + val field = classOf[ServerErrorMessage].getDeclaredField("m_mesgParts") + field.setAccessible(true) + val parts = field.get(message).asInstanceOf[java.util.Map[Character, String]] + parts.remove('D') + // The original exception has already stored the DETAIL into a string. So we must create a new Exception. + throw new PSQLException(message) + } } }(actionExecutionContext) } diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingDetritusEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingDetritusEntryComponent.scala index c36b8047d11..db7c1f3d826 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingDetritusEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingDetritusEntryComponent.scala @@ -1,6 +1,6 @@ package cromwell.database.slick.tables -import java.sql.Clob +import javax.sql.rowset.serial.SerialClob import cromwell.database.sql.tables.CallCachingDetritusEntry @@ -16,7 +16,7 @@ trait CallCachingDetritusEntryComponent { def detritusKey = column[String]("DETRITUS_KEY", O.Length(255)) - def detritusValue = column[Option[Clob]]("DETRITUS_VALUE") + def detritusValue = column[Option[SerialClob]]("DETRITUS_VALUE") def callCachingEntryId = column[Int]("CALL_CACHING_ENTRY_ID") diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingHashEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingHashEntryComponent.scala index 017a8149f68..1a8e3e772f1 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingHashEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingHashEntryComponent.scala @@ -9,7 +9,7 @@ trait CallCachingHashEntryComponent { import driver.api._ class CallCachingHashEntries(tag: Tag) extends Table[CallCachingHashEntry](tag, "CALL_CACHING_HASH_ENTRY") { - def callCachingHashEntryId = column[Int]("CALL_CACHING_HASH_ENTRY_ID", O.PrimaryKey, O.AutoInc) + def callCachingHashEntryId = column[Long]("CALL_CACHING_HASH_ENTRY_ID", O.PrimaryKey, O.AutoInc) def hashKey = column[String]("HASH_KEY", O.Length(255)) @@ -36,9 +36,9 @@ trait CallCachingHashEntryComponent { * Find all hashes for a CALL_CACHING_ENTRY_ID */ val callCachingHashEntriesForCallCachingEntryId = Compiled( - (callCachingHashEntryId: Rep[Int]) => for { + (callCachingEntryId: Rep[Int]) => for { callCachingHashEntry <- callCachingHashEntries - if callCachingHashEntry.callCachingEntryId === callCachingHashEntryId + if callCachingHashEntry.callCachingEntryId === callCachingEntryId } yield callCachingHashEntry ) } diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingSimpletonEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingSimpletonEntryComponent.scala index 7170ae9025c..38a095a9682 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingSimpletonEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/CallCachingSimpletonEntryComponent.scala @@ -1,6 +1,6 @@ package cromwell.database.slick.tables -import java.sql.Clob +import javax.sql.rowset.serial.SerialClob import cromwell.database.sql.tables.CallCachingSimpletonEntry @@ -16,7 +16,7 @@ trait CallCachingSimpletonEntryComponent { def simpletonKey = column[String]("SIMPLETON_KEY", O.Length(255)) - def simpletonValue = column[Option[Clob]]("SIMPLETON_VALUE") + def simpletonValue = column[Option[SerialClob]]("SIMPLETON_VALUE") def wdlType = column[String]("WDL_TYPE", O.Length(255)) diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/CustomLabelEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/CustomLabelEntryComponent.scala index 8121b3705ae..153b6815810 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/CustomLabelEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/CustomLabelEntryComponent.scala @@ -1,12 +1,14 @@ package cromwell.database.slick.tables import cromwell.database.sql.tables.CustomLabelEntry +import shapeless.syntax.std.tuple._ import slick.model.ForeignKeyAction.Cascade trait CustomLabelEntryComponent { this: DriverComponent with WorkflowMetadataSummaryEntryComponent => + import driver.api.TupleMethods._ import driver.api._ class CustomLabelEntries(tag: Tag) @@ -19,8 +21,14 @@ trait CustomLabelEntryComponent { def workflowExecutionUuid = column[String]("WORKFLOW_EXECUTION_UUID", O.Length(100)) - override def * = (customLabelKey, customLabelValue, workflowExecutionUuid, - customLabelEntryId.?) <> (CustomLabelEntry.tupled, CustomLabelEntry.unapply) + def baseProjection = (customLabelKey, customLabelValue, workflowExecutionUuid) + + override def * = baseProjection ~ customLabelEntryId.? <> (CustomLabelEntry.tupled, CustomLabelEntry.unapply) + + def forUpdate = baseProjection.shaped <> ( + tuple => CustomLabelEntry.tupled(tuple :+ None), + CustomLabelEntry.unapply(_: CustomLabelEntry).map(_.reverse.tail.reverse) + ) def fkCustomLabelEntryWorkflowExecutionUuid = foreignKey("FK_CUSTOM_LABEL_ENTRY_WORKFLOW_EXECUTION_UUID", workflowExecutionUuid, workflowMetadataSummaryEntries)(_.workflowExecutionUuid, onDelete = Cascade) @@ -41,7 +49,7 @@ trait CustomLabelEntryComponent { customLabelEntry <- customLabelEntries if customLabelEntry.workflowExecutionUuid === workflowExecutionUuid && customLabelEntry.customLabelKey === labelKey - } yield customLabelEntry) + } yield customLabelEntry.forUpdate) def existsWorkflowIdLabelKeyAndValue(workflowId: Rep[String], labelKey: Rep[String], diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/DriverComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/DriverComponent.scala index d5f78601862..f9343883f4d 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/DriverComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/DriverComponent.scala @@ -1,7 +1,55 @@ package cromwell.database.slick.tables -import slick.jdbc.JdbcProfile +import java.sql.{Blob, Clob} + +import javax.sql.rowset.serial.{SerialBlob, SerialClob} +import org.apache.commons.io.IOUtils +import slick.jdbc.{JdbcProfile, PostgresProfile} trait DriverComponent { val driver: JdbcProfile + + import driver.api._ + + /** Ensure clobs are retrieved inside the transaction, not after */ + implicit val serialClobColumnType = MappedColumnType.base[SerialClob, Clob]( + identity, + { + case serialClob: SerialClob => serialClob + case clob => + /* + PostgreSQL's JDBC driver has issues with non-ascii characters. + https://stackoverflow.com/questions/5043992/postgres-utf-8-clobs-with-jdbc + + It returns bad values for length() and getAsciiStream(), and causes an extra null bytes to be added at the end + of the resultant SerialClob. + + Example via copy_workflow_outputs/unscattered.wdl: + + "... Enfin un peu de francais pour contrer ce raz-de-marée anglais ! ..." + + The 'é' in results in an extra null byte at the end of getAsciiStream(). + */ + val string = IOUtils.toString(clob.getCharacterStream) + new SerialClob(string.toCharArray) + } + ) + + /** Ensure clobs are retrieved inside the transaction, not after */ + implicit val serialBlobColumnType = MappedColumnType.base[SerialBlob, Blob]( + identity, + { + case serialBlob: SerialBlob => serialBlob + case blob => new SerialBlob(blob) + } + ) + + private val shouldQuote = this.driver match { + // https://stackoverflow.com/questions/43111996/why-postgresql-does-not-like-uppercase-table-names#answer-43112096 + case PostgresProfile => true + case _ => false + } + + /** Adds quotes around the string if required by the DBMS. */ + def quoted(string: String) = if (shouldQuote) s""""$string"""" else string } diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/JobStoreEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/JobStoreEntryComponent.scala index 6422e535ef7..de5bfe57698 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/JobStoreEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/JobStoreEntryComponent.scala @@ -1,6 +1,6 @@ package cromwell.database.slick.tables -import java.sql.Clob +import javax.sql.rowset.serial.SerialClob import cromwell.database.sql.tables.JobStoreEntry @@ -26,7 +26,7 @@ trait JobStoreEntryComponent { def returnCode = column[Option[Int]]("RETURN_CODE") // Only set for failure: - def exceptionMessage = column[Option[Clob]]("EXCEPTION_MESSAGE") + def exceptionMessage = column[Option[SerialClob]]("EXCEPTION_MESSAGE") def retryableFailure = column[Option[Boolean]]("RETRYABLE_FAILURE") diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/JobStoreSimpletonEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/JobStoreSimpletonEntryComponent.scala index 40d3f094ea3..e2e9c83dac9 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/JobStoreSimpletonEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/JobStoreSimpletonEntryComponent.scala @@ -1,6 +1,6 @@ package cromwell.database.slick.tables -import java.sql.Clob +import javax.sql.rowset.serial.SerialClob import cromwell.database.sql.tables.JobStoreSimpletonEntry import slick.model.ForeignKeyAction.Cascade @@ -16,7 +16,7 @@ trait JobStoreSimpletonEntryComponent { def simpletonKey = column[String]("SIMPLETON_KEY", O.Length(255)) - def simpletonValue = column[Option[Clob]]("SIMPLETON_VALUE") + def simpletonValue = column[Option[SerialClob]]("SIMPLETON_VALUE") def wdlType = column[String]("WDL_TYPE", O.Length(255)) diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/MetadataEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/MetadataEntryComponent.scala index 711cde84f56..6e440fdeebf 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/MetadataEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/MetadataEntryComponent.scala @@ -1,8 +1,8 @@ package cromwell.database.slick.tables -import java.sql.{Clob, Timestamp} +import java.sql.Timestamp +import javax.sql.rowset.serial.SerialClob -import cats.data.NonEmptyList import cromwell.database.sql.tables.MetadataEntry trait MetadataEntryComponent { @@ -36,7 +36,7 @@ trait MetadataEntryComponent { def metadataKey = column[String]("METADATA_KEY", O.Length(255)) - def metadataValue = column[Option[Clob]]("METADATA_VALUE") + def metadataValue = column[Option[SerialClob]]("METADATA_VALUE") def metadataValueType = column[Option[String]]("METADATA_VALUE_TYPE", O.Length(10)) @@ -152,12 +152,14 @@ trait MetadataEntryComponent { * If requireEmptyJobKey is true, only workflow level keys are returned, otherwise both workflow and call level * keys are returned. */ - def metadataEntriesLikeMetadataKeys(workflowExecutionUuid: String, metadataKeys: NonEmptyList[String], - requireEmptyJobKey: Boolean) = { + def metadataEntriesWithKeyConstraints(workflowExecutionUuid: String, + metadataKeysToFilterFor: List[String], + metadataKeysToFilterOut: List[String], + requireEmptyJobKey: Boolean) = { (for { metadataEntry <- metadataEntries if metadataEntry.workflowExecutionUuid === workflowExecutionUuid - if metadataEntryHasMetadataKeysLike(metadataEntry, metadataKeys) + if metadataEntryHasMetadataKeysLike(metadataEntry, metadataKeysToFilterFor, metadataKeysToFilterOut) if metadataEntryHasEmptyJobKey(metadataEntry, requireEmptyJobKey) } yield metadataEntry).sortBy(_.metadataTimestamp) } @@ -166,59 +168,43 @@ trait MetadataEntryComponent { * Returns metadata entries that are "like" metadataKeys for the specified call. * If jobAttempt has no value, all metadata keys for all attempts are returned. */ - def metadataEntriesLikeMetadataKeysWithJob(workflowExecutionUuid: String, metadataKeys: NonEmptyList[String], - callFqn: String, jobIndex: Option[Int], jobAttempt: Option[Int]) = { + def metadataEntriesForJobWithKeyConstraints(workflowExecutionUuid: String, + metadataKeysToFilterFor: List[String], + metadataKeysToFilterOut: List[String], + callFqn: String, + jobIndex: Option[Int], + jobAttempt: Option[Int]) = { (for { metadataEntry <- metadataEntries if metadataEntry.workflowExecutionUuid === workflowExecutionUuid - if metadataEntryHasMetadataKeysLike(metadataEntry, metadataKeys) + if metadataEntryHasMetadataKeysLike(metadataEntry, metadataKeysToFilterFor, metadataKeysToFilterOut) if metadataEntry.callFullyQualifiedName === callFqn if hasSameIndex(metadataEntry, jobIndex) // Assume that every metadata entry for a call should have a non null attempt value - // Because of that, if the jobAttempt paramater is Some(_), make sure it matches, otherwise take all entries + // Because of that, if the jobAttempt parameter is Some(_), make sure it matches, otherwise take all entries // regardless of the attempt if (metadataEntry.jobAttempt === jobAttempt) || jobAttempt.isEmpty } yield metadataEntry).sortBy(_.metadataTimestamp) } - /** - * Returns metadata entries that are NOT "like" metadataKeys for the specified workflow. - * If requireEmptyJobKey is true, only workflow level keys are returned, otherwise both workflow and call level - * keys are returned. - */ - def metadataEntriesNotLikeMetadataKeys(workflowExecutionUuid: String, metadataKeys: NonEmptyList[String], - requireEmptyJobKey: Boolean) = { - (for { - metadataEntry <- metadataEntries - if metadataEntry.workflowExecutionUuid === workflowExecutionUuid - if !metadataEntryHasMetadataKeysLike(metadataEntry, metadataKeys) - if metadataEntryHasEmptyJobKey(metadataEntry, requireEmptyJobKey) - } yield metadataEntry).sortBy(_.metadataTimestamp) - } + private[this] def metadataEntryHasMetadataKeysLike(metadataEntry: MetadataEntries, + metadataKeysToFilterFor: List[String], + metadataKeysToFilterOut: List[String]): Rep[Boolean] = { - /** - * Returns metadata entries that are NOT "like" metadataKeys for the specified call. - * If jobIndex (resp. jobAttempt) has no value, all metadata keys for all indices (resp. attempt) - * are returned. - */ - def metadataEntriesNotLikeMetadataKeysWithJob(workflowExecutionUuid: String, metadataKeys: NonEmptyList[String], - callFqn: String, jobIndex: Option[Int], jobAttempt: Option[Int]) = { - (for { - metadataEntry <- metadataEntries - if metadataEntry.workflowExecutionUuid === workflowExecutionUuid - if !metadataEntryHasMetadataKeysLike(metadataEntry, metadataKeys) - if metadataEntry.callFullyQualifiedName === callFqn - if hasSameIndex(metadataEntry, jobIndex) - // Assume that every metadata entry for a call should have a non null attempt value - // Because of that, if the jobAttempt parameter is Some(_), make sure it matches, otherwise take all entries - // regardless of the attempt - if (metadataEntry.jobAttempt === jobAttempt) || jobAttempt.isEmpty - } yield metadataEntry).sortBy(_.metadataTimestamp) - } + def containsKey(key: String): Rep[Boolean] = metadataEntry.metadataKey like key - private[this] def metadataEntryHasMetadataKeysLike(metadataEntry: MetadataEntries, - metadataKeys: NonEmptyList[String]): Rep[Boolean] = { - metadataKeys.toList.map(metadataEntry.metadataKey like _).reduce(_ || _) + val positiveFilter: Option[Rep[Boolean]] = metadataKeysToFilterFor.map(containsKey).reduceOption(_ || _) + val negativeFilter: Option[Rep[Boolean]] = metadataKeysToFilterOut.map(containsKey).reduceOption(_ || _) + + (positiveFilter, negativeFilter) match { + case (Some(pf), Some(nf)) => pf && !nf + case (Some(pf), None) => pf + case (None, Some(nf)) => !nf + + // We should never get here, but there's no reason not to handle it: + // ps: is there a better literal "true" in slick? + case (None, None) => true: Rep[Boolean] + } } private[this] def hasSameIndex(metadataEntry: MetadataEntries, jobIndex: Rep[Option[Int]]) = { diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/WorkflowMetadataSummaryEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/WorkflowMetadataSummaryEntryComponent.scala index 14721104971..d5c5273d245 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/WorkflowMetadataSummaryEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/WorkflowMetadataSummaryEntryComponent.scala @@ -1,11 +1,11 @@ package cromwell.database.slick.tables import java.sql.Timestamp -import java.util.concurrent.atomic.AtomicInteger import cats.data import cats.data.NonEmptyList import cromwell.database.sql.tables.WorkflowMetadataSummaryEntry +import shapeless.syntax.std.tuple._ import slick.jdbc.{GetResult, PositionedParameters, SQLActionBuilder} //noinspection SqlDialectInspection @@ -13,6 +13,7 @@ trait WorkflowMetadataSummaryEntryComponent { this: DriverComponent with CustomLabelEntryComponent with MetadataEntryComponent => + import driver.api.TupleMethods._ import driver.api._ class WorkflowMetadataSummaryEntries(tag: Tag) @@ -35,9 +36,15 @@ trait WorkflowMetadataSummaryEntryComponent { def rootWorkflowExecutionUuid = column[Option[String]]("ROOT_WORKFLOW_EXECUTION_UUID", O.Length(100)) - override def * = (workflowExecutionUuid, workflowName, workflowStatus, startTimestamp, endTimestamp, - submissionTimestamp, parentWorkflowExecutionUuid, rootWorkflowExecutionUuid, - workflowMetadataSummaryEntryId.?) <> (WorkflowMetadataSummaryEntry.tupled, WorkflowMetadataSummaryEntry.unapply) + def baseProjection = (workflowExecutionUuid, workflowName, workflowStatus, startTimestamp, endTimestamp, + submissionTimestamp, parentWorkflowExecutionUuid, rootWorkflowExecutionUuid) + + override def * = baseProjection ~ workflowMetadataSummaryEntryId.? <> (WorkflowMetadataSummaryEntry.tupled, WorkflowMetadataSummaryEntry.unapply) + + def forUpdate = baseProjection.shaped <> ( + tuple => WorkflowMetadataSummaryEntry.tupled(tuple :+ None), + WorkflowMetadataSummaryEntry.unapply(_: WorkflowMetadataSummaryEntry).map(_.reverse.tail.reverse) + ) def ucWorkflowMetadataSummaryEntryWeu = index("UC_WORKFLOW_METADATA_SUMMARY_ENTRY_WEU", workflowExecutionUuid, unique = true) @@ -63,7 +70,7 @@ trait WorkflowMetadataSummaryEntryComponent { (workflowExecutionUuid: Rep[String]) => for { workflowMetadataSummaryEntry <- workflowMetadataSummaryEntries if workflowMetadataSummaryEntry.workflowExecutionUuid === workflowExecutionUuid - } yield workflowMetadataSummaryEntry) + } yield workflowMetadataSummaryEntry.forUpdate) val workflowMetadataSummaryEntryExistsForWorkflowExecutionUuid = Compiled( (workflowExecutionUuid: Rep[String]) => (for { @@ -81,8 +88,8 @@ trait WorkflowMetadataSummaryEntryComponent { def concat(a: SQLActionBuilder, b: SQLActionBuilder): SQLActionBuilder = { SQLActionBuilder(a.queryParts ++ b.queryParts, (p: Unit, pp: PositionedParameters) => { - a.unitPConv.apply(p, pp) - b.unitPConv.apply(p, pp) + a.unitPConv.apply(p, pp) + b.unitPConv.apply(p, pp) }) } @@ -118,79 +125,130 @@ trait WorkflowMetadataSummaryEntryComponent { endTimestampOption: Option[Timestamp], includeSubworkflows: Boolean): SQLActionBuilder = { - val summaryTableAlias = "summaryTable" - val labelsOrTableAlias = "labelsOrMixin" - val labelsAndTableAliases = labelAndKeyLabelValues.zipWithIndex.map { case (labelPair, i) => s"labelAndTable$i" -> labelPair }.toMap + val customLabelEntryTable = quoted("CUSTOM_LABEL_ENTRY") + val workflowMetadataSummaryEntryTable = quoted("WORKFLOW_METADATA_SUMMARY_ENTRY") + + val workflowExecutionUuidColumn = quoted("WORKFLOW_EXECUTION_UUID") + val customLabelKeyColumn = quoted("CUSTOM_LABEL_KEY") + val customLabelValueColumn = quoted("CUSTOM_LABEL_VALUE") + val parentWorkflowExecutionUuidColumn = quoted("PARENT_WORKFLOW_EXECUTION_UUID") + + val summaryTableAlias = quoted("summaryTable") + val labelsOrTableAlias = quoted("labelsOrMixin") + val labelsAndTableAliases = labelAndKeyLabelValues.zipWithIndex.map { + case (labelPair, i) => quoted(s"labelAndTable$i") -> labelPair + }.toMap + + val selectColumns = List( + "WORKFLOW_EXECUTION_UUID", + "WORKFLOW_NAME", + "WORKFLOW_STATUS", + "START_TIMESTAMP", + "END_TIMESTAMP", + "SUBMISSION_TIMESTAMP", + "PARENT_WORKFLOW_EXECUTION_UUID", + "ROOT_WORKFLOW_EXECUTION_UUID", + "WORKFLOW_METADATA_SUMMARY_ENTRY_ID", + ) + .map(quoted) + .mkString(s"$summaryTableAlias.", ", ", "") val select = selectOrCount match { case Select => - sql"""|SELECT #$summaryTableAlias.WORKFLOW_EXECUTION_UUID, - | #$summaryTableAlias.WORKFLOW_NAME, - | #$summaryTableAlias.WORKFLOW_STATUS, - | #$summaryTableAlias.START_TIMESTAMP, - | #$summaryTableAlias.END_TIMESTAMP, - | #$summaryTableAlias.SUBMISSION_TIMESTAMP, - | #$summaryTableAlias.PARENT_WORKFLOW_EXECUTION_UUID, - | #$summaryTableAlias.ROOT_WORKFLOW_EXECUTION_UUID, - | #$summaryTableAlias.WORKFLOW_METADATA_SUMMARY_ENTRY_ID - | """.stripMargin + sql"""|SELECT #$selectColumns + |""".stripMargin case Count => - sql"""SELECT COUNT(1) - | """.stripMargin + sql"""|SELECT COUNT(1) + |""".stripMargin } val labelOrJoin = if (labelOrKeyLabelValues.nonEmpty) { Option( - sql""" JOIN CUSTOM_LABEL_ENTRY #$labelsOrTableAlias on #$summaryTableAlias.WORKFLOW_EXECUTION_UUID = #$labelsOrTableAlias.WORKFLOW_EXECUTION_UUID - | """.stripMargin) + sql"""| JOIN #$customLabelEntryTable #$labelsOrTableAlias + | ON #$summaryTableAlias.#$workflowExecutionUuidColumn + | = #$labelsOrTableAlias.#$workflowExecutionUuidColumn + |""".stripMargin) } else None val labelAndJoins = labelsAndTableAliases.toList.map { case (labelAndTableAlias, _) => - sql""" JOIN CUSTOM_LABEL_ENTRY #$labelAndTableAlias on #$summaryTableAlias.WORKFLOW_EXECUTION_UUID = #$labelAndTableAlias.WORKFLOW_EXECUTION_UUID - | """.stripMargin + sql"""| JOIN #$customLabelEntryTable #$labelAndTableAlias + | ON #$summaryTableAlias.#$workflowExecutionUuidColumn + | = #$labelAndTableAlias.#$workflowExecutionUuidColumn + |""".stripMargin } val from = concatNel(NonEmptyList.of( - sql"""FROM WORKFLOW_METADATA_SUMMARY_ENTRY #$summaryTableAlias - | """.stripMargin) ++ labelOrJoin.toList ++ labelAndJoins ) + sql"""|FROM #$workflowMetadataSummaryEntryTable #$summaryTableAlias + |""".stripMargin) ++ labelOrJoin.toList ++ labelAndJoins) + + def makeSetConstraint(column: String, elements: Set[String]) = { + val list = elements.toList.map(element => sql"""#$summaryTableAlias.#${quoted(column)} = $element""") + NonEmptyList.fromList(list).map(or).toList + } - val statusConstraint = NonEmptyList.fromList(workflowStatuses.toList.map(status => sql"""#$summaryTableAlias.WORKFLOW_STATUS=$status""")).map(or).toList - val nameConstraint = NonEmptyList.fromList(workflowNames.toList.map(name => sql"""#$summaryTableAlias.WORKFLOW_NAME=$name""")).map(or).toList - val idConstraint = NonEmptyList.fromList(workflowExecutionUuids.toList.map(uuid => sql"""#$summaryTableAlias.WORKFLOW_EXECUTION_UUID=$uuid""")).map(or).toList - val submissionTimeConstraint = submissionTimestampOption.map(ts => sql"""#$summaryTableAlias.SUBMISSION_TIMESTAMP>=$ts""").toList - val startTimeConstraint = startTimestampOption.map(ts => sql"""#$summaryTableAlias.START_TIMESTAMP>=$ts""").toList - val endTimeConstraint = endTimestampOption.map(ts => sql"""#$summaryTableAlias.END_TIMESTAMP<=$ts""").toList + def makeTimeConstraint(column: String, comparison: String, elementOption: Option[Timestamp]) = { + elementOption.map(element => sql"""#$summaryTableAlias.#${quoted(column)} #$comparison $element""").toList + } + + val statusConstraint = makeSetConstraint("WORKFLOW_STATUS", workflowStatuses) + val nameConstraint = makeSetConstraint("WORKFLOW_NAME", workflowNames) + val idConstraint = makeSetConstraint("WORKFLOW_EXECUTION_UUID", workflowExecutionUuids) + + val submissionTimeConstraint = makeTimeConstraint("SUBMISSION_TIMESTAMP", ">=", submissionTimestampOption) + val startTimeConstraint = makeTimeConstraint("START_TIMESTAMP", ">=", startTimestampOption) + val endTimeConstraint = makeTimeConstraint("END_TIMESTAMP", "<=", endTimestampOption) // *ALL* of the labelAnd list of KV pairs must exist: - val labelsAndConstraint = NonEmptyList.fromList(labelsAndTableAliases.toList.map { case (labelsAndTableAlias, (labelKey, labelValue)) => - and(NonEmptyList.of(sql"#$labelsAndTableAlias.custom_label_key=$labelKey") :+ sql"#$labelsAndTableAlias.custom_label_value=$labelValue") + val labelsAndConstraint = NonEmptyList.fromList(labelsAndTableAliases.toList.map { + case (labelsAndTableAlias, (labelKey, labelValue)) => + and(NonEmptyList.of( + sql"""#$labelsAndTableAlias.#$customLabelKeyColumn = $labelKey""", + sql"""#$labelsAndTableAlias.#$customLabelValueColumn = $labelValue""", + )) }).map(and).toList // At least one of the labelOr list of KV pairs must exist: - val labelOrConstraint = NonEmptyList.fromList(labelOrKeyLabelValues.toList.map { case (k, v) => - and(NonEmptyList.of(sql"#$labelsOrTableAlias.custom_label_key=$k") :+ sql"#$labelsOrTableAlias.custom_label_value=$v") + val labelOrConstraint = NonEmptyList.fromList(labelOrKeyLabelValues.toList.map { + case (labelKey, labelValue) => + and(NonEmptyList.of( + sql"""#$labelsOrTableAlias.#$customLabelKeyColumn = $labelKey""", + sql"""#$labelsOrTableAlias.#$customLabelValueColumn = $labelValue""", + )) }).map(or).toList - val mixinTableCounter = new AtomicInteger(0) + var mixinTableCounter = 0 def labelExists(labelKey: String, labelValue: String) = { - val tableName = s"labelsMixin" + mixinTableCounter.getAndIncrement() - sql"""EXISTS(SELECT 1 from CUSTOM_LABEL_ENTRY #$tableName WHERE ((#$tableName.WORKFLOW_EXECUTION_UUID = #$summaryTableAlias.WORKFLOW_EXECUTION_UUID) AND (#$tableName.CUSTOM_LABEL_KEY = $labelKey) AND (#$tableName.CUSTOM_LABEL_VALUE = $labelValue)))""" + val tableName = quoted(s"labelsMixin" + mixinTableCounter) + mixinTableCounter += 1 + sql"""|EXISTS ( + | SELECT 1 FROM #$customLabelEntryTable #$tableName + | WHERE ( + | (#$tableName.#$workflowExecutionUuidColumn = #$summaryTableAlias.#$workflowExecutionUuidColumn) + | AND (#$tableName.#$customLabelKeyColumn = $labelKey) + | AND (#$tableName.#$customLabelValueColumn = $labelValue) + | ) + |) + |""".stripMargin } // *ALL* of the excludeLabelOr list of KV pairs must *NOT* exist: - val excludeLabelsOrConstraint = NonEmptyList.fromList(excludeLabelOrValues.toList.map { case (labelKey, labelValue) => not(labelExists(labelKey, labelValue)) } ).map(and).toList + val excludeLabelsOrConstraint = NonEmptyList.fromList(excludeLabelOrValues.toList map { + case (labelKey, labelValue) => not(labelExists(labelKey, labelValue)) + }).map(and).toList // At least one of the excludeLabelAnd list of KV pairs must *NOT* exist: - val excludeLabelsAndConstraint = NonEmptyList.fromList(excludeLabelAndValues.toList.map { case (labelKey, labelValue) => not(labelExists(labelKey, labelValue)) } ).map(or).toList + val excludeLabelsAndConstraint = NonEmptyList.fromList(excludeLabelAndValues.toList.map { + case (labelKey, labelValue) => not(labelExists(labelKey, labelValue)) + }).map(or).toList val includeSubworkflowsConstraint = if (includeSubworkflows) List.empty else { - List(sql"""#$summaryTableAlias.PARENT_WORKFLOW_EXECUTION_UUID IS NULL""".stripMargin) + List(sql"""#$summaryTableAlias.#$parentWorkflowExecutionUuidColumn IS NULL""".stripMargin) } val constraintList = - statusConstraint ++ + statusConstraint ++ nameConstraint ++ idConstraint ++ submissionTimeConstraint ++ @@ -274,18 +332,20 @@ trait WorkflowMetadataSummaryEntryComponent { ) val paginationAddendum: List[SQLActionBuilder] = (page, pageSize) match { - case (Some(p), Some(ps)) => List(sql""" LIMIT #${Integer.max(p-1, 0) * ps},#$ps """) - case (None, Some(ps)) => List(sql""" LIMIT 0,#$ps """) + case (Some(p), Some(ps)) => List(sql""" LIMIT #$ps OFFSET #${ps * ((p - 1) max 0)}""") + case (None, Some(ps)) => List(sql""" LIMIT #$ps OFFSET 0""") case _ => List.empty } - val orderByAddendum = sql""" ORDER BY WORKFLOW_METADATA_SUMMARY_ENTRY_ID DESC - | """.stripMargin + val orderByAddendum = + sql"""| ORDER BY #${quoted("WORKFLOW_METADATA_SUMMARY_ENTRY_ID")} DESC + |""".stripMargin // NB you can preview the prepared statement created here by using, for example: println(result.statements.head) - concatNel((NonEmptyList.of(mainQuery) :+ orderByAddendum) ++ paginationAddendum) - .as[WorkflowMetadataSummaryEntry](rconv = GetResult { r => + val fullQuery = concatNel(NonEmptyList(mainQuery, orderByAddendum :: paginationAddendum)) + + fullQuery.as[WorkflowMetadataSummaryEntry](rconv = GetResult { r => WorkflowMetadataSummaryEntry(r.<<, r.<<, r.<<, r.<<, r.<<, r.<<, r.<<, r.<<, r.<<) }) } diff --git a/database/sql/src/main/scala/cromwell/database/slick/tables/WorkflowStoreEntryComponent.scala b/database/sql/src/main/scala/cromwell/database/slick/tables/WorkflowStoreEntryComponent.scala index 3d99dafcd0c..d3888c8aa49 100644 --- a/database/sql/src/main/scala/cromwell/database/slick/tables/WorkflowStoreEntryComponent.scala +++ b/database/sql/src/main/scala/cromwell/database/slick/tables/WorkflowStoreEntryComponent.scala @@ -1,7 +1,7 @@ package cromwell.database.slick.tables -import java.sql.{Blob, Clob, Timestamp} - +import java.sql.Timestamp +import javax.sql.rowset.serial.{SerialBlob, SerialClob} import cromwell.database.sql.tables.WorkflowStoreEntry trait WorkflowStoreEntryComponent { @@ -21,21 +21,21 @@ trait WorkflowStoreEntryComponent { def workflowTypeVersion = column[Option[String]]("WORKFLOW_TYPE_VERSION", O.Length(255)) - def workflowDefinition = column[Option[Clob]]("WORKFLOW_DEFINITION") + def workflowDefinition = column[Option[SerialClob]]("WORKFLOW_DEFINITION") def workflowUrl = column[Option[String]]("WORKFLOW_URL", O.Length(2000)) - def workflowInputs = column[Option[Clob]]("WORKFLOW_INPUTS") + def workflowInputs = column[Option[SerialClob]]("WORKFLOW_INPUTS") - def workflowOptions = column[Option[Clob]]("WORKFLOW_OPTIONS") + def workflowOptions = column[Option[SerialClob]]("WORKFLOW_OPTIONS") - def customLabels = column[Clob]("CUSTOM_LABELS") + def customLabels = column[SerialClob]("CUSTOM_LABELS") def workflowState = column[String]("WORKFLOW_STATE", O.Length(20)) def submissionTime = column[Timestamp]("SUBMISSION_TIME") - def importsZip = column[Option[Blob]]("IMPORTS_ZIP") + def importsZip = column[Option[SerialBlob]]("IMPORTS_ZIP") def cromwellId = column[Option[String]]("CROMWELL_ID", O.Length(100)) diff --git a/database/sql/src/main/scala/cromwell/database/sql/MetadataSqlDatabase.scala b/database/sql/src/main/scala/cromwell/database/sql/MetadataSqlDatabase.scala index 30e07f94d37..c82f78e9265 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/MetadataSqlDatabase.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/MetadataSqlDatabase.scala @@ -2,7 +2,6 @@ package cromwell.database.sql import java.sql.Timestamp -import cats.data.NonEmptyList import cromwell.database.sql.joins.MetadataJobQueryValue import cromwell.database.sql.tables.{MetadataEntry, WorkflowMetadataSummaryEntry} @@ -51,21 +50,17 @@ trait MetadataSqlDatabase extends SqlDatabase { jobAttempt: Option[Int]) (implicit ec: ExecutionContext): Future[Seq[MetadataEntry]] - def queryMetadataEntriesLikeMetadataKeys(workflowExecutionUuid: String, - metadataKeys: NonEmptyList[String], + def queryMetadataEntryWithKeyConstraints(workflowExecutionUuid: String, + metadataKeysToFilterFor: List[String], + metadataKeysToFilterAgainst: List[String], metadataJobQueryValue: MetadataJobQueryValue) (implicit ec: ExecutionContext): Future[Seq[MetadataEntry]] - def queryMetadataEntryNotLikeMetadataKeys(workflowExecutionUuid: String, - metadataKeys: NonEmptyList[String], - metadataJobQueryValue: MetadataJobQueryValue) - (implicit ec: ExecutionContext): Future[Seq[MetadataEntry]] - /** * Retrieves next summarizable block of metadata satisfying the specified criteria. * * @param buildUpdatedSummary Takes in the optional existing summary and the metadata, returns the new summary. - * @return A `Future` with the maximum metadataEntryId summarized by the invocation of this method. + * @return A `Future` with the number of rows summarized by the invocation, and the number of rows still to summarize. */ def summarizeIncreasing(summaryNameIncreasing: String, startMetadataKey: String, @@ -80,13 +75,13 @@ trait MetadataSqlDatabase extends SqlDatabase { buildUpdatedSummary: (Option[WorkflowMetadataSummaryEntry], Seq[MetadataEntry]) => WorkflowMetadataSummaryEntry) - (implicit ec: ExecutionContext): Future[Long] + (implicit ec: ExecutionContext): Future[(Long, Long)] /** * Retrieves a window of summarizable metadata satisfying the specified criteria. * * @param buildUpdatedSummary Takes in the optional existing summary and the metadata, returns the new summary. - * @return A `Future` with the maximum metadataEntryId summarized by the invocation of this method. + * @return A `Future` with the number of rows summarized by this invocation, and the number of rows still to summarize. */ def summarizeDecreasing(summaryNameDecreasing: String, summaryNameIncreasing: String, @@ -102,7 +97,7 @@ trait MetadataSqlDatabase extends SqlDatabase { buildUpdatedSummary: (Option[WorkflowMetadataSummaryEntry], Seq[MetadataEntry]) => WorkflowMetadataSummaryEntry) - (implicit ec: ExecutionContext): Future[Long] + (implicit ec: ExecutionContext): Future[(Long, Long)] def getWorkflowStatus(workflowExecutionUuid: String)(implicit ec: ExecutionContext): Future[Option[String]] diff --git a/database/sql/src/main/scala/cromwell/database/sql/SqlConverters.scala b/database/sql/src/main/scala/cromwell/database/sql/SqlConverters.scala index 44a7a5e8853..fe1bf5f7dba 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/SqlConverters.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/SqlConverters.scala @@ -32,12 +32,16 @@ object SqlConverters { } implicit class ClobToRawString(val clob: Clob) extends AnyVal { - // yes, it starts at 1 - def toRawString: String = clob.getSubString(1, clob.length.toInt) + def toRawString: String = { + // See notes on empty clob issues in StringToClobOption + val length = clob.length.toInt + // yes, it starts at 1 + if (length == 0) "" else clob.getSubString(1, length) + } } implicit class StringOptionToClobOption(val strOption: Option[String]) extends AnyVal { - def toClobOption: Option[Clob] = strOption.flatMap(_.toClobOption) + def toClobOption: Option[SerialClob] = strOption.flatMap(_.toClobOption) } implicit class StringToClobOption(val str: String) extends AnyVal { @@ -52,17 +56,21 @@ object SqlConverters { import eu.timepit.refined.api.Refined import eu.timepit.refined.collection.NonEmpty - def toClobOption: Option[Clob] = if (str.isEmpty) None else Option(new SerialClob(str.toCharArray)) + def toClobOption: Option[SerialClob] = if (str.isEmpty) None else Option(new SerialClob(str.toCharArray)) - def toClob(default: String Refined NonEmpty): Clob = { + def toClob(default: String Refined NonEmpty): SerialClob = { val nonEmpty = if (str.isEmpty) default.value else str new SerialClob(nonEmpty.toCharArray) } } implicit class BlobToBytes(val blob: Blob) extends AnyVal { - // yes, it starts at 1 - def toBytes: Array[Byte] = blob.getBytes(1, blob.length.toInt) + def toBytes: Array[Byte] = { + // See notes on empty blob issues in BytesOptionToBlob + val length = blob.length.toInt + // yes, it starts at 1 + if (length == 0) Array.empty else blob.getBytes(1, length) + } } implicit class BlobOptionToBytes(val blobOption: Option[Blob]) extends AnyVal { @@ -79,11 +87,11 @@ object SqlConverters { https://github.com/apache/derby/blob/10.13/java/engine/org/apache/derby/iapi/types/HarmonySerialBlob.java#L111 OK! -> https://github.com/arteam/hsqldb/blob/2.3.4/src/org/hsqldb/jdbc/JDBCBlob.java#L184 */ - def toBlobOption: Option[Blob] = bytesOption.flatMap(_.toBlobOption) + def toBlobOption: Option[SerialBlob] = bytesOption.flatMap(_.toBlobOption) } implicit class BytesToBlobOption(val bytes: Array[Byte]) extends AnyVal { - def toBlobOption: Option[Blob] = if (bytes.isEmpty) None else Option(new SerialBlob(bytes)) + def toBlobOption: Option[SerialBlob] = if (bytes.isEmpty) None else Option(new SerialBlob(bytes)) } implicit class EnhancedFiniteDuration(val duration: FiniteDuration) extends AnyVal { diff --git a/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingDetritusEntry.scala b/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingDetritusEntry.scala index 0fd6d87cb15..31e6af183ae 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingDetritusEntry.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingDetritusEntry.scala @@ -1,11 +1,11 @@ package cromwell.database.sql.tables -import java.sql.Clob +import javax.sql.rowset.serial.SerialClob case class CallCachingDetritusEntry ( detritusKey: String, - detritusValue: Option[Clob], + detritusValue: Option[SerialClob], callCachingEntryId: Option[Int] = None, callCachingDetritusEntryId: Option[Int] = None ) diff --git a/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingHashEntry.scala b/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingHashEntry.scala index 91bb72da747..41926ff2057 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingHashEntry.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingHashEntry.scala @@ -5,5 +5,5 @@ case class CallCachingHashEntry hashKey: String, hashValue: String, callCachingEntryId: Option[Int] = None, - callCachingHashEntryId: Option[Int] = None + callCachingHashEntryId: Option[Long] = None ) diff --git a/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingSimpletonEntry.scala b/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingSimpletonEntry.scala index fb2627ba3be..626246c7bf2 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingSimpletonEntry.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/tables/CallCachingSimpletonEntry.scala @@ -1,11 +1,11 @@ package cromwell.database.sql.tables -import java.sql.Clob +import javax.sql.rowset.serial.SerialClob case class CallCachingSimpletonEntry ( simpletonKey: String, - simpletonValue: Option[Clob], + simpletonValue: Option[SerialClob], wdlType: String, callCachingEntryId: Option[Int] = None, callCachingSimpletonEntryId: Option[Int] = None diff --git a/database/sql/src/main/scala/cromwell/database/sql/tables/JobStoreEntry.scala b/database/sql/src/main/scala/cromwell/database/sql/tables/JobStoreEntry.scala index bf819e63283..c1a904af3be 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/tables/JobStoreEntry.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/tables/JobStoreEntry.scala @@ -1,6 +1,6 @@ package cromwell.database.sql.tables -import java.sql.Clob +import javax.sql.rowset.serial.SerialClob case class JobStoreEntry ( @@ -10,7 +10,7 @@ case class JobStoreEntry jobAttempt: Int, jobSuccessful: Boolean, returnCode: Option[Int], - exceptionMessage: Option[Clob], + exceptionMessage: Option[SerialClob], retryableFailure: Option[Boolean], jobStoreEntryId: Option[Int] = None ) diff --git a/database/sql/src/main/scala/cromwell/database/sql/tables/JobStoreSimpletonEntry.scala b/database/sql/src/main/scala/cromwell/database/sql/tables/JobStoreSimpletonEntry.scala index 909dd17fea2..e0c66921973 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/tables/JobStoreSimpletonEntry.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/tables/JobStoreSimpletonEntry.scala @@ -1,11 +1,11 @@ package cromwell.database.sql.tables -import java.sql.Clob +import javax.sql.rowset.serial.SerialClob case class JobStoreSimpletonEntry ( simpletonKey: String, - simpletonValue: Option[Clob], + simpletonValue: Option[SerialClob], wdlType: String, jobStoreEntryId: Option[Int] = None, jobStoreSimpletonEntryId: Option[Int] = None diff --git a/database/sql/src/main/scala/cromwell/database/sql/tables/MetadataEntry.scala b/database/sql/src/main/scala/cromwell/database/sql/tables/MetadataEntry.scala index fcc4a40006e..c273c3e47e3 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/tables/MetadataEntry.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/tables/MetadataEntry.scala @@ -1,6 +1,8 @@ package cromwell.database.sql.tables -import java.sql.{Clob, Timestamp} +import java.sql.Timestamp + +import javax.sql.rowset.serial.SerialClob case class MetadataEntry ( @@ -9,7 +11,7 @@ case class MetadataEntry jobIndex: Option[Int], jobAttempt: Option[Int], metadataKey: String, - metadataValue: Option[Clob], + metadataValue: Option[SerialClob], metadataValueType: Option[String], metadataTimestamp: Timestamp, metadataEntryId: Option[Long] = None diff --git a/database/sql/src/main/scala/cromwell/database/sql/tables/WorkflowStoreEntry.scala b/database/sql/src/main/scala/cromwell/database/sql/tables/WorkflowStoreEntry.scala index 66f22ae12b7..b969939d16c 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/tables/WorkflowStoreEntry.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/tables/WorkflowStoreEntry.scala @@ -1,21 +1,23 @@ package cromwell.database.sql.tables -import java.sql.{Blob, Clob, Timestamp} +import java.sql.Timestamp + +import javax.sql.rowset.serial.{SerialBlob, SerialClob} case class WorkflowStoreEntry ( workflowExecutionUuid: String, - workflowDefinition: Option[Clob], + workflowDefinition: Option[SerialClob], workflowUrl: Option[String], workflowRoot: Option[String], workflowType: Option[String], workflowTypeVersion: Option[String], - workflowInputs: Option[Clob], - workflowOptions: Option[Clob], + workflowInputs: Option[SerialClob], + workflowOptions: Option[SerialClob], workflowState: String, submissionTime: Timestamp, - importsZip: Option[Blob], - customLabels: Clob, + importsZip: Option[SerialBlob], + customLabels: SerialClob, cromwellId: Option[String], heartbeatTimestamp: Option[Timestamp], workflowStoreEntryId: Option[Int] = None diff --git a/database/sql/src/main/scala/cromwell/database/sql/tables/package.scala b/database/sql/src/main/scala/cromwell/database/sql/tables/package.scala index b9421d7b615..9ed1e801abf 100644 --- a/database/sql/src/main/scala/cromwell/database/sql/tables/package.scala +++ b/database/sql/src/main/scala/cromwell/database/sql/tables/package.scala @@ -20,10 +20,11 @@ package cromwell.database.sql * - `Double` * - `Int` * - `Long` - * - `java.sql.Clob` + * - `javax.sql.rowset.serial.SerialClob` * - `java.sql.Timestamp` * - * Nullable columns should be wrapped in an `Option`. + * Nullable columns should be wrapped in an `Option`. Note that SerialClob is + * required instead of java.sql.Clob, for Postgres support. * * Primary and foreign key columns are the only columns that should be defaulted, as they are to be filled in by the * database, and cannot and should not be set within the business logic. On the other hand, columns to be filled in by diff --git a/docs/Configuring.md b/docs/Configuring.md index 49eb9adaf3d..60248d0bbfd 100644 --- a/docs/Configuring.md +++ b/docs/Configuring.md @@ -297,7 +297,34 @@ url = "jdbc:mysql://host/cromwell?rewriteBatchedStatements=true&serverTimezone=U Using this option does not alter your database's underlying timezone; rather, it causes Cromwell to "speak UTC" when communicating with the DB, and the DB server performs the conversion for you. -## Abort +**Using Cromwell with Postgresql** + +To use Postgresql as the database, you will need to install and enable the +Large Object extension. If the extension is present, setting up the database +requires just these commands: + +``` +$ createdb cromwell +$ psql -d cromwell -c "create extension lo;" +``` + +Postgresql configuration in Cromwell is very similar to MySQL. An example: + +```hocon +database { + profile = "slick.jdbc.PostgresProfile$" + db { + driver = "org.postgresql.Driver" + url = "jdbc:postgresql//localhost:5432/cromwell" + user = "user" + password = "pass" + port = 5432 + connectionTimeout = 5000 + } +} +``` + +### Abort **Control-C (SIGINT) abort handler** @@ -514,7 +541,7 @@ Cromwell writes one batch of workflow heartbeats at a time. While the internal q a configurable threshold then [instrumentation](developers/Instrumentation.md) may send a metric signal that the heartbeat load is above normal. -This threshold may be configured the configuration value: +This threshold may be configured via the configuration value: ```hocon system.workflow-heartbeats { @@ -523,3 +550,37 @@ system.workflow-heartbeats { ``` The default threshold value is 100, just like the default for the heartbeat batch size. + +### YAML + +**Maximum number of nodes** + +Cromwell will throw an error when detecting cyclic loops in Yaml inputs. However one can craft small acyclic YAML +documents that consume significant amounts of memory or cpu. To limit the amount of processing during parsing, there is +a limit on the number of nodes parsed per YAML document. + +This limit may be configured via the configuration value: + +```hocon +yaml { + max-nodes = 1000000 +} +``` + +The default limit is 1,000,000 nodes. + +**Maximum nesting depth** + +There is a limit on the maximum depth of nested YAML. If you decide to increase this value, you will likely need to also +increase the Java Virtual Machine's thread stack size as well using +[either `-Xss` or `-XX:ThreadStackSize`](https://docs.oracle.com/javase/8/docs/technotes/tools/unix/java.html). + +This limit may be configured via the configuration value: + +```hocon +yaml { + max-depth = 1000 +} +``` + +The default limit is a maximum nesting depth of 1,000. diff --git a/docs/api/RESTAPI.md b/docs/api/RESTAPI.md index 380bdc15a63..62d9cda44fc 100644 --- a/docs/api/RESTAPI.md +++ b/docs/api/RESTAPI.md @@ -1,5 +1,5 @@