Skip to content
This repository was archived by the owner on Feb 27, 2023. It is now read-only.

Getting Started

Scott A. Stafford edited this page Aug 2, 2017 · 17 revisions

The goal of this tutorial is to demonstrate how to build a batch processing program for MarkLogic using Spring Batch. It is assumed that the reader is familiar with the terminology of Spring Batch and Gradle and has an Internet connection to download required dependencies.

  1. Install Software
  2. Set up Spring Batch Project
  3. Add Dependencies to Project
  4. Create Job configuration class
  5. Create properties file
  6. Test
  7. Install
  8. Execute

Install Software

Make sure that both Java and Gradle commands are in your path variable. Gradle is a requirement for the purposes of this getting started exercise. It is not a requirement when building a MarkLogic Spring Batch application (but it is highly recommended)

The mlJobRepo program installs a MarkLogicJobRepository in your MarkLogic install. Follow the mlJobRepo instructions to setup.

Set up Spring Batch Project

Create an example project folder and name it 'sb-sample'. Open a command prompt and change to this newly created directory. All the following commands will be executed in this folder.

mkdir sb-sample
cd sb-sample

Let's start by initializing our project via Gradle.

gradle init --type java-application    

Th init gradle command will create a few artifacts that will serve as a starting point.

  • build.gradle
  • settings.gradle
  • gradlew - The Gradle Wrapper
  • Source Code Folders
    • src
      • main
        • java
          • App.java
      • test
        • java
          • AppTest.java

Go ahead and remove App.java and AppTest.java. They will not be used in this tutorial.

Add Dependencies to Project

Next, add the MarkLogic Spring Batch dependencies in your gradle file. Open build.gradle and remove any existing lines in the dependencies section. Add the following lines in the dependencies section.

compile 'com.marklogic:marklogic-spring-batch-core:1.+'
testCompile 'com.marklogic:marklogic-spring-batch-test:1.+'

When you build your batch programming program, gradle will import these jars that will help you execute your program. These libraries include the Spring Batch jar files.

The jar files for MarkLogic Spring Batch are hosted in a Maven site called JCenter.

Create Job Configuration Class

The main code file for a Spring Batch program is the Job Configuration (JobConfig) class. This wires up the necessary piece parts of your job. If you are not familiar with Spring Batch, please read the [domain language of Spring Batch] to become familiar with the piece parts of a batch processing job. In our Getting Started batch processing job, we will have a single step job with an ItemReader, ItemProcessor, and ItemWriter.

In this guide, we are going to copy an existing JobConfig class and use that for a starting point. Please copy YourJobConfig file and paste under the src/main/java directory. Remove line #1, the package declaration, from this file since this file exists in the root package.

Before we go any further, let's highlight some lines of code.

@EnableBatchProcessing
@Import(value = {com.marklogic.spring.batch.config.MarkLogicBatchConfiguration.class })

The EnableBatchProcessing annotation enable Spring Batch features and provides a base configuration for setting up batch jobs in an @Configuration class. This will be needed for every job.

The Import annotation brings in the MarkLogicBatchConfiguration. This class is used for several purposes. First, this file will inject the MarkLogicBatchConfigurer. This class injects several Spring Batch base classes such as the JobExplorer, JobLauncher, and PlatformTransactionManager. This class is responsible for injecting the MarkLogicJobRepository, a MarkLogic implementation of the JobRepository. The second part of the MarkLogicBatchConfiguration that is defined is two MarkLogic DatabaseClient connections, one for the batch database (target/source) and the MarkLogicJobRepository. These databases can be the same or separate and the connection information is defined in a properties file called job.properties.

Create job.properties

Copy this job.properties file and save under your project root folder. Modify the property values for your environment. It is assumed that you have a target database to write your data.

Test

  1. Copy YourJobTest to src/test/java
  2. Delete first line (package declaration) from test file.
  3. gradle test

Assumption is the target and MarkLogicJobRepository are being hosted on the same application server and database.

Install

To create the actual program, we are leveraging the Gradle Application plugin that exists in your build.gradle file. But before we generate an actual executable program, we must specify the main class. Open your build.gradle file and find the line that sets the mainClassName property and set to the following.

mainClassName = 'com.marklogic.spring.batch.core.launch.support.CommandLineJobRunner'  

Now that the mainClass is set then the following command will install the batch processing program under the build folder.

gradle installDist

Execute

This command will execute the YourJob program. It will write a single XML file to your database.

build\install\sb-sample\bin\sb-sample.bat --job_path YourJobConfig --job_id job --output_collections test

Refer to CommandLineJobRunner for help with command line options.

Help

build\install\sb-sample\bin\sb-sample.bat --help
Clone this wiki locally