Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
2.5k views
in Technique[技术] by (71.8m points)

spring - Restart step (or job) after timeout occours

It's possible restart a job or step automatically when timeout occurs? I tried retry and skip (skip, because job re-run every 30 minutes, provided no error has occurred) a step, like this:

<step id="jobTest.step1">
  <tasklet>
    <transaction-attributes timeout="120"/>
    <chunk reader="testReader" processor="testProcessor" writer="testWriter" commit-interval="10"  retry-limit="3" >
      <retryable-exception-classes>
        <include class="org.springframework.transaction.TransactionTimedOutException"/>
       </retryable-exception-classes>
    </chunk>
    <listeners>
      <listener ref="stepListener" />
    </listeners>
  </tasklet>
</step>

I tried with skip-policy too, but I did not get satisfactory results. I just need restart this step (or entire job) when occurs timeout .

UPDATE

I've tried this too, but without sucess: Spring batch: Retry job if does not complete in particular time

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Retry/Skip features are applicable to items within a chunk in a fault-tolerant chunk-oriented step, not at the step level or the job level. There are actually two distinct things in your requirement:

1. How to stop a job after a given timeout?

Apart from externally calling JobOperator#stop after a time out occurs, you can stop a job from within the job itself by sending a stop signal through the StepExecution#isTerminateOnly flag. The idea is to have access to the step execution in order to set that flag after a certain timeout. This depends on the tasklet type of the step:

Simple Tasklet

For a simple tasklet, you can access the step execution through the ChunkContext. Here is an example:

import java.time.Duration;
import java.util.Date;

import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;

public class MyTasklet implements Tasklet {

    private static final int TIMEOUT = 120; // in minutes (can be turned into a configurable field through a constructor)

    @Override
    public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
        if (timeout(chunkContext)) {
            chunkContext.getStepContext().getStepExecution().setTerminateOnly();
        }
        // do some work
        if (moreWork()) {
            return RepeatStatus.CONTINUABLE;
        } else {
            return RepeatStatus.FINISHED;
        }
    }

    private boolean timeout(ChunkContext chunkContext) {
        Date startTime = chunkContext.getStepContext().getStepExecution().getJobExecution().getStartTime();
        Date now = new Date();
        return Duration.between(startTime.toInstant(), now.toInstant()).toMinutes() > TIMEOUT;
    }

    private boolean moreWork() {
        return false; // TODO implement logic
    }
}

This tasklet will regularly check if the timeout is exceeded and stop the step (and hence the surrounding job) accordingly.

Chunk-oriented tasklet

In this case, you can use a step listener and set the terminateOnly flag in one of the lifecycle methods (afterRead, afterWrite, etc). Here is an example:

import java.time.Duration;
import java.util.Date;

import org.springframework.batch.core.StepExecution;
import org.springframework.batch.core.listener.StepListenerSupport;
import org.springframework.batch.core.scope.context.ChunkContext;

public class StopListener extends StepListenerSupport {

    private static final int TIMEOUT = 120; // in minutes (can be made configurable through constructor)

    private StepExecution stepExecution;

    @Override
    public void beforeStep(StepExecution stepExecution) {
        this.stepExecution = stepExecution;
    }

    @Override
    public void afterChunk(ChunkContext context) { // or afterRead, or afterWrite, etc.
        if (timeout(context)) {
            this.stepExecution.setTerminateOnly();
        }
    }

    private boolean timeout(ChunkContext chunkContext) {
        Date startTime = chunkContext.getStepContext().getStepExecution().getJobExecution().getStartTime();
        Date now = new Date();
        return Duration.between(startTime.toInstant(), now.toInstant()).toMinutes() > TIMEOUT;
    }
}

The idea is the same, you need to check the time regularly and set the flag when appropriate.

Both ways will leave your job in a STOPPED status which is a restartable status. Batch jobs used to be executed in a batch window and a common requirement was to stop them (gracefully) when the window is closed. The previous technique is the way to go.

The answer in Spring batch: Retry job if does not complete in particular time is not a good option IMO because it will abruptly terminate the transaction for the current chunk and leave the job in a FAILED status (which is a restartable status as well). However, by seeing a job in a FAILED status, it is not possible to distinguish a real failure from a deliberate stop. Given the requirement of deliberately wanting the job stop at a the end of the batch window, I believe that the job should be gracefully stopped and restarted in the next window.

2. How to restart the job automatically after the timeout?

Now that you know how to stop a job after a timeout, you can use a RetryTemplate around the job launcher and re-launch the job when appropriate. Here is an example:

public static void main(String[] args) throws Throwable {
    RetryTemplate retryTemplate = new RetryTemplate();
    retryTemplate.setRetryPolicy(new SimpleRetryPolicy(3));

    ApplicationContext applicationContext = new AnnotationConfigApplicationContext(MyJob.class);
    JobLauncher jobLauncher = applicationContext.getBean(JobLauncher.class);
    Job job = applicationContext.getBean(Job.class);
    JobParameters jobParameters = new JobParametersBuilder()
            .addDate("runtime", new Date())
            .toJobParameters();

    retryTemplate.execute((RetryCallback<JobExecution, Throwable>) retryContext -> {
        JobExecution jobExecution = jobLauncher.run(job, jobParameters);
        if (jobExecution.getExitStatus().getExitCode().equals(ExitStatus.STOPPED.getExitCode())) {
            throw new Exception("Job timeout");
        }
        return jobExecution;
    });
}

This will automatically re-run the job at most 3 times if it finishes with the status STOPPED (for example due to a timeout as shown previously).

Hope this helps.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...