JSR-352 is final and included in JEE7, the first implementation is available in Glassfish 4. JSR-352 takes over the programming model of Spring Batch almost 1-1, just take a look at the domain and component vocabulary:
Spring Batch | JSR-352 | Comment |
---|---|---|
Job | Job | |
Step | Step | |
Chunk | Chunk | |
Item | Item | |
ItemReader / ItemStream | ItemReader | JSR-352’s ItemReader includes Spring Batch’s ItemStream capabilities |
ItemProcessor | ItemProcessor | |
ItemWriter / ItemStream | ItemWriter | JSR-352’s ItemWriter includes Spring Batch’s ItemStream capabilities |
JobInstance | JobInstance | |
JobExecution | JobExecution | |
StepExecution | StepExecution | |
JobExecutionListener | JobListener | |
StepExecutionListener | StepListener | |
Listeners | Listeners | We have the same listeners in SB and JSR-352 |
Those are the most important components and names, but you can continue this list and you’ll only find minor differences. The configuration in XML for a simple job looks very much the same as well:
Spring Batch | JSR-352 |
---|---|
|
|
All in all it’s a very good thing from either point of view. The Java community gets a standard derived from the most popular open source batch framework, which in turn will implement the standard in its next release. People using Spring Batch will always have the safety to know that, if Spring Batch is abandoned sometime in the future, there are other implementations with the exact same programming model, and it’s (quite) easy to switch. People using other implementations of JEE7 server vendors have the safety to know that the programming model has been validated for years now.
Though the programming model is pretty much the same, there are still some differences between the JSR-352 specification and the current Spring Batch implementation. Today I wanna talk about three of them, and I’m very curious about how Michael Minella and Co. will solve those differences.
Scoping
The following paragraph is taken from the JSR-352 specification.
11.1 Batch Artifact Lifecycle
All batch artifacts are instantiated prior to their use in the scope in which they are declared in the Job XML and are valid for the life of their containing scope. There are three scopes that pertain to artifact lifecycle: job, step, and step-partition.
One artifact per Job XML reference is instantiated. In the case of a partitioned step, one artifact per Job XML reference per partition is instantiated. This means job level artifacts are valid for the life of the job. Step level artifacts are valid for the life of the step. Step level artifacts in a partition are valid for the life of the partition.
No artifact instance may be shared across concurrent scopes. The same instance must be used in the applicable scope for a specific Job XML reference.
So, we’re gonna have three scopes in implementations of the JSR-352: job, step and step-partition. In Spring Batch we currently have the two scopes singleton and step. Since partitioning is a little bit more different between Spring Batch and the JSR-352, I will exclude it here and just talk about the scopes job and step vs. the scopes singleton and step. In Spring Batch everything is singleton by default, and if we want to have step scope, we need to explicitly set it on the batch artifact. A job scope does not exist. A very practical consequence is that you can’t inject job parameters into components that are not in step scope. In JSR-352, all components inside or referenced by a definition get job scope and all components inside or referenced by a
definition get step scope. You cannot change that behaviour, which, for example, means that you cannot have components in singleton scope.
All in all, I prefer the JSR-352 way of dealing with scopes. Since many batch components have state and job parameters need to be injected here and there, you almost always end up giving step scope to almost every component inside a step, so step scope would be a sensible default and it wouldn’t really be a limitation if you cannot have singleton scope. A job scope would make sense in general, but it has been discussed in the Spring Batch community several times (for example here) and always has been declined for not adding much value. This is still true, since the only component that cannot have step scope for accessing job parameters is the JobExecutionListener
, and methods of this component always receive arguments which include the job parameters. So when the JSR-352 way is a little bit more straight forward and cleaner, it’s not a game changer, it’s more or less about a nicer default scope for steps and a job scope that’s not really necessary.
Anyway, if Spring Batch wants to implement the JSR-352, there will be some changes. The JSR-352’s JobListener
(which is the equivalent for the JobExecutionListener
in Spring Batch) definitely needs a job scope, because otherwise it would not have any chance to access job parameters (its beforeJob
and afterJob
methods don’t take arguments, so job parameters need to be injected, and step scope is not available at that point of processing the job). EDIT: Sometimes reality is faster than writing blog posts: Spring Batch 2.2.1 has been released, and it introduces a job scope.
Chunk processing
The following illustration is taken from the final release of the specification. You can see that one item is read, then processed, then the next item is read and processed, and finally all processed items are written in one action.
Ironically, this picture is copied from the Spring Batch reference documentation, but it has never been implemented like that. Chunk based processing in Spring Batch works like this:
First, all items for the chunk are read, then processed, then written. If processing in Spring Batch stays like this, it doesn’t conform to the JSR-352 spec, but why does it make a difference? It makes a difference, because the spec introduces an attribute time-limit
on the chunk element, and it specifies the number of seconds of reading and processing after which a chunk is complete. My guess is that in Spring Batch it will specify the number of seconds of reading after which a chunk is complete, because changing that behaviour would be too complex and didn’t bring too much value.
For batches that mostly do writing (and I know a lot of them) the time-limit
attribute is not very helpful anyway.
Properties
The JSR-352 introduces an interesting concept of dealing with properties. On almost any level of the job XML you may define your own properties, and then you can access them for substitution in property definitions that are defined after the first property AND belong to the hierarchy where the first property was defined. This example is taken from the spec:
1<job id="job1"> 2 <properties> 3 <property name="filestem" value="postings"/> 4 </properties> 5 <step id="step1"> 6 <chunk> 7 <properties> 8 <property name="infile.name" value="#{jobProperties['filestem']}.txt"/> 9 </properties> 10 </chunk> 11 </step> 12 </job>
The resolution for infile.name
would be postings.txt
. If you want to access the property in some component that’s referenced inside the chunk, for example the ItemReader
, you need to inject it with a special annotation BatchProperty
:
1@Inject @BatchProperty(name="infile.name") 2String fileName;
Until now we just saw how to define our own properties in the job XML, but the spec offers some more sources for properties. This is the complete list:
- jobParameters – specifies to use a named parameter from the job parameters.
- jobProperties – specifies to use a named property from among the job’s properties.
- systemProperties – specifies to use a named property from the system properties.
- partitionPlan – specifies to use a named property from the partition plan of a partitioned step.
This system reflects a little bit a different philosophy of dealing with properties. In a Spring application properties are normally read from a file and/or system properties with a little help of the PropertyPlaceholderConfigurer
and then used in bean definitions. In Spring Batch you additionally may access job parameters and job and step execution contexts (the latter would be the location for partition plan parameters) in bean definitions. The JSR-352 does not specify any way of reading properties from an external file, instead the job XML itself seems to be the property file. That’s not very useful, so I guess every implementation will have its own solution for reading properties from an external file.
Anyway, the possibility to define properties directly in the job XML and to build them up in a hierarchieral way is new to Spring Batch and has to be implemented for the JSR-352. Using @Inject @BatchProperty
for injecting properties into a bean is new as well, but it’s more or less the same thing that currently does the annotation @Value
, so the implementation shouldn’t be much of a problem.
Conclusion
Though the programming models in JSR-352 and Spring Batch are pretty much the same, there are some small differences between the spec and the implementation of Spring Batch. I’m curious about the way these differences are dealt with. Exciting times for batch programmers!
More articles
fromTobias Flohre
Your job at codecentric?
Jobs
Agile Developer und Consultant (w/d/m)
Alle Standorte
Gemeinsam bessere Projekte umsetzen.
Wir helfen deinem Unternehmen.
Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.
Hilf uns, noch besser zu werden.
Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.
Blog author
Tobias Flohre
Senior Software Developer
Do you still have questions? Just send me a message.
Do you still have questions? Just send me a message.