Bug description
While using MongoItemReader
, I have configured my Step
bean to utilize faultTolerant
method and a skipLimit
of 5
on the skip condition for IllegalArgumentException
@Configuration
@RequiredArgsConstructor
public class PetJobConfig {
private final PetRepo petRepo;
private final JobRepository jobRepository;
private final PlatformTransactionManager platformTransactionManager;
private final MongoTemplate mongoTemplate;
@Bean
public Step readPetFromMongo() {
return new StepBuilder("petReaderMongo", jobRepository)
.allowStartIfComplete(true)
.<PetDomain, PetDomain>chunk(1000, platformTransactionManager)
.reader(petRepo.petReader())
.writer(new PetWriter(mongoTemplate))
.faultTolerant()
.skipLimit(5)
.skip(IllegalArgumentException.class)
.build();
}
@Bean
public Job readPetFromMongoJob() {
return new JobBuilder("petReaderMongoJob", jobRepository)
.start(readPetFromMongo())
.build();
}
}
In my current context, there could be data in my database that does not fully conform to the target type provided to the
reader. And I would like to make use of the faultTolerant
method to skip these dirty data.
type provided to MongoItemReader
public record Animal (
String name,
Animal animal
) {}
public enum Animal {
CAT,
DOG;
}
Repo class
@Repository
@RequiredArgsConstructor
public class PetRepo {
private final MongoTemplate mongoTemplate;
public MongoItemReader<PetDomain> petReader() {
Map<String, Sort.Direction> sorts = new HashMap<>();
Query query = new Query();
var reader = new MongoItemReaderBuilder<PetDomain>()
.name("petReader")
.collection("pet")
.pageSize(500)
.template(mongoTemplate)
.targetType(PetDomain.class)
.sorts(sorts)
.query(query)
.build();
return reader;
}
}
E.g. dirty data from mongodb
{
name: "Bingo",
animal: "CAT2" // Does not conform to enum provided
}
However, due to the way doPageRead
utilizes MongoOperations
to retrieve data as a list
instead of a stream
, it
is unable to serialize to the type as long as there is dirty data.
So to be able to iterate through the iterator, I have to override the entire doPageRead
method just to change
the MongoOperation
method from find
to stream
protected Iterator<T> doPageRead() {
if (queryString != null) {
Pageable pageRequest = PageRequest.of(page, pageSize, sort);
String populatedQuery = replacePlaceholders(queryString, parameterValues);
Query mongoQuery;
if (StringUtils.hasText(fields)) {
mongoQuery = new BasicQuery(populatedQuery, fields);
}
else {
mongoQuery = new BasicQuery(populatedQuery);
}
mongoQuery.with(pageRequest);
if (StringUtils.hasText(hint)) {
mongoQuery.withHint(hint);
}
if (StringUtils.hasText(collection)) {
// return (Iterator<T>) template.find(mongoQuery, type, collection).iterator();
return (Iterator<T>) template.stream(mongoQuery, type, collection).iterator();
}
else {
// return (Iterator<T>) template.find(mongoQuery, type).iterator();
return (Iterator<T>) template.stream(mongoQuery, type).iterator();
}
}
else {
Pageable pageRequest = PageRequest.of(page, pageSize);
query.with(pageRequest);
if (StringUtils.hasText(collection)) {
// return (Iterator<T>) template.find(query, type, collection).iterator();
return (Iterator<T>) template.stream(query, type, collection).iterator();
}
else {
// return (Iterator<T>) template.find(query, type).iterator();
return (Iterator<T>) template.stream(query, type).iterator();
}
}
}
I was wondering if its actually better to utilize stream
instead as find
prevents the iterator from iterating if a
document from the database does not conform to the class type provided.
Environment
- JDK 17
- spring batch 5.0.3
Expected behavior
If theres non conforming data from the database to the type specified in MongoItemReader
, it should be able to move on
to the next item in the iterator.