ExpediaGroup/circus-train
Circus Train is a dataset replication tool that copies Hive tables between clusters and clouds.
JavaApache-2.0
Issues
- 0
out of date spring boot libraries when download libraries or build from sources
#233 opened by mitnitskiyas - 0
Housekeeping requires `circus_train` database
#111 opened by baumandm - 19
Circus train jobs are failing while adding metadata into the RDS housekeeping database.
#87 opened by rksangeeth007 - 0
Use PartitionIterator for getPartition calls
#215 opened by patduin - 1
Update of metadata during a replication fails when base path is not matched in all partitions
#212 opened by patduin - 0
- 0
- 4
- 1
- 0
- 1
Upgrade default Java version to 8
#170 opened by massdosage - 1
If table has large number of partitions to replicate, Replica.updateMetadata should send partition list in batches
#166 opened by barnharts4 - 0
partitionLimit does not trim partitionPredicate
#164 opened by patduin - 0
Implement Avro Schema Copier for S3-S3 replications
#162 opened by patduin - 1
Provide sts assume role support for s3-s3 copier
#151 opened by patduin - 0
Add references to downstream projects
#117 opened by massdosage - 1
Circus Train should also sync the external schema for Avro Tables when replication mode is METADATA_UPDATE
#144 opened by abhimanyugupta07 - 0
- 1
Don't override target cluster replication factor
#132 opened by patduin - 5
Circus Train uses the wrong database for housekeeping
#129 opened by AnanaMJ - 3
s3s3Copier fails for large data sets
#56 opened by yashrajrs - 1
Replication failure due to incompatible Hive versions when replication strategy is "PROPAGATE_DELETES"
#115 opened by dkc-bitsian - 0
Periodic GraphiteReporter Exception: Unable to report to Graphite Errors while running Circus Train
#120 opened by abhimanyugupta07 - 0
Upgrade AWS Java SDK to latest 1.x version (1.11.469) and fix AWS deprecated warning
#102 opened by massdosage - 3
Generic keystore implementation
#84 opened by revinjchalil - 0
- 1
Circus Train failing to get AWS Credentials while running on an ECS Container
#109 opened by abhimanyugupta07 - 5
- 5
- 4
Circus Train might be using too many database connections for the Housekeeping database
#88 opened by jmnunezizu - 0
Upgrade parent POM
#97 opened by massdosage - 1
- 0
- 2
Support replicating view definitions
#100 opened by spuranda123 - 0
Fix Jackson Databind alert
#91 opened by massdosage - 1
Wrongly parsed column values by Circus Train
#90 opened by andrispalfi - 1
Refactor duplicated SSH code
#85 opened by AnanaMJ - 0
Configurable working directory for GCP keys
#82 opened by courtsVII - 0
- 0
Avro schema replication failing when avro.schema.url without scheme is specified
#74 opened by courtsVII - 0
Integration tests fail (sometimes)
#60 opened by patduin - 0
- 1
Avro schema is not found from SerdeProperties avro.schema.url when replicating from Hadoop configured for HA
#69 opened by courtsVII - 0
- 1
Housekeeping fails when s3 location is deleted
#61 opened by yashrajrs - 0
Extract SSH logic into a separate library
#46 opened by ddcprg - 1
Select Copier using yaml file
#55 opened by yashrajrs - 2
S3-S3 Copier Hive Diff failure
#49 opened by courtsVII - 0
Circus Train job does not fail if exception is thrown on tableReplicationStart
#52 opened by courtsVII - 0
Improve replica check exception
#47 opened by ddcprg