Scripts to analyze CCDs
cd src
groovy <SCRIPT> <DATADIR>
For example look at the runall.sh
. It downloads data from CHB's Sample CCDAs
repo and runs over them.
This is what your script should look like
/**
* This script prints the frequency of each section in the CCD
*/
import groovy.transform.BaseScript
@BaseScript CcdAnalysis script
def <K, V> Map<K, V> combineMaps(Map<K, List<V>> accum, Map<K, V> b) {
accum + b.collectEntries { k, v -> [k, (accum[k] ?: []) + v] }
}
def result = files.parallelStream().map { File file ->
def xml = new XmlSlurper().parse(file.newInputStream())
def sections = xml.component.structuredBody.component.section
sections.collectEntries { section ->
[section.code.@code.text(), section.title.text()]
}
}.reduce([:]) {accum, newMap -> combineMaps(accum, newMap) }.collect {k, v ->
[k, [count: v.size(), title: v.head()] ]
}
result.each {k, v -> println "$k - ${v.count} - ${v.title}" }
- Give a clear groovydoc on top that explains what the script does.
- Declare the
@BaseScript
to beCcdAnalysis
. That lets users run all scripts in an identical way. - Use the property
files
to access all CCDs available to you. - Name your script in a meaningful way, e.g.
SectionFrequency.groovy
- Optionally if you need static typing, use
@Grab('com.github.rahulsom:ihe-iti:0.8')
For every other aspect, this is just another groovy script.
Recommended:
- Try to parallelize script execution. Everyone has a multi core processor these days, and many people crunch 1000s of CCDs with this.
- Fork the repo
- Clone it
- Write your script, tweak an existing script
- Send a pull request
- Wait for Travis to run some tests
- Wait for me to merge it