`-format` not respected?
sdruskat opened this issue · 2 comments
Thanks for supplying this great tool!
When trying to harvest ArXiv using metha-sync http://export.arxiv.org/oai2 -format "arXiv" -rm
, I'm getting a log of harvest: &{BaseURL:http://export.arxiv.org/oai2 Format:oai_dc Set: From: Until: Client:<RETRACTED> MaxRequests:1048576 DisableSelectiveHarvesting:false CleanBeforeDecode:true IgnoreHTTPErrors:false MaxEmptyResponses:10 SuppressFormatParameter:false HourlyInterval:false DailyInterval:false ExtraHeaders:map[] KeepTemporaryFiles:false Delay:0 Identify:<RETRACTED> Started:0001-01-01 00:00:00 +0000 UTC Mutex:{state:0 sema:0}}
(note the Format:oai_dc
bit).
The tmp files also include the oai_dc
-formatted XML: <Response><responseDate>2024-01-08T11:29:07Z</responseDate><request verb="ListRecords" set="" metadataPrefix="oai_dc">...
.
Am I calling metha-sync the wrong way, or is this a bug?
Thanks for the bug report and glad you find metha useful.
What you encountered is a limitation of the default flag, that is, all options must come before any argument:
Flag parsing stops just before the first non-flag argument ("-" is a non-flag argument) or after the terminator "--".
This should be better:
$ metha-sync -format "arXiv" -rm http://export.arxiv.org/oai2