Deploying zh(Chinese) version of Wikipedia shows 'failed to parse input: OutOfBounds'
FledgeXu opened this issue · 8 comments
I'm attempting to deploy zh(Chinese) version of Wikipedia and the script shows 'failed to parse input: OutOfBounds'
。
OS version
:
Linux localhost 4.19.0-5-amd64 #1 SMP Debian 4.19.37-5+deb10u2 (2019-08-08) x86_64 GNU/Linux
Rustc version
:
rustc 1.49.0 (e1884a8e3 2020-12-29)
logs
:
root@localhost:~/distributed-wikipedia-mirror# ./mirrorzim.sh --languagecode=zh --wikitype=wikipedia
Download the zim file...
base64: invalid input
--2021-01-23 22:01:06-- https://download.kiwix.org/zim/wikipedia/wikipedia_zh_all_maxi_2021-01.zim
Resolving download.kiwix.org (download.kiwix.org)... 195.154.156.115
Connecting to download.kiwix.org (download.kiwix.org)|195.154.156.115|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://ftpmirror.your.org/pub/kiwix/zim/wikipedia/wikipedia_zh_all_maxi_2021-01.zim [following]
--2021-01-23 22:01:06-- https://ftpmirror.your.org/pub/kiwix/zim/wikipedia/wikipedia_zh_all_maxi_2021-01.zim
Resolving ftpmirror.your.org (ftpmirror.your.org)... 204.9.55.82, 2001:4978:1:420::cc09:3752
Connecting to ftpmirror.your.org (ftpmirror.your.org)|204.9.55.82|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not Satisfiable
The file is already fully retrieved; nothing to do.
Remove tmp directory ./tmp/wikipedia_zh_all_maxi_2021-01 before run ...
Unpack the zim file into ./tmp/wikipedia_zh_all_maxi_2021-01...
thread 'main' panicked at 'failed to parse input: OutOfBounds', src/bin/extract_zim.rs:56:36
stack backtrace:
0: 0x55d36d1e8360 - std::backtrace_rs::backtrace::libunwind::trace::h04d12fdcddff82aa
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/../../backtrace/src/backtrace/libunwind.rs:100:5
1: 0x55d36d1e8360 - std::backtrace_rs::backtrace::trace_unsynchronized::h1459b974b6fbe5e1
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x55d36d1e8360 - std::sys_common::backtrace::_print_fmt::h9b8396a669123d95
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:67:5
3: 0x55d36d1e8360 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he009dcaaa75eed60
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:46:22
4: 0x55d36d209aec - core::fmt::write::h77b4746b0dea1dd3
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/fmt/mod.rs:1078:17
5: 0x55d36d1e49f2 - std::io::Write::write_fmt::heb7e50902e98831c
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/io/mod.rs:1518:15
6: 0x55d36d1ea965 - std::sys_common::backtrace::_print::h2d880c9e69a21be9
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:49:5
7: 0x55d36d1ea965 - std::sys_common::backtrace::print::h5f02b1bb49f36879
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:36:9
8: 0x55d36d1ea965 - std::panicking::default_hook::{{closure}}::h658e288a7a809b29
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:208:50
9: 0x55d36d1ea608 - std::panicking::default_hook::hb52d73f0da9a4bb8
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:227:9
10: 0x55d36d1eb101 - std::panicking::rust_panic_with_hook::hfe7e1c684e3e6462
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:593:17
11: 0x55d36d1eac47 - std::panicking::begin_panic_handler::{{closure}}::h42939e004b32765c
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:499:13
12: 0x55d36d1e881c - std::sys_common::backtrace::__rust_end_short_backtrace::h9d2070f7bf9fd56c
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:141:18
13: 0x55d36d1eaba9 - rust_begin_unwind
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:495:5
14: 0x55d36d207a51 - core::panicking::panic_fmt::ha0bb065d9a260792
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/panicking.rs:92:14
15: 0x55d36d207873 - core::option::expect_none_failed::h7e1dd0a94971eb61
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/option.rs:1268:5
16: 0x55d36d0feef0 - extract_zim::main::h0d770a376a8e6eab
17: 0x55d36d0f9bd3 - std::sys_common::backtrace::__rust_begin_short_backtrace::h5ecc56c6658a80dd
18: 0x55d36d0fa599 - std::rt::lang_start::{{closure}}::hb0d654310eb3e6ce
19: 0x55d36d1eb617 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h57e2a071d427b24c
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:259:13
20: 0x55d36d1eb617 - std::panicking::try::do_call::h81cbbe0c3b30a28e
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:381:40
21: 0x55d36d1eb617 - std::panicking::try::hbeeb95b4e1f0a876
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:345:19
22: 0x55d36d1eb617 - std::panic::catch_unwind::h59c48ccb40a0bf20
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panic.rs:396:14
23: 0x55d36d1eb617 - std::rt::lang_start_internal::ha53ab63f88fee728
at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/rt.rs:51:25
24: 0x55d36d100ba2 - main
25: 0x7f3fc854609b - __libc_start_main
26: 0x55d36d0f80da - _start
27: 0x0 - <unknown>
Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:
- "Priority" labels will show how urgent this is for the team.
- "Status" labels will show if this is ready to be worked on, blocked, or in progress.
- "Need" labels will indicate if additional input or analysis is required.
Finally, remember to use https://discuss.ipfs.io if you just need general support.
The problem is probably that it does not handle zstd compression introduced early 2020 in the ZIM format.
@FledgeXu if you have time you can try again with updated README from #77
Readable version: https://github.com/ipfs/distributed-wikipedia-mirror/blob/8a3c7d1cc5b2f0b787a76776d0ae27d33b911472/README.md#how-to-add-new-wikipedia-snapshots-to-ipfs