对数据表格标题翻译的修正意见

Question

对数据表格标题翻译的修正意见

Opened this issue 3 years ago · 5 comments

CutieDeng commented 3 years ago

对数据表格标题翻译的修正意见

所有英文单词后面请使用 "," （等英文标点）
total_cases 等标题的单位为「人」。
total_cases 与 new_cases 提供的信息内容是可互相推导的，忽略new_cases 相关列。
stringency_index: 财政紧缩指数
handwashing_facilities: 卫生设施
reproduction rate: 基本传染数，基本再生数。
是指没有任何防疫作为介入且所有人没有免疫力情况下，一个感染到某种传染病的初发个案，能够把疾病传染给其他多少个人的平均数。
基本传染数通常写作 $R_0$ . 容易发现，该值愈大，即流行病愈难控制。
在没有防疫情况下：
- 若 $R_0 < 1$, 该传染病将逐渐消失。
- 若 $R_0 > 1$, 该传染病会以指数方式散步，成为流行病。
- 若 $R_0 = 1$, 传染病会变成地方性流行病。
见：维基百科：基本传染数

补充：

icu_patients: 进入 ICU 的病例数
hosp_patients: 入院病例数
weekly_icu_admissions: 周进入 ICU 病例数
new_tests: 检测数
positive_rate: （检测）阳性率
tests_units: 检测单元
total_vaccinations: 接种疫苗数
total_boosters_per_hundred: （疫苗）加强针接种数
excess_mortality_cumulative: 超额死亡累计数

Answer 1 · 2021-11-21T06:29:20.000Z

追增有效信息列

将所有的信息分成两部分，一部分描述为「原始信息」，另一部分描述为「附加信息」。条件：原始信息能够自动地推导出附加信息，即——在接下来具体的实现中，不会实际存储附加信息，以节约存储的磁盘空间和避免信息不一致错误。

Answer 2 · 2021-11-21T06:40:45.000Z

smoothed关键词仍待解决，目前的理解是“降噪”

Answer 3 · 2021-11-25T14:48:08.000Z

无法正常显示可以点击Data atom model v1125.pdf进行阅读。

Data atom model v1125

This modification is modified at 22:44, Nov. 25, 2021.

Geography with country

ISO code
continent
location

[raw information] Some ISO code starting with 'OWID' such as 'OWID_NAM' is the summary of the information of a continent. It stored the position information in col 'location' rather than 'continent'.

Time Information

date

Country Information

<Geography>
population
population density
median age
aged 65/70 older
GDP per capital:question:
extreme poverty
cardiovascular death rate
diabetes prevalence
female/male smokers
handwashing facilities
hospital beds per thousand
life expectancy
human development index

Epidemic Information

<Country>
<Time>
Total cases

You can using difference equation to imply to 'new cases' values.
Total deaths
❔Total cases per million(total deaths, new deaths,
❕Reproduction rate
Patients in ICU
Patients in hospitals
Weekly ICU admissions
Weekly hospital admissions
Total tests
❕Positive rate
❕Tests per case
❕Tests units
Total vaccinations
People vaccinated
People fully vaccinated
❓Total boosters
Stringency index
Excess mortality cumulative absolute
Excess mortality cumulative
Excess mortality
Excess mortality cumulative per million

Graph Information

New cases smoothed
New deaths smoothed
New cases smoothed per million
New deaths smoothed per million
New tests smoothed
New vaccinations smoothed

Answer 4 · 2021-11-26T04:30:14.000Z

已经仔细阅读了你目前上传的整理数据的代码。现提出两点问题：

用Scanner读入的效率如何？如果换成bufferedReader，效率会不会更高？
希望可以在重要的代码前面加上注释，以便后续理解，譬如：compareIndex方法是将对应index的列按照国家分类？

Answer 5 · 2021-11-26T05:19:23.000Z

收到你的询问，现给出回答：

Scanner 本身带有相对友好的预处理功能，效率偏低；在需要效率的情况下，建议改成 BufferedReader.
收到，未来的代码我会在必要的、容易引起困惑的地方加上注释。

顺便补充接下来对数据处理的工作：

关注 per 相关信息是否冗余，比如：New cases smoothed per million 是否与 New cases smoothed 有直接的比例关系
澄清部分引起疑惑的信息的含义
给出最后的数据模型架构