X-lab2017/open-digger

[OSS101] Task7: Open source domain visualization dashboard

Opened this issue · 19 comments

Description

The task objective is to create an open source domain visualization dashboard based on the data provided by OpenDigger or obtained independently. This dashboard will primarily be used to display real-time status, trends, and relationships of large-scale data, enabling users to quickly and accurately understand the overall situation of the data.

In terms of visualization libraries, popular open-source options include D3.js, Echarts, Highcharts, AntV, etc. For frontend frameworks, React, Vue, Angular, and others are highly favored.
Additionally, there are numerous data visualization and analysis tools available. Open source tools like DataEase, Superset, Metabase, as well as commercial tools like Power BI, FineBI, Tableau, etc., can assist users in analyzing data rapidly and gaining insights into business trends.

Refer to:

  1. https://dataease.nzcer.cn/#/delink?link=VBUnBtSHpoJsJKnGJWsDbhuX7Js9mi2jhQvVt5nqVlyHIyn2uAvAO0h%2BJCR7OiyfoxIqnVG7F7qo8HlMBBWNYw&user=gdj4ml2CihGOrtwIFyMIW%2BKZq1l7bZBb34K%2FTg3uKiph2AvSvsG%2FzjFBZ1TSJRFu1%2B3JYQ%2F4yOmgIasmNbUUgA

  2. https://dataease.nzcer.cn/link.html?link=KcbYHnTCu1UOtKDl6xM4AReMV8OTtVNM2w/Hbv+HKmEE9qvzp9q/FIMrJWr6OREd8D8HQhKDEKKLOABsNjwS8A==&user=tFKg7LbjBr9YjnHFoJ8JqcIxGljN2DDlxW1yJ25LtQ6h0KaUxl0OLCG989V6ED+JWCQME60MXNyI//Em3K2eKA==

我想请问一下如果用自己的数据obtained independently,在数据上有什么要求吗

没有要求~,可以使用opendigger所提供的数据,也可以使用自己补充的数据~

所以用自己数据的话,就做一个能反映出我自己数据的具体情况的leaderboard就行了,是吗?

只要是与开原生态相关的即可~

我在open digger使用npm run notebook的时候报了下面的错:
2183aed339eb87c120557be6b7e7653

没事了,换成python的就可以了

不对,应该是用python version是可以打开notebook的,但是运行不了,因为是js的语法,但是如果用js的话,npm run notebook就会报上面的错误

应该是open-digger/src/cron/tasks/issue_reaction_importer.ts文件的问题,好像是部分库的改动导致有些方法不再支持了,我重新改完之后的文件发在下面了你们也可以看一下

const task: Task = {
cron: '0 3 * * *',
callback: async () => {
const logger = getLogger('IssueReactionImporterTask');

const config = await getConfig();

const githubApp = new App({
  appId: config.github.appId,
  privateKey: readFileSync(config.github.appPrivateKeyPath).toString(),
});

const getInstallationAccessToken = async (installationId) => {
  const octokit = await githubApp.getInstallationOctokit(installationId);
  return octokit;
};

const gettInstalledRepos = async (): Promise<{ iid: any, id: any, name: string }[]> => {
  // get repo list from GitHub App
  const octokit = new Octokit({
    auth: `Bearer ${githubApp.octokit.auth({type:"app"})}`,
  });
  const repos: { iid: number, id: number, name: string }[] = [];
  const installations: any[] = await octokit.paginate('GET /app/installations');
  for (const i of installations) {
    const oct = new Octokit({
      auth: `Bearer ${await getInstallationAccessToken({installationId: i.id})}`,
    });
    const installationRepos = await oct.paginate('GET /installation/repositories');
    logger.info(`Got ${installationRepos.length} repos for installation ${i?.account?.login}`);
    repos.push(...installationRepos.map(r => ({ iid: i.id, id: r.id, name: r.full_name })));
  }
  return repos;
};

const getOctokitClient = async (repo: { iid: number, id: number, name: string }): Promise<Octokit> => {
  return new Octokit({
    auth: `Bearer ${await getInstallationAccessToken({
      installationId: repo.iid,
      repositoryIds: [repo.id],
    })}`,
  });
};

主要改动的一些部分
f89e72a3bcd46ac71c3235a251a002b

使用不同的语言build npm run notebook,应使用不同的requirements,这个解决方案也许可以参考~#1578

getInstallationAccessToken

Well try! getInstallationAccessToken方法似乎存在冗余,应该可以做得更好~

使用不同的语言build npm run notebook,应使用不同的requirements,这个解决方案也许可以参考~#1578

这个解决方案已经实现并在Windows和模拟linux环境下完成测试(try https://github.com/birdflyi/open-digger/tree/master in #1589 ),欢迎测试 @Peng99999 @pry666 😆

最重要的是,所有kernel各自的README.md都作了同步更新~

好的,我自己再尝试一下

6ab5d8eefd1280835968ba45ba5c43e
在运行handbook_py.ipynb的时候报错这个是怎么回事

我换nodejs好像可以用了,但是在读取clickhouse的时候好像还是有如下报错
86c86a294b25ec53ebbd32beee82c19

6ab5d8eefd1280835968ba45ba5c43e 在运行handbook_py.ipynb的时候报错这个是怎么回事

这是由于python kernel的此函数已经过时了。相关函数在label_data_utils.py中,其上一次有效维护记录是去年的commit dc29bf3f617081d10d16f3e3b3a93c3a9345c470。

我使用如下命令追溯了当时的文件结构:

git clone git@github.com:X-lab2017/open-digger.git
cd open-digger/
git checkout -b tmp dc29bf3
head labeled_data/companies/360/index.yml

发现标签目录结构与当前版本的结构不同:
image
当前版本多出来了一个层次,包含了labels, platforms等新的标签划分,导致旧的python kernel无法正常处理相关的逻辑。

建议:可以尝试通过维护此kernel或换用node.js、pycjs等方式解决。

我换nodejs好像可以用了,但是在读取clickhouse的时候好像还是有如下报错 86c86a294b25ec53ebbd32beee82c19

未能复现。
看起来似乎与你的host设置有关,clickhouse的设置可以参考Clickhouse-sample-data
注意需要安装docker,下载data,按上面的docker pulldocker run命令拉取clickhouse server image并将docker上的clickhouse server容器启动;然后完成你所用的kernel所对应的'local_config'文件中的host设置,按所用kernel相对应的步骤build kernel的环境并启动notebook serverk容器。
若仍然报错,报告更多的细节有助于进一步找出并解决问题~