proposal: How to add support of local testing for a new language?

Question

proposal: How to add support of local testing for a new language?

j178 opened this issue 2 years ago · 11 comments

j178 commented 2 years ago

我最近一直在思考一个更加统一、规范的 local testing 的实现，减少支持新语言所需的开发工作量，让各种语言的 local testing 表现更加一致。

以生成 C++ 版本的 two-sum 为例，我提议做出以下定义：

一个 Generator 应该生成什么？

question.md [optional]
testcases.txt
solution.h
solution_test.cpp

`question.md`

question.md 包含 Markdown 格式的题目描述，只有当开启 code.separate_descrption_file 时才生成这个文件。否则题目描述包含在代码文件中(solution.h)。

See #94 .

`testcases.txt`

testcases.txt 由多个 case 组成，case 之间由一个空行分隔。

case 包括 input 和 output 两部分，如果 input 参数有多个，每行放置一个。

示例：

target_case: 0

input:
[3,9,20,null,null,15,7]
output:
3

input:
[1,null,2]
output:
2

testcases.txt 中还可能包含一个可选的 Key-value: target_case: 0，表示只执行某一个 case。case 的编号从 1 开始，向下递增。

target_case: 0 中 0 表示执行所有 case.
也可以为负数，-1 代表最后一个 case.

`solution.h`

主要包含经过模板替换、modifier 修改后的 question code snippet，示例：

// 省略描述部分

// @lc code=begin

class Solution {
public:
    int maxDepth(TreeNode* root) {
        
    }
};

// @lc code=end

`solution_test.cpp`

solution_test.cpp 应该是一个完整的程序，每次调用执行一个独立的 case。

它从 stdin 读入完整的 case input
将 case input 反序列化为函数入参需要的类型，比如 list, list of list, ListNode, TreeNode
调用 solution function，获取 function return value
将 return value 序列化，从 stdout 输出

solution_test.cpp 只是逻辑上的概念，并不一定需要生成为单独的文件，可以与代码文件存在于同一个文件中。

示例：

// Code generated by https://github.com/j178/leetgo
#include <bits/stdc++.h>
#include "LC_IO.h"
#include "solution.h"

using namespace std;

int main(int argc, char **argv) {
	// scan input args
	TreeNode* root; cin >> root;

	// initialize object
	Solution *obj = new Solution();

	// call method
	auto &&res = obj->maxDepth(root);

        // print result
	cout << "output: " << res << endl;

	delete obj;
	return 0;
}

为避免用户自己打印到 stdout 的内容影响 output 的解析，return value 的 output 应该包含 output: 前缀，例如：

output: [0,1]

序列化和反序列化的功能实现应该存在于一个公共的 library 中，由 solution_test.cpp 引用。这个 library 应该由 Generator 的Initialize() 拷贝到 $outDir/common 中。library 的原始代码应该放在仓库的 testutils/$lang 目录下，通过 embed 打包在程序中。示例：

Click to expand

#ifndef LC_IO_H
#define LC_IO_H

#include <iostream>
#include <queue>

/**
 * Definition for a singly-linked list.
 */
struct ListNode {
	int val;
	ListNode *next;
	ListNode() : val(0), next(nullptr) {}
	ListNode(int x) : val(x), next(nullptr) {}
	ListNode(int x, ListNode *next) : val(x), next(next) {}
};

/**
 * Function for deserializing a singly-linked list.
 */
std::istream &operator>>(std::istream &is, ListNode *&node) {
	node = nullptr;
	ListNode *now = nullptr;
L0: is.ignore();
L1: switch (is.peek()) {
	case ' ':
	case ',': is.ignore(); goto L1;
	case ']': is.ignore(); goto L2;
	default : int x; is >> x;
	          now = (now ? now->next : node) = new ListNode(x);
	          goto L1;
	}
L2: switch (is.peek()) {
	case '\r': is.ignore(); goto L2;
	case '\n': is.ignore(); goto L3;
	case EOF : goto L3;
	}
L3: return is;
}

/**
 * Function for serializing a singly-linked list.
 */
std::ostream &operator<<(std::ostream &os, ListNode *node) {
	os << '[';
	while (node != nullptr) {
		os << node->val << ',';
		node = node->next;
	}
	os.seekp(-1, std::ios_base::end);
	os << ']';
	return os;
}

/**
 * Definition for a binary tree node.
 */
struct TreeNode {
	int val;
	TreeNode *left;
	TreeNode *right;
    TreeNode(int x) : val(x), left(nullptr), right(nullptr) {}
    TreeNode(int x, TreeNode *left, TreeNode *right) : val(x), left(left), right(right) {}
};

/**
 * Function for deserializing a binary tree.
 */
std::istream &operator>>(std::istream &is, TreeNode *&node) {
	std::deque<TreeNode *> dq;
L0: is.ignore();
L1: switch (is.peek()) {
	case ' ':
	case ',': is.ignore(); goto L1;
	case 'n': is.ignore(5); dq.emplace_back(nullptr);
	          goto L1;
	case ']': is.ignore(); goto L2;
	default : int x; is >> x;
	          dq.emplace_back(new TreeNode(x));
	          goto L1;
	}
L2: switch (is.peek()) {
	case '\r': is.ignore(); goto L2;
	case '\n': is.ignore(); goto L3;
	case EOF : goto L3;
	}
L3: int n = dq.size();
	for (int i = 0, j = 1; i < n; ++i) {
		auto root = dq[i];
		if (root == nullptr) { continue; }
		root->left = j < n ? dq[j] : nullptr;
		root->right = j + 1 < n ? dq[j + 1] : nullptr;
		j += 2;
	}
	node = n ? dq[0] : nullptr;
	return is;
}

/**
 * Function for serializing a binary tree.
 */
std::ostream &operator<<(std::ostream &os, TreeNode *node) {
	std::queue<TreeNode *> q;
	int cnt_not_null_nodes = 0;
	auto push = [&](TreeNode *node) {
		q.emplace(node);
		if (node != nullptr) { ++cnt_not_null_nodes; }
	};
	auto pop = [&]() {
		auto front = q.front(); q.pop();
		if (front != nullptr) {
			--cnt_not_null_nodes;
			push(front->left);
			push(front->right);
			os << front->val << ',';
		} else {
			os << "null,";
		}
	};
	os << '[';
	if (node != nullptr) {
		push(node);
		while (cnt_not_null_nodes > 0) { pop();	}
		os.seekp(-1, std::ios_base::end);
	}
	os << ']';
	return os;
}

/**
 * Function for deserializing an array.
 */
template <typename T>
std::istream &operator>>(std::istream &is, std::vector<T> &v) {
L0: is.ignore();
L1: switch (is.peek()) {
	case ' ':
	case ',': is.ignore(); goto L1;
	case ']': is.ignore(); goto L2;
	default : v.emplace_back();
	          if constexpr (std::is_same_v<T, std::string>) {
	              is >> quoted(v.back());
	          } else {
	              is >> v.back();
	          }
	          goto L1;
	}
L2: switch (is.peek()) {
	case '\r': is.ignore(); goto L2;
	case '\n': is.ignore(); goto L3;
	case EOF : goto L3;
}
L3: return is;
}

/**
 * Function for serializing an array.
 */
template <typename T>
std::ostream &operator<<(std::ostream &os, const std::vector<T> &v) {
	os << '[';
	if constexpr (std::is_same_v<T, std::string>) {
		for (auto &&x : v) { os << quoted(x) << ','; }
	} else if constexpr (std::is_same_v<T, double>) {
		for (auto &&x : v) {
			char buf[320]; sprintf(buf, "%.5f,", x); os << buf;
		}
	} else {
		for (auto &&x : v) { os << x << ','; }
	}
	os.seekp(-1, std::ios_base::end);
	os << ']';
	return os;
}

#endif

test generator

这是 leetgo pick 的工作，它负责：

初始化测试环境

包括拷贝 library 代码，初始化语言相关的 workspace (go mod init, cargo init, python -m venv 等)
生成 question.md、testcases.txt、solution.h、solution_test.cpp 文件

test runner

这是 leetgo test -L 的工作，它负责：

编译生成 executable (可选)

对于编译型语言，需要现将 solution_test.cpp 编译为 binary 才能执行。生成的 binary 应该放在 /tmp/leetgo/$questionSlug-$langSlug.exec 。
解析 testcases.txt 抽出每个 case 的 input，作为 stdin 调用 executable，捕获 executable 的 stdout 并解析出 output
比对 output 与 expected output（现阶段是直接的字符串比对，后续可能会支持更智能的判断），展示比对结果。

支持新语言需要的工作

用新语言实现参数的序列化、反序列化
实现 solution_test.$lang 的生成
实现 solution_test.$lang 的编译和调用

大家怎么看？欢迎一起讨论 @w43322 @frostming @acehinnnqru

Answer 1 · 2023-02-16T11:50:49.000Z

关于solution_test.cpp 我理解你的目的，所有语言都变成一个 stdin -> program -> stdout的测试程序。抹掉了语言的差异，但这样有两个问题：

为了让用户可以专注于逻辑的实现，leetgo需要实现一个调用层

       ┌────────────────────────────────────────────┐
       │leetgo                                      │
       │                                            │
       │        ┌─────────────────────────┐         │
       │        │ Language native         │         │
  stdin│        │                         │         │stdout
──────►├────────┼────► class Solution─────┼────────►├─────►
       │        │                         │         │
       │        └─────────────────────────┘         │
       │                                            │
       └────────────────────────────────────────────┘

理想情况下用户只需要提供 class Solution，那么leetgo则负责完成(read stdin -> decode as language native)，每个语言需要实现一个互操作层

此外，这样把测试放到黑盒里跑了，不利于用户在IDE里打断点调试（指test runner）

Answer 2 · 2023-02-16T11:52:25.000Z

不过话说回来，leetcode的测试程序基本也是按这样实现的(stdin, stdout) @yihong0618

Answer 3 · 2023-02-16T11:53:54.000Z

是的，一个 docker 的 runtime 但据说要换掉，但这个据说是我两年前听到的

Answer 4 · 2023-02-16T12:22:14.000Z

理想情况下用户只需要提供 class Solution，那么leetgo则负责完成(read stdin -> decode as language native)，每个语言需要实现一个互操作层

read stdin -> decode as language native 这应该是 solution_test.cpp 自己做的，例如这样：

int main(int argc, char **argv) {
	// scan input args
	TreeNode* root; cin >> root;

	// initialize object
	Solution *obj = new Solution();

	// call method
	auto &&res = obj->maxDepth(root);

        // print result
	cout << "$output$" << res << "$output$";

	delete obj;
	return 0;
}

leetgo 的调用层应该是：

解析 testcases.txt 抽出每个 case 的 input，作为 stdin 调用 executable，捕获 executable 的 stdout 并解析出 output

是与语言无关的。

此外，这样把测试放到黑盒里跑了，不利于用户在IDE里打断点调试（指test runner）

因为 solution_test.cpp 是一个完整的程序，所以用户依然是可以打断点调试的。只是这种情况下没有 leetgo 提供 stdin input，用户需要自己复制粘贴 case 到控制台中。

Answer 5 · 2023-02-18T06:22:34.000Z

关于这个问题我有一些想法，但最近比较忙，下周我细说一下

Answer 6 · 2023-02-26T06:08:28.000Z

这个 proposal 中的大部分设计都已经在 #116 中实现了，接下来我会来更新 #90 #111 来适配新的测试框架，谢谢各位~

Answer 7 · 2023-03-02T01:02:06.000Z

同意这个proposal，但有一个小问题，output wrapper是不是复杂一点比较好？ output: 可能还是会和用户的输出有冲突

Answer 8 · 2023-03-02T03:38:37.000Z

想了想，把 output marker 改成了 @output:, 应该足够了，也不会特别奇怪，谢谢你的建议

Answer 9 · 2023-03-02T04:45:32.000Z

或者用 @lc output，跟 code begin/end marker 一致

Answer 10 · 2023-03-03T12:10:10.000Z

1. 编译生成 executable (可选)
   对于编译型语言，需要现将 `solution_test.cpp` 编译为 binary 才能执行。生成的 binary 应该放在 `/tmp/leetgo_cache/$slug.exec` 。

请问临时目录的路径如何获取，目前项目里是否有这种函数呢

Answer 11 · 2023-03-03T13:21:30.000Z

@w43322 Hi, 我在 #126 中添加了一个 getTempBinFile 函数，能帮忙 review 吗？