billryan/algorithm-exercise

add problem description automatically from url link

billryan opened this issue · 2 comments

It is a new feature about writing style of problem description. Add problem description is a tedious work.
As a programmer, we should avoid writing monotonous, repetitive tasks. Here it comes! parse_source.py

The new writing style of problem can be found in Gray Code.

Usage

python scripts/parse_source.py problem_url

You can redirect the standard output to your clipboard or pbcopy or some file in your disk.

Example

python scripts/parse_source.py http://www.lintcode.com/en/problem/word-ladder/

You will get the following text from the standard output.

Given two words (_start_ and _end_), and a dictionary, find the length of
shortest transformation sequence from _start_ to _end_, such that:

  1. Only one letter can be changed at a time
  2. Each intermediate word must exist in the dictionary

#### Example

Given:  
_start_ = `"hit"`  
_end_ = `"cog"`  
_dict_ = `["hot","dot","dog","lot","log"]`  

As one shortest transformation is `"hit" -> "hot" -> "dot" -> "dog" -> "cog"`,  
return its length `5`.

#### Note

  * Return 0 if there is no such transformation sequence.
  * All words have the same length.
  * All words contain only lowercase alphabetic characters.

Paste or redirect to your markdown file, have fun with this new feature. 👍

从给定的 lintcode url 生成问题的 markdown 源文件已经完成,leetcode 部分的差不太多,不过因为目前主要是从 lintcode 家拷贝题目描述的,所以暂时不去折腾,有兴趣的可以参考 Lintcode 类改写给我发 PR.
主要依赖两个库,pyquery 主要用于 html 查询,速度非常快,用于查找如 tag, title, 题目难度等十分有效。html2text 则用于从 pyquery 输出的 html 解析生成 markdown.

后期规划:

  1. 重构之前的书写风格,尤其是题目描述部分。
  2. 加入 tags 方便归类,这个使用 Gitbook 的 glossary 功能完成,tags 也由程序自动生成。
  3. 加入难度标识,这个得解析 leetcode 上的,lintcode 上的太不准了。功能已经完成,未实施。

Enhancement for html2text:

  • <pre><code></code></pre>
  • <sup></sup>