Could you implement an example of an archive website such as headbook or Weibo?
Opened this issue · 2 comments
Could you implement an example of an archive website such as headbook or Weibo?
I encountered some problems in the implementation process. For example, during the implementation of Weibo API, the Save option prompts backup processing, and the post content is not displayed after refresh.
Here are some code paths and contents I changed.
- packages\backend\src\Utility\parsePostId\parsePostId.ts
import { getUrlLastSegment } from '../../Utility/urlParse/getLastSegment';
.......
.......
export async function parsePostId(urlStr: string): Promise<string> {
return getUrlLastSegment(urlStr);
- packages/backend/src/Components/Post/Service/postApi.ts
function getPostApi(postId: string): AxiosPromise {
return crawlerAxios.get(`https://weibo.com/ajax/statuses/longtext?id=${postId}`);
- packages/backend/src/Components/Post/Service/postCrawler.ts
private transformData(res: any): {
repostingId: string;
postInfo: IPost;
userRaw: unknown;
embedImages: string[];
} {
/* extract the information here. If it's a html document,
try to manipulate the html with cheerio */
return res.postInfo;
Running platform : Gitpod
Hi @GoYiz , This is the instance of headbook https://github.com/Combo819/headbook-archiver
The transformData
is obviously incomplete. It should return an object of this type
{
repostingId: string;
postInfo: IPost;
userRaw: unknown;
embedImages: string[];
}
but you just return res.postInfo
.
You should make some transformation of the res
object, and convert it into the type IPost
. You can check the example here https://github.com/Combo819/headbook-archiver/blob/master/packages/backend/src/Components/Post/Service/postCrawler.ts#L162
The res
from Weibo may have very different structure. So you need to write your own transformation based on the res
structure.
BTW, if you're trying to implement a weibo instance, m.weibo.cn
is easier than weibo.com
.
You can also assign https://weibo.com
to BASE_URL
in https://github.com/Combo819/headbook-archiver/blob/master/packages/backend/src/Config/constants.ts#L1
So you can write the API function like
return crawlerAxios.get(`/ajax/statuses/longtext?id=${postId}`);