Bypass Thunderskill

Question

Bypass Thunderskill

Opened this issue 8 months ago · 24 comments

As there is a lot of complaints about the nature of the data in thunderskill being selective and not representative of the actual general performance.
Would you be interested in bypassing thunderskill and collecting the data directly?
This way all games could be parsed and we could avoid arguments against the validity of the data.

Answer 1 · 2023-12-22T00:21:13.000Z

Hi Bearddyy, is there any way can do that legally?

Answer 2 · 2023-12-22T12:07:27.000Z

@ControlNet We can extract scores etc from replay files downloaded from their website,
No less legal than doing the same thing from thunderskill?

Answer 3 · 2024-01-18T04:43:21.000Z

Kind of interesting on this. any update?
My opinion is find a way to extract data from original API, for example, in game client, we can see someone's profile, it's data must come from somewhere.
I once obtained the real API url of the game data through packet capture, but I don't know how to use it

Answer 4 · 2024-01-18T04:51:16.000Z

@axiangcoding Actually, there is someone having tried that, but it's actually not feasible here. If you want to try downloading all the replay files from WT's official replay website, you need around 2~4 gbits per second to download it. And they will ban the ip if downloading too much. So it's not a good way.

From that person's analysis, the data from the official replay website and the data collected from thunderskill strongly correlated, so currently the data is still fine for some analysis.

Answer 5 · 2024-01-18T06:06:17.000Z

@axiangcoding Actually, there is someone having tried that, but it's actually not feasible here. If you want to try downloading all the replay files from WT's official replay website, you need around 2~4 gbits per second to download it. And they will ban the ip if downloading too much. So it's not a good way.

From that person's analysis, the data from the official replay website and the data collected from thunderskill strongly correlated, so currently the data is still fine for some analysis.

i believe you reply to the wrong guy...

Answer 6 · 2024-01-18T06:15:22.000Z

@axiangcoding Actually, there is someone having tried that, but it's actually not feasible here. If you want to try downloading all the replay files from WT's official replay website, you need around 2~4 gbits per second to download it. And they will ban the ip if downloading too much. So it's not a good way.
From that person's analysis, the data from the official replay website and the data collected from thunderskill strongly correlated, so currently the data is still fine for some analysis.

i believe you reply to the wrong guy...

No... I guess that person who contacted me via email is Breaddyy, so I share the information here to let you know.

Answer 7 · 2024-01-18T06:19:58.000Z

@ControlNet okey then. I remember that the replay file is binary or encrypted. Is there a way to decrypt it now?

Answer 8 · 2024-01-18T06:21:33.000Z

@axiangcoding I see the Bearddyy's repository can handle it. Please have a look https://github.com/Bearddyy/wtparser

Answer 9 · 2024-01-18T06:27:07.000Z

@axiangcoding I see the Bearddyy's repository can handle it. Please have a look https://github.com/Bearddyy/wtparser

Thanks. He really make some progress on this

Answer 10 · 2024-01-18T08:20:17.000Z

Kind of interesting on this. any update? My opinion is find a way to extract data from original API, for example, in game client, we can see someone's profile, it's data must come from somewhere. I once obtained the real API url of the game data through packet capture, but I don't know how to use it

I have found a way to get full player data without even needing an auth header, buuutttt it uses protobuf, and I need to transform compiled definitions to a file to be able to use it.

I've still found some very useful endpoints through, such as searching for player names (I've also scraped for them too) and fetching news.

Answer 11 · 2024-01-18T08:39:05.000Z

I have found a way to get full player data without even needing an auth header, buuutttt it uses protobuf, and I need to transform compiled definitions to a file to be able to use it.

Mind if share what is and how to use that API? I used tried to capture network packet, but it's a cdn url base on AWS, not sure i can use it.

I've still found some very useful endpoints through, such as searching for player names (I've also scraped for them too, I'll add the link soon) and fetching news.

Looking forward to see those links!

Answer 12 · 2024-01-18T08:42:47.000Z

@axiangcoding
It's from the assistant

I'll make a public postman workspace and link it o7

Answer 13 · 2024-01-18T08:48:36.000Z

@RaidFourms Thanks advance for sharing! It helps a lot.

Answer 14 · 2024-01-18T08:50:41.000Z

Thanks for sharing. Looking forward to your works.

Answer 15 · 2024-01-18T08:57:01.000Z

@axiangcoding Actually, there is someone having tried that, but it's actually not feasible here. If you want to try downloading all the replay files from WT's official replay website, you need around 2~4 gbits per second to download it. And they will ban the ip if downloading too much. So it's not a good way.
From that person's analysis, the data from the official replay website and the data collected from thunderskill strongly correlated, so currently the data is still fine for some analysis.

i believe you reply to the wrong guy...

No... I guess that person who contacted me via email is Breaddyy, so I share the information here to let you know.

I didn't email you, must have been someone else.
As for data rate limit, it's potentially able to be circumvented by distribution of scripts to VMs as each processes less. Also I have found each replay has 2 types of files that alternate so I suspect the data rate is further reduced. But again, could just be blocked by gaining, would need a fair amount of automation.

Answer 16 · 2024-01-18T08:59:32.000Z

@axiangcoding Actually, there is someone having tried that, but it's actually not feasible here. If you want to try downloading all the replay files from WT's official replay website, you need around 2~4 gbits per second to download it. And they will ban the ip if downloading too much. So it's not a good way.
From that person's analysis, the data from the official replay website and the data collected from thunderskill strongly correlated, so currently the data is still fine for some analysis.

i believe you reply to the wrong guy...

No... I guess that person who contacted me via email is Breaddyy, so I share the information here to let you know.

I didn't email you, must have been someone else. As for data rate limit, it's potentially able to be circumvented by distribution of scripts to VMs as each processes less. Also I have found each replay has 2 types of files that alternate so I suspect the data rate is further reduced. But again, could just be blocked by gaining, would need a fair amount of automation.

I would imagine proxies would be much more efficient

Answer 17 · 2024-01-18T09:08:31.000Z

Also here is the username/userid scraper. Very inefficient through 😭

Answer 18 · 2024-01-18T09:26:33.000Z

Also here is the username/userid scraper. Very inefficient through 😭

Thanks for this, I didn't even know the app existed.
I think there's potentially a fair amount of data that could be scrapped from that endpoint, but the specific data I was interested in, like specific vehicle or national performance doesn't appear as if it would be available from navigating around the app. It looks like it's similar data to the user page.

Answer 19 · 2024-01-18T09:35:32.000Z

Also here is the username/userid scraper. Very inefficient through 😭

Thanks for this, I didn't even know the app existed. I think there's potentially a fair amount of data that could be scrapped from that endpoint, but the specific data I was interested in, like specific vehicle or national performance doesn't appear as if it would be available from navigating around the app. It looks like it's similar data to the user page.

wdym?

Answer 20 · 2024-01-18T10:23:58.000Z

@axiangcoding Here is some API stuff https://www.postman.com/llama-for3ver/workspace/wta-public/documentation/31045545-13f0ca8f-cc11-423f-9ed1-53000afa06fb

Answer 21 · 2024-01-21T04:06:07.000Z

@axiangcoding Any update on it yet?

Answer 22 · 2024-01-21T09:09:25.000Z

@axiangcoding Any update on it yet?

Sorry, I'm not very familiar with protobuf, I need some time to try it out. I don’t have much time recently, I will share with you any progress

Answer 23 · 2024-01-21T09:40:25.000Z

@axiangcoding Any update on it yet?

Sorry, I'm not very familiar with protobuf, I need some time to try it out. I don’t have much time recently, I will share with you any progress

I don't know much either lol

i'll still keep you updated o7

Answer 24 · 2024-01-21T11:56:41.000Z

hi guys, i have started a repo https://github.com/axiangcoding/wt-profile-tool to create a wt profile parse library base on @RaidFourms's information and help. But for now this is just the beginning.

For the player vehicle data, I think I've been able to parse it out, in the near future.

Thank you all for sharing the information, it will greatly advance this work.

In fact, I am good at web server development. I know nothing about parsing apk and stuff like that. Special thanks to @RaidFourms's hard work on this.