rebane2001/matterport-dl

[Bug]: When behind the great china firewall, matterport-dl.py fails

Closed this issue · 13 comments

One or more sample matterport IDs / urls that reproduce the issue

7aPE8e7JXFU

Python version

3.12.1 & 3.12.6

Steps to reproduce

pip install -r requirements.txt
python matterport-dl.py 7aPE8e7JXFU

What went wrong

Downloading capture of 7aPE8e7JXFU with base page... https://my.matterport.com/show/?m=7aPE8e7JXFU
Traceback (most recent call last):
File "D:\matterport-dl-main\matterport-dl.py", line 1206, in
asyncio.run(initiateDownload(pageId))
File "C:\Users\xg\AppData\Local\Programs\Python\Python312\Lib\asyncio\runners.py", line 194, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "C:\Users\xg\AppData\Local\Programs\Python\Python312\Lib\asyncio\runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xg\AppData\Local\Programs\Python\Python312\Lib\asyncio\base_events.py", line 684, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "D:\matterport-dl-main\matterport-dl.py", line 840, in initiateDownload
await downloadCapture(getPageId(url))
File "D:\matterport-dl-main\matterport-dl.py", line 614, in downloadCapture
staticbase = re.search(r'', base_page_text).group(1) # type: ignore - may be None
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'group'

Workarounds

I tried changing the Python version and changing the computer, but still couldn't download it

Ha I added the warning right there to ignore the fact it might be none but clearly it can happen ;)

It is odd you have this result though multiple times, not typically a failure point. Can you attached the file: .\downloads\7aPE8e7JXFU\index.html on your PC to this ticket.

As it does wfm so something must be a bit different on your download.

python matterport-dl.py 7aPE8e7JXFU
Downloading capture of 7aPE8e7JXFU with base page... https://my.matterport.com/show/?m=7aPE8e7JXFU
Downloading graph model data...
Doing advanced download of dollhouse/floorplan data...
Going to download tileset 3d asset models
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 49.18it/s]
Downloading textures and previews for tileset 3d models
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 507/507 [00:00<00:00, 2268.41it/s]
Downloading static files...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 332/332 [00:00<00:00, 346.82it/s]
Downloading model info...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 1117.81it/s]
Downloading plugins...
Downloading images...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 1799.14it/s]
Downloading primary model assets...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11730/11730 [00:38<00:00, 302.09it/s]
Done, Total fetches: 12647 Skipped: 3603 (28%) actual Request: 9044 (72%) Success: 16 (0%) Failed403: 8999 (71%) Failed404: 24 (0%) FailedUnknown: 5 (0%)!

I have attached the index.html file, please check it. @mitchcapper
7aPE8e7JXFU.zip

OK so I think the issue is matterport seems to be redirecting to a chinese server for IPs (I am guessing) that are behind the GFW. I have tried a quick hack so download the latest version of matterport-dl and try again.

I have no way of testing this and somewhat limited desire to debug downloading from behind the GFW. If this does not solve the problem I recommend trying a proxy based outside of china I think that will fix it.

When I use a Chinese IP, I still cannot download and there are the following errors:

Downloading capture of 7aPE8e7JXFU with base page... https://my.matterport.com/show/?m=7aPE8e7JXFU
Traceback (most recent call last):
File "D:\matterport-dl-main (2)\matterport-dl-main\matterport-dl.py", line 1217, in
asyncio.run(initiateDownload(pageId))
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\xg\AppData\Local\Programs\Python\Python313\Lib\asyncio\runners.py", line 194, in run
return runner.run(main)
~~~~~~~~~~^^^^^^
File "C:\Users\xg\AppData\Local\Programs\Python\Python313\Lib\asyncio\runners.py", line 118, in run
return self._loop.run_until_complete(task)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^
File "C:\Users\xg\AppData\Local\Programs\Python\Python313\Lib\asyncio\base_events.py", line 721, in run_until_complete
return future.result()
~~~~~~~~~~~~~^^
File "D:\matterport-dl-main (2)\matterport-dl-main\matterport-dl.py", line 851, in initiateDownload
await downloadCapture(getPageId(url))
File "D:\matterport-dl-main (2)\matterport-dl-main\matterport-dl.py", line 627, in downloadCapture
threeMin = re.search(rf"https://static.{BASE_MATTERPORT_DOMAIN}/webgl-vendors/three/[a-z0-9\-_/.]*/three.min.js", base_page_text).group() # type: ignore - may be None
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'group'

When I use a proxy, I still cannot download successfully

@mitchcapper

OK a few more china fixes try matterport-dl.py from https://github.com/mitchcapper/matterport-dl/tree/china_server_support_v2

Please post the run_report.log and index.html from the proxy run along with the log output for when you are using a proxy.

Thank you for your prompt support. I have already used China_Srver_Support-v2, but I still cannot download it. I have separately sent log files for not using a proxy and using a proxy.@mitchcapper
roWLLMMmPL8.zip
roWLLMMmPL8-proxy.zip

How are you using your proxy? Are you sure it is outside of china? The log file says it is still detecting the chinese redirect in the main page.

Yes, I confirm that my proxy address is outside of China. The current proxy address used in the United States. @mitchcapper
IP

Is the entire computer under VPN or is this a browser plugin? Are you using the --proxy option with matterport-dl.py? It is odd as the proxy zip file you sent still had the china redirect (if you open the html file you can see <base href="https://static.matterportvr.cn/showcase/24.9.2_webgl-414-gfb230a4879/"> at the top, the matterportvr.cn is not the normal server.

I added some more logging so download from that same url again, and make sure to delete the folder in downloads before each run so it is clean.

Hello, the latest version still cannot be downloaded.The last test showed that the entire computer was under VPN and the proxy option was not used in the matterport-dl.py.
I purchased a Windows server located in China for easier testing.
Remote Desktop Connection Address: 122.114.52.43:33890
Username: Administrator
Password: Your GitHub username
Please login and change your password
Attention: This is a temporary server and will be unavailable at 17:55, September 16, 2024, UTC-8. Afterwards, I can provide you with a new server. @mitchcapper

I was able to login and have changed the password you can find my email on my GH profile as well.

Success server was quite helpful. Download my branch again I think it should work now. Once you confirm ill merge into main here.

matterport-dl [china_server_support_v2 +0 ~1 -0 !]> python matterport-dl.py 7aPE8e7JXFU
Started up a download run Running python 3.13.0rc2 on win32 with matterport-dl version: refs/heads/china_server_support_v2 (55d5091f7f21d2635ab2100de035c7f161cab938)
Downloading capture of 7aPE8e7JXFU with base page... https://my.matterport.com/show/?m=7aPE8e7JXFU
Chinese matterport url found in main page, will try China server, note if this does not work try a proxy outside china
Downloading graph model data...
Doing advanced download of dollhouse/floorplan data...
Going to download tileset 3d asset models
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:00<00:00, 85.18it/s]
Downloading textures and previews for tileset 3d models
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 507/507 [00:05<00:00, 87.85it/s]
Downloading static files...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 332/332 [00:07<00:00, 42.97it/s]
Downloading model info...
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 453.86it/s]
Downloading plugins...
Downloading images...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 95.67it/s]
Downloading primary model assets...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 11730/11730 [07:03<00:00, 27.69it/s]
Done, Total fetches: 12648 Skipped: 513 (4%) actual Request: 12135 (96%) Success: 3107 (25%) Failed403: 9004 (71%) Failed404: 24 (0%) FailedUnknown: 0 (0%)!

great!I have tried this branch and it has successfully downloaded and run. thank you. @mitchcapper