Image
映像檔,可以把它想成是以前我們在玩 VM 的 Guest OS( 安裝在虛擬機上的作業系統 )。
Image 是唯讀( R\O )
Container
容器,利用映像檔( Image )所創造出來的,一個 Image 可以創造出多個不同的 Container,
Container 也可以被啟動、開始、停止、刪除,並且互相分離。
查看目前 images
docker image ls
刪除 Image
docker rmi [OPTIONS] IMAGE [IMAGE...]
查看目前運行的 container
docker ps
查看目前全部的 container( 包含停止狀態的 container )
docker ps -a
新建並啟動 Container
docker run [OPTIONS] IMAGE[:TAG|@DIGEST] [COMMAND] [ARG...]
啟動 Container
docker start [OPTIONS] CONTAINER [CONTAINER...]
停止 Container
docker stop [OPTIONS] CONTAINER [CONTAINER...]
重新啟動 Container
docker restart [OPTIONS] CONTAINER [CONTAINER...]
删除 Container
docker rm [OPTIONS] CONTAINER [CONTAINER...]
離開 Container 但任務繼續執行
先 Ctrl-p 再 Ctrl-q.
進入 Container
docker exec [OPTIONS] CONTAINER COMMAND [ARG...]
docker exec -it <Container ID> bash (以root進入container)
docker exec -it --user $(id -u):$(id -g) <Container ID> bash (以自己的user name進入container)
查看目前 images
可以看到目前已有許多不同版本的Pytorch以及Tensorflow可供使用
r07127@Velkoz:~$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
nvidia/cuda 10.0-cudnn7-devel-ubuntu16.04 edbd625b9042 2 days ago 3.11GB
nvidia/cuda 10.0-cudnn7-devel-ubuntu18.04 bdc0497c2295 2 days ago 3.07GB
ubuntu latest 3556258649b2 9 days ago 64.2MB
tensorflow/tensorflow 1.13.2-gpu-py3 b5408c35298a 2 weeks ago 3.35GB
tensorflow/tensorflow 1.14.0-gpu-py3 a7a1861d2150 5 weeks ago 3.51GB
pytorch/pytorch 1.1.0-cuda10.0-cudnn7.5-devel 1be771bff6d1 3 months ago 6.94GB
pytorch/pytorch 1.0.1-cuda10.0-cudnn7-devel 1d67d1a473d2 5 months ago 6.17GB
pytorch/pytorch 1.0-cuda10.0-cudnn7-devel fa5b91571a44 7 months ago 5.45GB
tensorflow/tensorflow 1.12.0-gpu-py3 413b9533f92a 8 months ago 3.35GB
tensorflow/tensorflow 1.11.0-gpu-py3 888a203f096f 10 months ago 3.27GB
tensorflow/tensorflow 1.10.0-gpu-py3 0b4ceed1758b 11 months ago 3.08GB
pytorch/pytorch 0.4.1-cuda9-cudnn7-devel 3febd5b72fb0 12 months ago 5.23GB
pytorch/pytorch 0.4-cuda9-cudnn7-devel 63994d8624a2 13 months ago 4.71GB
查看目前全部的 container
可以發現目前沒有任何被建立的container
r07127@Velkoz:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
新建並啟動 Container (先用root權限創建並安裝所需套件) 以pytorch 1.1.0-cuda10.0-cudnn7.5-devel為範例
r07127@Velkoz:~$ docker run --gpus all --name=r07127_pytorch_1.1 -it --shm-size 16G -v /home/r07127:/r07127 pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-devel bash
--gpus all
使用全部的GPU--name
設定 container 的名稱-i
讓容器的標準輸入保持打開-t
讓Docker分配一個虛擬終端(pseudo-tty)並綁定到容器的標準輸入上-it
-i -t 的合併寫法, 搞不懂這2個在幹嘛, 打上去就對了--shm-size 16G
設定container中/dev/shm 的大小為16G, 沒有設定的話預設為64M, 此大小在pytorch中會有error-v /home/r07127:/r07127
把Velkoz中/home/r07127的資料夾link到Container中的/r07127資料夾pytorch/pytorch:1.1.0-cuda10.0-cudnn7.5-devel
要用來啟動container的Image檔bash
你要執行的指令, 可以先打bash, 進入container後在下其他指令
輸入pip list
確定pytorch版本為1.1.0並安裝自己所需的套件
root@336dc392dea0:/workspace# pip list
Package Version
---------------- --------
asn1crypto 0.24.0
backcall 0.1.0
beautifulsoup4 4.7.1
certifi 2019.3.9
cffi 1.12.3
chardet 3.0.4
conda 4.6.14
conda-build 3.17.8
cryptography 2.6.1
decorator 4.4.0
filelock 3.0.10
glob2 0.6
idna 2.8
ipython 7.5.0
ipython-genutils 0.2.0
jedi 0.13.3
Jinja2 2.10.1
libarchive-c 2.8
lief 0.9.0
MarkupSafe 1.1.1
mkl-fft 1.0.12
mkl-random 1.0.2
numpy 1.16.3
olefile 0.46
parso 0.4.0
pexpect 4.7.0
pickleshare 0.7.5
Pillow 6.0.0
pip 19.1
pkginfo 1.5.0.1
prompt-toolkit 2.0.9
psutil 5.6.2
ptyprocess 0.6.0
pycosat 0.6.3
pycparser 2.19
Pygments 2.3.1
pyOpenSSL 19.0.0
PySocks 1.6.8
pytz 2019.1
PyYAML 5.1
requests 2.21.0
ruamel-yaml 0.15.46
setuptools 41.0.1
six 1.12.0
soupsieve 1.8
torch 1.1.0
torchvision 0.2.2
tqdm 4.31.1
traitlets 4.3.2
urllib3 1.24.2
wcwidth 0.1.7
wheel 0.33.1
輸入nvcc --version
確認cuda版本為10.0
root@336dc392dea0:/workspace# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2018 NVIDIA Corporation
Built on Sat_Aug_25_21:08:01_CDT_2018
Cuda compilation tools, release 10.0, V10.0.130
確認以上版本都是自己需要的以後
移動到自己link進container的資料夾, 並且執行自己的程式(此處以pytorch_mnist.py為例)
root@336dc392dea0:/workspace# cd ../r07127/Docker-Tutorial/
root@336dc392dea0:/r07127/Docker-Tutorial# python pytorch_mnist.py
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./MNIST/MNIST/raw/train-images-idx3-ubyte.gz
97%|#########################################################################################################1 | 9650176/9912422 [00:12<00:00, 2049268.76it/s]Extracting ./MNIST/MNIST/raw/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./MNIST/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting ./MNIST/MNIST/raw/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./MNIST/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting ./MNIST/MNIST/raw/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./MNIST/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting ./MNIST/MNIST/raw/t10k-labels-idx1-ubyte.gz
Processing...
Done!
Train Epoch: 1 [0/60000 (0%)] Loss: 2.300039
Train Epoch: 1 [640/60000 (1%)] Loss: 2.213450
Train Epoch: 1 [1280/60000 (2%)] Loss: 2.170429
Train Epoch: 1 [1920/60000 (3%)] Loss: 2.076518
Train Epoch: 1 [2560/60000 (4%)] Loss: 1.868044
Train Epoch: 1 [3200/60000 (5%)] Loss: 1.413765
Train Epoch: 1 [3840/60000 (6%)] Loss: 1.000020
Train Epoch: 1 [4480/60000 (7%)] Loss: 0.775462
Train Epoch: 1 [5120/60000 (9%)] Loss: 0.460300
Train Epoch: 1 [5760/60000 (10%)] Loss: 0.486124
Train Epoch: 1 [6400/60000 (11%)] Loss: 0.437753
Train Epoch: 1 [7040/60000 (12%)] Loss: 0.408149
Train Epoch: 1 [7680/60000 (13%)] Loss: 0.458510
Train Epoch: 1 [8320/60000 (14%)] Loss: 0.427587
Train Epoch: 1 [8960/60000 (15%)] Loss: 0.398676
Train Epoch: 1 [9600/60000 (16%)] Loss: 0.386188
Train Epoch: 1 [10240/60000 (17%)] Loss: 0.298156
Train Epoch: 1 [10880/60000 (18%)] Loss: 0.502031
Train Epoch: 1 [11520/60000 (19%)] Loss: 0.523246
Train Epoch: 1 [12160/60000 (20%)] Loss: 0.339011
Train Epoch: 1 [12800/60000 (21%)] Loss: 0.367665
Train Epoch: 1 [13440/60000 (22%)] Loss: 0.451440
Train Epoch: 1 [14080/60000 (23%)] Loss: 0.304708
Train Epoch: 1 [14720/60000 (25%)] Loss: 0.358396
Train Epoch: 1 [15360/60000 (26%)] Loss: 0.331111
Train Epoch: 1 [16000/60000 (27%)] Loss: 0.440046
Train Epoch: 1 [16640/60000 (28%)] Loss: 0.363632
Train Epoch: 1 [17280/60000 (29%)] Loss: 0.314917
Train Epoch: 1 [17920/60000 (30%)] Loss: 0.200644
Train Epoch: 1 [18560/60000 (31%)] Loss: 0.500168
Train Epoch: 1 [19200/60000 (32%)] Loss: 0.326233
Train Epoch: 1 [19840/60000 (33%)] Loss: 0.120071
Train Epoch: 1 [20480/60000 (34%)] Loss: 0.189754
Train Epoch: 1 [21120/60000 (35%)] Loss: 0.139856
Train Epoch: 1 [21760/60000 (36%)] Loss: 0.312687
Train Epoch: 1 [22400/60000 (37%)] Loss: 0.150788
Train Epoch: 1 [23040/60000 (38%)] Loss: 0.288335
Train Epoch: 1 [23680/60000 (39%)] Loss: 0.467620
9920512it [00:29, 2049268.76it/s] Train Epoch: 1 [24320/60000 (41%)] Loss: 0.215443
Train Epoch: 1 [24960/60000 (42%)] Loss: 0.153275
Train Epoch: 1 [25600/60000 (43%)] Loss: 0.224636
Train Epoch: 1 [26240/60000 (44%)] Loss: 0.263891
Train Epoch: 1 [26880/60000 (45%)] Loss: 0.233028
Train Epoch: 1 [27520/60000 (46%)] Loss: 0.264711
Train Epoch: 1 [28160/60000 (47%)] Loss: 0.210201
Train Epoch: 1 [28800/60000 (48%)] Loss: 0.132909
Train Epoch: 1 [29440/60000 (49%)] Loss: 0.277988
Train Epoch: 1 [30080/60000 (50%)] Loss: 0.094118
Train Epoch: 1 [30720/60000 (51%)] Loss: 0.128409
Train Epoch: 1 [31360/60000 (52%)] Loss: 0.246995
Train Epoch: 1 [32000/60000 (53%)] Loss: 0.336425
Train Epoch: 1 [32640/60000 (54%)] Loss: 0.153679
Train Epoch: 1 [33280/60000 (55%)] Loss: 0.091901
Train Epoch: 1 [33920/60000 (57%)] Loss: 0.145015
Train Epoch: 1 [34560/60000 (58%)] Loss: 0.197816
Train Epoch: 1 [35200/60000 (59%)] Loss: 0.218493
Train Epoch: 1 [35840/60000 (60%)] Loss: 0.062733
Train Epoch: 1 [36480/60000 (61%)] Loss: 0.135493
Train Epoch: 1 [37120/60000 (62%)] Loss: 0.116680
Train Epoch: 1 [37760/60000 (63%)] Loss: 0.236761
Train Epoch: 1 [38400/60000 (64%)] Loss: 0.063430
Train Epoch: 1 [39040/60000 (65%)] Loss: 0.106552
Train Epoch: 1 [39680/60000 (66%)] Loss: 0.160519
Train Epoch: 1 [40320/60000 (67%)] Loss: 0.107377
Train Epoch: 1 [40960/60000 (68%)] Loss: 0.177162
Train Epoch: 1 [41600/60000 (69%)] Loss: 0.229493
Train Epoch: 1 [42240/60000 (70%)] Loss: 0.073655
Train Epoch: 1 [42880/60000 (71%)] Loss: 0.153484
Train Epoch: 1 [43520/60000 (72%)] Loss: 0.277725
Train Epoch: 1 [44160/60000 (74%)] Loss: 0.142475
Train Epoch: 1 [44800/60000 (75%)] Loss: 0.115020
Train Epoch: 1 [45440/60000 (76%)] Loss: 0.120996
Train Epoch: 1 [46080/60000 (77%)] Loss: 0.077347
Train Epoch: 1 [46720/60000 (78%)] Loss: 0.194605
Train Epoch: 1 [47360/60000 (79%)] Loss: 0.069361
Train Epoch: 1 [48000/60000 (80%)] Loss: 0.211329
Train Epoch: 1 [48640/60000 (81%)] Loss: 0.138081
Train Epoch: 1 [49280/60000 (82%)] Loss: 0.093662
Train Epoch: 1 [49920/60000 (83%)] Loss: 0.107775
Train Epoch: 1 [50560/60000 (84%)] Loss: 0.120280
Train Epoch: 1 [51200/60000 (85%)] Loss: 0.146696
Train Epoch: 1 [51840/60000 (86%)] Loss: 0.068481
Train Epoch: 1 [52480/60000 (87%)] Loss: 0.024163
Train Epoch: 1 [53120/60000 (88%)] Loss: 0.260879
Train Epoch: 1 [53760/60000 (90%)] Loss: 0.092017
Train Epoch: 1 [54400/60000 (91%)] Loss: 0.129662
Train Epoch: 1 [55040/60000 (92%)] Loss: 0.189733
Train Epoch: 1 [55680/60000 (93%)] Loss: 0.034060
Train Epoch: 1 [56320/60000 (94%)] Loss: 0.035735
Train Epoch: 1 [56960/60000 (95%)] Loss: 0.078727
Train Epoch: 1 [57600/60000 (96%)] Loss: 0.115442
Train Epoch: 1 [58240/60000 (97%)] Loss: 0.192014
Train Epoch: 1 [58880/60000 (98%)] Loss: 0.206907
Train Epoch: 1 [59520/60000 (99%)] Loss: 0.063385
Test set: Average loss: 0.1013, Accuracy: 9668/10000 (97%)
...(略)
在執行pytorch_mnist.py的同時可以觀察2顆GPU是否正常運作
r07127@Velkoz:~$ nvidia-smi -l 1
Fri Aug 2 01:23:56 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 415.27 Driver Version: 415.27 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:01:00.0 Off | N/A |
| 0% 54C P2 67W / 275W | 581MiB / 11177MiB | 14% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... Off | 00000000:02:00.0 Off | N/A |
| 0% 45C P2 62W / 275W | 565MiB / 11178MiB | 9% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1296 G /usr/lib/xorg/Xorg 9MiB |
| 0 1342 G /usr/bin/gnome-shell 6MiB |
| 0 23543 C python 553MiB |
| 1 23543 C python 553MiB |
+-----------------------------------------------------------------------------+