Document
- The simple c version author is Eric
- Overlap Data Transfers in CUDA
CNN accelerated by cuda.
The start-of-art result's of popular datasets
- Test on mnist and get 99.76%, after voting(99.82%) (best 99.79%)
- Test on cifar-10 and get 85.49% (best 89%)
- Use Dropout to train the NetWork
- Support checkpoint, the program will save the best test result and save the network weight in the file "Result/checkPoint.txt", If the program exit accidentally, you can continue the program form this checkpoint.
- Translate the data set of mnist, including scale, rotate, distortion, accordding to Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis.
- The log will be saved in the file "Result/log.txt".
- In the convolutional layers, you can chose combine feature maps, according to notes on Convolutional Neural NetWorks.
- Support local connection layers.
- If you want the program run fast, you can set the "TEST_EPOCH" to be large.
- Support branchLayer and combineLayer, which is designed accordding to goolenet, the network structure is no logger an linear structure but Directed acycline graph.
Depend on opencv and cuda
You can compile the code on windows or linux.
###SDK include path(-I)
- linux: /usr/local/cuda/samples/common/inc/ (For include file "helper_cuda"); /usr/local/include/opencv/ (Depend on situation)
- windows: X:/Program Files (x86) /NVIDIA Corporation/CUDA Samples/v6.5/common/inc (For include file "helper_cuda"); X:/Program Files/opencv/vs2010/install/include (Depend on situation)
###Library search path(-L)
- linux: /usr/local/lib/
- windows: X:/Program Files/opencv/vs2010/install/x86/cv10/lib (Depend on situation)
###libraries(-l)
- opencv_core
- opencv_highgui
- opencv_imgproc
- opencv_imgcodecs (need for opencv3.0)
- cublas
- curand
- cudadevrt
###GPU compute
- capability 2.0
###Windows
- Install vs2010.
- Download and install opencv-2.4 or other higher versions
- Download and install cuda-5.0 or other higher versions
- When you create a new project using VS2010, You can find NVIDIA-CUDA project template, create a cuda-project.
- View-> Property Pages-> Configuration Properties-> CUDA C/C++ -> Device-> Code Generation-> compute_20,sm_20
- View-> Property Pages-> Configuration Properties-> CUDA C/C++ -> Common-> Generate Relocatable Device Code-> Yes(-rdc=true)
- View-> Property Pages-> Configuration Properties-> Linker-> Input-> Additional Dependencies-> libraries(-l)
- View-> Property Pages-> Configuration Properties-> VC++ Directories-> General-> Library search path(-L)
- View-> Property Pages-> Configuration Properties-> VC++ Directories-> General-> Include Directories(-I)
###Linux
- Install opencv and cuda
- Start the nsight from cuda
- Create an 'empty cuda' project and import the clone code
- Project->Proerties for add-> Build-> Settings->CUDA->Device linker mode: separate compilation
- Project->Proerties for add-> Build-> Settings->CUDA->Generate PTX code 2.0
- Project->Proerties for add-> Build-> Settings->CUDA->Generate GPU code 2.0
- Project->Proerties for add-> Build-> Settings->Tool Settings->NVCC Compiler->includes: +/usr/local/cuda/samples/common/inc/; + opencv sdk include path ;
- Project->Proerties for add-> Build-> Settings->Tool Settings->NVCC Linkers->Libraries: libraries(-l)
- Project->Proerties for add-> Build-> Settings->Tool Settings->NVCC Linkers->Libraries search path(-L): /usr/local/lib/
Config
- Author :zhxfl
- Mail :zhxfl@mail.ustc.edu.cn
- Welcome for any suggest!!