This project lets user upload images and return a related sentence.
Java 8
Maven
Python 2.7
Numpy
PyTorch
pycocotools
- Maven - Dependency Management
If you do not want to train the model from scratch, you can use a pretrained model. You can download the pretrained model here and the vocabulary file here. You should extract pretrained_model.zip to ./image_caption/models/
and vocab.pkl to ./image_caption/data/
using unzip
command.
First of all, git clone --recurse-submodules -j8
our project and cd
to the current project,
git clone --recurse-submodules -j8 https://github.com/yaodongyu/cs8524-pictureteller.git
cd cs8524-pictureteller
Download the pretrained model here and the vocabulary file here.
Extract pretrained_model.zip to ./image_caption/models/
and vocab.pkl to ./image_caption/data/
using unzip
command.
mvn package
mv target/pictureteller-0.0.1-SNAPSHOT.jar .
java -jar pictureteller-0.0.1-SNAPSHOT.jar
1). To browse the landing page, type
localhost:8080
2). To initialize a new user, type
localhost:8080/user/new
3). To add an image to user #id, such as user 1, type
localhost:8080/user/1/image
4). To show an image of user #id, such as user 1, type
localhost:8080/user/1/show
1). Open h2 console
localhost:8080/h2-console
2). Enter user name and related information, where
spring.datasource.driver-class-name=org.h2.Driver
spring.datasource.url=jdbc:h2:file:c:/temp/rcp_h2
spring.datasource.username=sa
spring.datasource.password=