Get very very low results my custom dataset

Question

Get very very low results my custom dataset

Closed this issue 7 years ago · 1 comments

Hi, I have converted the Stanford Drone Dataset to Pascal VOC format. I then converted it to TFRecords with the command lumi Adapting Dataset.
Dataset content :
train ==> 40000 images
trainval ==> 56000 images
val==> 17000 images
test ==> 23500 images
I have 6 class(pedestrian, biker, skater, car, cart, bus).
my config file :
train:

Run on debug mode (which enables more logging).

debug: False

Seed for random operations.

seed:

Training batch size for images. FasterRCNN currently only supports 1.

batch_size: 1

Base directory in which model checkpoints & summaries (for Tensorboard) will

be saved.

job_dir: jobs/

Ignore scope when loading from checkpoint (useful when training RPN first

and then RPN + RCNN).

ignore_scope:

Enables TensorFlow debug mode, which stops and lets you analyze Tensors

after each `Session.run`.

tf_debug: False

Name used to identify the run. Data inside `job_dir` will be stored under

`run_name`.

run_name: Stanford/

Disables logging and saving checkpoints.

no_log: False

Displays debugging images with results every N steps. Debug mode must be

enabled.

display_every_steps:

Display debugging images every N seconds.

display_every_secs: 300

Shuffle the dataset. It should only be disabled when trying to reproduce

some problem on some sample.

random_shuffle: True

Save Tensorboard timeline.

save_timeline: False

The frequency, in seconds, that a checkpoint is saved.

save_checkpoint_secs: 600

The frequency, in number of global steps, that the summaries are written to

disk.

save_summaries_steps:

The frequency, in seconds, that the summaries are written to disk. If both

save_summaries_steps and save_summaries_secs are set to empty, then the

default summary saver isn't used.

save_summaries_secs: 30

Run TensorFlow using full_trace mode for memory and running time logging

Debug mode must be enabled.

full_trace: False

Clip gradients by norm, making sure the maximum value is 10.

clip_by_norm: False

Learning rate config.

learning_rate:
# Because we're using kwargs, we want the learning_rate dict to be replaced
# as a whole.
_replace: True
# Learning rate decay method; can be: ((empty), 'none', piecewise_constant,
# exponential_decay, polynomial_decay) You can define different decay
# methods using decay_method and defining all the necessary arguments.
decay_method:
learning_rate: 0.0003

Optimizer configuration.

optimizer:
# Because we're using kwargs, we want the optimizer dict to be replaced as a
# whole.
_replace: True
# Type of optimizer to use (momentum, adam, gradient_descent, rmsprop).
type: momentum
# Any options are passed directly to the optimizer as kwarg.
momentum: 0.9

Number of epochs (complete dataset batches) to run.

num_epochs: 1000

Image visualization mode, options = train, eval, debug, (empty).

Default=(empty).

image_vis: train

Variable summary visualization mode, options = full, reduced, (empty).

var_vis:

eval:

Image visualization mode, options = train, eval, debug,

(empty). Default=(empty).

image_vis: eval

dataset:
type: object_detection

From which directory to read the dataset.

dir: 'tf/'

Which split of tfrecords to look for.

split: train

Resize image according to min_size and max_size.

image_preprocessing:
min_size: 600
max_size: 1024

Data augmentation techniques.

data_augmentation:
- flip:
left_right: True
up_down: False
prob: 0.5
# Also available:
# # If you resize to too small images, you may end up not having any anchors
# # that aren't partially outside the image.
# - resize:
# min_size: 600
# max_size: 1024
# prob: 0.2
# - patch:
# min_height: 600
# min_width: 600
# prob: 0.2
# - distortion:
# brightness:
# max_delta: 0.2
# hue:
# max_delta: 0.2
# saturation:
# lower: 0.5
# upper: 1.5
# prob: 0.3

model:
type: fasterrcnn
network:
# Total number of classes to predict.
num_classes: 6
# Use RCNN or just RPN.
with_rcnn: True

Whether to use batch normalization in the model.

batch_norm: False

base_network:
# Which type of pretrained network to use.
architecture: resnet_v1_50
# Should we train the pretrained network.
trainable: True
# From which file to load the weights.
weights:
# Should we download weights if not available.
download: True
# Which endpoint layer to use as feature map for network.
endpoint:
# Starting point after which all the variables in the base network will be
# trainable. If not specified, then all the variables in the network will be
# trainable.
fine_tune_from: block2
# Whether to train the ResNet's batch norm layers.
train_batch_norm: False
# Whether to use the base network's tail in the RCNN.
use_tail: True
# Whether to freeze the base network's tail.
freeze_tail: False
# Output stride for ResNet.
output_stride: 16
arg_scope:
# Regularization.
weight_decay: 0.0005

loss:
# Loss weights for calculating the total loss.
rpn_cls_loss_weight: 1.0
rpn_reg_loss_weights: 1.0
rcnn_cls_loss_weight: 1.0
rcnn_reg_loss_weights: 1.0

anchors:
# Base size to use for anchors.
base_size: 256
# Scale used for generating anchor sizes.
scales: [0.25, 0.5, 1, 2]
# Aspect ratios used for generating anchors.
ratios: [0.5, 1, 2]
# Stride depending on feature map size (of pretrained).
stride: 16

rpn:
activation_function: relu6
l2_regularization_scale: 0.0005 # Disable using 0.
# Sigma for the smooth L1 regression loss.
l1_sigma: 3.0
# Number of filters for the RPN conv layer.
num_channels: 512
# Kernel shape for the RPN conv layer.
kernel_shape: [3, 3]
# Initializers for RPN weights.
rpn_initializer:
_replace: True
type: random_normal_initializer
mean: 0.0
stddev: 0.01
cls_initializer:
_replace: True
type: random_normal_initializer
mean: 0.0
stddev: 0.01
bbox_initializer:
_replace: True
type: random_normal_initializer
mean: 0.0
stddev: 0.001

proposals:
  # Total proposals to use before running NMS (sorted by score).
  pre_nms_top_n: 12000
  # Total proposals to use after NMS (sorted by score).
  post_nms_top_n: 2000
  # Option to apply NMS.
  apply_nms: True
  # NMS threshold used when removing "almost duplicates".
  nms_threshold: 0.7
  min_size: 0  # Disable using 0.
  # Run clipping of proposals after running NMS.
  clip_after_nms: False
  # Filter proposals from anchors partially outside the image.
  filter_outside_anchors: False
  # Minimum probability to be used as proposed object.
  min_prob_threshold: 0.0

target:
  # Margin to crop proposals to close to the border.
  allowed_border: 0
  # Overwrite positives with negative if threshold is too low.
  clobber_positives: False
  # How much IoU with GT proposals must have to be marked as positive.
  foreground_threshold: 0.7
  # High and low thresholds with GT to be considered background.
  background_threshold_high: 0.3
  background_threshold_low: 0.0
  foreground_fraction: 0.5
  # Ration between background and foreground in minibatch.
  minibatch_size: 256
  # Assign to get consistent "random" selection in batch.
  random_seed:  # Only to be used for debugging.

rcnn:
layer_sizes: [] # Could be e.g. [4096, 4096].
dropout_keep_prob: 1.0
activation_function: relu6
l2_regularization_scale: 0.0005
# Sigma for the smooth L1 regression loss.
l1_sigma: 1.0
# Use average pooling before the last fully-connected layer.
use_mean: True
# Variances to normalize encoded targets with.
target_normalization_variances: [0.1, 0.2]

rcnn_initializer:
  _replace: True
  type: variance_scaling_initializer
  factor: 1.0
  uniform: True
  mode: FAN_AVG
cls_initializer:
  _replace: True
  type: random_normal_initializer
  mean: 0.0
  stddev: 0.01
bbox_initializer:
  _replace: True
  type: random_normal_initializer
  mean: 0.0
  stddev: 0.001

roi:
  pooling_mode: crop
  pooled_width: 7
  pooled_height: 7
  padding: VALID

proposals:
  # Maximum number of detections for each class.
  class_max_detections: 100
  # NMS threshold used to remove "almost duplicate" of the same class.
  class_nms_threshold: 0.5
  # Maximum total detections for an image (sorted by score).
  total_max_detections: 300
  # Minimum prob to be used as proposed object.
  min_prob_threshold: 0.5

target:
  # Ratio between foreground and background samples in minibatch.
  foreground_fraction: 0.25
  minibatch_size: 256
  # Threshold with GT to be considered positive.
  foreground_threshold: 0.5
  # High and low threshold with GT to be considered negative.
  background_threshold_high: 0.5
  background_threshold_low: 0.0

And result :

Average Precission is terribly bad. where did i make mistake? How can fix this?Please help

Answer 1 · 2018-10-22T20:30:02.000Z

What it's the image?