kerlomz/captcha_trainer

Accuracy和Cost一直上下波动是什么原因

nq2010 opened this issue · 4 comments

请问训练了上百万次,没改动数据源,Accuracy和Cost一直上下波动是什么原因?总是附近徘徊,训练了几天都是这样
1183
1284
1285

配置文件

- requirement.txt - GPU: tensorflow-gpu, CPU: tensorflow

- If you use the GPU version, you need to install some additional applications.

System:
MemoryUsage: 0.8
Version: 2

CNNNetwork: [CNN5, ResNet, DenseNet]

RecurrentNetwork: [CuDNNBiLSTM, CuDNNLSTM, CuDNNGRU, BiLSTM, LSTM, GRU, BiGRU, NoRecurrent]

- The recommended configuration is CNN5+GRU

UnitsNum: [16, 64, 128, 256, 512]

- This parameter indicates the number of nodes used to remember and store past states.

Optimizer: Loss function algorithm for calculating gradient.

- [AdaBound, Adam, Momentum]

OutputLayer: [LossFunction, Decoder]

- LossFunction: [CTC, CrossEntropy]

- Decoder: [CTC, CrossEntropy]

NeuralNet:
CNNNetwork: CNNX
RecurrentNetwork: GRU
UnitsNum: 64
Optimizer: RAdam
OutputLayer:
LossFunction: CTC
Decoder: CTC

ModelName: Corresponding to the model file in the model directory

ModelField: [Image, Text]

ModelScene: [Classification]

- Currently only Image-Classification is supported.

Model:
ModelName: invver
ModelField: Image
ModelScene: Classification

FieldParam contains the Image, Text.

When you filed to Image:

- Category: Provides a default optional built-in solution:

-- [ALPHANUMERIC, ALPHANUMERIC_LOWER, ALPHANUMERIC_UPPER,

-- NUMERIC, ALPHABET_LOWER, ALPHABET_UPPER, ALPHABET, ALPHANUMERIC_CHS_3500_LOWER]

- or can be customized by:

-- ['Cat', 'Lion', 'Tiger', 'Fish', 'BigCat']

- Resize: [ImageWidth, ImageHeight/-1, ImageChannel]

- ImageChannel: [1, 3]

- In order to automatically select models using image size, when multiple models are deployed at the same time:

-- ImageWidth: The width of the image.

-- ImageHeight: The height of the image.

- MaxLabelNum: You can fill in -1, or any integer, where -1 means not defining the value.

-- Used when the number of label is fixed

When you filed to Text:

This type is temporarily not supported.

FieldParam:
Category: ALPHANUMERIC_CHS_DOCUMENT_LOWER
Resize: [90, 35]
ImageChannel: 3
ImageWidth: 90
ImageHeight: 35
MaxLabelNum: -1
OutputSplit:
AutoPadding: True

The configuration is applied to the label of the data source.

LabelFrom: [FileName, XML, LMDB]

ExtractRegex: Only for methods extracted from FileName:

- Default matching apple_20181010121212.jpg file.

- The Default is .?(?=_..)

LabelSplit: Only for methods extracted from FileName:

- The split symbol in the file name is like: cat&big cat&lion_20181010121212.png

- The Default is null.

Label:
LabelFrom: FileName
ExtractRegex: .*?(?=_)
LabelSplit: null

DatasetPath: [Training/Validation], The local absolute path of a packed training or validation set.

SourcePath: [Training/Validation], The local absolute path to the source folder of the training or validation set.

ValidationSetNum: This is an optional parameter that is used when you want to extract some of the validation set

- from the training set when you are not preparing the validation set separately.

SavedSteps: A Session.run() execution is called a Step,

- Used to save training progress, Default value is 100.

ValidationSteps: Used to calculate accuracy, Default value is 500.

EndAcc: Finish the training when the accuracy reaches [EndAcc*100]% and other conditions.

EndCost: Finish the training when the cost reaches EndCost and other conditions.

EndEpochs: Finish the training when the epoch is greater than the defined epoch and other conditions.

BatchSize: Number of samples selected for one training step.

ValidationBatchSize: Number of samples selected for one validation step.

LearningRate: [0.1, 0.01, 0.001, 0.0001]

- Use a smaller learning rate for fine-tuning.

Trains:
DatasetPath:
Training:
- ./projects/invver/dataset/Trains.4.tfrecords
#- ./projects/invver/dataset/Trains.7.tfrecords
#- ./projects/invver/dataset/Trains.6.tfrecords
#- ./projects/invver/dataset/Trains.5.tfrecords
- ./projects/invver/dataset/Trains.3.tfrecords
#- ./projects/invver/dataset/Trains.2.tfrecords
Validation:
- ./projects/invver/dataset/Validation.4.tfrecords
#- ./projects/invver/dataset/Validation.7.tfrecords
#- ./projects/invver/dataset/Validation.6.tfrecords
#- ./projects/invver/dataset/Validation.5.tfrecords
- ./projects/invver/dataset/Validation.3.tfrecords
#- ./projects/invver/dataset/Validation.2.tfrecords
SourcePath:
Training:
- D:/invver4
Validation:
ValidationSetNum: 300000
SavedSteps: 100
ValidationSteps: 500
EndAcc: 0.99
EndCost: 0.5
EndEpochs: 20
BatchSize: 64
ValidationBatchSize: 300
LearningRate: 0.001

Binaryzation: The argument is of type list and contains the range of int values, -1 is not enabled.

MedianBlur: The parameter is an int value, -1 is not enabled.

GaussianBlur: The parameter is an int value, -1 is not enabled.

EqualizeHist: The parameter is an bool value.

Laplace: The parameter is an bool value.

WarpPerspective: The parameter is an bool value.

Rotate: The parameter is a positive integer int type greater than 0, -1 is not enabled.

PepperNoise: This parameter is a float type less than 1, -1 is not enabled.

Brightness: The parameter is an bool value.

Saturation: The parameter is an bool value.

Hue: The parameter is an bool value.

Gamma: The parameter is an bool value.

ChannelSwap: The parameter is an bool value.

RandomBlank: The parameter is a positive integer int type greater than 0, -1 is not enabled.

RandomTransition: The parameter is a positive integer int type greater than 0, -1 is not enabled.

DataAugmentation:
Binaryzation: -1
MedianBlur: -1
GaussianBlur: -1
EqualizeHist: False
Laplace: False
WarpPerspective: False
Rotate: -1
PepperNoise: -1.0
Brightness: False
Saturation: False
Hue: False
Gamma: False
ChannelSwap: False
RandomBlank: -1
RandomTransition: -1
RandomCaptcha:
Enable: False
FontPath: None

Binaryzation: The parameter is an integer number between 0 and 255, -1 is not enabled.

ReplaceTransparent: Transparent background replacement, bool type.

HorizontalStitching: Horizontal stitching, bool type.

ConcatFrames: Horizontally merge two frames according to the provided frame index list, -1 is not enabled.

BlendFrames: Fusion corresponding frames according to the provided frame index list, -1 is not enabled.

- [-1] means all frames

Pretreatment:
Binaryzation: -1
ReplaceTransparent: True
HorizontalStitching: False
ConcatFrames: -1
BlendFrames: -1
ExecuteMap: {}

样本不够呗,波动上不去了就不用继续跑了因为是徒劳的。。

也不要太相信这个代码,不能确定是否有坑,毕竟楼主要挣辛苦钱,只能参考一下!hhhhhhhhhaahahh

配置文件

- requirement.txt - GPU: tensorflow-gpu, CPU: tensorflow

- If you use the GPU version, you need to install some additional applications.

System:
MemoryUsage: 0.8
Version: 2

CNNNetwork: [CNN5, ResNet, DenseNet]

RecurrentNetwork: [CuDNNBiLSTM, CuDNNLSTM, CuDNNGRU, BiLSTM, LSTM, GRU, BiGRU, NoRecurrent]

- The recommended configuration is CNN5+GRU

UnitsNum: [16, 64, 128, 256, 512]

- This parameter indicates the number of nodes used to remember and store past states.

Optimizer: Loss function algorithm for calculating gradient.

- [AdaBound, Adam, Momentum]

OutputLayer: [LossFunction, Decoder]

- LossFunction: [CTC, CrossEntropy]

- Decoder: [CTC, CrossEntropy]

NeuralNet:
CNNNetwork: CNNX
RecurrentNetwork: GRU
UnitsNum: 64
Optimizer: RAdam
OutputLayer:
LossFunction: CTC
Decoder: CTC

ModelName: Corresponding to the model file in the model directory

ModelField: [Image, Text]

ModelScene: [Classification]

- Currently only Image-Classification is supported.

Model:
ModelName: invver
ModelField: Image
ModelScene: Classification

FieldParam contains the Image, Text.

When you filed to Image:

- Category: Provides a default optional built-in solution:

-- [ALPHANUMERIC, ALPHANUMERIC_LOWER, ALPHANUMERIC_UPPER,

-- NUMERIC, ALPHABET_LOWER, ALPHABET_UPPER, ALPHABET, ALPHANUMERIC_CHS_3500_LOWER]

- or can be customized by:

-- ['Cat', 'Lion', 'Tiger', 'Fish', 'BigCat']

- Resize: [ImageWidth, ImageHeight/-1, ImageChannel]

- ImageChannel: [1, 3]

- In order to automatically select models using image size, when multiple models are deployed at the same time:

-- ImageWidth: The width of the image.

-- ImageHeight: The height of the image.

- MaxLabelNum: You can fill in -1, or any integer, where -1 means not defining the value.

-- Used when the number of label is fixed

When you filed to Text:

This type is temporarily not supported.

FieldParam:
Category: ALPHANUMERIC_CHS_DOCUMENT_LOWER
Resize: [90, 35]
ImageChannel: 3
ImageWidth: 90
ImageHeight: 35
MaxLabelNum: -1
OutputSplit:
AutoPadding: True

The configuration is applied to the label of the data source.

LabelFrom: [FileName, XML, LMDB]

ExtractRegex: Only for methods extracted from FileName:

- Default matching apple_20181010121212.jpg file.

- The Default is .?(?=._.)

LabelSplit: Only for methods extracted from FileName:

- The split symbol in the file name is like: cat&big cat&lion_20181010121212.png

- The Default is null.

Label:
LabelFrom: FileName
ExtractRegex: .*?(?=_)
LabelSplit: null

DatasetPath: [Training/Validation], The local absolute path of a packed training or validation set.

SourcePath: [Training/Validation], The local absolute path to the source folder of the training or validation set.

ValidationSetNum: This is an optional parameter that is used when you want to extract some of the validation set

- from the training set when you are not preparing the validation set separately.

SavedSteps: A Session.run() execution is called a Step,

- Used to save training progress, Default value is 100.

ValidationSteps: Used to calculate accuracy, Default value is 500.

EndAcc: Finish the training when the accuracy reaches [EndAcc*100]% and other conditions.

EndCost: Finish the training when the cost reaches EndCost and other conditions.

EndEpochs: Finish the training when the epoch is greater than the defined epoch and other conditions.

BatchSize: Number of samples selected for one training step.

ValidationBatchSize: Number of samples selected for one validation step.

LearningRate: [0.1, 0.01, 0.001, 0.0001]

- Use a smaller learning rate for fine-tuning.

Trains:
DatasetPath:
Training:

  • ./projects/invver/dataset/Trains.4.tfrecords
    #- ./projects/invver/dataset/Trains.7.tfrecords
    #- ./projects/invver/dataset/Trains.6.tfrecords
    #- ./projects/invver/dataset/Trains.5.tfrecords
  • ./projects/invver/dataset/Trains.3.tfrecords
    #- ./projects/invver/dataset/Trains.2.tfrecords
    Validation:
  • ./projects/invver/dataset/Validation.4.tfrecords
    #- ./projects/invver/dataset/Validation.7.tfrecords
    #- ./projects/invver/dataset/Validation.6.tfrecords
    #- ./projects/invver/dataset/Validation.5.tfrecords
  • ./projects/invver/dataset/Validation.3.tfrecords
    #- ./projects/invver/dataset/Validation.2.tfrecords
    SourcePath:
    Training:
  • D:/invver4
    Validation:
    ValidationSetNum: 300000
    SavedSteps: 100
    ValidationSteps: 500
    EndAcc: 0.99
    EndCost: 0.5
    EndEpochs: 20
    BatchSize: 64
    ValidationBatchSize: 300
    LearningRate: 0.001

Binaryzation: The argument is of type list and contains the range of int values, -1 is not enabled.

MedianBlur: The parameter is an int value, -1 is not enabled.

GaussianBlur: The parameter is an int value, -1 is not enabled.

EqualizeHist: The parameter is an bool value.

Laplace: The parameter is an bool value.

WarpPerspective: The parameter is an bool value.

Rotate: The parameter is a positive integer int type greater than 0, -1 is not enabled.

PepperNoise: This parameter is a float type less than 1, -1 is not enabled.

Brightness: The parameter is an bool value.

Saturation: The parameter is an bool value.

Hue: The parameter is an bool value.

Gamma: The parameter is an bool value.

ChannelSwap: The parameter is an bool value.

RandomBlank: The parameter is a positive integer int type greater than 0, -1 is not enabled.

RandomTransition: The parameter is a positive integer int type greater than 0, -1 is not enabled.

DataAugmentation:
Binaryzation: -1
MedianBlur: -1
GaussianBlur: -1
EqualizeHist: False
Laplace: False
WarpPerspective: False
Rotate: -1
PepperNoise: -1.0
Brightness: False
Saturation: False
Hue: False
Gamma: False
ChannelSwap: False
RandomBlank: -1
RandomTransition: -1
RandomCaptcha:
Enable: False
FontPath: None

Binaryzation: The parameter is an integer number between 0 and 255, -1 is not enabled.

ReplaceTransparent: Transparent background replacement, bool type.

HorizontalStitching: Horizontal stitching, bool type.

ConcatFrames: Horizontally merge two frames according to the provided frame index list, -1 is not enabled.

BlendFrames: Fusion corresponding frames according to the provided frame index list, -1 is not enabled.

- [-1] means all frames

Pretreatment:
Binaryzation: -1
ReplaceTransparent: True
HorizontalStitching: False
ConcatFrames: -1
BlendFrames: -1
ExecuteMap: {}

仔细一看,原来是税务的,看图的话,生成的不少,这么多步了才一个epoch,这个验证码我训练最终实测97.5%,你应该生成的有问题。