손실이 수렴하지 않는 Caffe 회귀

Caffe에서 회귀를하고 있습니다. 데이터 세트는 128x128 크기의 400 RGB 이미지이며 레이블에는 범위 (-1,1)의 부동 소수점 숫자가 포함됩니다. 데이터 집합에 적용한 유일한 변환은 정규화 (RGB의 모든 픽셀 값을 255로 나눈 값)입니다. 그러나 손실은 전혀 수렴하지 않는 것 같습니다.손실이 수렴하지 않는 Caffe 회귀

어떤 이유 일 수 있습니까? 누구든지 제게 제안 해 주시겠습니까? 여기

내 훈련 로그입니다 :

Training.. 
Using solver: solver_hdf5.prototxt 
I0929 21:50:21.657784 13779 caffe.cpp:112] Use CPU. 
I0929 21:50:21.658033 13779 caffe.cpp:174] Starting Optimization 
I0929 21:50:21.658107 13779 solver.cpp:34] Initializing solver from parameters: 
test_iter: 100 
test_interval: 500 
base_lr: 0.0001 
display: 25 
max_iter: 10000 
lr_policy: "inv" 
gamma: 0.0001 
power: 0.75 
momentum: 0.9 
weight_decay: 0.0005 
snapshot: 5000 
snapshot_prefix: "lenet_hdf5" 
solver_mode: CPU 
net: "train_test_hdf5.prototxt" 
I0929 21:50:21.658143 13779 solver.cpp:75] Creating training net from net file: train_test_hdf5.prototxt 
I0929 21:50:21.658567 13779 net.cpp:334] The NetState phase (0) differed from the phase (1) specified by a rule in layer data 
I0929 21:50:21.658709 13779 net.cpp:46] Initializing net from parameters: 
name: "MSE regression" 
state { 
    phase: TRAIN 
} 
layer { 
    name: "data" 
    type: "HDF5Data" 
    top: "data" 
    top: "label" 
    include { 
    phase: TRAIN 
    } 
    hdf5_data_param { 
    source: "train_hdf5file.txt" 
    batch_size: 64 
    shuffle: true 
    } 
} 
layer { 
    name: "conv1" 
    type: "Convolution" 
    bottom: "data" 
    top: "conv1" 
    param { 
    lr_mult: 1 
    } 
    param { 
    lr_mult: 2 
    } 
    convolution_param { 
    num_output: 20 
    kernel_size: 5 
    stride: 1 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "relu1" 
    type: "ReLU" 
    bottom: "conv1" 
    top: "conv1" 
} 
layer { 
    name: "pool1" 
    type: "Pooling" 
    bottom: "conv1" 
    top: "pool1" 
    pooling_param { 
    pool: MAX 
    kernel_size: 2 
    stride: 2 
    } 
} 
layer { 
    name: "dropout1" 
    type: "Dropout" 
    bottom: "pool1" 
    top: "pool1" 
    dropout_param { 
    dropout_ratio: 0.1 
    } 
} 
layer { 
    name: "fc1" 
    type: "InnerProduct" 
    bottom: "pool1" 
    top: "fc1" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    inner_product_param { 
    num_output: 500 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "dropout2" 
    type: "Dropout" 
    bottom: "fc1" 
    top: "fc1" 
    dropout_param { 
    dropout_ratio: 0.5 
    } 
} 
layer { 
    name: "fc2" 
    type: "InnerProduct" 
    bottom: "fc1" 
    top: "fc2" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    inner_product_param { 
    num_output: 1 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "loss" 
    type: "EuclideanLoss" 
    bottom: "fc2" 
    bottom: "label" 
    top: "loss" 
} 
I0929 21:50:21.658833 13779 layer_factory.hpp:74] Creating layer data 
I0929 21:50:21.658859 13779 net.cpp:96] Creating Layer data 
I0929 21:50:21.658871 13779 net.cpp:415] data -> data 
I0929 21:50:21.658902 13779 net.cpp:415] data -> label 
I0929 21:50:21.658926 13779 net.cpp:160] Setting up data 
I0929 21:50:21.658936 13779 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: train_hdf5file.txt 
I0929 21:50:21.659220 13779 hdf5_data_layer.cpp:94] Number of HDF5 files: 1 
I0929 21:50:21.920578 13779 net.cpp:167] Top shape: 64 3 128 128 (3145728) 
I0929 21:50:21.920656 13779 net.cpp:167] Top shape: 64 1 (64) 
I0929 21:50:21.920686 13779 layer_factory.hpp:74] Creating layer conv1 
I0929 21:50:21.920740 13779 net.cpp:96] Creating Layer conv1 
I0929 21:50:21.920774 13779 net.cpp:459] conv1 <- data 
I0929 21:50:21.920825 13779 net.cpp:415] conv1 -> conv1 
I0929 21:50:21.920877 13779 net.cpp:160] Setting up conv1 
I0929 21:50:21.921985 13779 net.cpp:167] Top shape: 64 20 124 124 (19681280) 
I0929 21:50:21.922050 13779 layer_factory.hpp:74] Creating layer relu1 
I0929 21:50:21.922085 13779 net.cpp:96] Creating Layer relu1 
I0929 21:50:21.922108 13779 net.cpp:459] relu1 <- conv1 
I0929 21:50:21.922137 13779 net.cpp:404] relu1 -> conv1 (in-place) 
I0929 21:50:21.922185 13779 net.cpp:160] Setting up relu1 
I0929 21:50:21.922227 13779 net.cpp:167] Top shape: 64 20 124 124 (19681280) 
I0929 21:50:21.922250 13779 layer_factory.hpp:74] Creating layer pool1 
I0929 21:50:21.922277 13779 net.cpp:96] Creating Layer pool1 
I0929 21:50:21.922298 13779 net.cpp:459] pool1 <- conv1 
I0929 21:50:21.922323 13779 net.cpp:415] pool1 -> pool1 
I0929 21:50:21.922418 13779 net.cpp:160] Setting up pool1 
I0929 21:50:21.922472 13779 net.cpp:167] Top shape: 64 20 62 62 (4920320) 
I0929 21:50:21.922495 13779 layer_factory.hpp:74] Creating layer dropout1 
I0929 21:50:21.922534 13779 net.cpp:96] Creating Layer dropout1 
I0929 21:50:21.922555 13779 net.cpp:459] dropout1 <- pool1 
I0929 21:50:21.922582 13779 net.cpp:404] dropout1 -> pool1 (in-place) 
I0929 21:50:21.922613 13779 net.cpp:160] Setting up dropout1 
I0929 21:50:21.922652 13779 net.cpp:167] Top shape: 64 20 62 62 (4920320) 
I0929 21:50:21.922672 13779 layer_factory.hpp:74] Creating layer fc1 
I0929 21:50:21.922709 13779 net.cpp:96] Creating Layer fc1 
I0929 21:50:21.922729 13779 net.cpp:459] fc1 <- pool1 
I0929 21:50:21.922757 13779 net.cpp:415] fc1 -> fc1 
I0929 21:50:21.922801 13779 net.cpp:160] Setting up fc1 
I0929 21:50:22.301134 13779 net.cpp:167] Top shape: 64 500 (32000) 
I0929 21:50:22.301193 13779 layer_factory.hpp:74] Creating layer dropout2 
I0929 21:50:22.301210 13779 net.cpp:96] Creating Layer dropout2 
I0929 21:50:22.301218 13779 net.cpp:459] dropout2 <- fc1 
I0929 21:50:22.3net.cpp:404] dropout2 -> fc1 (in-place) 
I0929 21:50:22.301244 13779 net.cpp:160] Setting up dropout2 
I0929 21:50:22.301254 13779 net.cpp:167] Top shape: 64 500 (32000) 
I0929 21:50:22.301259 13779 layer_factory.hpp:74] Creating layer fc2 
I0929 21:50:22.301270 13779 net.cpp:96] Creating Layer fc2 
I0929 21:50:22.301275 13779 net.cpp:459] fc2 <- fc1 
I0929 21:50:22.301285 13779 net.cpp:415] fc2 -> fc2 
I0929 21:50:22.301295 13779 net.cpp:160] Setting up fc2 
I0929 21:50:22.301317 13779 net.cpp:167] Top shape: 64 1 (64) 
I0929 21:50:22.301328 13779 layer_factory.hpp:74] Creating layer loss 
I0929 21:50:22.301338 13779 net.cpp:96] Creating Layer loss 
I0929 21:50:22.301343 13779 net.cpp:459] loss <- fc2 
I0929 21:50:22.301350 13779 net.cpp:459] loss <- label 
I0929 21:50:22.301360 13779 net.cpp:415] loss -> loss 
I0929 21:50:22.301374 13779 net.cpp:160] Setting up loss 
I0929 21:50:22.301385 13779 net.cpp:167] Top shape: (1) 
I0929 21:50:22.301391 13779 net.cpp:169]  with loss weight 1 
I0929 21:50:22.301419 13779 net.cpp:239] loss needs backward computation. 
I0929 21:50:22.301425 13779 net.cpp:239] fc2 needs backward computation. 
I0929 21:50:22.301430 13779 net.cpp:239] dropout2 needs backward computation. 
I0929 21:50:22.301436 13779 net.cpp:239] fc1 needs backward computation. 
I0929 21:50:22.301441 13779 net.cpp:239] dropout1 needs backward computation. 
I0929 21:50:22.301446 13779 net.cpp:239] pool1 needs backward computation. 
I0929 21:50:22.301452 13779 net.cpp:239] relu1 needs backward computation. 
I0929 21:50:22.301457 13779 net.cpp:239] conv1 needs backward computation. 
I0929 21:50:22.301463 13779 net.cpp:241] data does not need backward computation. 
I0929 21:50:22.301468 13779 net.cpp:282] This network produces output loss 
I0929 21:50:22.301482 13779 net.cpp:531] Collecting Learning Rate and Weight Decay. 
I0929 21:50:22.301491 13779 net.cpp:294] Network initialization done. 
I0929 21:50:22.301496 13779 net.cpp:295] Memory required for data: 209652228 
I0929 21:50:22.301908 13779 solver.cpp:159] Creating test net (#0) specified by net file: train_test_hdf5.prototxt 
I0929 21:50:22.301935 13779 net.cpp:334] The NetState phase (1) differed from the phase (0) specified by a rule in layer data 
I0929 21:50:22.302028 13779 net.cpp:46] Initializing net from parameters: 
name: "MSE regression" 
state { 
    phase: TEST 
} 
layer { 
    name: "data" 
    type: "HDF5Data" 
    top: "data" 
    top: "label" 
    include { 
    phase: TEST 
    } 
    hdf5_data_param { 
    source: "test_hdf5file.txt" 
    batch_size: 30 
    } 
} 
layer { 
    name: "conv1" 
    type: "Convolution" 
    bottom: "data" 
    top: "conv1" 
    param { 
    lr_mult: 1 
    } 
    param { 
    lr_mult: 2 
    } 
    convolution_param { 
    num_output: 20 
    kernel_size: 5 
    stride: 1 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "relu1" 
    type: "ReLU" 
    bottom: "conv1" 
    top: "conv1" 
} 
layer { 
    name: "pool1" 
    type: "Pooling" 
    bottom: "conv1" 
    top: "pool1" 
    pooling_param { 
    pool: MAX 
    kernel_size: 2 
    stride: 2 
    } 
} 
layer { 
    name: "dropout1" 
    type: "Dropout" 
    bottom: "pool1" 
    top: "pool1" 
    dropout_param { 
    dropout_ratio: 0.1 
    } 
} 
layer { 
    name: "fc1" 
    type: "InnerProduct" 
    bottom: "pool1" 
    top: "fc1" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    inner_product_param { 
    num_output: 500 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "dropout2" 
    type: "Dropout" 
    bottom: "fc1" 
    top: "fc1" 
    dropout_param { 
    dropout_ratio: 0.5 
    } 
} 
layer { 
    name: "fc2" 
    type: "InnerProduct" 
    bottom: "fc1" 
    top: "fc2" 
    param { 
    lr_mult: 1 
    decay_mult: 1 
    } 
    param { 
    lr_mult: 2 
    decay_mult: 0 
    } 
    inner_product_param { 
    num_output: 1 
    weight_filler { 
     type: "xavier" 
    } 
    bias_filler { 
     type: "constant" 
     value: 0 
    } 
    } 
} 
layer { 
    name: "loss" 
    type: "EuclideanLoss" 
    bottom: "fc2" 
    bottom: "label" 
    top: "loss" 
} 
I0929 21:50:22.302146 13779 layer_factory.hpp:74] Creating layer data 
I0929 21:50:22.302158 13779 net.cpp:96] Creating Layer data 
I0929 21:50:22.302165 13779 net.cpp:415] data -> data 
I0929 21:50:22.302176 13779 net.cpp:415] data -> label 
I0929 21:50:22.302186 13779 net.cpp:160] Setting up data 
I0929 21:50:22.302191 13779 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: test_hdf5file.txt 
I0929 21:50:22.302305 13779 hdf5_data_layer.cpp:94] Number of HDF5 files: 1 
I0929 21:50:22.434798 13779 net.cpp:167] Top shape: 30 3 128 128 (1474560) 
I0929 21:50:22.434849 13779 net.cpp:167] Top shape: 30 1 (30) 
I0929 21:50:22.434864 13779 layer_factory.hpp:74] Creating layer conv1 
I0929 21:50:22.434895 13779 net.cpp:96] Creating Layer conv1 
I0929 21:50:22.434914 13779 net.cpp:459] conv1 <- data 
I0929 21:50:22.434944 13779 net.cpp:415] conv1 -> conv1 
I0929 21:50:22.434996 13779 net.cpp:160] Setting up conv1 
I0929 21:50:22.435084 13779 net.cpp:167] Top shape: 30 20 124 124 (9225600) 
I0929 21:50:22.435119 13779 layer_factory.hpp:74] Creating layer relu1 
I0929 21:50:22.435205 13779 net.cpp:96] Creating Layer relu1 
I0929 21:50:22.435237 13779 net.cpp:459] relu1 <- conv1 
I0929 21:50:22.435292 13779 net.cpp:404] relu1 -> conv1 (in-place) 
I0929 21:50:22.435328 13779 net.cpp:160] Setting up relu1 
I0929 21:50:22.435371 13779 net.cpp:167] Top shape: 30 20 124 124 (9225600) 
I0929 21:50:22.435400 13779 layer_factory.hpp:74] Creating layer pool1 
I0929 21:50:22.435443 13779 net.cpp:96] Creating Layer pool1 
I0929 21:50:22.435470 13779 net.cpp:459] pool1 <- conv1 
I0929 21:50:22.435511 13779 net.cpp:415] pool1 -> pool1 
I0929 21:50:22.435550 13779 net.cpp:160] Setting up pool1 
I0929 21:50:22.435597 13779 net.cpp:167] Top shape: 30 20 62 62 (2306400) 
I0929 21:50:22.435626 13779 layer_factory.hpp:74] Creating layer dropout1 
I0929 21:50:22.435669 13779 net.cpp:96] Creating Layer dropout1 
I0929 21:50:22.435698 13779 net.cpp:459] dropout1 <- pool1 
I0929 21:50:22.435739 13779 net.cpp:404] dropout1 -> pool1 (in-place) 
I0929 21:50:22.435780 13779 net.cpp:160] Setting up dropout1 
I0929 21:50:22.435823 13779 net.cpp:167] Top shape: 30 20 62 62 (2306400) 
I0929 21:50:22.435853 13779 layer_factory.hpp:74] Creating layer fc1 
I0929 21:50:22.435899 13779 net.cpp:96] Creating Layer fc1 
I0929 21:50:22.435926 13779 net.cpp:459] fc1 <- pool1 
I0929 21:50:22.435971 13779 net.cpp:415] fc1 -> fc1 
I0929 21:50:22.436018 13779 net.cpp:160] Setting up fc1 
I0929 21:50:22.816076 13779 net.cpp:167] Top shape: 30 500 (15000) 
I0929 21:50:22.816138 13779 layer_factory.hpp:74] Creating layer dropout2 
I0929 21:50:22.816154 13779 net.cpp:96] Creating Layer dropout2 
I0929 21:50:22.816160 13779 net.cpp:459] dropout2 <- fc1 
I0929 21:50:22.816170 13779 net.cpp:404] dropout2 -> fc1 (in-place) 
I0929 21:50:22.816182 13779 net.cpp:160] Setting up dropout2 
I0929 21:50:22.816192 13779 net.cpp:167] Top shape: 30 500 (15000) 
I0929 21:50:22.816197 13779 layer_factory.hpp:74] Creating layer fc2 
I0929 21:50:22.816208 13779 net.cpp:96] Creating Layer fc2 
I0929 21:50:22.816249 13779 net.cpp:459] fc2 <- fc1 
I0929 21:50:22.816262 13779 net.cpp:415] fc2 -> fc2 
I0929 21:50:22.816277 13779 net.cpp:160] Setting up fc2 
I0929 21:50:22.816301 13779 net.cpp:167] Top shape: 30 1 (30) 
I0929 21:50:22.816316 13779 layer_factory.hpp:74] Creating layer loss 
I0929 21:50:22.816329 13779 net.cpp:96] Creating Layer loss 
I0929 21:50:22.816337 13779 net.cpp:459] loss <- fc2 
I0929 21:50:22.816347 13779 net.cpp:459] loss <- label 
I0929 21:50:22.816359 13779 net.cpp:415] loss -> loss 
I0929 21:50:22.816370 13779 net.cpp:160] Setting up loss 
I0929 21:50:22.816381 13779 net.cpp:167] Top shape: (1) 
I0929 21:50:22.816388 13779 net.cpp:169]  with loss weight 1 
I0929 21:50:22.816407 13779 net.cpp:239] loss needs backward computation. 
I0929 21:50:22.816416 13779 net.cpp:239] fc2 needs backward computation. 
I0929 21:50:22.816426 13779 net.cpp:239] dropout2 needs backward computation. 
I0929 21:50:22.816433 13779 net.cpp:239] fc1 needs backward computation. 
I0929 21:50:22.816442 13779 net.cpp:239] dropout1 needs backward computation. 
I0929 21:50:22.816452 13779 net.cpp:239] pool1 needs backward computation. 
I0929 21:50:22.816460 13779 net.cpp:239] relu1 needs backward computation. 
I0929 21:50:22.816468 13779 net.cpp:239] conv1 needs backward computation. 
I0929 21:50:22.816478 13779 net.cpp:241] data does not need backward computation. 
I0929 21:50:22.816486 13779 net.cpp:282] This network produces output loss 
I0929 21:50:22.816500 13779 net.cpp:531] Collecting Learning Rate and Weight Decay. 
I0929 21:50:22.816510 13779 net.cpp:294] Network initialization done. 
I0929 21:50:22.816517 13779 net.cpp:295] Memory required for data: 98274484 
I0929 21:50:22.816565 13779 solver.cpp:47] Solver scaffolding done. 
I0929 21:50:22.816587 13779 solver.cpp:363] Solving MSE regression 
I0929 21:50:22.816596 13779 solver.cpp:364] Learning Rate Policy: inv 
I0929 21:50:22.870337 13779 solver.cpp:424] Iteration 0, Testing net (#0)

업데이트

교육 이미지 변경 후 (이것은 @lejlot의 응답 이후) 내 데이터 :

그것은 학습 것으로 보인다

출처

2016-09-30 magneto

는 손실이 내려갑니다. 그러나 분명히 데이터에 문제가 있습니다. 학습 (반복 0) 전에 이미 0.0006의 손실이 있습니다. 이것은 랜덤 모델에 대한 손실은 극히 적습니다. 따라서 귀하의 데이터가 매우 이상하게 보입니다. 당신의 부양 가치를 살펴보십시오. 그들은 -1과 1 사이에 정말 훌륭하게 분포되어 있습니까? 또는 99 %의 "0"과 몇 가지 다른 값을 갖는 것과 비슷합니까? 접근 자체에는 아무런 문제가 없으며 단순히 데이터를 더 분석해야합니다. 실제로 [-1, 1] 간격으로 확장해야합니다. 일단 당신이 그것을 고칠, 거기에 주위에 놀 수있는 작은 것들이 많이있을 것입니다 -하지만 이것은 지금 가장 큰 문제입니다 랜덤 모델와 작은 오류로가는 길에, 따라서 알고리즘/방법/매개 변수가 아닌 데이터입니다. 학습 속도를 빠르게 높이려면 학습 속도를 현재 사용중인 0.0001에서 늘릴 수 있지만 이전과 같이 먼저 데이터를 수정하십시오.

출처

2016-10-01 09:43:49 lejlot

답장을 보내 주셔서 감사합니다. @lejlot. 나는 이미지 데이터를 두 번 255로 나눈 것으로 나타났습니다. 이제는 caffe.io.load_image가 부서 자체를 수행했음을 알았습니다. 명시 적으로해야 할 필요는 없습니다. 그리고 -nan 오류 때문에 학습 속도를 높일 수 없었습니다. 0.0001의 학습 률로 이제 0.08의 반복 0 손실이 발생하고 잘하면 손실의 눈에 띄는 감소를 볼 수 있습니다. 나는 지금 그것이 좋기를 바란다. 또한 제 경우에 만족할만한 손실이 있는지 (제 훈련이 충분히 좋다고 생각할 때와 같은) 말해 주시겠습니까? 새로운 교육 이미지를 첨부했습니다. – magneto

당신은 "만족스러운 손실"이 무엇인지 말할 수 없습니다. 이를 테스트하고 데이터의 관점에서 문제를 분석해야합니다. 번호 자체는 전혀 중요하지 않습니다. 특히, 훈련 손실은 아무런 의미가 없습니다. 보통 0 훈련 오류를 훈련시킬 수 있습니다 (심지어는 조언이 없다고 생각할지라도) – lejlot

손실이 수렴하지 않는 Caffe 회귀

답변

관련 문제