Skip to content

Bus error when training until iteration 4110 #9

@Nestarneal

Description

@Nestarneal

Hi,
I try to reproduce your results in the repository Realtime_Multi-Person_Pose_Estimation, and I follow the Training Steps in README to download the LMDB data and this repository for training.

I've tried three times for training, but each time when the iteration reaches 4110, I always get the following error:

I1031 21:17:26.459512 32355 sgd_solver.cpp:106] Iteration 4110, lr = 2e-05
*** Aborted at 1509455849 (unix time) try "date -d @1509455849" if you are using GNU date ***
PC: @     0x7fc0ee06da5e (unknown)
*** SIGBUS (@0x7ee49acba50e) received by PID 32355 (TID 0x7fc0b33dd700) from PID 18446744072011621646; stack trace: ***
    @     0x7fc0ee009cb0 (unknown)
    @     0x7fc0ee06da5e (unknown)
    @     0x7fc0eee2b9de (unknown)
    @     0x7fc0eee2ba2b (unknown)
    @     0x7fc0eff43cea caffe::db::LMDBCursor::value()
    @     0x7fc0effc890e caffe::DataReader::Body::read_one()
    @     0x7fc0effc8ef4 caffe::DataReader::Body::InternalThreadEntry()
    @     0x7fc0eff3bea5 caffe::InternalThread::entry()
    @     0x7fc0e52e242f thread_proxy
    @     0x7fc0d6cb4184 start_thread
    @     0x7fc0ee0d0ffd (unknown)
    @                0x0 (unknown)
Bus error

I create a script for training, and its contents are as following:

#!/usr/bin/env sh
/path/to/caffe.bin train --solver=pose_solver.prototxt --gpu=1 --weights=../../../model/vgg/VGG_ILSVRC_19_layers.caffemodel 2>&1 | tee output/$(date +%y%m%d_%H%M).txt 

Do you have any idea about this?

Many thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions