ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Training deep neural networks for stereo vision

Jure Žbontar (2016) Training deep neural networks for stereo vision. PhD thesis.

[img]
Preview
PDF
Download (12Mb)

    Abstract

    We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for learning a similarity measure on image patches. The first architecture is faster than the second, but produces disparity maps that are slightly less accurate. In both cases, the input to the network is a pair of small image patches and the output is a measure of similarity between them. Both architectures contain a trainable feature extractor that represents each image patch with a feature vector. The similarity between patches is measured on the feature vectors instead of the raw image intensity values. The fast architecture uses a fixed similarity measure to compare the two feature vectors, while the accurate architecture attempts to learn a good similarity measure on feature vectors. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and Middlebury stereo data sets and show that it outperforms other approaches on all three data sets.

    Item Type: Thesis (PhD thesis)
    Keywords: stereo, matching cost, similarity learning, supervised learning, convolutional neural networks
    Number of Pages: 125
    Language of Content: English
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Yann LeCunMentor
    izr. prof. dr. Janez Demšar257Comentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537065923)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3412
    Date Deposited: 15 Jul 2016 14:53
    Last Modified: 24 Aug 2016 08:24
    URI: http://eprints.fri.uni-lj.si/id/eprint/3412

    Actions (login required)

    View Item