Weakly-supervised structured output learning with flexible and latent graphs using high-order loss functions

Files

RA_hdl_107543.pdf (1.89 MB)
  (Restricted Access)

Date

2015

Authors

Carneiro, G.
Peng, T.
Bayer, C.
Navab, N.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

Proceedings / IEEE International Conference on Computer Vision. IEEE International Conference on Computer Vision, 2015, vol.2015 International Conference on Computer Vision, ICCV 2015, pp.648-656

Statement of Responsibility

Gustavo Carneiro, Tingying Peng, Christine Bayer, Nassir Navab

Conference Name

2015 IEEE International Conference on Computer Vision (ICCV 2015) (11 Dec 2015 - 18 Dec 2015 : Santiago, CHILE)

Abstract

We introduce two new structured output models that use a latent graph, which is flexible in terms of the number of nodes and structure, where the training process minimises a high-order loss function using a weakly annotated training set. These models are developed in the context of microscopy imaging of malignant tumours, where the estimation of the number and proportion of classes of microcirculatory supply units (MCSU) is important in the assessment of the efficacy of common cancer treatments (an MCSU is a region of the tumour tissue supplied by a microvessel). The proposed methodologies take as input multimodal microscopy images of a tumour, and estimate the number and proportion of MCSU classes. This estimation is facilitated by the use of an underlying latent graph (not present in the manual annotations), where each MCSU is represented by a node in this graph, labelled with the MCSU class and image location. The training process uses the manual weak annotations available, consisting of the number of MCSU classes per training image, where the training objective is the minimisation of a high-order loss function based on the norm of the error between the manual and estimated annotations. One of the models proposed is based on a new flexible latent structure support vector machine (FLSSVM) and the other is based on a deep convolutional neural network (DCNN) model. Using a dataset of 89 weakly annotated pairs of multimodal images from eight tumours, we show that the quantitative results from DCNN are superior, but the qualitative results from FLSSVM are better and both display high correlation values regarding the number and proportion of MCSU classes compared to the manual annotations.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

© 2015 IEEE

License

Call number

Persistent link to this record