# Geometric Deep Learning (Introduction) & Applications

## Who am I

• From Iceland, live in Denmark
• Postdoc at DTU Compute focusing on geometric deep learning (GDL)
• Image Analysis & Computer Graphics Group
• PhD in Applied Mathematics defended in April 2018
• Focus on Statistical Learning, in particular sparse classification
• During the summer at deCODE genetics here in Iceland

## Collaborators in GDL

• Rasmsus R. Paulsen
• PhD supervisor
• Line K. H. Clemmensen
• PhD co-supervisor
• Others in reading group (geo-dl.compute.dtu.dk)

## Prediction of Facial Landmarks

• We are interested in modeling the external anatomy of face and ear
• Used for accoustic simulations for optimal placements of microphones for hearing aids
• Accurate landmarks for phenotyping (what is a phenotype?)
• GDL caught our attention

## Outline

• What is geometric about GDL?
• Why do we need GDL?
• Different Data, Different Problems and Different Approaches
• Preliminary Results for landmark prediction on faces
• Deep Learning and Genetics

## What do we refer to as geometric?

• Geometry is associated with the data, compared to images, which provide a single view and can be regarded as a euclidean grid
• Different input data
• 3D-point clouds, e.g. from 3D scans
• Volumetric data, Discritization of meshes or medical scans
• Meshes, e.g. Computer Aided Design drawings or Computer Graphics objects
• Usually surfaces embedded in 3D
• Non-euclidean geometries, how to calculate distances?
• Other graph structures, e.g. social networks or epidemiological networks.

## 3D point clouds

Stanford Bunny Point Cloud

## Volumetric data

Stanford Bunny Volumetric

## Meshes

Stanford Bunny Mesh

## What are the applications?

• Classification
• Big catalogues of CAD models
• Segmentation
• Semantic segmentation and scene segmentation
• Dense & Sparse point correspondances
• Landmark annotations
• Shape analysis

## Why do we need DL on this?

• Better and faster methods for e.g. classification
• Challenges in generalization to input for DL methods
• We want improvements similar to image and text based problems
• Meet the increase in acquisition of 3D data
• Scanning of museaum artifacts
• 3D-face and body scans
• Archeology scans
• Quality assurance in factories, real time scanning is on the way

## Structured Light Scanning

Scanning of Polar Bear Skulls

## Different Approaches for Different Data

• Different represenations of data call for different approaches
• Different problems to tackle!
• We need invariances to different properties of the data, e.g.:
• Point Clouds should be invariant to permutations of the individual points
• Volumetric data should be invariant of orientation
• Meshes should be invariant to changes in triangulation of faces

## Point Clouds Example Approaches

• PointNet and PointNet++ from Stanford
• Implemented for 3D classification and segmentation
• Each point is treated independently as a 3d point in the input

## PointNet Requirements

• Unordered, invariant to the N! permutations of the input
• Interaction Among Points, need to capture combinatorial interactions in local structures
• Invariance Under Transformations, rotation and translation should not affect our predictions/classifications

## Strategies for learning with unordered data

• Sort input in canonical order
• No ordering is stable in high dimensional space for point perturbations
• Treat input as a sequence for an RNN and permute training data
• Hard to scale for long sequences, works for N=10-100, point clouds usually have at least 1000 points (usually way more)
• Simple symmetric functions for information aggregation
• Authors choose this!

## PointNet Idea

• Approximate a general function on a pointset by applying a symmetric function on transformed elements in the set $f(\{ x_1,...,x_n \}) \approx g(h(x_1),...,h(x_n))$
• $$f$$ takes in a set, so it is invariant to permutations
• $$h$$ is a multi-layer percepteron
• $$g$$ is a composition of a single variable function and max pooling
• With several differnt $$h$$ functions we can create a global descriptor of the point cloud
• Global descriptors are used for classification

## Other Details for PointNet

• Affine transformation is predicted for canonical alignment
• Also applied to features deeper
• Semantic segmentation
• Feed global descriptors to points
• Combine local and global features for point classification
• Theoretical justification for universal approximation to continuous set function
• Trained on ModelNet40, 12k man-made CAD models from 40 categories

## PointNet Architecture

PointNet Architecture

## PointNet Results Kinect Left, CAD Right

Results from PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

## Volumetric and Multi-View Example Approaches

Results from Volumetric and Multi-View CNNs for Object Classification on 3D Data

## Main Ideas for Volumetric and Multi-View Approaches

• Predict labels from subvolumes
• Helps to prevent overfitting
• Data augmentation
• Multi-Orientation Pooling
• Layer which aggregates information from many different views
• Also, multi-view pooling

## Changing Voxel Resolution for Rendering and Volumetric Approaches

Results from Volumetric and Multi-View CNNs for Object Classification on 3D Data

## Invariance to isometry

• For point correspondances between two meshes, e.g. two scanned humans
• We need an intrinsic operator which only depends on the Riemannian metric of the manifold.
• Bronstein et. al propose the Laplace Beltrami Operator (LBO)
• LBO admits an eigendecomposition on a smooth compact manifold
• Generalization of Fourier series to non-Euclidean domains
• Allows for defining convolution on meshes

## Decomposition is global

Function, filter, same filter different mesh

## Solved With Loacalised Approaches

Different Localised Approaches

## Our Pipeline

Two stacked hourglass networks

## Performance

Performance on different landmarks

## Applications to MRI

Not restricted to 3D scans of faces

## Applications for Genetics (if time allows)

• Data-driven phenotypes (e.g. Big Five)
• Phenotypes from images
• Modern social phenotypes

## Leaders in the Field

These individuals have paved the way in collaboration with their research groups.

• Professor Leonidas Guibas, Stanford
• Innovated approaches to volumetric and point-cloud data
• Professor Michael Bronstein, University of Lugano & Intel perceptual computing
• Convolution on meshes and dense mesh correspondances

## Material for those interested

• Geometric Deep Learning SIGGRAPH ASIA 2016 course notes
• Geometric Deep Learning Webpage
• geo-dl.compute.dtu.dk, our reading group

## Thanks Questions?

• Let me know if you come to Denmark!