CS-ME Joint Seminar
  • FENS
  • CS-ME Joint Seminar

You are here

Abstract

We will present three strands of work in this talk on exploiting
the structure of man-made environments both in small scale using
RGB-D images (scenes) and large scale (buildings).

Learning 3D context of everyday object

Previous work has shown that contextual cues can greatly help in
locating and identifying objects. In this talk, we argue that there
is a strong correlation between local 3D structure and object
placement in everyday scenes. We call this the 3D context of the
object. We present one method to capture the 3D context of different
object classes. For evaluation, we have collected a large dataset of
Microsoft Kinect frames from five different locations in Europe, which
we also make publicly available. We provide extensive experiments that
show the plausibility of the 3D context idea and our realization. Our
experimental results support that the 3D structure surrounding objects
in everyday scenes is a strong indicator of their placement.

Analysis and prediction of the large scale structure of indoor
environments

Various robotics tasks ranging from exploration to fetch-and-carry
missions in partially known environments require the robot to predict
what lies in the unexplored part of the environment. In this talk we
first analyze a large set of indoor environments, namely from two
large annotated floor plan data sets corresponding to the buildings
from the MIT and KTH campuses. Utilizing tools
from graph theory we provide certain characteristics that emerge from
real-world indoor environments. Following this
analysis, we propose two methods for predicting both the topology and
the categories of rooms given a partial map. We provide extensive
experimental results that evaluate their performance. In particular,
we analyze the transferability of our models between the two data
sets.

Kinect@Home: Crowdsourcing a Large 3D Dataset of Real Environments

We present Kinect@Home, in collaboration with MIT Media Lab, aimed at
collecting a vast RGB-D dataset from
real everyday living spaces. This dataset is planned to be the largest
real world image collection of everyday environments to date, making
use of the availability of a widely adopted robotics sensor which is
also in the homes of millions of users, the Microsoft Kinect camera.

Bio:

Alper Aydemir

Computer Vision for Surface Applications, NASA JPL
Computer Vision and Active Perception Lab., KTH

Alper Aydemir is a recent PhD graduate from the Royal Institute of
Technology, working on developing methods to efficiently search for
objects in large-scale environments. His research interests include
active vision, 3D SLAM, RGB-D processing and semantic mapping. During
this PhD, Alper has collaborated with Prof. Patric Jensfelt and Prof.
Danica Kragic on the FP7 EU project CogX. Currently, he is a
researcher at NASA JPL, Computer Vision Group. Alper has previously
interned at the NanoRobotics Laboratory, Carnegie Mellon University
and he holds a BSc on Mechatronics Engineering, Sabanci University,
Turkey.

http://csc.kth.se/~aydemir