Motivation For Project:

With this project, our aim is to complete two tasks:

  1. Learn a universal representation of objects to be manipulated (e.g. pushed around, picked up).

  2. Learn a dynamics model in order to perform a manipulation task.

We perform both of these steps by employing deep learning techniques.

Representation for Arbitrary Object Manipulation

Given an arbitrary object, we select to use it’s NeRF representation in order to inform the next poisition to move a robotic arm to. This is advantageous as a NeRF is does not require any prior 3D model of an arbitrary object and can easily be obtained simply from a set of images, whereas other representations require explicit models, analogous to CAD models or mesh extractions.

We obtain a NeRF representation using instant-ngp, and upon having the NeRF representation, we take the contour of the median level set of the NeRF, using the occupancy values. This gives us an outline of the object to be manipulated.

Learning The Dynamics Model for Manipulating Arbitrary Objects

In order to learn how the pushing dynamics on arbitrary object work, we first need to learn how our actions relate to changes in the state space of the arbitrary object. To train this model, we apply a series of “random” actions. A more detailed explanation of how these actions are taken is in the report, section III.B.

A residual dynamics model is then learned, where the state of the object at a given time and a random action are input into the model and the model aims to predict the next state.

After the model is trained, we the then test on a series of test objects to see if the learned model can perform a planar pushing task to move the object to a target pose. We found that it is successful 50% of the time!

Implementation

This project required learning a lot of new technologies and obtaining skills. I used PyTorch to train the dynamics model, instant-ngp to generate the NeRF models. The training of the model was integrated with PyBullet, which allowed to use a well defined robotic arm to learn the pushing dynamics with. The contour extraction was done using NumPy and OpenCV. All code was written using Python.