Augmented Reality

Introduction

In the past few years, Virtual Environments (a.k.a. Virtual Reality) have attracted a great deal of media attention. In Augmented Reality, the user can see the real world around him, with computer graphics superimposed or composited with the real world. The ultimate goal is to create a system such that the user can not tell the difference between the real world and the virtual augmentation of it. To the user of this ultimate system it would appear that he is looking at a single real scene.

Our work involves augmenting destination environments with objects cut from source environments(without prior information), rather than augmenting 3-d models, which allows us to create virtual environments augmented with real objects such that the user cannot tell the difference between the real world and the virtual augmentation of it.

Previous Work

Previous work involves augmenting 3-d models into destination environments, using vision based position tracker.[1]

Our Methodology

The current work models the source objects as planes. A modified version of the ray tracing algorithm was developed which renders the destination environment using these planar objects and also the possible occluding planes in the destination environment. These planes in the destination environment were modelled using single view 3d reconstruction methods.[2]

On a per frame basis, the object is background subtracted from the frame and then the sparse 3-D modelling of destination image is done. The 3-D and texture information of the planes modelled is determined and the ground planes of the source and destination images are registered. The new 3-D points for the feet of the object is calculated and a quadrilateral is constructed on which the texture of the object is pasted. Then using the developed rendering technique the final image is formed taking care of all possible occllusions.

Calibration of the destination image

The destination environment is calibrated using single view 3d reconstruction methods.[2]This is used to model possible occluding planes in this environment.

Registration of Ground planes in source and destination environments

The ground planes of the source and destination environments are registered and a homography(H1) is found. This is used to get the ground points of the object(s) in the dest. environment from their counterparts in the dest. environments.

This registration is totally user dependent. Registering ground plane of source with any plane of the destination will make the object move in that plane in the destination e.g. registering ground of source with ceiling of destination can make the object moving on the ground of source appear moving on the ceiling of the destination.

Sparse Modelling of planes in destination environment

To handle occlusion we need to model all the possible occluders in the destination environments. As the no. of these occluders is generally small, they can be easily modelled. Again these are also modelled as planes by one time hand-clicking in the single view 3d reconstruction method[2]

Texture information of source object

The texture information of source object(s) is extracted using techniques like background subtraction and connectivity analysis on a per frame basis. The texture is finally cut by hand-clicking of points determining the object plane. The hand-clicking needs to be done only on the first frame and a tracker can be used for the following frames.

Determination of object plane in destination environment

We need to determine the plane of the object in the destination environment for rendering. Due to constraints of the single view reconstruction method, we can determine this plane only upto 2 degrees of freedom, i.e. we cannot determine the inclination of this plane with the ground plane. The overall determination of the plane is done in 4 steps:

1. We click 2 points correspoding to the extremities of the object intersecting the ground plane.
2. The homography H1 found above is applied to these points to obtain corresponding points on the ground plane of the destination environment
3. The single view reconstruction method is applied to get the 3-d co-ordinates of these 2 points. This determines the plane in the 2-degrees of freedom.
4. The height extent of the object is determined by the fixed aspect ratio.

Rendering

Now we are fully equipped with all the information to render the augmented destination environment. On a per frame basis, for each pixel in the target image, we find direction of its corresponding ray(i.e the ray whose projection is the pixel itself) with respect to the camera center.

We then find the intersection points of this ray with the modelled planes. These points are then checked whether they lie inside the plane boundaries. Among the points which satisfy the above condition, their respective distance from the camera center is calculated and the point is chosen whose distance is minimum. The plane corresponding to this point is determined.

The co-ordinates of this point in the local co-ordinate system of the plane are then found to get the texture information, which is then pasted into the image.

Conclusion

We have presented a simple interactive method for augmenting real moving objects in a still scene using planar assumption for object as well as occluders.

Bibliography:

1. R. A. Smith, A. W. Fitzgibbon, and A. Zisserman. Improving augmented reality using image and scene constraints
2. Akash M Kushal, Vikas Bansal, Subhashis Banerjee. A simple method for interactive 3D reconstruction and camera calibration from a single view