Real Time Stereoscopic Streaming

Special thanks.

I would like to particularly thank my professor, Mr. Uchio, who greatly helped me for both the human part and the technical part of my internship. I felt really pampered in his laboratory.
I also want to thank Mr. Rafael Sierra, for his friendship, wise advices, and accurate supervision;-)
My stay in the lab would not have been so pleasant without the presence of Mrs. Toyokuni, who taught me about Japanese culture, gave me further explanations about the language, and could even stand a such bothering French boy.
It leads me to someone else I want to thank: the professor Higashi, who knew the right words to use for teaching Japanese to non-native English speakers.
The last, but not the least, I thank my girlfriend, Naoko, for helping me out in this great-but-hard-to-understand country.

Of course, I am grateful to many others, but the list would be too long to be written. So let me thank Mrs. Nakatani, Pr Tokoi, Pr Wada, Philippe Ngo, everyone in my laboratory, and all the other (and they are many) who helped me during this internship.

Introduction

The aim of this project is quite easy to understand. The way to implement it, however, is a bit more complex.
There are two main actors in the system: the expert, and the farmer. The farmer can see objects, like fruits, or vegetables, but sometimes would appreciate advices from an expert in order to improve his farming. So he will wear a special helmet, containing two cameras. Those cameras will record exactly what the farmer is looking at, and transmit the video streams over Internet. Somewhere else, the expert will receive those streams, and the application will recompose a 3D picture. In that way, the expert will see exactly what the farmer is seeing, as far as he can use stereoscopic glasses, or special 3D screen.

To sum up, the project is real-time stereoscopic streaming.

Before explaining further the project, I will give a short description of the working context, the country, and the university.

Project background

Japan

I did this internship in the University of Wakayama, of the Wakayama prefecture in Japan. This country is interesting from many points of view: cutting edge technologies stand along with traditional usages, glass buildings are neighbors of temples and shrines, and the peaceful character of Japanese people contrast with the wildness of nature. Within six months, I could experience five earthquakes, four typhoons, and volcano & tsunamis were not so far. On the other hand, Japanese gardens, natural hot springs, Japanese food and temples balance very well all the possible drawbacks.

I also could get some time to study Japanese (2h per week), which greatly helped me, since most of people here speak only Japanese.
The life here is quite expensive, but I could handle thanks to a short part-time job for the LIMSI (part of the national French research center (CNRS)). Working mainly on the university project, and on the LIMSI project on weekends, there was not plenty of time for tourism. However I could visit enough to be impressed by the cultural background of Japan.

The university

The internship took place in the Networking lab of the university. There are several research axes, like wireless networking, water activity controlling, or GPS data analyzer. All those axes are applied to a more general purpose: agricultural improvement. Uchio sensei, the responsible of the lab supervised all those projects, and Mr Sierra was supervising me. He is working on a thesis about network improvement, mainly using regions of interest, so my project was directly linked to his.
On the hardware point of view, I must say we have been pampered from Uchio sensei. We could get every hardware we needed, regardless the price. The difference between Japanese research and French one is impressive from this point of view (I cannot forget my university (Orsay, Paris XI) once had to close in winter because they had not enough money to pay the heater system fees;-)
So with a strong help for any Japanese question from both my lab mates and my Japanese teacher, and an ideal working environment, it was a real pleasure to go working everyday.

Technical difficulties

For this project, we identified several “hot points”:
– Streaming a video of a type we do not know (compressed, raw data, RGB, YUV…). We do not know the type mainly because at the time we developed the application, we had no idea which camera to use.
– Maintaining a good quality of video, even if the network is congested (congestion control).
– Adapting the receiving application to stereoscopic devices like shuttles or 3d screen, knowing that none of the stereoscopic devices followed the same standard.
– Find a way to improve the speed of then system, and allow re-encoding “on the fly”
– Modifying the IP header of our packets

Knowing all those possible hot points, three solutions came out: To simplify the network process: a new networking model for streaming. To improve the speed of the application: using the processor of the 3D graphic card to perform some costly operations. To make everything simple for the final user, building a website and dynamically update the streams information.
However, all those solutions, particularly the first and the second, were not sure to be possible. So the secondary objective of this internship was to research and determinate whether those solutions were suitable or not.

Preliminary study: the DV format, and Linux drivers.

In order to understand how to manipulate DV streams, the first thing we developed was a simple DV player, running on Linux (Red Had9), with a GTK+ interface.
The first step for this part was to install the proper DV drivers, then to get an API to decode a DV stream, and finally build a GTK+ interface, and display the decoded stream onto the window.
In order to get into the network part of the project, we first studied DVTS: an open source application for transmitting DV packets over Internet. DVTS uses RTP, so it looked pretty similar to the application we wanted to build. In addition, DVTS works under many operating systems, so our first idea was to use the source code of DVTS to build our own application.
We had to give up this idea quite early, because we could not get the source code for the windows version, and our initial stereoscopic glasses worked only under windows. In addition, DVTS supports only DV, and knowing that with a little bit of patience we could use all the codecs used by media player, it would have been a pity to tie to this limitation.
Anyway, this preliminary study helped us to understand a basic media format, DV, to deal with Linux drivers and GTK user interfaces, and to get used to RTP transmissions.