Impressive

How Benjamin Button was made on computer.

Advertisements

Interactive Ray-Tracing – Coherent Rays

My article for my UCE15 subject was submited on the 1st February and the public presentation and evaluation was yesterday.

Here’s the abstract and the conclusion.

Abstract. In this paper, we analyse two different approaches to ray-
triangle intersections in a ray tracing context. We first developed a single
ray shooting method and took measurements of its performance. Then,
based on the initial method, a second version was implemented using
ray packets to shoot several rays at a time, by making use of SIMD
instructions.
For both versions, among several parameters, the time of execution, the
number of frames, rays and intersections per second were taken into
consideration. A comparison of the data obtained showed a 2.8 speed
up between the single ray and the ray packet versions which was less
pronounced than the results from previous literature [5] and therefore
compelling us to optimize the method and results so to further invest on
the method and improve the results obtained so far.

Conclusions
In this work we performed and analysed several different measurements on two
different approaches in ray shooting for interactive ray tracing. These triangle
intersection methods were the single ray shooting and the packets of several rays
shooting. They differed by the first being sequential, selecting one ray at a time
and testing it against all scenes triangles, while the second was vectorial, i.e.,
packets of four rays were tested at the same time against all scenes triangles
using SIMD instructions. These versions were referred to as iRT and iRT 4pack
respectively.
Using counters in the source code of the software by us developed and Intels
Vtune software, we were able to conclude the vectorial version is 2.8 times faster
than the sequential one. As expected, the resolution and the number of triangles
present in a scene contribute to a linear complexity of the problem. This can be
turned into a logarithmic problem if we implement an accelerator data structure.
We also confirmed that the time-consuming factor is the ray-triangle intersection
function and that the generation time of primary rays is negligible in relation to
the total time of execution.

I’ll update this post when I receive the grade.

As for what I’m working now on, I’m using Nate Robins’ GLM to parse OBJ files to include materials, groups and textures.

I’m also in the process of adapting a BVH code to work on my application. Initial tests gave a up to 100 times speed up.

After that I’m going to work on threads.

Summer School CDDIP 2008

As promised, here’s the post about my experience in Denmark, that happened during last July: the Conceptual Design and Development of Innovative Products.

CDDIP08 was an intensive three week long erasmus program focused on developing new products for Bang&Olufsen. This was done by teaming up people from different countries and backgrounds. There were five contries participating on the program: Denmark, Czech Republic, Portugal, The Netherlands and Poland; each country composed by students of Electronics, Mechanics, Software, Human Technology, and Automatic and Robotics respectively.

The program venue was in Struer, Denmark, home of B&O. We worked at the B&O Factory, and we had constant input from our teachers and workers from B&O. And we all stayed at this school for interns called “The Gymnasium”. Good place to stay, with a good garden so we could all have dinner, hang out, get to know everybody and have a couple of those great danish beers. Oh, and the kitchen lady was so friendly and the food was always great! No soup, though…

Before we arrive there, everything was already planed from the minute of departure to the minute of arrival back home for each participating person. This was amazing considering Portugal is a mess. Just for fun, me and my portuguese friends joked that the taxis would go slower just to get in time!!

Back to topic. The teams were formed by 4 or 5 elements and each element was from a different team. I was in a 4 element team: me, Kim from Denmark, Matthijs from The Netherlands and Karel from Czech Republic. Unfortunately I cannot talk about the specifics of what our work was about (B&O policies) but I can tell how did it go and how cool it was. That’s for later in this post.

During our stay in Denmark we had social arranjements so all students and teachers could get to know Denmark and get to know each other. And this is what I’m writing a bit about in the next paragraphs.

First we went for a B&O tour. we had the chance to see the Factory: how things were organized, what this and that machine did, how they did to get that designs out of aluminium and history details about B&O. Interesting to know they didn’t teamed up with the Nazis at the WWII. We also visited they store in the Factory and I was blown away.  In that same day visited Struer’s Town Hall and we had lunch with the Struer’s Mayor. After lunch we had a very good insight about how Denmark works as a country and we had the chance to ask a ton of question about it and making comparisons between countries. The next day we all had dinner at a pub followed by bowling. At this point I could see dutch and the Danish students were the most communicative. Much probably a cultural thing. Only later I started to relate more with the Polish and Czech guys.

We also went for rowing and barbecuing near Struer. Liked it a lot. The next day we had dinner at local B&O people’s home. I think this was not only a good cultural experience since we could see how a typical home in Struer looks like but we could experience first hand the kindness and family values of Dannish people. After this dinner my host took me on a bicycle ride around Struer. Simply beautiful. (I’m getting all nostalgic while remembering all this)

Then we had the excursion to Aarhus to visit Terma company and IHA. Terma makes radars, software and hardware for airplanes and satellites.

The 12th July was reserved for a visit to the west coast! I have tons of photos about it. For the rest of the activities we had a cinema tour, a visit to Struer museum and the Farewell dinner just before the final presentation day at B&O.

At the B&O Factory we had to accomplish all the milestones given for each week and had talks from each teacher. There were 6 teams, and each 2 teams got the same project. This was a great idea so later we could see different solutions for the same given problem. We had plenty of material to play with: Lego, card boards, electronic devices, Internet access and so on. During the several weeks we had to make small presentation to the teachers and the other teams. At the second week the other team that had the same project we had ended up getting very similar details we had, so after asking permission we joined efforts and formed a 9 element team. And this was great because in the end we had a working prototype with a lot of functionalities.

In the last day we had the final presentation. There were B&O workers, our teachers, IHA’s dean and later the media.

As final thoughts I think this experience was just awesome. I made a lot of new friends, expanded my network, got to know new cultures and new places. I also learn a lot about team work, pro-activeness and being responsible. I have to congratulate Peter Larson from IHA and B&O for alowing this great experience contributing to everyone’s personal and professional growth.

Single Ray and 4 Ray Packet Versions

After some days working around some bug with my code on SIMD instructions, I finally arrived at the end of this part. There is still one thing to do, that is decent benchmarks. I’m going to use Intel’s Performance Libraries.

A few hours ago, both versions were tested for performance on the different versions with 3 different objects. Single Ray and 4Packet versions have 3 different outputs:

  • PToaster – windowed version using PixelToaster as output result of intersections(HDR);
  • EXR – OpenEXR image type as output(HDR);
  • Void – Version to test intersection time solo, so no output is done.

Neither versions have a space partitioning  algorithm implemented, so all rays are tested against each triangle and both versions were working with a  200×200 pixels window. The shading function is very simple giving the zero value if not hit, else 0.5f. That makes something like:

The 3 objects are:

  • sphere.geom       – 759 triangles     –      30.360.000 rays per frame;
  • 9spheres.geom   –  6831 triangles  –    273.240.000 rays per frame;
  • 63spheres.geom – 47817 triangles – 1.912.680.000 rays per frame.

Also, tests were made on a dual 3.2GHz Xeon and on the Cluster, (which I promise to have full details about it pretty soon) and the next time values correspond to the time taken to generate one single frame.

Single Ray Version

  • Xeon Machine, CPU@2800Mhz, sphere.geom

PToaster: 1.35 seconds || EXR: 1.386 seconds || Void: 1,35 seconds;

  • Cluster, (to be updated)

4Packet Version

  • Xeon Machine, CPU@2800Mhz:

sphere.geom:

PToaster: 0.353 seconds || EXR: 0.380 seconds || Void: 0,340 seconds;

9spheres.geom:

PToaster: 2.95 seconds || EXR: 2.98 seconds || Void: 2,94 seconds;

63spheres.geom

PToaster: 71.7 seconds || EXR: 75.7 seconds || Void: 71.5 seconds.

  • Cluster, CPU@2500MHz

sphere.geom

PToaster: 0.197 seconds || EXR: 0.205 seconds || Void: 0,201 seconds;

9spheres.geom:

PToaster: 1.637 seconds || EXR: 1.650 seconds || Void: 1.656 seconds;

63spheres.geom

PToaster: 10.976 seconds || EXR: 11.062 seconds || Void: 11.037 seconds.

Not bad. The SIMD implementation gave an excellent optimization. I guess after implementing the multi-threading, accelerating data structure and multi-core, we’ll get excellent results! 😀  Here’s some printscreens:

SingleRay on sphere.geom

SingleRay on sphere.geom

4Pack on sphere.geom

4Pack on sphere.geom

4Pack on 9sphere.geom

4Pack on 9sphere.geom

4Pack on 63spheres.geom

4Pack on 63spheres.geom

I’ll update this post next week, with more details and more tests 🙂

Progressing

I haven’t written anything new for sometime now, I’m been very busy with the project and personal stuff.

Since the last time I posted, the iRT project has seen good improvement. But lets start by the beginning.
The project structure for November and December is the following:

  • Starting point code: here; cleaning the code, leaving it nice to read,
  • Develop a true single Ray and sequential interactive ray tracer (Not that interactive, you’ll see later why),
  • Develop the 4 ray Packet version: developing the intersection methods, shading and in the end apply  Intel’s SIMD instruction,
  • Study, and improve the above algorithms in order to get better times,
  • Testing and benchmarking,
  • Write an article about coherent rays and the 4 packet rays,
  • OBJ complete loader (right now its as basic as it can be).

During the next couple of days I’ll be posting about the Single Ray and 4 RayPacket versions, with benchmark tests, details on how the process is done, printscreens and SIMD instructions.

Feels like writting… random thoughts.

Notice: the following paragraphs are pretty random so please bare with me. I’ve been told I have to learn to organize my ideas better so I can better express myself 😉 Enjoy!

Ever since I created this blog I’ve only written stuff about my project and posted some ted.com talks. I now have this strange urge to write about random stuff I’ve been doing lately.  You know, I’m a human after all, and I tend to be far away from “geek land”.

Right now I’m also doing a bit of research on the stocks market, and also on investments, cause I dont want my money stagnated on some random bank. I want it to grow and help others grow. No, I’m not giving money to you, dear reader ahahah!

Right now I’m going to participate on a virtual money stock market competition called Global Investment Challenge. It’s all real, except the money: you buy stocks, then you sell them, you lose money, you make money. I’ve been reading a book on it, and reading some newspapers on business and economy (on paper and online) to get some updates and understand better this crisis. You can find info here. Unfortunately it’s only for Portuguese people. In the end you gain experience on the real world market, and you can even win some prizes like cash and a brand new car !

About one year ago I gave a try on the forex trade. Also on play money. It didn’t go so well, I must say. Still I think I learn something out of it.

Another thing I love is music and right now I’m listening to the new nneka’s album. Great stuff.

At the lab I’m starting to feel more at ease with the people around me. Some are very helpful and more friendly that I could ever guess.

I’m also a sucker for traveling .  I was in Berlin during the 1st week of October, I might have mentioned it before, and what a city! Really, it’s unbelievable. It’s safe and cheap! Even Lisbon is more expensive than Berlin!! Berlin has such an electrifying historical tension in the air. All the monuments, museums, places and old building with gunshots and explosions scars on the walls.  Such a mixture of cultures, people and foods and I swear I wanted to stay there for more than a week. I’m going back there pretty soon. BTW, ryanair is having some special offers right now, so grab your ticket quick!!

So I think I’ll hit Berlin in the beginning of 2009 and later maybe Dublin or Paris. I’d like that. 🙂 Someday I’ll post my experience on the CDDIP08 erasmus program I went to last July, in Denmark.

PixelToaster on the move!

After some work and testing, here are the first images of the ray tracer. Right now, through Pixeltoaster I am able to open a window, refresh the frame with a new frame, and keyboard interactivity to lower or raise the exposure of the image. It was quite easy to implement it, because Pixeltoaster already works with floating point pixel values (like openexr does).

raytracing n.1

raytracing n.1

raytracing n.2

raytracing n.2

raytracing n.3

raytracing n.3

Now I’m starting working with the packet rays. The first thing I’ll do is to clean the code, and I think I’m going to make some documentation on the way. Then I’ll have to make 2 interactive RayTracer versions: the first one with no packets, pure sequencial ray shooting/calculation, the other one using the packets of 4 rays optimized for SIMD code.

Things are getting interesting!