Zalo DS Blog

Thursday, November 16, 2006

Floats, more and more...

I have redone my profiler.
First I have redone mi counter, now I have I new one that i have tested with the ds clock an synchronization is perfect :D. The new timer now can give me more info about the time passed since the last time I called him.
I have deleted also that aditions i made to avoid compiler optimizations. I could so this changing my ints data to volatile.
So, for any operation the basic code is:

counter->Reset();
for(int i = 0; i < NUM_OPS; ++i)
{
c = a * b;
}
printf("MUL32: %d %d\n", counter->GetTimeSinceLastCall(), c);

More simple, as you can see. I have tested again all the previous operations and these are the results (remember NUM_OPS = 5000000):

- Multiplications: 671 ms
- Multiplications with mulf32: 896 ms
- Divisions: 2163 ms
- Divisions with divf32: 16265 ms
- 64 bits multiplications: 672 ms
- 64 bits divisions: 15817 ms
- Square roots: 8207 ms


Floats operations:
- Multiplications: 3208 ms
- Divisions: 10297 ms

I hope it will be useful fore somebody

Wednesday, November 15, 2006

Unexpected profiling results

I am starting to hate this XDDDD

This morning I realized that I haven't done some floats operations and get their times. Everybody in forums when you talk about floats says:"Nooooo, atay away from them. Fixed point, fixed point!!"

I have added to yesterady's program a multiplication and a division with float numbers.
Basically the coded added is like this

float fa = 1000000000.0f;
float fb = 0.0000000001f;
float fc;
counter->Reset();
int t8 = 0;
for(int i = 0; i < NUM_OPS; ++i)
{
fc = fa++ * fb++;
t8 += counter->GetTimeSinceLastCall();
}
printf("FMUL T8: %d %f\n", t8, fc);

counter->Reset();
int t9 = 0;
for(int i = 0; i < NUM_OPS; ++i)
{
fc = fa++ / fb++;
t9 += counter->GetTimeSinceLastCall();
}
printf("FDIV T9: %d %f", t9, fc);

And... the results obtained are
FMUL 11446ms
FDIV 18619ms

Strange isn't it? Personally I was expecting something else... I mean... fixed point is a pain in the neck and there is always problems with them (that annoying overflow is not the only one). For multiplications the time increases until 3 times with fixed point but wiht divs... it takes less time??

With these results... I don't know what to do :P Should I use floats or use fixed point? Should I continue wasting my time on a FP class? I think that I am going to ask in any for... I mean, I am going to investigate a little bit

Tuesday, November 14, 2006

Some f32 profiling

Hi, remember that math coproccesor I wrote about yesterday? I have been doing some testing on it, to check if it is worth using it if you are not using parallel processing.

I have wrote a little program making some simple opperations and checking the time for them. Basically the code for operations is like this:

counter->Reset();
int t1 = 0;
for(int i = 0; i < NUM_OPS; ++i)
{
c = a++ * b++;
t1 += counter->GetTimeSinceLastCall();
}

So, basically what I do is
1 - Reset my counter (that's not nds counter but one of my own classes)
2 - Creating a var for store the time
3 - Making the operation and also a pair of adds to avoid that compiler deletes this code for not doing anything.
4 - My counter cannot store a very long amount of time because of the DS hardware, so it is a good idea to store the time passed each itteration

Well, that is not going to give me the time that nintendo ds spends on each operation, but gives me something to compare one each other.

Each operation is run 5000000 times(NUM_OPS = 5000000)
The operations I have checked out are:
- MUL32-64: multiplications with int (4 bytes) versus multiplications with long long int(8 bytes, using mulf32 from ndslib)
- DIV32: 4-bits division first with normal division and then with math coprocessor (using divf32 from ndslib)
- Mul64: the same opperation that mulf32 form ndslib but without displacement
- Div64: (64 / 32) bits division, with math coprocessor (using divf64 from ndslib)
- Sqrt32: 32 bits square root, with math coprocessor(using sqrt32 from ndslib)

After running it on my Nintendo DS I have got:
- For Mul32/64: 4425ms and 4273ms
- For Div32: 4273ms and 20450ms with math cop.
- For mul64: 4273ms as spected... that means displacements are very fast
- For Div64: 20450ms
- For SQRT32:12361ms

So, extract your own conclussions but it seems that using math cooprocesor is a little slower than not doing it. For sqrt there is no problem because there is no other way to do it (well you can implement your own sqrt with Newton's or other method and check if it is faster). But for the division the time multiplies by almost 5! That doesn't mean you shouldn't use it, in fact, while this operations are being calculated you can do other thins instead of "while(...);" but that won't be able if you use ndslib directly

The good news are that 64 bits multiplication (used for FP) can be done almost as faster as 32 bits one. This is very useful for a Fixed Point class. What can I do with divisions... that is something I need to check a little deeper. I have written a division with uses normal 32bit division for fixed point, but it makes some evaluations and additions that surely makes it slower.

If anybody, one of this days wants the code to see what I hace done, only ask for it. I still don't know how to upload files into this blog (maybe it can't be done XDD)

Well, that's all for today. It's been funny this time ^_^

Monday, November 13, 2006

4 days working on fixed point for nothing!! XD

Wow, it's been a long time. A week, I think... XD I was supposed to have a game by this date but the real truth is that I have nothing

It's not that I haven't been working on it. I have spent a lot of time in the past 4 days (I have been on a little vacation) working with the DS. But now I have that felling that I have completely wasted my time. Why? Well, I have been working on a fixed point class for the DS. Really... is it a secret or something? Why nobody in the forums said anything about that... LIBNDS HAS SUPPORT FOR FIXED POINT!!! Why nobody talk about that NDS HAS HARDWARE SUPPORT FOR SQRT or even thought NDS is a 32 bit machine IT HAS SUPPORT FOR DIV, MUL AND SQTR 64 BITS!!!!!!!!!!

Weel, as always... I've learnt the things the hard way... haha. Well, at less it has been worth remenbering how Newton's method works for sqrt works XD. And I understand how hard is to work without floating points (really).

From now on, my first point of reference is going to be the ndslib headers. But I see something that I don't like very much; mulf32, div32 and all others has a while(...); And not only that, the methods for normalization and sqrt can be improved to be more accurate.

If you like more info about all this stuff you can check here and also in math.h from ndslib

Well, now I thinh that I can finish my fixed point class (I am not using f32, because in c++ I prefer a new type and operators overloading... maybe it is a little slower, but looks better :P). So, lets go!

Monday, November 06, 2006

Another Nintendo DS development blog

Hi, there.

Finally I have decided myself to write a blog about all the stuff I am doing with my DS. When I bought this wonderful machine one and a half year ago I did it because it was a Nintendo machine and I had the money for it. Yeah, I bought it because it was so promising, but I could'n realize how promiwing it was goig to be.

After all this months of fun (specially with the Mario Kart), two months ago I decided to read some tutorials showing me how to program it. It didn't look very hard, so after reading all the Chris Double's tutorials I finally started coding some stuff.

At first (and at this time to) my aim is to know how this machine works, I don't want to make a wonderful game or something, only investigation. During these last few weeks I have been developed my own library, mainly for video stuff, and know I have started codign a little game (I am not going to spend more than a week on it, so it is very simple)

Why haven't I use the wonderful palib? - you may be asking - Because as I said, my aim is to know how nintendo ds work. Now I have another lib called ZaloDSLib (:D) and it is what I am going to use from now on. Maybe I will take a look on some aspect of the palib in the future :D (it is not copying, it is investigation)

So, what is this blog going to be about? About my research on the nintendo DS. Thank you for reading (if someone is going to do it)

PD: ahm... by the way, sorry for my english, I am doing my best ^_^U