Blackvoxel > Programming with Blackvoxel
How hard could smoothing be?
d3x0r:
Thanx for your long, detailed response.
Speaking of convering to enum...
It is better to use enums for defines... like the set of FACEDRAW_ values can all be assoicated by putting them in an enum; I dunno; maybe it's more useful in C#... then you reference the enum type and all the values you can pick are available in a drop list. I guess the other reason I started converting defines to enums was for Doc-o-matic code documentation tool; then all the values would be groups in the same documentation page. To the compiler enums are constants, so there's no performance hit.
I do appreciate the care for optimization during development...it's apparent to me as I was learning where things were.
My only criticism is ... some lines are too long :) When doing line breaks on long lines, operators should be prefixed on the continuation line and not left on the tail of the prior line... makes reading the expression easier becuase you can see what the op is, inc ase the line is just a little too long; also makes it clear whether a line is a new line (no operator prefix) or a continuation.... This is a habit noone uses or teaches; but I've found of great benefit since early 2000's...
Re: multiple rendering paths... yes, I saw the multiple paths, but the difference is how the voxels are iterated, one uses a foreach(x,y,z) the other uses foreach( in sorted list) and then did the same drawing... so I broke out the triangle emitter into a separate function that both use.
--- Code: ---void ZRender_Smooth::EmitFaces( ZVoxelType ** VoxelTypeTable, UShort &VoxelType, UShort &prevVoxelType, ULong info
, Long x, Long y, Long z
, Long Sector_Display_x, Long Sector_Display_y, Long Sector_Display_z )
--- End code ---
https://github.com/d3x0r/Blackvoxel/blob/master/src/ZRender_Smooth.cpp#L670
Guess I ended up reverting that in ZRender_Basic; so it's back to mostly original code (added culler init)
---------
My issue is... I realize that voxelSector should be a low level class, and not know things like World.. and now after each new ZVoxelSector another call to InitFaceCullData has to be made... makes it less clean to just get a sector... need like a sector factory in render or world ... one in render could could use one in world.... But then; Render right now is a game(?) a world(?) ... If it was a factory from render, than smooth sectors could be requested or basic...
------
I was having issues getting yaw/pitch/roll reverse calculation working, so I ended up making zmatrix.cpp for those so I could just recompile the one source, since basic types get included by like everything, a change was having to
rebuild everything; need to get rid of .cpp and make the yaw/pitch/roll inlines. *done* still some fixups to do in planePhysics I think...
-----
Oh, was playing blackice, and like most FPS they have a pitch lock at +90 degrees... was playin in modified blackvoxel for so long, expected the camera to auto rotate as the angle went over my head. I'm not 100% sure how this works... and I can't seem to get to exactly 90 degrees up or down, and I'm not sure what would happen if you did get there... since I'm always just a hair to the left or right of exactly down, it results in a roll that undoing results in turning you around in yaw... it's really kinda natural... anyway the snap to 'up' annoyed me yet again... it's not like that actually saves you anything, since you later have to compute sin/cos values to apply motion instead of just having the axis array (matrix) available to use....
I thought quaternions would be way complex; but their usage is really to and from storage, and is a way to smoothly translate one rotation matrix to another through a simple linear interpolation of the quaternaion (which is just a vec4) that can be read/written to the 3x3 rotation matrix. (matrix is 4x4 cause they have a translation/origin row)... would be nice if d3d and opengl used sane matrixes... layed out so the X/y/z axis normal-vectors are lines up as an array of 3 values so you get the vectors immediately without applying a row-pitch offset thing...
------------
--- Quote ---Yep, that's a more complex algorithm and it needs more sectors. That's a great work.
Full Culling for actual unsmoothed voxels are working with 7 sectors (1 + 6 adjacent).
With need of diagonal voxel data, it will need 27 sectors (1 + 26 adjacent).
--- End quote ---
but... marching cubes is the 8 bits... which is the 8 corners... I guess
and I don't need what's diagonal...so the 8 corner cubes from the center... because those don't affect the shape of this or any of the near cubes... so only 18 (6 original, 12 in the 6 ordinals from each of the original 6 ( accessable by 2 names, FRONT_HAS_ABOVE, and ABOVE_HAS_FRONT is the same cube..... but since I don't use the diagonals in the center, then I don't need diagonals from the first mates either... but the I do use 18 bits instead of 8
----------------------------
Glad to meet you :) don't actually have many peers :)
can I add you on facebook maybe? I'd share my name, but it's not like John smith... it would immediately indicate a individual so it's like posting my SSN# here :)
Uhmm... I had a TI-99/4a with speech synthesizer (why don't I have a simple speech synthesizer today?) then several years later a Tandy 1000 (no letters after, original - PC jr clone) and turbo pascal
olive:
--- Quote from: d3x0r on October 30, 2014, 07:39:24 am ---Thanx for your long, detailed response.
Speaking of convering to enum...
It is better to use enums for defines... like the set of FACEDRAW_ values can all be assoicated by putting them in an enum; I dunno; maybe it's more useful in C#... then you reference the enum type and all the values you can pick are available in a drop list. I guess the other reason I started converting defines to enums was for Doc-o-matic code documentation tool; then all the values would be groups in the same documentation page. To the compiler enums are constants, so there's no performance hit.
--- End quote ---
We used enum a lot in Blackvoxel, but also some define ;).
Our IDE doesn't work well with "auto completion" on enum types parameters (but it work well on most things).
We had some issues while debuging with enums that doesn't make them comfortable because the debugger only show enum name and not the value. So, in some cases, we used define.
Another problem with enum is the inability to specify underlying integer type other than the default int. That's why we rarely use "enum type" with enums. In some case that's because we want to use different storage size. But also for fixing integer size to avoid 32/64 bits issues. C++11 enables to specify enum storage type, but we don't want to switch to this yet.
About documentation, we hope we'll have time to write it one day ;).
--- Quote ---I do appreciate the care for optimization during development...it's apparent to me as I was learning where things were.
My only criticism is ... some lines are too long :) When doing line breaks on long lines, operators should be prefixed on the continuation line and not left on the tail of the prior line... makes reading the expression easier becuase you can see what the op is, inc ase the line is just a little too long; also makes it clear whether a line is a new line (no operator prefix) or a continuation.... This is a habit noone uses or teaches; but I've found of great benefit since early 2000's...
--- End quote ---
We agree with you on the interest of putting the operators before on line break cases.
About "long lines", we didn't like "over spreaded" code : we found it to be deconcentrating.
Rather than using a single dimension, we write code on 2 dimensions and get it more compact. So can view more of it at the same time.
Prioritarily, the most important code spreads verticaly to get more readable and the less important spreads horizontaly for avoiding bloating as much possible.
But we know every programmer would have have their own personnal opinion on coding style and presentation. Whatever arguments, that's always a matter of personnal taste and programming style.
--- Quote ---Re: multiple rendering paths... yes, I saw the multiple paths, but the difference is how the voxels are iterated, one uses a foreach(x,y,z) the other uses foreach( in sorted list) and then did the same drawing... so I broke out the triangle emitter into a separate function that both use.
--- Code: ---void ZRender_Smooth::EmitFaces( ZVoxelType ** VoxelTypeTable, UShort &VoxelType, UShort &prevVoxelType, ULong info
, Long x, Long y, Long z
, Long Sector_Display_x, Long Sector_Display_y, Long Sector_Display_z )
--- End code ---
https://github.com/d3x0r/Blackvoxel/blob/master/src/ZRender_Smooth.cpp#L670
Guess I ended up reverting that in ZRender_Basic; so it's back to mostly original code (added culler init)
--- End quote ---
The render code is considered as a critic portion. So adding a function call in a loop should be avoided.
There is 16x16x64 voxels on a sector, so it's making 16384 functions calls.
If the engine needs to render 10 sectors per frame and 60 frames per second, that would do :
16384 * 10 * 60 = 9830400 function calls per second.
In fact the relative overhead must be relativized because the called function is very long and have a lot of stuff in it. But that's always an overhead that could be avoided.
You could always inline the code to remove this overhead. But some compilers may also choose to ignore this on their own criteria. Inlining is never guaranteed.
So, in parts where you want to be sure the things will be compiled like you want to in a fully predicable way, the best is to write them like they must be compiled without making any suppositions on what optimizations compiler could do.
In doint so, you'll also eliminate possible issues on performance depending on inlining heuristic oddities on a particular compiler.
Of course, I would recommand to do such methods only on critical code sections that realy need maximum speed because reducing code factorization also have bad counterparts.
--- Quote ------------
My issue is... I realize that voxelSector should be a low level class, and not know things like World.. and now after each new ZVoxelSector another call to InitFaceCullData has to be made... makes it less clean to just get a sector... need like a sector factory in render or world ... one in render could could use one in world.... But then; Render right now is a game(?) a world(?) ... If it was a factory from render, than smooth sectors could be requested or basic...
------
--- End quote ---
Video games are very particular programs. That's the kind of stuff that would rarely end up to be coded in a perfect academic ways.
In a game, there is a lot of inner interactions, a lot of technical complexity, a lot of technical constraints, permanent evolutions caused by new ideas.
Doing anything require tough decisions to balance between a lot of conflicting considerations.
In most cases, even the best decision will only lead to the best compromise.
Whatever you do, that will always let you with an impression of imperfection.
--- Quote ---I was having issues getting yaw/pitch/roll reverse calculation working, so I ended up making zmatrix.cpp for those so I could just recompile the one source, since basic types get included by like everything, a change was having to
rebuild everything; need to get rid of .cpp and make the yaw/pitch/roll inlines. *done* still some fixups to do in planePhysics I think...
-----
Oh, was playing blackice, and like most FPS they have a pitch lock at +90 degrees... was playin in modified blackvoxel for so long, expected the camera to auto rotate as the angle went over my head. I'm not 100% sure how this works... and I can't seem to get to exactly 90 degrees up or down, and I'm not sure what would happen if you did get there... since I'm always just a hair to the left or right of exactly down, it results in a roll that undoing results in turning you around in yaw... it's really kinda natural... anyway the snap to 'up' annoyed me yet again... it's not like that actually saves you anything, since you later have to compute sin/cos values to apply motion instead of just having the axis array (matrix) available to use....
I thought quaternions would be way complex; but their usage is really to and from storage, and is a way to smoothly translate one rotation matrix to another through a simple linear interpolation of the quaternaion (which is just a vec4) that can be read/written to the 3x3 rotation matrix. (matrix is 4x4 cause they have a translation/origin row)... would be nice if d3d and opengl used sane matrixes... layed out so the X/y/z axis normal-vectors are lines up as an array of 3 values so you get the vectors immediately without applying a row-pitch offset thing...
------------
but... marching cubes is the 8 bits... which is the 8 corners... I guess
and I don't need what's diagonal...so the 8 corner cubes from the center... because those don't affect the shape of this or any of the near cubes... so only 18 (6 original, 12 in the 6 ordinals from each of the original 6 ( accessable by 2 names, FRONT_HAS_ABOVE, and ABOVE_HAS_FRONT is the same cube..... but since I don't use the diagonals in the center, then I don't need diagonals from the first mates either... but the I do use 18 bits instead of 8
--- End quote ---
It's possible we'll use quaternions in the future, at last for some kind of airplane or rocket that needs to avoid gimball lock. As you said, it could have pro and cons. A lot of stuff could be made without.
For the airplane, fine tuning and balancing it's behavior is a long job. Tuning took in fact longuer than implementing the aircraft itself.
----------------------------
--- Quote ---Glad to meet you :) don't actually have many peers :)
can I add you on facebook maybe? I'd share my name, but it's not like John smith... it would immediately indicate a individual so it's like posting my SSN# here :)
--- End quote ---
We are also glad to meet you :)
Yes, you can add us on Facebook. We use the Blackvoxel account as we don't have personnal Facebook accounts. (But our name and the company info are here : http://www.blackvoxel.com/view.php?node=14)
--- Quote ---Uhmm... I had a TI-99/4a with speech synthesizer (why don't I have a simple speech synthesizer today?) then several years later a Tandy 1000 (no letters after, original - PC jr clone) and turbo pascal
--- End quote ---
I remember TI-99, but it wasn't very common in our country. Here, most computers were C64, Atari, Amstrad and the local manufacturer, Thomson (These integrated light pen).
At that time, We were programming in Basic and assembly language. Also doing some electronic.
C langage come few years later with Commodore Amiga.
The Blackvoxel Team
d3x0r:
--- Quote from: olive on November 01, 2014, 01:20:42 am ---
The render code is considered as a critic portion. So adding a function call in a loop should be avoided.
There is 16x16x64 voxels on a sector, so it's making 16384 functions calls.
If the engine needs to render 10 sectors per frame and 60 frames per second, that would do :
16384 * 10 * 60 = 9830400 function calls per second.
--- End quote ---
True enough.... but it's at least 16 calls that function makes itself... to emit the 4 points with 4 texture coords, setup primitive etc... and that's 1 face of 1 voxel... so figure perfect flatness... 256*16 ... so 1 in 4096 increase in function calls is not significant :) Edit: I got that a little wrong... If something NEEDs to be inlined... there's always Define's :) Conveted some of the emitting to defines so I could have consistant texture coord references ... like point 0, face 0,1,2 from it has a texture coord defined for it....
If a compiler can't do the job 'right' then don't use it... LCC for instance; wicked fast, but linker cripples real usage. Wish I could play with icc (intel) but they're so closed for some reason.
--- Quote from: olive on November 01, 2014, 01:20:42 am ---It's possible we'll use quaternions in the future, at last for some kind of airplane or rocket that needs to avoid gimball lock. As you said, it could have pro and cons. A lot of stuff could be made without.
For the airplane, fine tuning and balancing it's behavior is a long job. Tuning took in fact longuer than implementing the aircraft itself.
--- End quote ---
Well Again, quaternions shouldn't be 'used' ... since to be applied for transformation it needs to convert to a 3x3 matrix... might as well just keep the matrix. linear interpolation only counts for follow cams from arbitrary points to other arbitrary points... but mostly a follow cam will be a relative translation of an existing view camera and not really require quaternion representation either... and the 4 coordinates don't translate into something comprehendable... so expressing any constant quat-vec4's is not really possible... just to retrieve from an existing rotation 3x3 state.... well i,j,k,l vectors work; but once they get combined multiple coordinates change at once for a simple rotation.
-----------------------
But; I guess I'm really looking at the wrong scale of things...
I know you mentioned some things already... but how do I really add a new block behavior? Maybe creation of a block should create a voxel body :) A voxel brontasaurus or something... so a simple block spawns all of the others when created... and generate motion in voxel units... was considering a ground displacement for footprints ... so many thoughts.
The bomb in black forest uses world distance for detection and transformation into red liquid distance.... the x64 factor for reduced world size caused much too many red blocks to be created :)
Also attempted to simply add more threads to process things; wasn't as simply extendable as I would have hoped :)
InterlockedExchange()/*windows, not sure what linux equiv is... gcc __asm...*/... "lock xchg ax, [mem] " is a cheap multiprocessor safe lock. for things like queues and lists to add/remove items to process, a simple spin-on-lock with a Relinquish() ... sched_yeild() /*linux*/ or Sleep(0); /* windows */ is sufficient and doesn't require things like scheduling.. like if( locked ) sleep( until_lock_is_unlocked ); ..
static size_t lock; /* register sized variable */
while( locked_exchange( &lock, 1 ) ) { Relinquish(); };
/* do stuff */
lock = 0;
d3x0r:
The other example of 'poor relationships' I ran into is C# has a type called DataTable, which is a representation of a SQL table, with dynamic columns with types and rows, etc. But datatables contain columns and rows but all columns know the datatable they are in, and all rows know the datatable, also datatables can be contained in datasets which is a group of datatables and adds the foriegn key relationships between tables. So from any row you can get to any other row in any other table that row is remotely related to... so there's sometimes merits of having... say worlds know all sectors, but sectors know their world, and hence their renderer... or something.
olive:
--- Quote from: d3x0r on November 01, 2014, 09:40:28 am ---True enough.... but it's at least 16 calls that function makes itself... to emit the 4 points with 4 texture coords, setup primitive etc... and that's 1 face of 1 voxel... so figure perfect flatness... 256*16 ... so 1 in 4096 increase in function calls is not significant :) Edit: I got that a little wrong... If something NEEDs to be inlined... there's always Define's :) Conveted some of the emitting to defines so I could have consistant texture coord references ... like point 0, face 0,1,2 from it has a texture coord defined for it....
--- End quote ---
Ther is 14 function call per face, so 84 calls for all faces. So, roughly 1/100 of function calls. But lot of parameters, so much like 1/50 of the calls. Sure, it's not huge.
The idea is more to code critical pieces of code in the spirit of doing the maximum optimisations.
Defined code are often used for code snippet. But not very handy for large portions of code. Have also some drawbacks.
We also studied the idea of meta programming. But there are also disadvantages.
--- Quote ---If a compiler can't do the job 'right' then don't use it... LCC for instance; wicked fast, but linker cripples real usage. Wish I could play with icc (intel) but they're so closed for some reason.
--- End quote ---
Doing the job "right" for a compiler could also mean to take into account a lot of contradictory considerations because modern CPU are very complex.
As an example, a compiler must avoid over-inlining big pieces of code in every portion of non critical code because increasing all the code too much could lead to bad cache usage and worse performances.
But an heavily used little piece of code is a very special case that could suffer of this general behaviour.
The problem is that even the best compiler have no real mean to know if a code portion is critical. It's decisions are based on "general considerations". That's optimal for general code, but might not be the best for special cases. That's why the need to help compiler sometime.
Of course, there is a lot of other mean to help compiler in such cases. Some compilers have a lot of flags for tuning. There is special non standard instructions, like the "always_inline" of gcc.
Profiling is also used to help compiler to know what pieces of code are heavily used and need to change it's rules.
But all these mean have big drawbacks. Profiling add a complicated phase, special compiler flags and instructions aren't working across compilers.
Here again, whatever way mean a price to pay.
--- Quote ---Well Again, quaternions shouldn't be 'used' ... since to be applied for transformation it needs to convert to a 3x3 matrix... might as well just keep the matrix. linear interpolation only counts for follow cams from arbitrary points to other arbitrary points... but mostly a follow cam will be a relative translation of an existing view camera and not really require quaternion representation either... and the 4 coordinates don't translate into something comprehendable... so expressing any constant quat-vec4's is not really possible... just to retrieve from an existing rotation 3x3 state.... well i,j,k,l vectors work; but once they get combined multiple coordinates change at once for a simple rotation.
--- End quote ---
I think it's a good advice. We'll certainly follow it.
--- Quote ---But; I guess I'm really looking at the wrong scale of things...
I know you mentioned some things already... but how do I really add a new block behavior? Maybe creation of a block should create a voxel body :) A voxel brontasaurus or something... so a simple block spawns all of the others when created... and generate motion in voxel units... was considering a ground displacement for footprints ... so many thoughts.
--- End quote ---
Yep, that's exactly the principles of how things are working with MVI.
In Blackvoxel, a voxel can have a program defining it's behavior, that could be called a voxel-shader.
In this program, you can manipulate other voxels around : make animations, chemical reactions, transformations.... nearly everything is possible.
Yes there is amazing things to do with this.
Making a "block behavior" means simply add a the behavior code in ZVoxelReactor.cpp switch(case is the voxeltype). Then add the BvProp_Active=1 statement in the voxelinfo corresponding file.
Like I said in a post some days ago, there is also some other classes to add if your voxel need to have it's own data storage for internal state, inventory or whatever you want.
We think we'll make this stuff simpler in the future by adding an "On_Execution" kind of function in ZVoxelType. This is less efficient, but only massively used voxels needs high performances.
--- Quote ---The bomb in black forest uses world distance for detection and transformation into red liquid distance.... the x64 factor for reduced world size caused much too many red blocks to be created :)
Also attempted to simply add more threads to process things; wasn't as simply extendable as I would have hoped :)
InterlockedExchange()/*windows, not sure what linux equiv is... gcc __asm...*/... "lock xchg ax, [mem] " is a cheap multiprocessor safe lock. for things like queues and lists to add/remove items to process, a simple spin-on-lock with a Relinquish() ... sched_yeild() /*linux*/ or Sleep(0); /* windows */ is sufficient and doesn't require things like scheduling.. like if( locked ) sleep( until_lock_is_unlocked ); ..
static size_t lock; /* register sized variable */
while( locked_exchange( &lock, 1 ) ) { Relinquish(); };
/* do stuff */
lock = 0;
--- End quote ---
Blackvoxel use such concepts in message files used for inter thread communications. We used slightly different instructions as we are doing "lockless" working(while ensuring multithread correctness).
In Gcc there is "compiler intrinsics" that are special functions converted directly to assembly. These intrinsics can be translated on the right instructions with different processors.
So, we used the __sync_bool_compare_and_swap() intrinsic for our stuff.
As what I could read on the web, the InterlockedExchange() equivalent should be the _sync_lock_test_and_set() intrinsic on Gcc. But it must be verified.
Be careful to add the -march=i686 flag on windows gcc, otherwise, the intrinsics won't compile.
As an information, Blackvoxel was nearly entirely developed on Linux. Even the Windows package is compiled on Linux :).
The Blackvoxel Team
Navigation
[0] Message Index
[#] Next page
[*] Previous page
Go to full version