V P U

PROCESSING




Just thinking about a hypothetical "VPU", or Voxel Processing Unit. This is actually possible with an fpga. Also would be possible to have an popcount operation. More research is needed, but having as many options available as possible allows for easier traversal through future changes in technology.

For me the ultimate computer would be an 8-core Orange Pi RV2, rip out the AI garbage and replace it with an FPGA.


It would only require 16 instructions:

NOT - bitwise NOT of register. 

OR - bitwise OR of register. 

AND - bitwise AND of register. 

ADD - register addition with register. 

SUB - register subtraction with register. 

MUL - register multiplied with register.

RS11 - right shift register 11 bits. 

RS13 - right shift register 13 bits. 

RS8 - right shift register 8 bits. 

ZSUB - zero subtracted from register. 

AND255 - register AND 255.

256SUB - 256 subtracted from register. 

ADD256 - 256 added to register. 

SUB256 - register subtracted 256.

JELZ - jump if equal or less than zero. 

JEGZ - jump if equal or greater than zero. 

A total of 23 variables are used in calculations so they could be loaded into a 32 general register system at the start with a final pixel colour returned. 


Alternatively a RISC-V approach can be taken. First registers are constants:

register0 = 0

register1 = 256

register2 = 255

register3 = 8

register4 = 11

register5 = 13

register6 - 31 = variables. 

This way every instruction involves two registers, thus making next instruction logistics easier. 

Now there are only 9 instructions. 

NOT, AND, OR, ADD, SUB, MUL, RS, JELZ, JEGZ. 




OPTIMIZATIONS FOR ESP32



3D VERSION OF OFFSET CHANGE MODE



DCT IV



Middle set of numbers is encoded data.


Has good accuracy.
0.005x gives a range of 100*2pi.
0.0005x gives a range of 1000*pi.


Small error.


Precision for 3D layers (offset).







POPCOUNT


Adding a POPCOUNT instruction will allow me to decompress terrain data (offset) at a ratio of 32:1 in realtime! Big thanks to high school friend Daniel Dennett for that breakthrough. 



Popcount ( n: 00110011001100110011001100110011 ) 15 with 32 bits:

return ( ( n - ( ( n >>> 1 ) & 0x55555555 ) ) * 0x11111111 ) >>> 28


Popcount ( n: 00110011001100110011001100110011 ) 16 with 32 bits:

n =  (( n - ( ( n >>> 1 ) & 0x55555555 ) ) * 0x11111111) >>> 28

return n | ( ( !n ) << 4 )


Only 8 operations to count 16 bits. The '>>>' operator allows signed integer popcount! Blocks of zero are skipped before the popcount.



Popcount ( n: 11111111111111111111111111111111 ) 32 with 32 bits:

m =   n - ( ( n >>> 1 ) & 0x55555555 ) 

n = ( ( ( m >>> 2 ) & 0x33333333 ) * 0x11111111 ) >>> 28

m = ( ( m & 0x33333333 ) * 0x11111111 ) >>> 28

return n + m + ( ( ( !n ) + ( !m ) ) << 4 )


The above map has no solid blocks, so a faster version can be used :


Popcount ( n: 11111111111111111111111111111111 ) 15 + 15 with 32 bits:

m =  n - ( ( n >>> 1 ) & 0x55555555 ) 

n = ( ( ( m >>> 2 ) & 0x33333333 ) * 0x11111111 ) >>> 28

m = ( ( m & 0x33333333 ) * 0x11111111 ) >>> 28

return n + m


The +m is for positive (red) gradients, and -m is for negative (blue) gradients.


For a 256x256 landscape (offset) :

Each horizontal strip has a SEED with 4 relative displacements plus 2 popcount relative displacements.


SEED = | 8 bits. Height  | 6 bits r | 6 bits r | 6 bits r | 6 bits r |

SEED = | 8 bits. Height | 6 bits r | 6 bits r | 6 bits r | 6 bits r |

SEED = | 8 bits. Height | 6 bits r | 6 bits r | 6 bits r | 6 bits r |

SEED = | 8 bits. Height | 6 bits r | 6 bits r | 6 bits r | 6 bits r |


0<------------------------------------- X --------------------------------->256

SEED  >   >   p   p  r  >   >   <   <  r  >   >   <   <  r  >  >  <  <  r

SEED + r - p - p

SEED  >   >   <   <  r  >   >   p   p  r  >   >   <   <  r  >  >  <  <  r

SEED + r + r - p - p

SEED  >   >   <   <  r  >   >   <   <  r  >   >   <   <  r  p  p  <  <  r

SEED + r + r + r + p + p

SEED  p   p   <   <  r  >   >   <   <  r  >   >   <   <  r  >  >  <  <  r

SEED + p + p


The SEED is decoded and a popcount is decoded from the closest seed to the landscape x-coordinate, in either direction.

A 256x256 block of 65536 numbers can be created using 2048 numbers.




LANDSCAPE (OFFSET) RELATIVE DISPLACEMENT


IMAGE EDITOR VERSION


Heightmap of elevation data.


Move the heightmap right one pixel. Subtract this heightmap from the original to give this:


This is the relative displacement of the terrain. 



Emboss the original heightmap:


Select the dark area and fill with red, invert the selection and fill with blue:



Black pixel = add 0

Red pixel = add +1

Blue pixel = add -1

If travelling in the opposite direction:

Black pixel = add 0

Red pixel = add -1

Blue pixel = add +1



JAVASCRIPT VERSION









The shaded green areas are single gradients (32 voxels per count). The non-shaded areas are dual gradients (16 voxels per count), subtracted from each other.



Interestingly, Isometric depthmaps (offset) will mostly be blocks of 1's or blocks of 0's. Allowing for large compression through RLE (run length encoding).

Will be writing a Python version because of new technologies like inline assembly and VIPER. 



RUN LENGTH ENCODED SMOOTHING





By combining popcount with RLE it maybe possible to decompress a terrain offset with a compression ratio at 16000:1 at runtime.




LANDSCAPE SYNTHESIS




Synthesize a landscape (offset) with :

z = 2*sin(x-2y)+2*sin(4x+3y)+2*sin(3y-5x)+2*sin(7x+5y)+sin(5x+11y)+sin(10x+7y)

it has a period of 2 pi.




Change 2 to 4 to get a mountain range. The sand is made by dividing the formula by double what was used to get mountain range.






Divide the formula by 64 to get the jungle, and divide same formula by 128 to get the sand. Subtract 0.3 from the same formula and divide by 32 to get mountains.
After the formula is completed it only takes a single shift to create a new terrain layer.





Mountain range with 4 layers.

Each layer has it's own 3D texture from one line of code.





Simple shading effect.




ISOMETRIC MAPPING



Using Javascript you can get isometric maps from Google Earth like so:




MODE1 :  Isometric RLE offset.

MODE2 :  Synthesized offset.

MODE3 : DCT IV offset.

MODE4 : Popcount Relative displacement offset.

MODE5 : Full bitmap offset.

MODE6 : MODE1 + MODE2

MODE7 : MODE1 + MODE3

MODE8 : MODE1 + MODE4

MODE9 : MODE1 + MODE5





3D BLITTER



The best thing about voxels is I can do a "3d" BitBLT. This allows me to perform some really cool effects, that are simple to implement and are on a massive scale. I can also simulate a 3d page flipping system by superimposing two worlds on top of each other. By changing a single bit I can perform a pseudo 3d flip instantaneously. 



In 2004 when I was exploring uses for "world flipping" I envisioned a time travel effect similar to the Doom "melt" effect. Replacing the current 3d world with a future 3d world. Just finished watching Silent Hill on DVD and realised I could easily replicate the school morphing into the coal mine with a simple Cellular Automaton. Only just scratching the surface of voxel power.


Because this has been a 30 year hobby,the cost of running 8 machines per year is $100.

 Just finished one hour testing batches on a dimensity 7020. Running 5 threads at 100% used 8% battery charge with no drop in performance. The phone remained at room temperature for the whole hour. Extremely good results, no more testing required. The space leaper is a rough approximation and the actual space leaping system could yield another 50% performance boost. 

Moving the iso-surface extractor out of the renderer for a non-multiply 3D rendering engine. This will reduce the number of variables in the renderer and reduce garbage collection. Javascripts window.location.replace function has given me a theoretically infinite number of 3D engines, all I have to do is copy the file and rename it. Change the name of the image file and I have just created a new 3D world and all the 3D models. This is the best thing about javascript, I love it! Just finishing "mo-cap" library with $200 phone!

Comments

Popular posts from this blog

HARDWARE

MATH