V P U
PROCESSING
Just thinking about a hypothetical "VPU", or Voxel Processing Unit. This is actually possible with an fpga. Also would be possible to have an popcount operation. More research is needed, but having as many options available as possible allows for easier traversal through future changes in technology.
For me the ultimate computer would be an 8-core Orange Pi RV2, rip out the AI garbage and replace it with an FPGA.
It would only require 16 instructions:
NOT - bitwise NOT of register.
OR - bitwise OR of register.
AND - bitwise AND of register.
ADD - register addition with register.
SUB - register subtraction with register.
MUL - register multiplied with register.
RS11 - right shift register 11 bits.
RS13 - right shift register 13 bits.
RS8 - right shift register 8 bits.
ZSUB - zero subtracted from register.
AND255 - register AND 255.
256SUB - 256 subtracted from register.
ADD256 - 256 added to register.
SUB256 - register subtracted 256.
JELZ - jump if equal or less than zero.
JEGZ - jump if equal or greater than zero.
A total of 23 variables are used in calculations so they could be loaded into a 32 general register system at the start with a final pixel colour returned.
Alternatively a RISC-V approach can be taken. First registers are constants:
register0 = 0
register1 = 256
register2 = 255
register3 = 8
register4 = 11
register5 = 13
register6 - 31 = variables.
This way every instruction involves two registers, thus making next instruction logistics easier.
Now there are only 9 instructions.
NOT, AND, OR, ADD, SUB, MUL, RS, JELZ, JEGZ.
OPTIMIZATIONS FOR ESP32
3D VERSION OF OFFSET CHANGE MODE
DCT IV
POPCOUNT
Adding a POPCOUNT instruction will allow me to decompress terrain data (offset) at a ratio of 32:1 in realtime! Big thanks to high school friend Daniel Dennett for that breakthrough.
Popcount ( n: 00110011001100110011001100110011 ) 15 with 32 bits:
return ( ( n - ( ( n >>> 1 ) & 0x55555555 ) ) * 0x11111111 ) >>> 28
Popcount ( n: 00110011001100110011001100110011 ) 16 with 32 bits:
n = (( n - ( ( n >>> 1 ) & 0x55555555 ) ) * 0x11111111) >>> 28
return n | ( ( !n ) << 4 )
Only 8 operations to count 16 bits. The '>>>' operator allows signed integer popcount! Blocks of zero are skipped before the popcount.
Popcount ( n: 11111111111111111111111111111111 ) 32 with 32 bits:
m = n - ( ( n >>> 1 ) & 0x55555555 )
n = ( ( ( m >>> 2 ) & 0x33333333 ) * 0x11111111 ) >>> 28
m = ( ( m & 0x33333333 ) * 0x11111111 ) >>> 28
return n + m + ( ( ( !n ) + ( !m ) ) << 4 )
The above map has no solid blocks, so a faster version can be used :
Popcount ( n: 11111111111111111111111111111111 ) 15 + 15 with 32 bits:
m = n - ( ( n >>> 1 ) & 0x55555555 )
n = ( ( ( m >>> 2 ) & 0x33333333 ) * 0x11111111 ) >>> 28
m = ( ( m & 0x33333333 ) * 0x11111111 ) >>> 28
return n + m
The +m is for positive (red) gradients, and -m is for negative (blue) gradients.
For a 256x256 landscape (offset) :
Each horizontal strip has a SEED with 4 relative displacements plus 2 popcount relative displacements.
SEED = | 8 bits. Height | 6 bits r | 6 bits r | 6 bits r | 6 bits r |
SEED = | 8 bits. Height | 6 bits r | 6 bits r | 6 bits r | 6 bits r |
SEED = | 8 bits. Height | 6 bits r | 6 bits r | 6 bits r | 6 bits r |
SEED = | 8 bits. Height | 6 bits r | 6 bits r | 6 bits r | 6 bits r |
0<------------------------------------- X --------------------------------->256
SEED > > p p r > > < < r > > < < r > > < < r
SEED + r - p - p
SEED > > < < r > > p p r > > < < r > > < < r
SEED + r + r - p - p
SEED > > < < r > > < < r > > < < r p p < < r
SEED + r + r + r + p + p
SEED p p < < r > > < < r > > < < r > > < < r
SEED + p + p
The SEED is decoded and a popcount is decoded from the closest seed to the landscape x-coordinate, in either direction.
A 256x256 block of 65536 numbers can be created using 2048 numbers.
LANDSCAPE (OFFSET) RELATIVE DISPLACEMENT
IMAGE EDITOR VERSION
Move the heightmap right one pixel. Subtract this heightmap from the original to give this:
Red pixel = add +1
Blue pixel = add -1
If travelling in the opposite direction:
Black pixel = add 0
Red pixel = add -1
Blue pixel = add +1
JAVASCRIPT VERSION
Interestingly, Isometric depthmaps (offset) will mostly be blocks of 1's or blocks of 0's. Allowing for large compression through RLE (run length encoding).
Will be writing a Python version because of new technologies like inline assembly and VIPER.
RUN LENGTH ENCODED SMOOTHING
LANDSCAPE SYNTHESIS
Synthesize a landscape (offset) with :
z = 2*sin(x-2y)+2*sin(4x+3y)+2*sin(3y-5x)+2*sin(7x+5y)+sin(5x+11y)+sin(10x+7y)
it has a period of 2 pi.
Mountain range with 4 layers.
Each layer has it's own 3D texture from one line of code.
Simple shading effect.
ISOMETRIC MAPPING
Using Javascript you can get isometric maps from Google Earth like so:
MODE1 : Isometric RLE offset.
MODE2 : Synthesized offset.
MODE3 : DCT IV offset.
MODE4 : Popcount Relative displacement offset.
MODE5 : Full bitmap offset.
MODE6 : MODE1 + MODE2
MODE7 : MODE1 + MODE3
MODE8 : MODE1 + MODE4
MODE9 : MODE1 + MODE5
3D BLITTER
The best thing about voxels is I can do a "3d" BitBLT. This allows me to perform some really cool effects, that are simple to implement and are on a massive scale. I can also simulate a 3d page flipping system by superimposing two worlds on top of each other. By changing a single bit I can perform a pseudo 3d flip instantaneously.
In 2004 when I was exploring uses for "world flipping" I envisioned a time travel effect similar to the Doom "melt" effect. Replacing the current 3d world with a future 3d world. Just finished watching Silent Hill on DVD and realised I could easily replicate the school morphing into the coal mine with a simple Cellular Automaton. Only just scratching the surface of voxel power.
Because this has been a 30 year hobby,the cost of running 8 machines per year is $100.
Just finished one hour testing batches on a dimensity 7020. Running 5 threads at 100% used 8% battery charge with no drop in performance. The phone remained at room temperature for the whole hour. Extremely good results, no more testing required. The space leaper is a rough approximation and the actual space leaping system could yield another 50% performance boost.
Moving the iso-surface extractor out of the renderer for a non-multiply 3D rendering engine. This will reduce the number of variables in the renderer and reduce garbage collection. Javascripts window.location.replace function has given me a theoretically infinite number of 3D engines, all I have to do is copy the file and rename it. Change the name of the image file and I have just created a new 3D world and all the 3D models. This is the best thing about javascript, I love it! Just finishing "mo-cap" library with $200 phone!





























Comments
Post a Comment