Created by Fabien Cellier
I don't care about a tech in particular, all I want is a solution ( a good one ). If someone has a better idea (with arguments) or want to discuss, I am always available :-).
Because today we don't solve all the issues
Change asm.js for parallelism? (or other subset?)
Implementation details ?
GPGPU is never far away, transcompiler asm.js -> OpenCL C
Studies about having the same directly in ionmonkey
kinds of parallelism : data and tasks
We need parallelism for two purposes:
avoid to freeze web pages during long computation
We need some requirements
* Interaction with WebGL
Need a way to automatically have the good number of "threads"
( depends on the device, number of core... -> should be handle by javascript VM, not the user )
Big investment :
Good API (a little complex), cost is high
80 % of the features in WebCL could be done in ComputeWorkers, which are much easier to use
The transcompiler is hard to implement
->WebCL could compile to a language which is safe, but compiler should also be safe!
Rely on OpenCL driver
-> with ComputeWorker we already have 80% of the fallback
-> with ComputeWorker we already have 80% of the fallback
ComputeWorker seems to be the best first step ( risk is limited, investment too, will procure a good security to webCL )
will allow emscripten to use more kinds of threads
Why ComputeWorker?
Change asm.js for parallelism?
Implementation details ?
API similar to WebWorkers
Code use asm.js with some limitations
does not depends on one tech
could be implement in OpenCL, directCompute,cuda(,jit?)
var pw = new ComputeWorker(source.js);
pw.post(data/*typedArray or scalar
typedArray with mutex if it's a sharedMemory (new type)*/,
n/* number of tasks which could be launched in parallel*/
ownership);
pw.onmessage = function (oEvent) {
oEvent.data /*typedArray or scalar */;
};
//interaction with WebGL
var pw = new ComputeWorker(source.js,webGLContext);
pw.post(webGLMemoryObject,n);
pw.onmessage = function (oEvent) {
oEvent.data /*typedArray or scalar */;
};
// only asm.js API, postMessage and onMessage are provided in the worker
"use pasm"; // parallel asm.js
"use GPU"; //hint to use GPGPU if possible
var priv = new Int32Array(16);// private memory only accessible to one "thread"
//could be seen as the heap/stack
// no need to add for a copy, we already know the accessibility
var sharedMem = new BufferArray(16|0);// global memory shared between threads
//(transactional memory?)
function test(array, scalar) {
scalar = scalar | 0;// hint for the type
// bufferArray from main page are in "global memory"
//and shared between "threads"
array = new Int16Array(array);// hint for the type
//to know the id of the thread, between 0 and n-1, like opencl
array[id] += id;
return array; // no post message
}
onmessage = test;
Why ComputeWorker?
Implementation details ?
no ffi
hard to interact with binaries with GPGPU drivers
no dataView on the heap
not a problem because heap is IN the worker, could be sent to main page if needed
That's why we are in a worker!
no function pointers, nested definition
possible
no recursion?
but allows more optimisation during compilation and no checks for the heap at runtime
Note that we can emulate recursion and function pointer in openCL and directcompute
with a combination of a "frame pointer" and break/continue and switch to emulate addresses but heap has a static length
By default all variables are "Thread-local"
Why ComputeWorker?
Subset of asm.js for parallelism?
GPGPU is never far away, transcompiler asm.js -> OpenCL C
Studies about having the same directly in ionmonkey
Private memory should be as small as possible:
Global memory is used when private memory is too large (much slower)