« Target architecture » : différence entre les versions

De Assothink Wiki
Aller à la navigation Aller à la recherche
Contenu ajouté Contenu supprimé
Aucun résumé des modifications
Aucun résumé des modifications
Ligne 93 : Ligne 93 :
They work in parallel.
They work in parallel.


They work on the SM, on their individual LM, and process some simple calculations (add, substract, multiply) 
They work on the SM, on their individual LM, and process some simple calculations (add, substract, multiply). 


==== Shared memory organisation ====
==== Shared memory organisation ====
Ligne 111 : Ligne 111 :
*1 excitation value, EXC
*1 excitation value, EXC
*1 signal factor, FAC
*1 signal factor, FAC
*1 sizing value, SIZE (SIZE is smaller than K)
*1 input sizing value, INSIZE (INSIZE is smaller than K)  
*1 output sizing value, OUTSIZE (OUTSIZE is smaller than K)  
*possibly other values less important
*possibly other values less important


Ligne 117 : Ligne 118 :


Step 1 - Values are transferred and summed from SM to EXC in LM
Step 1 - Values are transferred and summed from SM to EXC in LM
<blockquote>EXC += SM[IN[i]] &nbsp; &nbsp; &nbsp;&nbsp; (for 0&lt;=i&lt;SIZE) </blockquote>
<blockquote>EXC += SM[IN[i]] &nbsp; &nbsp; &nbsp;&nbsp; (for 0&lt;=i&lt;INSIZE)&nbsp;</blockquote>
Step 2 - computation of output values
Step 2 - computation of output values
<blockquote>SIG[i] = (EXC*PER[i])/FAC &nbsp; &nbsp; for (0&lt;=i&lt;SIZE) </blockquote>
<blockquote>SIG[i] = (EXC*PER[i])/FAC &nbsp; &nbsp; for (0&lt;=i&lt;OUTSIZE)&nbsp;</blockquote>
Step 3 - Decrease of local excitation
Step 3 - Decrease of local excitation
<blockquote>EXC -= SIG[i] &nbsp; &nbsp; &nbsp; &nbsp;(for 0&lt;=i&lt;SIZE) </blockquote>
<blockquote>EXC -= SIG[i] &nbsp; &nbsp; &nbsp; &nbsp;(for 0&lt;=i&lt;OUTSIZE)&nbsp;</blockquote>
Step 4 - Values are transferred from LM to SM
Step 4 - Values are transferred from LM to SM
<blockquote>SM[OUT[i]] = SIG[i] &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; (for 0&lt;=i&lt;SIZE) </blockquote>
<blockquote>SM[OUT[i]] = SIG[i] &nbsp; &nbsp; &nbsp; &nbsp;&nbsp; (for 0&lt;=i&lt;OUTSIZE)&nbsp;</blockquote>
==== Non overlapping contract ====
==== Non overlapping contract ====



Version du 27 octobre 2012 à 11:55

Summary 

The hardware suited to adequately run Assothink is very simple but massively parallel, with millions of basic active components; highly interconnected.  

This kind of hardware architecture is not currently available.

The IPSE architecture is suited.

It is described below, and may be considered as a basic set of specifications.   

This might be suggested to some company producing wide scale IC (integrated circuits) products. An expensive project! This would be the most achieved and performing version of Assothink.

Context and purpose 

This is part of the Assothink project.

... emergence...

... perform tasks that conventional CPU and computers do not handle efficiently...

... associative computing...

... mimic human brain...

Integration

The target architecture describes a board to be added to conventional computers, working with them as "integrated programmable synaptic engine" (IPSE).

It could also be the heart of a standalone new kind of computer (with specific devices, operating system,...), but this ambitious option is not considered here. 

Components

The IPSE board contains mainly 2 components:

  • a synaptic share memory (SM)
  • a set of synaptic micro-engines (working in parallel)

Each of the synaptic micro-engines includes  

  • a synaptic micro-engine processor
  • a local memory (LM) 

Interaction with hosting computer

The hosting computer control the IPSE thru

  • a set of C functions
  • a java package offering the same functions

The functions are:

  • read the local memory of a synaptic micro-engine
  • write the local memory of a synaptic micro-engine
  • run 1 processing cycle of all micro-engines
  • start continuous cycling of all micro-engines
  • stop continuous cycling of all micro-engines

Other functions may be available as nice to have, they are not directly needed

Component description

Synaptic Share Memory (SM)

The synaptic share memory is a set of 32-bits adressable memory registers (32-bit integers)

It is a kind of RAM.

The number of registers is Nsm. Nsm is always smaller than 232, so a 32-bit integer is sufficient to identify any of the registers.

Each register is individually readable/writable by all micro-engines.

Local Memory (LM)

There is a local memory for each synaptic micro-engine. 

The local memory is a set of 32-bits adressable memory registers (32-bit integers)

It is a kind of RAM.

The number of registers is Nlm. Nlm is always (much) smaller than 232, so a 32-bit integer is more than sufficient to identify any of the local memory registers.

Each register is readable/writable by the micro-engien itself, and when the IPSE is not active, it is accessible (read/write) by the hosting machine.

Micro-engine

The number of micro-engine is Ne. Ne is always (much) smaller than 232, so a 32-bit integer is more than sufficient to identify any of the local memory registers.

Each micro-egine is able to read and write in its own LM, and in the SM. 

Working cycle of the micro-engine processor

The micro-engines operates per cycle.

They work in parallel.

They work on the SM, on their individual LM, and process some simple calculations (add, substract, multiply). 

Shared memory organisation

The shared memory registers are written SM[i]. The index value ranges from 0 (included) to Nsm (excluded).  

Local memory organisation

Assuming Nlm integer per LM, let us define K = Nlm/5.

Each LM contains:

  • K input addresses, written IN[i]
  • K output addresses, written OUT[i]
  • K output values, written SIG[i]
  • K permeability values, written PER[i]
  • 1 excitation value, EXC
  • 1 signal factor, FAC
  • 1 input sizing value, INSIZE (INSIZE is smaller than K)  
  • 1 output sizing value, OUTSIZE (OUTSIZE is smaller than K)  
  • possibly other values less important

Steps  

Step 1 - Values are transferred and summed from SM to EXC in LM

EXC += SM[IN[i]]        (for 0<=i<INSIZE) 

Step 2 - computation of output values

SIG[i] = (EXC*PER[i])/FAC     for (0<=i<OUTSIZE) 

Step 3 - Decrease of local excitation

EXC -= SIG[i]        (for 0<=i<OUTSIZE) 

Step 4 - Values are transferred from LM to SM

SM[OUT[i]] = SIG[i]          (for 0<=i<OUTSIZE) 

Non overlapping contract

A given SM address may not be present in the IN[] set of more than 1 of the LMs. 

A given SM address may not be present in the OUT[] set of more than 1 of the LMs. 

In other words, any shared memory address is writable by at most one micro-engine, and readable by at most one micro-engine.

As a result, the read and write operations of all micro-engines may operate simultaneously without conflict.

Sizing and performance figures 

Here are the targetted numbers:

  • 106 cycles per seconds
  • Msm = 224 (~16 000 000) 

This implies 224 x 32 bits, thus 229 bits, thus 226 bytes, thus the equivalent of 64 Mb RAM. 

  • Nlm=212 (~ 4 000) 

This implies for each local memory 212 x 16 bits, thus 216 bits, thus 213 bytes, thus the equivalent of 8 Kb RAM.

  • Ne = 218 (~ 256 000) 

This implies for the sum of all local memories 230 x 16 bits, thus 234 bits, thus 231 bytes, thus the equivalent of 2 Gb RAM. 

Higher figures are welcome, but these are reasonably ambitious figures for a first target (quite small compares to figures in bio systems, like mammal brains :-)). 

Emulation of the IPSE with classical programming

PG has written 2 simple programs.

The first program emulates the IPSE board in Java (running JVM). 

The second program emulates the IPSE board in ANSI C.

The figures (to be compared with target figures) handled / allowed by the emulators are:

  • 103 cycle / second
  • Msm=219
  • Nlm=25 (average value permitted by dynamic allocation)
  • Ne=216 (working sequentially, not in parallel)