===============================================================================
SPECIFICATION OF MULTI-RATE COMPILATIONS RULES
===============================================================================

Types definition
T = int | float | [n]T

N : symbolic name

W : vector
W : [n]T N  (or W : T n N)

Id : access indexes
Id : int | N | Binop(i1, i2)      (with Binop : +,-,*,/)

E : signal expression
E = InX | P(E1,..) | Num(x) | Vectorize(n,E) | Serialize(E) | E1#E2 | E1[E2] | DL(E1)@E2 | Proj(n, g = (E0,E1,...)) | Decimate(n,o,E)

A : address
A : W | Cast(A, T) | A[I]

Cast(A, T) : cast the vector with the type T
A[I] : get the element of A at index I

V : compilation value
V = int | float | Load(A) | P(V1,..)

S : compilation statement
S : Dec(W) | Store(A, V) | { S* } | Loop(i, n) { S* } | SubLoop(i, n) { S* } | If(e) { S* }

An address A has a type given with typeAddr : A ==> T

typeAddr : A ==> T

typeAddr(W = T n N)     = [n]T
typeAddr(Cast(A, T))    = T
typeAddr(A[I])          = typeIndexAddr(A)

typeIndexAddr : T ==> T

typeIndexAddr int   = error
typeIndexAddr float = error
typeIndexAddr [n]T  = T

A compilation value V has a type given with typeVal : V ==> T

typeVal : V ==> T

typeVal(int)        = int
typeVal(float)      = float
typeVal(Load(A))    = typeAddr(A)
typeVal(P(V1,..))   = combType(typeVal(V1),typeVal(V2),...)

combType2 : T1 x T2 ==> T

combType2 int float     =  float
combType2 float int     =  float

combType2 [n]T float    =  [n]T
combType2 [n]T int      =  [n]T

combType2 int [n]T      =  [n]T
combType2 float [n]T    =  [n]T

combType2 [n]T T        =  [n]T
combType2 T [n]T        =  [n]T

combType2 [n]T [n]T     =  [n]T

combType2 [n]T [m]T     =  error

In each compilation rule:

- the "+" symbol means "adding a statement in the current loop"
- the ">" symbol means "returning a value"

===============================================================================
COMPLEX VERSION WITH DESTINATION AND INDEX
===============================================================================

CTOP[E0, E1...]     = compile each output signal
CV[O, E]            = compile Vector, fill vector O with count*rate samples of signal E, no value
CA[O, E, I]   	    = compile Assignment, store in O at index Os , the value of signal E at index I, no value
CS[E, I]            = compile Sample, compute the value of signal E at index I, returns a value
CD[DL(E)]           = create a delay line, write signal, returns the delay line

VS = vector size
count = implicit parameter indicating how many samples at rate 1 we will compute
count <= VS

-------------------------------------------------------------------------------
CTOP[E0, E1, E2, ...] = compile each output signal
-------------------------------------------------------------------------------
   	ri = rate(Ei)
   	Ti = int | float = type(Ei)
	[ri*VS]Ti = type(OUTi)

+	CV[OUT0, E0]
+	CV[OUT1, E1]
+	....

-------------------------------------------------------------------------------
CV[O, E] = compile vector, fill vector W with count*rate samples of signal E
-------------------------------------------------------------------------------
   	r = rate(E)
   	T = type(E)
	[r*VS]T = type(O)
	i = fresh index

+	Loop(i, r*count) {
+		CA[O[i], E, i]
+	}


===============================================================================
Num(x) : number (int or float) of value 'x'
===============================================================================

-------------------------------------------------------------------------------
CA[O, Num(x), I] : A x E x Id ==> void
-------------------------------------------------------------------------------
+	Store(O, CS[Num(x), I])

-------------------------------------------------------------------------------
CS[Num(x), I] : E x Id ==> V
-------------------------------------------------------------------------------
>	x (of type int | float)


===============================================================================
InX : input x
===============================================================================

-------------------------------------------------------------------------------
CA[O, InX, I] : A x E x Id ==> void
-------------------------------------------------------------------------------
+	Store(O, CS[InX, I])

-------------------------------------------------------------------------------
CS[InX, I]: E x Id ==> V
-------------------------------------------------------------------------------
>   Load(InX[I]) (of type float)


===============================================================================
P(E1,E2,..) : primitive operation on signals
===============================================================================
	r = rate(P(E1,E2,..))
	T = type(P(E1,E2,..))

-------------------------------------------------------------------------------
CA[O, P(E1,E2,..), I] : A x E x Id ==> void
-------------------------------------------------------------------------------
+	Store(O, CS[P(E1,E2,..), I])

-------------------------------------------------------------------------------
CS[P(E1,..), I] : E x Id ==> V
-------------------------------------------------------------------------------
	j is a fresh loop index

if not shared (P(E1,E2,..)) : we can inline the expression

>	P(CS[E1,I],CS[E2,I],...)

if shared (P(E1,E2,..)) : we need an intermediate buffer W to store
+ 	T 	W[r*VS]
+	Loop(j, r*count) {
+		Store(W[j], P(CS[E1,j],CS[E2,j],...))
+	}

>	Load(W[I])  (of type T)


===============================================================================
Vectorize(n,E) : create a signal of vectors of size n and rate r/n
===============================================================================
   n*m = rate(E)
   	 T = type(E)

 	 m = rate(Vectorize(n,E))
  [n]T = type(Vectorize(n,E))

-------------------------------------------------------------------------------
CA[O, Vectorize(n,E), I] : A x E x Id ==> void
-------------------------------------------------------------------------------
	k is a fresh loop index

if not shared (Vectorize(n,E)) :

+	SubLoop(k, n) {
+		CA[O[k], E, I*n+k]
+ 	}

if shared (Vectorize(n,E)) :

+	Store(O, CS[Vectorize(n,E),I])


-------------------------------------------------------------------------------
CS[Vectorize(n,E),I] : E x Id ==> V
-------------------------------------------------------------------------------
	j and k are fresh loop indexes

+	[n]T W[m*VS]
+	Loop(j, m*VS) {
+		SubLoop(k, n) {
+			CA[W[j][k], E, j*n+k]
+		}
+	}

>	Load(W[I]) (of type [n]T)

===============================================================================
Serialize(E) : Serialize a signal of vectors
===============================================================================
     m = rate(E)
  [n]T = type(E)

   n*m = rate(Serialize(E))
     T = type(Serialize(E))

-------------------------------------------------------------------------------
CA[O, Serialize(E), I] : A x E x Id ==> void
-------------------------------------------------------------------------------

if not shared (Serialize(E)) :

+   If (I%n == 0) {
+       CA[Cast(O,[n]T), E, I/n]  (here we "divide" I by n...)
+   }

if shared (Serialize(E)) :

+   Store(O, CS[Serialize(E),I])

-------------------------------------------------------------------------------
CS[Serialize(E), I] : E x Id ==> V
-------------------------------------------------------------------------------
    j and k are fresh loop indexes

+	T W[n*m*VS]
+	Loop(j, n*m*VS) {
+       If (j%n == 0) {
+           CA[Cast(W[j],[n]T), E, j/n]  (here we "divide" j by n..)
+       }
+	}

> Load(W[I]) (of type T)


===============================================================================
E1#E2 : concatenate two signals of vectors
===============================================================================
       r = rate(E1#E2) = rate(E1) = rate(E2)
    [n]T = type(E1)
    [m]T = type(E2)
  [n+m]T = type(E1#E2)

-------------------------------------------------------------------------------
CA[O, E1#E2, I] : A x E x Id ==> void
-------------------------------------------------------------------------------

if not shared (E1#E2) :

+	CA[Cast(O[0],[n]T), E1, I]
+	CA[Cast(O[n],[m]T), E2, I]  (here we "move" O by n...)

if shared (E1#E2) :

+	Store(O, CS[E1#E2, I])

-------------------------------------------------------------------------------
CS[E1#E2, I] : E x Id ==> V
-------------------------------------------------------------------------------
	j and k are fresh loop indexes

+	[n+m]T W[r*VS]
+	Loop(j, r*VS) {
+		CA[Cast(W[j][0],[n]T), E1, j]
+		CA[Cast(W[j][n],[m]T), E2, j]
+   }

>   Load(W[I])  (of type [n+m]T)


===============================================================================
E1[E2] : take value of a signal at a given index
===============================================================================
       r = rate(E1[E2]) = rate(E1) = rate(E2)
    [n]T = type(E1)
     int = type(E2) in (0, n-1)
       T = type(E1[E2])

-------------------------------------------------------------------------------
CA[O, E1[E2], I] : A x E x Id ==> void
-------------------------------------------------------------------------------

+   Store(O, CS[E1[E2], I])

-------------------------------------------------------------------------------
CS[E1[E2], I] : E x Id ==> V
-------------------------------------------------------------------------------

>   CS[E1,I][CS[E2,I]]  (of type T)


===============================================================================
Proj(n, g = (E0, E1, ...)) : projection of a recursive group
===============================================================================
       r = rate(E0) = rate(E1) = ...
      T0 = type(E0), T1 = type(E1), ...
      m0 = maxdelay(E0), m1 = maxdelay(E1), ...

-------------------------------------------------------------------------------
CA[O, Proj(n, g = (E0, E1, ...)), I] : A x E x Id ==> void
-------------------------------------------------------------------------------

+   Store(O, CS[Proj(n, g = (E0, E1, ...)), I])

-------------------------------------------------------------------------------
CS[Proj(n, g = (E0, E1, ...)), I] : E x Id ==> V
-------------------------------------------------------------------------------
       r = rate(E0) = rate(E1) = ...
      T0 = type(E0), T1 = type(E1), ...
      m0 = maxdelay(E0), m1 = maxdelay(E1), ...

+	T0 W0[r*VS + m0]
+	T1 W1[r*VS + m1]
+   ....
+	Loop(j, r*VS) {
+       CA[W0[j], E0, j]
+       CA[W1[j], E1, j]
+       ....
+	}

>   Load(Wn[I]) (of type Tn)


===============================================================================
DL(E1)@E2 : read in a delay line
===============================================================================
       r = rate(DL(E1)@E2) = rate(DL(E1)) = rate(E1) = rate(E2)
       T = type(DL(E1)) = type(E1)
     int = type(E2)
       d = maxdelay(DL(E1))

-------------------------------------------------------------------------------
CA[O, DL(E1)@E2, I] : A x E x Id ==> void
-------------------------------------------------------------------------------

+   Store(O, CS[DL(E1)@E2, I])

-------------------------------------------------------------------------------
CS[DL(E1)@E2, I] : E x Id ==> V
-------------------------------------------------------------------------------

if d is small :
+   W = CD[DL(E1)]

>   Load(W[I-CS[E2,I]]))  (of type T)

TODO : when d is not small


===============================================================================
CD[DL(E)] : create a delay line and write signal
===============================================================================
       r = rate(DL(E)) = rate(E)
       T = type(DL(E)) = type(E)
       d = maxdelay(DL(E))

-------------------------------------------------------------------------------
CD[DL(E)] : E ==> A
-------------------------------------------------------------------------------

if d is small :

+	T M[d]
+	T RM[r*VS + d]
+	T* R = RM[d]
+	Loop(j, d) {
+       Store(RM[j], Load(M[j]))
+	}
+	Loop(j, r*VS) {
+       CA[RM[j+d], E, j]
+	}
+	Loop(j, d) {
+       Store(M[j], Load(RM[r*VS+j]))
+	}

>   R  (of type [r*VS]T)

TODO : when d is not small
