NAME
    rcsq - REDC squaring

SYNOPSIS
    rcsq(x, m)

TYPES
    x		integer
    m		odd positive integer

    return	integer v, 0 <= v < m.

DESCRIPTION
    Let B be the base calc uses for representing integers internally
    (B = 2^16 for 32-bit machines, 2^32 for 64-bit machines)
    and N the number of words (base-B digits) in the representation
    of m.  Then rcsq(x,m) returns the value of B^-N * x^2 % m,
    where the inverse implicit in B^-N is modulo m
    and the modulus operator % gives the least non-negative residue.

    The normal use of rcsq() may be said to be that of squaring modulo m a
    value encoded by rcin() and REDC functions, as in:

	    rcin(x^2, m) = rcsq(rcin(x,m), m)

    from which we get:

	    x^2 % m = rcout(rcsq(rcin(x,m), m), m)

    Alternatively, x^2 % m may be evaluated usually more quickly by:

	    x^2 % m = rcin(rcsq(x,m), m).

RUNTIME
    If the value of m in rcsq(x,m) is being used for the first time in
    a REDC function, the information required for the REDC algorithms
    is calculated and stored for future use, possibly replacing an
    already stored valued, in a table covering up to 5 (i.e. MAXREDC)
    values of m.  The runtime required for this is about two times that
    required for multiplying two N-word integers.

    Two algorithms are available for evaluating rcsq(x, m), the one
    which is usually faster for small N is used when N <
    config("redc2"); the other is usually faster for larger N. If
    config("redc2") is set at about 90 and 0 <= x < m, the runtime
    required for rcsq(x, m)i is at most about f times the runtime
    required for an N-word by N-word multiplication, where f increases
    from about 1.1 for N = 1 to near 2.8 for N > 90.  More runtime may
    be required if x has to be reduced modulo m.

EXAMPLE
    Using a 64-bit machine with B = 2^32:

    > for (i = 0; i < 9; i++) print rcsq(i,9),:; print;
    0 7 1 0 4 4 0 1 7

    > for (i = 0; i < 9; i++) print rcin((rcsq(i,9),:; print;
    0 1 4 0 7 7 0 4 1

LIMITS
    none

LIBRARY
    void zredcsquare(REDC *rp, ZVALUE z1, ZVALUE *res)

SEE ALSO
    rcin, rcout, rcmul, rcpow
