In addition to the VFFT arguments defined in
Arguments for One-Dimensional FFT and VFFT Routines
, the
VCFFTF
,
VZFFTF
,
VCFFTB
, and
VZFFTB
routines use one additional argument called
ROWCOL.
ROWCOL
specifies whether to transform the rows or columns of
X(M,N)
. Set
ROWCOL
equal to
`R'
or
`r'
perform the transform or inverse transform on the rows of
X(M,N)
. Set
ROWCOL
equal to
`C'
or
`c'
perform the transform or inverse transform on the columns of
X(M,N)
.
CODE EXAMPLE 4
uses
RFFTF
to compute the FFT of a real sequence and
RFFTB
to compute the inverse transform. The computed Fourier coefficients are packed and stored in the original real array. The inverse transform is unnormalized and can be normalized by dividing each value by
N
.
CODE EXAMPLE 4
Fast Fourier Transform and Inverse Transform for Real Values
CODE EXAMPLE 5
uses
CFFTF
to compute the FFT of a complex sequence and
CFFTB
to compute the inverse transform. Because the number of calculated Fourier coefficients equals the number of input values, the real and imaginary values of the Fourier coefficients can be stored in the original array without additional storage manipulations. The inverse transform is unnormalized and can be normalized by dividing each value by
N
.
Sequence length
For
EZFFTF
, a real array containing the sequence to be transformed, unchanged on exit. For
EZFFTB
, a real array containing the Fourier coefficients of the inputs.
AZERO
The Fourier constant A
0
Real array containing the real parts of the complex Fourier coefficients. If
N
is even, then
A
is length
N
/2, otherwise
A
is length (
N
-1)/2.
Real array containing the imaginary parts of the complex Fourier coefficients. If
N
is even, then
B
is length
N
/2, otherwise
B
is length (
N
-1)/2.
WSAVE
Work array initialized by
EZFFTI
CODE EXAMPLE 6
uses
EZFFTF
to compute a Fourier transform of a real sequence and
EZFFTB
to compute the inverse transform. When using
EZFFTF
, the computed Fourier coefficients are stored in the arrays
A
and
B
. The input array
R
is not overwritten. Unlike the output of
RFFTF
and
DFFTF
, no packing is performed, and the complex conjugates are retained.
CODE EXAMPLE 5
Fast Fourier Transform and Inverse Transform for Complex Values
The EZFFT routines use the arguments shown in
TABLE 13
.
|
TABLE 13
Arguments for EZFFT Routines
Argument
|
Definition
|
CODE EXAMPLE 6
EZ Fourier Transform and Inverse Transform
CODE EXAMPLE 7
uses
COSQF
to compute the cosine quarter-wave transform of a real sequence and
COSQB
to compute the inverse transform. The computed Fourier coefficients are packed and stored in the original real array. The inverse transform is unnormalized and can be normalized by dividing each value by
4*N
.
CODE EXAMPLE 7
Cosine Quarter-Wave Transform and Inverse Transform
CODE EXAMPLE 8
uses
VCOSQF
to compute the cosine quarter-wave transform of a single real sequence and
VCOSQB
to compute the inverse transform. The computed Fourier coefficients are packed and stored in the original real array. The inverse transform is normalized.
CODE EXAMPLE 8
Cosine Quarter-Wave Transform and Inverse Transform Using Vector Routines
CODE EXAMPLE 9
uses
VCOSQF
to compute the cosine quarter-wave transform of multiple real sequences and
VCOSQB
to compute the inverse transforms. The computed Fourier coefficients of each sequence are packed and stored in the rows of the original real array. The inverse transforms are normalized.
CODE EXAMPLE 9
Cosine Quarter-Wave Transform and Inverse Transform Using Vector Routines
CODE EXAMPLE 10
uses
COST
to compute the cosine even-wave transform of a real sequence and the inverse transform. The computed Fourier coefficients are packed and stored in the original real array. The inverse transform is unnormalized and can be normalized by dividing each value by
2*(N-1)
.
CODE EXAMPLE 10
Cosine Even-Wave Transform and Inverse Transform
CODE EXAMPLE 11
uses
SINQF
to compute sine quarter-wave transform of a real sequence and
SINQB
to compute the inverse transform. The computed Fourier coefficients are packed and stored in the original real array. The inverse transform is unnormalized and can be normalized by dividing each value by
4*N
.
CODE EXAMPLE 11
Sine Quarter-Wave Transform and Inverse Transform
CODE EXAMPLE 12
uses
VSINQF
to compute the sine quarter-wave transform of a single real sequence and
VSINQB
to compute the inverse transform. The computed Fourier coefficients are packed and stored in the original real array. The inverse transform is normalized.
CODE EXAMPLE 12
Sine Quarter-Wave Transform and Inverse Transform Using Vector Routines
CODE EXAMPLE 13
uses
VSINQF
to compute the sine quarter-wave transform of multiple real sequences and
VSINQB
to compute the inverse transforms. The computed Fourier coefficients of each sequence are packed and stored in the rows of the original real array. The inverse transforms are normalized.
CODE EXAMPLE 13
Sine Quarter-Wave Transform and Inverse Transform Using Vector Routines
CODE EXAMPLE 14
uses
SINT
to compute the sine odd-wave transform of a real sequence and the inverse transform. The computed Fourier coefficients are packed and stored in the original real array. The inverse transform is unnormalized and can be normalized by dividing each value by
2*(N+1)
.
Number of rows to be transformed
Number of columns to be transformed
Two-dimensional array
A(LDA,N)
containing the sequences to be transformed and the results of an in-place transform
Leading dimension of array containing data to be transformed
Work array initialized by
x
FFT2I
LWORK
Dimension of work array
WORK
Real two-dimensional FFT routines use the arguments shown in
TABLE 15
.
PLACE
`I'
or
`i'
specifies that an in-place transform is performed.
`O'
or
`o'
specifies that an out-of-place transform is performed.
RFFT2F
or
DFFT2F
only:
`F'
or
`f'
specifies that a full result matrix is generated.
Any other character specifies that a partial result matrix is generated.
Number of rows to be transformed
Number of columns to be transformed
Two-dimensional array
A(LDA,N)
containing the sequences to be transformed and the results of an in-place transform
Leading dimension of array containing data to be transformed
Two-dimensional array
B(2*LDB,N
) that stores the results of an out-of-place transform
One half of the actual leading dimension of array that stores results of out-of-place transform
Work array initialized by
x
FFT2I
LWORK
Dimension of work array
WORK
In-place or Out-of-place.
When using In-Place, the results are stored in the modified input array that contains one or two additional rows, depending upon whether
M
is odd or even. When using Out-of-Place, the results are stored in a separate array.
Full or Partial.
When using Full, the complex conjugates are retained. When using Partial, the complex conjugates are discarded.
When computing a real one-dimensional FFT, the complex result can be packed and stored in the original array, because the values identically equal to zero and the complex conjugates are not stored. When computing the real two-dimensional FFT using the in-place and partial storage options, the complex conjugates are not stored, but the values identically equal to zero are stored. Saving the values identically equal to zero simplifies the indexing that occurs when computing the two-dimensional FFT. However, the size of the original array is modified to contain one or two additional rows, which are needed to store the values identically equal to zero.
The values of the arguments used with the real two-dimensional FFT routines depend upon whether an in-place or out-of place transform is performed, and whether the results are stored in a full or partial result matrix, as shown in
TABLE 16
.
In-Place Transform
B
unused
B
unused
When computing the real two-dimensional FFT of an input sequence of
M
rows and
N
columns, the computed Fourier coefficients will be stored in a result matrix with
2*M
rows and
N
columns when using the Full storage option. When using the Partial storage option, the Fourier coefficients will be stored in a result matrix with
M+2
rows and
N
columns when
M
is even, or in a result matrix with
M+1
rows and
N
columns when
M
is odd.
Sun Performance Library provides the
[S,D,C,Z]CNVCOR
routines for computing the convolution or correlation of a filter with one or more input vectors and the
[S,D,C,Z]CNVCOR2
routines for computing the two-dimensional convolution or correlation of two matrices. These routines are described in
Section Convolution and Correlation Routines
. The two-dimensional FFT routines can also be used to compute the two-dimensional convolutions of the two two-dimensional arrays
A
and
B
, as described in the following procedure.
1. Compute the two-dimensional FFT of
A
.
2. Compute the two-dimensional FFT of
B
.
3. Perform pointwise multiplication of
A
and
B
.
4. Compute the inverse two-dimensional FFT of the previous result.
The second transpose can be avoided for increased performance by using the
VFFT
and
[SDCZ]TRANS
routines to explicitly compute the transposed two-dimensional FFT, as described in the following procedure.
1. Use
VFFT
to compute one dimensional FFTs along the columns of
A
.
2. Use
ZTRANS
to transpose
A
.
3. Use
VFFT
to compute one-dimensional FFTs along the columns of the new
A
.
4. Use
VFFT
to compute one-dimensional FFTs along the columns of
B
.
5. Use
ZTRANS
to transpose
B
.
6. Use
VFFT
to compute one-dimensional FFTs along the columns of the new
B
.
7. Perform pointwise multiplication of
A
and
B
.
8. Use
VFFT
to compute inverse one-dimensional FFTs along the columns of the result.
9. Use
ZTRANS
to transpose the result back into its original order.
Sample Program: Two-Dimensional FFT and Inverse Transform
CODE EXAMPLE 15
uses
CFFT2F
to compute the two-dimensional FFT of a two-dimensional complex sequence and
CFFT2B
to compute the inverse transform. The computed Fourier coefficients are stored in the original complex array. The inverse transform is unnormalized and can be normalized by dividing each value by
M
*
N
.
CODE EXAMPLE 14
Sine Odd-Wave Transform and Inverse Transform
Perform a one-dimensional transform of the columns of the input vector.
Transpose the result matrix.
Perform a one-dimensional transform of the columns of the result matrix.
Transpose the result matrix to restore the original order of the data points.
Arguments for Two-Dimensional FFT Routines
Complex two-dimensional FFT routines use the arguments shown in
TABLE 14
.
|
TABLE 14
Arguments for Complex Two-Dimensional FFT Routines
Argument
|
Definition
|
TABLE 15
Arguments for Real Two-Dimensional FFT Routines
Argument
|
Definition
|
TABLE 16
Relationships Between Values of Arguments for Real Two-Dimensional FFT Routines
Full Result Matrix
|
Partial Result Matrix
|
Re(
X
0
)
Im(
X
0
)
Re(
X
0
)
Im(
X
0
)
Re(
X
1
)
Im(
X
1
)
Re(
X
1
)
Im(
X
1
)
The inverse transform is unnormalized and can be normalized by dividing each value by
M
*
N
.
CODE EXAMPLE 15
Two-Dimensional FFT and Inverse of Complex Sequence
The computed Fourier coefficients are stored in a
(2*M,
N)
array where one row contains the real part of the complex coefficient and the next row contains the imaginary part of the complex coefficient. In
CODE EXAMPLE 15
, to better display the complex conjugate symmetry, the real and imaginary parts of each complex coefficient are displayed on one line. For example, the following output:
Transformed Out-of-Place, Full
( 6.241, 0.000) ( 1.173, 0.000)
( -0.018, 1.169) ( 0.304, 0.111)
represents the following values for the Fourier coefficients.
|
Column 1
|
Column 2
|
CODE EXAMPLE 16
RFFT2F
and
RFFT2B
Example Showing In-Place and Out-of-Place Storage
CODE EXAMPLE 17
is a C example that uses
zfft2f
to compute the two-dimensional FFT of a two-dimensional complex sequence and
zfft2b
to compute the inverse transform. The computed Fourier coefficients are stored in the original complex array. The inverse transform is unnormalized and can be normalized by dividing each value by
m
*
n
.
CODE EXAMPLE 17
ZFFT2F
and
ZFFT2B
Example Using C
CODE EXAMPLE 18
is a C example that uses
rfft2f
to compute the two-dimensional FFT of a two-dimensional real sequence and
rfft2b
to compute the inverse transform. The computed Fourier coefficients are stored in the original real array using the partial storage option. The inverse transform is unnormalized and can be normalized by dividing each value by
m
*
n
.
Number of rows to be transformed
Number of columns to be transformed
Number of planes to be transformed
Three-dimensional array
A(LDA,N,K)
containing the sequences to be transformed and the results of an in-place transform
Leading dimension of array containing data to be transformed, where
LDA
M
Second dimension of array to be transformed, where
LD2A
N
Work array initialized by
x
FFT3I
LWORK
Dimension of work array
WORK
Real three-dimensional FFT routines use the arguments shown in
TABLE 18
.
PLACE
`I'
or
`i'
specifies that an in-place transform is performed.
`O'
or
`o'
specifies that an out-of-place transform is performed.
RFFT3F
or
DFFT3F
only:
`F'
or
`f'
specifies that a full result matrix is generated.
Any other character specifies that a partial result matrix is generated.
Number of rows to be transformed
Number of columns to be transformed
Number of planes to be transformed
Three-dimensional array
A(LDA,N,K)
containing the sequences to be transformed and the results of an in-place transform
Leading dimension of array containing data to be transformed
Three-dimensional array
B(2*LDB,N,K)
that stores the results of an out-of-place transform
Leading dimension of array that stores results of out-of-place transform
Work array initialized by
x
FFT3I
LWORK
Dimension of work array
WORK
In-place or Out-of-place.
When using In-Place, the results are stored in the modified input array that contains one or two additional rows, depending upon whether
M
is odd or even. When using Out-of-Place, the results are stored in a separate array.
Full or Partial.
When using Full, complex conjugates are retained. When using Partial, the complex conjugates are discarded.
When computing a real one-dimensional FFT, the complex result can be packed and stored in the original array, because the values identically equal to zero and the complex conjugates are not stored. When computing the real three-dimensional FFT using the in-place and partial storage options, the complex conjugates are not stored, but the values identically equal to zero are stored. Saving the values identically equal to zero simplifies the indexing that occurs when computing the three-dimensional FFT. However, the size of the original array is modified to contain one or two additional rows, which are needed to store the values identically equal to zero.
The values of the arguments used with the real three-dimensional FFT routines depend upon whether an in-place or out-of place transform is performed, and whether the results are stored in a full or partial result matrix, as shown in
TABLE 19
.
In-Place Transform
B
unused
B
unused
When computing the real 3D FFT of an input sequence of
M
rows,
N
columns, and
K
planes, the computed Fourier coefficients will be stored in a result matrix with
2*M
rows,
N
columns for each value of
K
when using the Full storage option. When using the Partial storage option, the Fourier coefficients will be stored in a result matrix with
M+2
rows and
N
columns for each value of
K
when
M
is even, or in
M+1
rows and
N
columns when
M
is odd. For each value of
K
, the storage format of the Fourier coefficients in the
M
rows and
N
columns is the same as for the real two-dimensional FFT routines. See
Storage of Real Two-Dimensional Sequences
.
CODE EXAMPLE 19
uses
CFFT3F
to compute the three-dimensional FFT of a three-dimensional complex sequence and
CFFT3B
to compute the inverse transform. The computed Fourier coefficients are stored in the original complex array. The inverse transform is unnormalized and can be normalized by dividing each value by
M
*
N*K
.
CODE EXAMPLE 18
Example of Using the Partial Storage Option
Perform a one-dimensional transform of the columns of the input vector.
Transpose the result matrix.
Perform a one-dimensional transform of the columns of the result matrix.
Reflect the result matrix so that the planes become columns.
Perform a one-dimensional transform of the columns of the result matrix.
Reflect and transpose the result matrix to restore the original order of the data points.
Arguments for Three-Dimensional FFT Routines
Complex three-dimensional FFT routines use the arguments shown in
TABLE 17
.
|
TABLE 17
Arguments for Complex Three-Dimensional FFT Routines
Argument
|
Definition
|
TABLE 18
Arguments for Real Three-Dimensional FFT Routines
Argument
|
Definition
|
TABLE 19
Relationship Between Values of Arguments for Real Three-Dimensional FFT Routines
Full Result Array
|
Partial Result Array
|
CNVCOR
`V'
or
`v'
specifies that convolution is computed.
`R'
or
`r'
specifies that correlation is computed.
`T'
or
`t'
specifies that the Fourier transform method is used.
`D'
or
`d'
specifies that the direct method is used, where the convolution or correlation is computed from the definition of convolution and correlation. (See Note 1)
Length of filter vector, where
NX
0.
Filter vector
Index of first element of
X
, where
NX
IFX
1
Stride between elements of the vector in
X
, where
INCX
>
0.
Length of input vectors, where
NY
0.
Number of implicit zeros prefixed to the
Y
vectors, where
NPRE
0.
Number of input vectors, where
M
0.
Input vectors.
Index of the first element of
Y
, where
NY
IFY
1
INC1Y
Stride between elements of the input vectors in
Y
, where
INC1Y
>
0.
INC2Y
Stride between input vectors in
Y
, where
INC2Y
> 0.
Length of the output vectors, where
NZ
0.
Number of
Z
vectors, where
K
0. If
K
<
M
, only the first
K
vectors will be processed. If
K
>
M
, all input vectors will be processed and the last
M
-
K
output vectors will be set to zero on exit.
Result vectors
Index of the first element of
Z
, where
NZ
IFZ
1
INC1Z
Stride between elements of the output vectors in
Z
, where
INCYZ
>
0.
INC2Z
Stride between output vectors in
Z
, where
INC2Z
> 0.
Work array
LWORK
Length of work array
Note 1. When the lengths of the two sequences to be convolved are similar, the FFT method is faster than the direct method. However, when one sequence is much larger than the other, such as when convolving a large time-series signal with a small filter, the direct method performs faster than the FFT-based method.
The two-dimensional convolution and correlation routines use the arguments shown in
TABLE 21
.
CNVCOR
`V'
or
`v'
specifies that convolution is computed.
`R'
or
`r'
specifies that correlation is computed.
METHOD
`T'
or
`t'
specifies that the Fourier transform method is used.
`D'
or
`d'
specifies that the direct method is used, where the convolution or correlation is computed from the definition of convolution and correlation. (See Note 1)
TRANSX
`N'
or
`n'
specifies that
X
is the filter matrix
`T'
or
`t'
specifies that the transpose of
X
is the filter matrix
SCRATCHX
`N'
or
`n'
specifies that
X
must be preserved
`S'
or
`s'
specifies that
X
can be used for scratch space. The contents of
X
are undefined after returning from a call where
X
is used for scratch space.
TRANSY
`N'
or
`n'
specifies that
Y
is the input matrix
`T'
or
`t'
specifies that the transpose of
Y
is the input matrix
SCRATCHY
`N'
or
`n'
specifies that
Y
must be preserved
`S'
or
`s'
specifies that
Y
can be used for scratch space. The contents of
X
are undefined after returning from a call where
Y
is used for scratch space.
Number of rows in the filter matrix
X
, where
MX
0
Number of columns in the filter matrix
X
, where
NX
0
Filter matrix.
X
is unchanged on exit when
SCRATCHX
is
`N'
or
`n'
and undefined on exit when
SCRATCHX
is
`S'
or
`s'
.
Leading dimension of array containing the filter matrix
X
.
Number of rows in the input matrix
Y
, where
MY
0.
Number of columns in the input matrix
Y
, where
NY
0
Number of implicit zeros prefixed to each row of the input matrix
Y
vectors, where
MPRE
0.
Number of implicit zeros prefixed to each column of the input matrix
Y
, where
NPRE
0.
Input matrix.
Y
is unchanged on exit when
SCRATCHY
is
`N'
or
`n'
and undefined on exit when
SCRATCHY
is
`S'
or
`s'
.
Leading dimension of array containing the input matrix
Y
.
Number of output vectors, where
MZ
0.
Length of output vectors, where
NZ
0.
Result vectors
Leading dimension of the array containing the result matrix
Z
, where
LDZ
MAX(1,MZ)
.
WORKIN
Work array
LWORK
Length of work array
Note 1. When the sizes of the two matrices to be convolved are similar, the FFT method is faster than the direct method. However, when one sequence is much larger than the other, such as when convolving a large data set with a small filter, the direct method performs faster than the FFT-based method.
The minimum dimensions for the
WORK
work arrays used with the one-dimensional and two-dimensional convolution and correlation routines are shown in
TABLE 24
. The minimum dimensions for one-dimensional convolution and correlation routines depend upon the values of the arguments
NPRE
,
NX
,
NY
, and
NZ
.
The minimum dimensions for two-dimensional convolution and correlation routines depend upon the values of the arguments shown
TABLE 22
.
Number of rows in the filter matrix
Number of rows in the input matrix
Number of output vectors
Number of columns in the filter matrix
Number of columns in the input matrix
Length of output vectors
Number of implicit zeros prefixed to each row of the input matrix
Number of implicit zeros prefixed to each column of the input matrix
MPOST
MAX(0,MZ-MYC)
NPOST
MAX(0,NZ-NYC)
MPRE
+
MPOST
+
MYC_INIT
, where
MYC_INIT
depends upon filter and input matrices, as shown in
TABLE 23
NPRE
+
NPOST
+
NYC_INIT
, where
NYC_INIT
depends upon filter and input matrices, as shown in
TABLE 23
MYC_INIT
and
NYC_INIT
depend upon the following, where
X
is the filter matrix and
Y
is the input matrix.
MYC_INIT
MAX(MX,MY)
MAX(NX,MY)
MAX(MX,NY)
MAX(NX,NY)
NYC_INIT
MAX(NX,NY)
MAX(MX,NY)
MAX(NX,MY)
MAX(MX,MY)
The values assigned to the minimum work array size is shown in
TABLE 24
.
SCNVCOR, DCNVCOR
4*(MAX(NX,NPRE+NY)
+
MAX(0,NZ-NY))
REAL, REAL*8
CCNVCOR, ZCNVCOR
2*(MAX(NX,NPRE+NY)
+
MAX(0,NZ-NY)))
COMPLEX, COMPLEX*16
The most common form of the computation, and the case that executes fastest, is applying a filter vector
X
to a series of vectors stored in the columns of
Y
with the result placed into the columns of
Z
. In that case,
INCX
= 1,
INC1Y
= 1,
INC2Y
NY
,
INC1Z
= 1,
INC2Z
NZ
. Another common form is applying a filter vector
X
to a series of vectors stored in the rows of
Y
and store the result in the row of
Z
, in which case
INCX
= 1,
INC1Y
NY
,
INC2Y
= 1,
INC1Z
NZ
, and
INC2Z
= 1.
Convolution can be used to compute the products of polynomials.
CODE EXAMPLE 21
uses
SCNVCOR
to compute the product of 1 + 2
x
+ 3
x
2
and 4 + 5
x
+ 6
x
2
.
CODE EXAMPLE 19
Three-Dimensional Fast Fourier Transform and Inverse Transform
The one-dimensional convolution and correlation routines use the arguments shown in
TABLE 20
.
|
TABLE 20
Arguments for One-Dimensional Convolution and Correlation Routines
SCNVCOR
,
DCNVCOR
,
CCNVCOR
, and
ZCNVCOR
Argument
|
Definition
|
TABLE 21
Arguments for Two-Dimensional Convolution and Correlation Routines
SCNVCOR2
,
DCNVCOR2
,
CCNVCOR2
, and
ZCNVCOR2
Argument
|
Definition
|
TABLE 22
Arguments Affecting Minimum Work Array Size for Two-Dimensional Routines:
SCNVCOR2
,
DCNVCOR2
,
CCNVCOR2
, and
ZCNVCOR2
Argument
|
Definition
|
TABLE 23
MYC_INIT
and
NYC_INIT
Dependencies
Transpose(Y)
|
Transpose(X)
|
Transpose(X)
|
TABLE 24
Minimum Dimensions and Data Types for
WORK
Work array Used With Convolution and Correlation Routines
Routine
|
Minimum Work Array Size (
WORK
)
|
SCNVCOR2
1
, DCNVCOR2
1
|
MY
+
NY
+
30
|
COMPLEX, COMPLEX*16
|
CODE EXAMPLE 21
One-Dimensional Convolution Using Fourier Transform Method and
REAL
Data
Making the output vector longer than the input vectors, as in the example above, implicitly adds zeros to the end of the input. No zeros are actually required in any of the vectors, and none are used in the example, but the padding provided by the implied zeros has the effect of an end-off shift rather than an end-around shift of the input vectors.
CODE EXAMPLE 22
will compute the product between the vector [ 1, 2, 3 ] and the circulant matrix defined by the initial column vector [ 4, 5, 6 ]:
CODE EXAMPLE 22
Convolution Used to Compute the Product of a Vector and Circulant Matrix
The difference between this example and the previous example is that the length of the output vector is the same as the length of the input vectors, so there are no implied zeros on the end of the input vectors. With no implied zeros to shift into, the effect of an end-off shift from the previous example does not occur and the end-around shift results in a circulant matrix product.
CODE EXAMPLE 23
Two-Dimensional Convolution Using Direct Method
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|