Image Processing : Transforming Between Domains |
The Fast Fourier Transform (FFT) is used in numerical analysis to transform an image between spatial and frequency domains. The FFT decomposes an image into sines and cosines of varying amplitudes and phases. The values of the resulting transform represent the amplitudes of particular horizontal and vertical frequencies. This image information in the frequency domain shows how often patterns are repeated within an image. Low frequencies represent gradual variations in an image, while high frequencies correspond to abrupt variations in the image.
Low frequencies tend to contain the most information because they determine the overall shape or pattern in the image. High frequencies provide detail in the image, but they are often contaminated by the spurious effects of noise. Masks can be easily applied to the image within the frequency domain to remove the noise.
The following sections introduce the concepts needed to work with images and Fast Fourier Transforms (FFTs):
The FFT process is the basis for many filters used in image processing. One of the easiest FFT filters to understand is the one used for background noise removal. This filter is simply a mask applied to the image in the frequency domain. See Removing Noise with the FFT for an example of how to use this type of filter.
When an image is transformed with FFT from the spatial domain to the frequency domain, the transformation process is referred to as a forward FFT. The forward FFT process can be performed with IDL's FFT function.
In the frequency domain, the lowest frequencies usually contain most of the information, which is shown by the large peak in the center of the data. If the transform is shown as a surface, the peak of low frequencies appears as a spike. If the transform is shown as an image, the peak of low frequencies is composed of the brightest pixels.
If the image does not contain any background noise, the rest of the data frequencies are very close to zero. However, the results of the FFT function have a very wide range. An initial display may not show any variations from zero, but a smaller range will show that the image does actually contain background noise. Since scaling a range can sometimes be quite arbitrary, different methods are used. See Displaying Images in the Frequency Domain for more information on displaying the results of a forward FFT.
The following example shows how to use IDL's FFT function to compute a forward FFT. This example uses the first image within the abnorm.dat
file in the examples/data
directory. The results of the FFT function are shifted to move the origin (0, 0) of the x- and y-frequencies to the center of the data. Frequency magnitude then increases with distance from the origin. If the results are not centered, then the negative frequencies appear after the positive frequencies because of the storage scheme of the FFT process. See the FFT description in the IDL Reference Guide for more information on this storage scheme. Complete the following steps for a detailed description of the process.
Example Code See forwardfft.pro in the examples/doc/image subdirectory of the IDL installation directory for code that duplicates this example. |
abnorm.dat
file:imageSize = [64, 64] file = FILEPATH('abnorm.dat', $ SUBDIRECTORY = ['examples', 'data']) image = READ_BINARY(file, DATA_DIMS = imageSize)
displaySize = 2*imageSize
DEVICE, DECOMPOSED = 0 LOADCT, 0
WINDOW, 0, XSIZE = displaySize[0], $ YSIZE = displaySize[1], TITLE = 'Original Image' TVSCL, CONGRID(image, displaySize[0], $ displaySize[1])
The following figure shows the original image.
ffTransform = FFT(image)
center = imageSize/2 + 1 fftShifted = SHIFT(ffTransform, center)
interval = 1. hFrequency = INDGEN(imageSize[0]) hFrequency[center[0]] = center[0] - imageSize[0] + $ FINDGEN(center[0] - 2) hFrequency = hFrequency/(imageSize[0]/interval) hFreqShifted = SHIFT(hFrequency, -center[0]) vFrequency = INDGEN(imageSize[1]) vFrequency[center[1]] = center[1] - imageSize[1] + $ FINDGEN(center[1] - 2) vFrequency = vFrequency/(imageSize[1]/interval) vFreqShifted = SHIFT(vFrequency, -center[1])
Note The previous two steps were performed because of the storage scheme of the FFT process. See the FFT description in the IDL Reference Guide for more information on this storage scheme. |
WINDOW, 1, TITLE = 'FFT: Transform' SHADE_SURF, fftShifted, hFreqShifted, vFreqShifted, $ /XSTYLE, /YSTYLE, /ZSTYLE, $ TITLE = 'Transform of Image', $ XTITLE = 'Horizontal Frequency', $ YTITLE = 'Vertical Frequency', $ ZTITLE = 'Real Part of Transform', CHARSIZE = 1.5
The following figure shows the results of applying the FFT to the image. The data at the high frequencies seem to be close to zero, but the peak (spike) along the z-axis is so large that a closer look is needed.
Note The data type of the array returned by the FFT function is complex, which contains real and imaginary parts. The amplitude is the absolute value of the FFT, while the phase is the angle of the complex number, computed using the arctangent. In the above surface, we are only displaying the real part. In most cases, the imaginary part will look the same as the real part. |
WINDOW, 2, TITLE = 'FFT: Transform (Closer Look)' SHADE_SURF,fftShifted, hFreqShifted, vFreqShifted, $ /XSTYLE, /YSTYLE, /ZSTYLE, $ TITLE = 'Transform of Image', $ XTITLE = 'Horizontal Frequency', $ YTITLE = 'Vertical Frequency', $ ZTITLE = 'Real Part of Transform', CHARSIZE = 1.5, $ ZRANGE = [0., 5.]
The following figure shows the resulting transform after scaling the z-axis range from 0 to 5. You can now see that the central peak is surrounded by smaller peaks containing both high frequency information and noise.
Within the frequency domain, the range of values from the peak to the high frequency noise is extreme. You can use a logarithmic scale to retain the shape of the surface, but reduce its range. Since the logarithmic scale only applies to positive values, you should first compute the power spectrum, which is the absolute value squared of the transform.
The following example shows how to display the results of IDL's FFT function. This example also uses the first image within the abnorm.dat
file in the examples/data
directory. The results of the transform are shifted to move the origin (0, 0) of the horizontal and vertical frequencies to the center of the display. If the results are not centered then the negative frequencies appear after the positive frequencies because of the storage scheme of the FFT process. See FFT for more information on its storage scheme. Complete the following steps for a detailed description of the process.
Example Code See displayfft.pro in the examples/doc/image subdirectory of the IDL installation directory for code that duplicates this example. |
abnorm.dat
file:imageSize = [64, 64] file = FILEPATH('abnorm.dat', $ SUBDIRECTORY = ['examples', 'data']) image = READ_BINARY(file, DATA_DIMS = imageSize)
displaySize = 2*imageSize
DEVICE, DECOMPOSED = 0 LOADCT, 0
WINDOW, 0, XSIZE = displaySize[0], YSIZE = displaySize[1], $ TITLE = 'Original Image' TVSCL, CONGRID(image, displaySize[0], $ displaySize[1])
The following figure shows the original image.
ffTransform = FFT(image)
center = imageSize/2 + 1 fftShifted = SHIFT(ffTransform, center)
interval = 1. hFrequency = INDGEN(imageSize[0]) hFrequency[center[0]] = center[0] - imageSize[0] + $ FINDGEN(center[0] - 2) hFrequency = hFrequency/(imageSize[0]/interval) hFreqShifted = SHIFT(hFrequency, -center[0]) vFrequency = INDGEN(imageSize[1]) vFrequency[center[1]] = center[1] - imageSize[1] + $ FINDGEN(center[1] - 2) vFrequency = vFrequency/(imageSize[1]/interval) vFreqShifted = SHIFT(vFrequency, -center[1])
Note The previous two steps were performed because of the storage scheme of the FFT process. See the FFT description in the IDL Reference Guide for more information on this storage scheme. |
powerSpectrum = ABS(fftShifted)^2
scaledPowerSpect = ALOG10(powerSpectrum)
WINDOW, 1, TITLE = 'FFT Power Spectrum: '+ $ 'Logarithmic Scale (surface)' SHADE_SURF, scaledPowerSpect, hFreqShifted, vFreqShifted, $ /XSTYLE, /YSTYLE, /ZSTYLE, $ TITLE = 'Log-scaled Power Spectrum', $ XTITLE = 'Horizontal Frequency', $ YTITLE = 'Vertical Frequency', $ ZTITLE = 'Log(Squared Amplitude)', CHARSIZE = 1.5
The following figure shows the log-scaled power spectrum as a surface. Both low and high frequency information are visible in this display.
|
Note The data type of the array returned by the FFT function is complex, which contains real and imaginary parts. The amplitude is the absolute value of the FFT, while the phase is the angle of the complex number, computed using the arctangent. In the above surface, we are only displaying the real part. In most cases, the imaginary part will look the same as the real part. |
WINDOW, 2, XSIZE = displaySize[0], YSIZE = displaySize[1], $ TITLE = 'FFT Power Spectrum: Logarithmic Scale (image)' TVSCL, CONGRID(scaledPowerSpect, displaySize[0], $ displaySize[1])
The following figure shows the log-scaled power spectrum as an image. The brighter pixels near the center of the display represent the low frequency peak of information-containing data. The noise appears as random darker pixels within the image.
After manipulating an image within the frequency domain, you will need to transform the image back to the spatial domain. This transformation process is referred to as an inverse FFT. The inverse FFT process can be performed with IDL's FFT function by setting the INVERSE keyword.
The following example shows how to use IDL's FFT function to compute an inverse FFT. This example uses the first image within the abnorm.dat
file in the examples/data
directory. The image is not manipulated in this example while it is in the frequency domain to show that no information is lost when using the FFT. However, manipulating spurious high frequency data within the frequency domain is a useful way to remove background noise from an image, as shown in Removing Noise with the FFT. Complete the following steps for a detailed description of the process.
Example Code See inversefft.pro in the examples/doc/image subdirectory of the IDL installation directory for code that duplicates this example. |
abnorm.dat
file:imageSize = [64, 64] file = FILEPATH('abnorm.dat', $ SUBDIRECTORY = ['examples', 'data']) image = READ_BINARY(file, DATA_DIMS = imageSize)
displaySize = 2*imageSize
DEVICE, DECOMPOSED = 0 LOADCT, 0
ffTransform = FFT(image)
center = imageSize/2 + 1 fftShifted = SHIFT(ffTransform, center)
Note This step was performed because of the storage scheme of the FFT process. See the FFT description in the IDL Reference Guide for more information on this storage scheme. |
powerSpectrum = ABS(fftShifted)^2
scaledPowerSpect = ALOG10(powerSpectrum)
WINDOW, 0, XSIZE = displaySize[0], YSIZE = displaySize[1], $ TITLE = 'Power Spectrum Image' TVSCL, CONGRID(scaledPowerSpect, displaySize[0], $ displaySize[1])
The following figure shows the log-scaled power spectrum.
|
fftInverse = REAL_PART(FFT(ffTransform, /INVERSE))
Note The data type of the array returned by the FFT function is complex, which contains real and imaginary parts. The amplitude is the absolute value of the FFT, while the phase is the angle of the complex number, computed using the arctangent. In the above surface, we are only displaying the real part. In most cases, the imaginary part will look the same as the real part. |
WINDOW, 1, XSIZE = displaySize[0], YSIZE = displaySize[1], $ TITLE = 'FFT: Inverse Transform' TVSCL, CONGRID(fftInverse, displaySize[0], $ displaySize[1])
The inverse transform is the same as the original image as shown in the following figure. Unlike some domain transformations, all image information is retained when transforming data to and from the frequency domain.
This example uses IDL's FFT function to remove noise from an image. The image comes from the abnorm.dat
file found in the examples/data
directory. The first display contains the original image and its transform. The noise is very evident in the transform. A surface representation of the power spectrum helps to determine the threshold necessary to remove the noise from the image. In the surface representation, the noise appears random and below a ridge containing the spike. The ridge and spike represent coherent information within the image. A mask is applied to the transform to remove the noise and the inverse transform is applied, resulting in a clearer image. Complete the following steps for a detailed description of the process.
Example Code See removingnoisewithfft.pro in the examples/doc/image subdirectory of the IDL installation directory for code that duplicates this example. |
abnorm.dat
file:imageSize = [64, 64] file = FILEPATH('abnorm.dat', $ SUBDIRECTORY = ['examples', 'data']) image = READ_BINARY(file, DATA_DIMS = imageSize)
displaySize = 2*imageSize
DEVICE, DECOMPOSED = 0 LOADCT, 0
WINDOW, 0, XSIZE = 2*displaySize[0], $ YSIZE = displaySize[1], $ TITLE = 'Original Image and Power Spectrum' TVSCL, CONGRID(image, displaySize[0], displaySize[1]), 0
ffTransform = FFT(image)
center = imageSize/2 + 1 fftShifted = SHIFT(ffTransform, center)
interval = 1. hFrequency = INDGEN(imageSize[0]) hFrequency[center[0]] = center[0] - imageSize[0] + $ FINDGEN(center[0] - 2) hFrequency = hFrequency/(imageSize[0]/interval) hFreqShifted = SHIFT(hFrequency, -center[0]) vFrequency = INDGEN(imageSize[1]) vFrequency[center[1]] = center[1] - imageSize[1] + $ FINDGEN(center[1] - 2) vFrequency = vFrequency/(imageSize[1]/interval) vFreqShifted = SHIFT(vFrequency, -center[1])
Note The previous two steps were performed because of the storage scheme of the FFT process. See the FFT description in the IDL Reference Guide for more information on this storage scheme. |
powerSpectrum = ABS(fftShifted)^2
scaledPowerSpect = ALOG10(powerSpectrum)
TVSCL, CONGRID(scaledPowerSpect, displaySize[0], $ displaySize[1]), 1
The following figure shows the original image and its log-scaled power spectrum. The black pixels (which appear random) in the power spectrum represent noise.
|
scaledPS0 = scaledPowerSpect - MAX(scaledPowerSpect)
WINDOW, 1, $ TITLE = 'Power Spectrum Scaled to a Zero Maximum' SHADE_SURF, scaledPS0, hFreqShifted, vFreqShifted, $ /XSTYLE, /YSTYLE, /ZSTYLE, $ TITLE = 'Zero Maximum Power Spectrum', $ XTITLE = 'Horizontal Frequency', $ YTITLE = 'Vertical Frequency', $ ZTITLE = 'Max-Scaled(Log(Power Spectrum))', $ CHARSIZE = 1.5
The following figure shows the resulting log-scaled power spectrum as a surface.
|
Note The data type of the array returned by the FFT function is complex, which contains real and imaginary parts. The real part is the amplitude, and the imaginary part is the phase. In image processing, we are more concerned with the amplitude, which is the only part represented in the surface and displays of the results of the transformation. However, the imaginary part is retained for the inverse transform back into the spatial domain. |
mask = REAL_PART(scaledPS0) GT -5.25
maskedTransform = fftShifted*mask
WINDOW, 2, XSIZE = 2*displaySize[0], $ YSIZE = displaySize[1], $ TITLE = 'Power Spectrum of Masked Transform and Results' TVSCL, CONGRID(ALOG10(ABS(maskedTransform^2)), $ displaySize[0], displaySize[1]), 0, /NAN
maskedShiftedTrans = SHIFT(maskedTransform, -center)
inverseTransform = REAL_PART(FFT(maskedShiftedTrans, $ /INVERSE))
TVSCL, CONGRID(inverseTransform, displaySize[0], $ displaySize[1]), 1
The following figure shows the power spectrum of the masked transform and its inverse, which contains less noise than the original image.
IDL Online Help (March 06, 2007)