Previous IDL Reference Guide: Procedures and Functions Next

FILE_SEARCH

Syntax | Return Value | Arguments | Keywords | Examples | Version History | See Also

The FILE_SEARCH function returns a string array containing the names of all files matching the input path specification. Input path specifications may contain wildcard characters, enabling them to match multiple files. FILE_SEARCH's interpretation of wildcard characters in the path specification is described in Supported Wildcards and Expansions.

A relative path is a path specification that can only be unambiguously interpreted by basing it relative to some other known location. Usually, this location is the current working directory for the process. A fully qualified path is a complete and unambiguous path that can be interpreted directly. For example, bin/idl is a relative path, while /usr/local/rsi/idl/bin/idl is a fully qualified path. By default, FILE_SEARCH follows the format of the input to decide whether to return relative or fully-qualified paths.


Note
In most cases, the operation of FILE_SEARCH is straightforward. There are, however, numerous options available; while these options make the routine more powerful, they may also make its behavior less intuitive. Read the keyword descriptions for additional details.

In addition, there are platform-specific behaviors of which you should be aware, especially if you work in a multiplatform environment. See Platform-Specific Filename Matching Issues for details.

Syntax

Result = FILE_SEARCH(Path_Specification)

or for recursive searching,

Result = FILE_SEARCH(Dir_Specification, Recur_Pattern)

Keywords: [, COUNT=variable ] [, /EXPAND_ENVIRONMENT ] [, /EXPAND_TILDE ] [, /FOLD_CASE ] [, /FULLY_QUALIFY_PATH ] [, /ISSUE_ACCESS_ERROR ] [, /MARK_DIRECTORY ] [, /MATCH_ALL_INITIAL_DOT | /MATCH_INITIAL_DOT ] [, /NOSORT ] [, /QUOTE ] [, /TEST_DIRECTORY ] [, /TEST_EXECUTABLE ] [, /TEST_READ ] [, /TEST_REGULAR ] [, /TEST_WRITE ] [, /TEST_ZERO_LENGTH ] [, /WINDOWS_SHORT_NAMES ]

UNIX-Only Keywords: [, /TEST_BLOCK_SPECIAL ] [, /TEST_CHARACTER_SPECIAL ] [, /TEST_DANGLING_SYMLINK ] [, /TEST_GROUP ] [, /TEST_NAMED_PIPE ] [, /TEST_SETGID ] [, /TEST_SETUID ] [, /TEST_SOCKET ] [, /TEST_STICKY_BIT ] [, /TEST_SYMLINK ] [, /TEST_USER ]

Return Value

Returns all matched filenames in a string array, one file name per array element. If no files exist with names matching the input arguments, a null scalar string is returned instead of a string array.

If the input path is relative, the results will be relative. If the input is fully qualified, the results will also be fully qualified. If you specify the FULLY_QUALIFY_PATH keyword, the results will be fully qualified no matter which form of input is used. FILE_SEARCH returns results based on standard and recursive searches:

Arguments

Any of the arguments described in this section can contain wildcard characters, as described in Supported Wildcards and Expansions.

Path_Specification

A scalar or array variable of string type, containing file paths to match. If Path_Specification is not supplied, or if it is supplied as a null string, FILE_SEARCH uses a default pattern of '*', which matches all files in the current directory.

Dir_Specification

A scalar or array variable of string type, containing directory paths within which FILE_SEARCH will perform recursive searching for files matching the Recur_Pattern argument. FILE_SEARCH examines Dir_Specification, and any directory found below it, and returns the paths of any files in those directories that match Recur_Pattern. If Dir_Specification is supplied as a null string, FILE_SEARCH searches the current directory.

Recur_Pattern

A scalar string containing a pattern for files to match in any of the directories specified by the Dir_Specification argument. If Recur_Pattern is supplied as a null string, FILE_SEARCH uses a default pattern of '*', which matches all files in the specified directories.

Keywords

COUNT

A named variable into which the number of files found is placed. If no files are found, a value of 0 (zero) is returned.

EXPAND_ENVIRONMENT

By default, FILE_SEARCH follows the conventions of the underlying operating system to determine whether it should expand environment variable references in input file specification patterns. The default is to do such expansions under UNIX, and not to do them under Microsoft Windows. The EXPAND_ENVIRONMENT keyword is used to change this behavior. Set it to a non-zero value to cause FILE_SEARCH to perform environment variable expansion on all platforms. Set it to zero to disable such expansion.

The syntax for expanding environment variables in an input file pattern is based on that supported by the standard UNIX shell (/bin/sh), as described in Supported Wildcards and Expansions.

EXPAND_TILDE

Users of the UNIX C-shell (/bin/csh), and other tools influenced by it, are familiar with the use of a tilde (~) character at the beginning of a path to denote a home directory. A tilde by itself at the beginning of the path (e.g. ~/directory/file) is equivalent to the home directory of the user executing the command, while a tilde followed by the name of a user (e.g. ~user/directory/file) is expanded to the home directory of the named user.

By default, FILE_SEARCH follows the conventions of the underlying operating system in deciding whether to expand a leading tilde or to treat it as a literal character. Hence, the default is to expand the leading tilde under UNIX, and not under Microsoft Windows. The EXPAND_TILDE keyword is used to change this behavior.

Set EXPAND_TILDE to 0 (zero) to disable tilde expansion on all platforms. Set it to a non-zero value to enable tilde expansion.


Note
Under Microsoft Windows, only the plain form of tilde is recognized. Attempts to use the ~user form will cause IDL to issue an error. IDL uses the HOME and HOMEPATH environment variables to obtain a home directory for the current Windows user.

FOLD_CASE

By default, FILE_SEARCH follows the case sensitivity policy of the underlying operating system. By default, matches are case sensitive on UNIX platforms, and case insensitive on Microsoft Windows platforms. The FOLD_CASE keyword is used to change this behavior. Set it to a non-zero value to cause FILE_SEARCH to do all file matching case insensitively. Explicitly set FOLD_CASE equal to zero to cause all file matching to be case sensitive.

RSI does not recommend changing the default value of FOLD_CASE, for the following reasons:

FULLY_QUALIFY_PATH

If set, FILE_SEARCH expands all returned file paths so that they are complete. Under UNIX, this means that all files are specified relative to the root of the file system. On Windows platforms, it means that all files are specified relative to the drive on which they are located. By default, FILE_SEARCH returns fully qualified paths when the input specification is fully qualified, and returns relative paths otherwise. For example:

CD, '/usr/local/rsi/idl/bin'  
PRINT, FILE_SEARCH('idl')  
idl  
PRINT, FILE_SEARCH('idl',/FULLY_QUALIFY_PATH)  
/usr/local/rsi/idl/bin/idl  

Under Microsoft Windows, any use of a drive letter colon (:) character implies full qualification, even if the path following the colon does not start with a slash character.

ISSUE_ACCESS_ERROR

If the IDL process lacks the necessary permission to access a directory included in the input specification, FILE_SEARCH will normally skip over it quietly and not include it in the generated results. Set ISSUE_ACCESS_ERROR to cause an error to be issued instead.

MARK_DIRECTORY

If set, all directory paths are returned with a path separator character appended to the end. This allows the caller to concatenate a file name directly to the end without having to supply a separator character first. This is convenient for cross-platform programming, as the separator characters differ between operating systems:

PRINT, FILE_SEARCH(!DIR)  
/usr/local/rsi/idl  
PRINT, FILE_SEARCH(!DIR, /MARK_DIRECTORY)  
/usr/local/rsi/idl/  

MATCH_ALL_INITIAL_DOT

By default, wildcards do not match leading dot (.) characters, and FILE_SEARCH does not return the names of files that start with the dot (.) character unless the leading dot is actually contained within the search string. Set MATCH_ALL_INITIAL_DOT to change this policy so that wildcards will match all files starting with a dot, including the special "." (current directory) and ".." (parent directory) entries. RSI recommends the use of the MATCH_INITIAL_DOT keyword instead of MATCH_ALL_INITIAL_DOT for most purposes.

MATCH_INITIAL_DOT

MATCH_INITIAL_DOT serves the same function as MATCH_ALL_INITIAL_DOT, except that the special "." (current directory) and ".." (parent directory) directories are not included.

NOSORT

Normally, FILE_SEARCH sorts the list of files returned by the operating system in a case-sensitive manner. If the NOSORT keyword is set, FILE_SEARCH will not perform the sort, instead returning exactly what is returned by the underlying operating system calls. On some operating systems, this can make FILE_SEARCH execute faster. The order of the results returned when NOSORT is set depends on the implementation of the operating system, and should not be relied upon.

QUOTE

FILE_SEARCH usually treats all wildcards found in the input specification as having the special meanings described in Supported Wildcards and Expansions. This means that such characters cannot normally be used as plain literal characters in file names. For example, it is not possible to match a file that contains a literal asterisk character in its name because asterisk is interpreted as the "match zero or more characters" wildcard.

If the QUOTE keyword is set, the backslash character can be used to escape any character so that it is treated as a plain character with no special meaning. In this mode, FILE_SEARCH replaces any two-character sequence starting with a backslash with the second character of the pair. In the process, any special wildcard meaning that character might have had disappears, and the character is treated as a literal.

If QUOTE is set, any literal backslash characters in your path must themselves be escaped with a backslash character. This is especially important for Microsoft Windows users, because the directory separator character for that platform is the backslash. Windows IDL also accepts UNIX-style forward slashes for directory separators, so Windows users have two choices in handling this issue:

Result = FILE_SEARCH('C:\\home\\bob\\\*.dat', /QUOTE)  
Result = FILE_SEARCH('C:/home/bob/\*.dat', /QUOTE)  

On a Windows system, either of these options gives the path to a file named *.dat.

TEST_BLOCK_SPECIAL

This keyword is only available on UNIX platforms.

Only include a matching file if it is a block special device.

TEST_CHARACTER_SPECIAL

This keyword is only available on UNIX platforms.

Only include a matching file if it is a character special device.

TEST_DANGLING_SYMLINK

This keyword is only available on UNIX platforms.

Only include a matching file if it is a symbolic link that points at a non-existent file.

TEST_DIRECTORY

Only include a matching file if it is a directory.

TEST_EXECUTABLE

Only include a matching file if it is executable. The source of this information differs between operating systems:

UNIX: IDL checks the per-file information (the execute bit) maintained by the operating system.

Microsoft Windows: The determination is made on the basis of the file name extension (e.g. .exe).

TEST_GROUP

This keyword is only available on UNIX platforms.

Only include a matching file if it belongs to the same effective group ID (GID) as the IDL process.

TEST_NAMED_PIPE

This keyword is only available on UNIX platforms.

Only include a matching file if it is a named pipe (fifo) device.

TEST_READ

Only include a matching file if it is readable by the user.


Note
This keyword does not support Access Control List (ACL) settings for files.

TEST_REGULAR

Only include a matching file if it is a regular disk file and not a directory, pipe, socket, or other special file type.

TEST_SETGID

This keyword is only available on UNIX platforms.

Only include a matching file if it has its Set-Group-ID bit set.

TEST_SETUID

This keyword is only available on UNIX platforms.

Only include a matching file if it has its Set-User-ID bit set.

TEST_SOCKET

This keyword is only available on UNIX platforms.

Only include a matching file if it is a UNIX domain socket.

TEST_STICKY_BIT

This keyword is only available on UNIX platforms.

Only include a matching file if it has its sticky bit set.

TEST_SYMLINK

This keyword is only available on UNIX platforms.

Only include a matching file if it is a symbolic link that points at an existing file.

TEST_USER

This keyword is only available on UNIX platforms.

Only include a matching file if it belongs to the same effective user ID (UID) as the IDL process.

TEST_WRITE

Only include a matching file if it is writable by the user.


Note
This keyword does not support Access Control List (ACL) settings for files.

TEST_ZERO_LENGTH

Only include a matching file if it has zero length.


Note
The length of a directory is highly system-dependent and does not necessarily correspond to the number of files it contains. In particular, it is possible for an empty directory to report a non-zero length. RSI does not recommend using the TEST_ZERO_LENGTH keyword on directories, as the information returned cannot be used in a meaningful way.

TEST_* Keywords

The keywords with names that start with the TEST_ prefix allow you to filter the list of resulting file paths based on various criteria. If you remove the TEST_ prefix from these keywords, they correspond directly to the same keywords to the FILE_TEST function, and are internally implemented by the same test code. One could therefore use FILE_TEST instead of the TEST_ keywords to FILE_SEARCH. For example, the following statement locates all subdirectories of the current directory:

Result = FILE_SEARCH(/TEST_DIRECTORY)  

It is equivalent to the following statements, using FILE_TEST:

result = FILE_SEARCH()  
idx = where(FILE_TEST(result, /DIRECTORY), count)  
result = (count eq 0) ? '' : result[idx]  

The TEST_* keywords are more succinct, and can be more efficient in the common case in which FILE_SEARCH generates a long list of results, only to have FILE_TEST discard most of them.

WINDOWS_SHORT_NAMES

By default, FILE_SEARCH ignores Microsoft Windows 8.3 short names when performing file matching and only considers the real file names. Set the WINDOWS_SHORT_NAMES keyword to change this policy. If this keyword is set, FILE_SEARCH looks at both the real and 8.3 short names associated with each file as it checks for a match. See Microsoft Windows "8.3 Short Names" for more information on this subject. This keyword is quietly ignored on non-Windows platforms.

You should be aware that turning on short name support can lead to confusing results. Such use should be considered carefully. For example, consider running the following IDL statement in a directory containing a single file named file_search.html:

PRINT, FILE_SEARCH('*.htm')  

Because this statement does not enable short name support, no files matching the specified pattern are found, and IDL does not print any filenames. This is the result most people would expect. If short name support is enabled, however:

PRINT, FILE_SEARCH('*.htm', /WINDOWS_SHORT_NAMES)  

IDL prints:

file_search.html   

In this case, IDL checks the short names as well as the real names. The 8.3 short name for file_search.html will be similar to FILE_S~1.HTM, which matches the Path_Specification. As such, FILE_SEARCH reports the real name for the matched file. This is the correct answer, but probably not the expected result.

Supported Wildcards and Expansions

The wildcards understood by FILE_SEARCH are based on those used by the standard UNIX shell /bin/sh (the ?, *, [, and ], characters, and environment variables) with some enhancements commonly found in the C-shell /bin/csh (the ~, {, and } characters). These wildcards are processed identically across all IDL supported platforms. The supported wildcards are shown in the following table:

Table 3-44: Supported Wildcards and Expansions 

Table 3-44: Supported Wildcards and Expansions 
Wildcard
Description
*
Matches any string, including the null string.
?
Matches any single character.
[...]
Matches any one of the enclosed characters. A pair of characters separated by "-" matches any character lexically between the pair, inclusive. If the first character following the opening bracket ( [ ) is a ! or ^, any character not enclosed is matched.
{str, str, ...}
Expand to each string (or filename-matching pattern) in the comma-separated list.
~
~user
If used at start of input file specification, is replaced with the path to the appropriate home directory. See the description of the EXPAND_TILDE keyword for details.
$var
Replace with value of the named environment variable. See the description of the EXPAND_ENVIRONMENT keyword for full details.
${var}
Replace ${var} with the value of the var environment variable. If var is not found in the environment, ${var} is replaced with a null string. This format is useful when the environment variable reference sits directly next to unrelated text, as the use of the {} brackets make it possible for IDL to determine where the environment variable ends and the remaining text starts (e.g. ${mydir}other_text).
${var:-alttext}
If environment variable var is present in the environment and has a non-NULL value, then substitute that value. If var is not present, or has a NULL value, then substitute the alternative text (alttext) provided instead.
${var-alttext}
If environment variable var is present in the environment (even if it has a NULL value) then substitute that value. If var is not present, then substitute the alternative text (alttext) provided instead.

These wildcards can appear anywhere in an input file specification, with the following exceptions:

Tilde (~)

The tilde character is only considered to be a wildcard if it is the first character in the input file specification and the EXPAND_TILDE keyword is set. Otherwise, it is treated as a regular character.

Initial Dot Character

The default is for wildcards not to match the dot (.) character if it occurs as the first character of a directory or file name. This follows the convention of UNIX shells, which treat such names as hidden files. In order to match such files, you can take any of the following actions:

Microsoft Windows UNC Paths

On a local area network, Microsoft Windows offers an alternative to the drive letter syntax for accessing files. The Universal Naming Convention (UNC) allows for specification of paths on other hosts using the syntax:

\\hostname\sharename\dir\dir\...\file  

UNC paths are distinguished from normal paths by the use of two initial slashes in the path. FILE_SEARCH can process such paths, but wildcard characters are not allowed in the hostname or sharename segments. Wildcards are allowed for specifying directories and files. For performance reasons, RSI does not recommend using the recursive form of FILE_SEARCH with UNC paths on very large directory trees.

Platform-Specific Filename Matching Issues

When using FILE_SEARCH, you should be aware of the following platform-specific issues.

File Path Syntax

The syntax allowed for file paths differs between operating systems. FILE_SEARCH always processes file paths using the syntax rules for the platform on which the IDL session is running. As a convenience for Microsoft Windows users, Windows IDL accepts UNIX style forward slashes as well as the usual backslashes as path separators.

Differing Defaults Between Platforms

The different operating systems supported by IDL have some conventions for processing file paths that are inherently incompatible. If FILE_SEARCH attempted to force an identical default policy for these features across all platforms, the resulting routine would be inconvenient to use on all platforms. FILE_SEARCH resolves this inherent tension between convenience and control in the following way:

The keywords that have different defaults on different platforms are listed in the following table:

 

Table 3-45:  FILE_SEARCH Defaults that Differ Between Platforms

Table 3-45:  FILE_SEARCH Defaults that Differ Between Platforms
Wildcard
Keyword
Default
UNIX
Default
Win
$var
${var}
${var:-alttext}
${var-alttext}
EXPAND_ENVIRONMENT
yes
no
~
EXPAND_TILDE
yes
no
 
FOLD_CASE
no
yes

Microsoft Windows "8.3 Short Names"

Older versions of Microsoft operating systems limited files to very short names: up to eight characters were allowed for the file name, followed by a dot (.), followed by an extension of up to three characters. This scheme is often referred to as "8.3 short names" or just "8.3". Newer releases of the Windows operating system have moved past the 8.3 limits, and allow much longer names. In order to allow old programs from older systems to run on newer systems without first being rebuilt, these newer versions of Windows actually maintain two separate and distinct file names for each file. Every file has, in addition to its real (potentially long) name, an automatically generated "8.3 short name". If the real name fits within the 8.3 limits, the real and short names are the same. If the real name does not fit within the 8.3 limits, the operating system constructs an 8.3 short name for it by applying a set of heuristic rules (See Microsoft's documentation for more detail on how these names are constructed). For example, the file file_search.html will be given a short name that looks something like FILE_S~1.HTM.


Note
8.3 short names are strictly a Windows backwards compatibility feature, and are not generally useful in newer software.

8.3 short names are an issue for FILE_SEARCH if the Path_Specification argument includes them. FILE_SEARCH handles this situation using the following rules:

  1. Before starting the process of file name matching, FILE_SEARCH examines the portion of Path_Specification between the first character and the first wildcard character (or the entire string if there are no wildcards) for non-wildcarded short names. Any such names are replaced by their real names.
  2.  

  3. By default, FILE_SEARCH only considers the real name during the process of matching Path_Specification with files, ignoring the 8.3 short names. The WINDOWS_SHORT_NAMES keyword can be set to change this policy. If WINDOWS_SHORT_NAMES is set, FILE_SEARCH looks at both the real and 8.3 short names associated with each file as it checks for a match.
  4.  

  5. If WINDOWS_SHORT_NAMES is set and FILE_SEARCH matches the 8.3 short name for a file, the real file name is returned. For instance, in the above example if FILE_SEARCH matches FILE_S~1.HTM, it will return file_search.html.


Warning
Windows 8.3 short names can be very confusing to understand. RSI recommends not using them unless absolutely necessary. Most modern applications will not encounter a need to match 8.3 short names.

Examples

Example 1

Find all files in the current working directory:

Result = FILE_SEARCH()  

Example 2

Find all IDL program (*.pro) files in the current working directory:

Result = FILE_SEARCH('*.pro')  

To determine the number of IDL procedure files that exist in the current directory, use the following statement:

PRINT, '# IDL pro files:',N_ELEMENTS(FILE_SEARCH('*.pro'))  

Example 3

Under Microsoft Windows, find all files in the top level directories of all drives other than the floppy drives:

Result=FILE_SEARCH('[!ab]:*')  

This example relies on the following:

  • FILE_SEARCH allows wildcards within the drive letter part of an input file specification.
  •  

  • Drives A and B are always floppies, and are not used by Windows for any other type of drive.

Example 4

Find all files in the user's home directory that start with the letters A-D. Match both upper and lowercase letters:

Result = FILE_SEARCH('~/[a-d]*', /EXPAND_TILDE, /FOLD_CASE)  

Example 5

Find all directories in the user's home directory that start with the letters A-D. Match both upper and lowercase letters:

Result = FILE_SEARCH('~/[a-d]*', /EXPAND_TILDE, /FOLD_CASE, $  
/TEST_DIRECTORY)  

Example 6

Recursively find all subdirectories found underneath the user's home directory that do not start with a dot character:

Result = FILE_SEARCH('$HOME', '*', /EXPAND_ENVIRONMENT, $  
/TEST_DIRECTORY)  

Example 7

Recursively find all subdirectories found underneath the user's home directory, including those that start with a dot character, but excluding the special "." and ".." directories:

Result = FILE_SEARCH('$HOME', '*', /MATCH_INITIAL_DOT, $  
/EXPAND_ENVIRONMENT, /TEST_DIRECTORY)  

Example 8

Find all .pro and .sav files in an IDL library search path, sorted by directory, in the order IDL searches for them:

Result = FILE_SEARCH(STRSPLIT(!PATH, PATH_SEP(/SEARCH_PATH), $  
   /EXTRACT) + '/*.{pro,sav}')  

Colon (:) is the UNIX path separator character, so the call to STRSPLIT breaks the IDL search path into an array of directories. To each directory name, we concatenate the wildcards necessary to match any .pro or .sav files in that directory. When this array is passed to FILE_SEARCH, it locates all files that match these specifications. FILE_SEARCH sorts all of the files found by each input string. The files for each string are then placed into the output array in the order they were searched for.

Example 9

Recursively find all directories in your IDL distribution:

Result = FILE_SEARCH(!DIR, '*', /TEST_DIRECTORY)  

Example 10

Under Microsoft Windows, FILE_SEARCH can be used to convert a path that uses 8.3 short names into an equivalent path that uses the real names. For example, Windows provides a directory for each user where applications are expected to create temporary files. Consider a user of Windows 2000 named Scott. The short name for Scott's temporary directory (which is usually available via the Windows "Temp" environment variable) will typically be something like C:\DOCUME~1\scott\LOCALS~1\Temp. This can be converted to real names using a statement like the following:

PRINT, FILE_SEARCH(GETENV('Temp'))  

IDL prints:

C:\Documents and Settings\scott\Local Settings\Temp   

There are some noteworthy facts about this example:

  1. Short names are relatively rare in modern Windows systems. Environment variables are one of the few ways in which they are still seen.
  2.  

  3. It was not necessary to specify the WINDOWS_SHORT_NAMES keyword in this example, although it would have been harmless to do so, because the path being converted contains no wildcard characters. FILE_SEARCH will always attempt to convert any non-wildcarded path components at the beginning of the path to their long names before it begins searching.
  4.  

  5. The GETENV function supports a special token (IDL_TMPDIR) that should be used to obtain the directory where temporary files go. This provides a portable cross platform way to find a good temporary directory. Under Windows, GETENV automatically converts the result of translating the IDL_TMPDIR preference to real names in a manner similar to that shown in this example.
  6.  

  7. People find the 8.3 short names confusing, and you may wish to translate them to real names for display purposes. However, most file handling functions and procedures in IDL will accept either without issue. There is little reason to translate them before using them in programs.

Version History

5.5
Introduced
6.1
Added WINDOWS_SHORT_NAMES keyword

See Also

FILE_TEST, FILEPATH, GETENV

  IDL Online Help (June 16, 2005)