You can use the glob module in Python to find all files with a .txt extension in a directory.txt extension in directory, you can use the glob module in Python. The glob module provides a function called glob that can find files and directories matching a specified pattern.

Here is an example of how to use glob to find all .txt files in a directory:

				
					import glob

# Find all .txt files in the current directory
files = glob.glob('*.txt')

# Print the list of files
print(files)

				
			

The glob function returns a list of file paths matching the pattern. In this case, the pattern '*.txt' matches all files in the current directory with a .txt extension.

You can also specify a specific directory to search by including the path in the pattern. For example, to find all .txt files in the /path/to/dir directory, you can use the following pattern: '/path/to/dir/*.txt'

If you want to search for files with a .txt extension in subdirectories as well, you can use the recursive=True option:

				
					import glob

# Find all .txt files in the current directory and its subdirectories
files = glob.glob('**/*.txt', recursive=True)

# Print the list of files
print(files)

				
			

3 Ways to Find All Files by Extension in Python

Here are three ways to find all files with a specific extension in Python:

  1. The ‘glob’ Module
  2. The ‘os.listdir’ Function
  3. The ‘os.walk’ Function

1. Recursive Search with ‘os.listdir()’

The glob module in Python provides a function called globglob, that can be used to find files and directories matching a specified pattern. The glob function returns a list of file paths matching the pattern.

Here is an example of how to use the glob function to find all .txt files in a directory:

				
					import glob

# Find all .txt files in the current directory
files = glob.glob('*.txt')

# Print the list of files
print(files)

				
			

You can also specify a specific directory to search by including the path in the pattern. For example, to find all .txt files in the /path/to/dir directory, you can use the following pattern: '/path/to/dir/*.txt'

If you want to search for files with a .txt extension in subdirectories as well, you can use the recursive=True option:

				
					import glob

# Find all .txt files in the current directory and its subdirectories
files = glob.glob('**/*.txt', recursive=True)

# Print the list of files
print(files)

				
			

The glob module also provides a function called iglob that works similarly to glob, but returns an iterator instead of a list. This can be more memory-efficient when working with large directories.

				
					import glob

# Find all .txt files in the current directory
files = glob.iglob('*.txt')

# Print the list of files
for file in files:
    print(file)

				
			

2. Recursive Search with ‘os.listdir()’

To perform a recursive search for files with a specific extension using the os module, you can use the following approach:

				
					import os

def find_files(extension, path):
    # Initialize an empty list to store the paths of the matching files
    matching_files = []

    # Iterate over the files and directories in the specified path
    for item in os.listdir(path):
        # Get the full path of the item
        full_path = os.path.join(path, item)

        # If the item is a file and has the desired extension, add it to the list of matching files
        if os.path.isfile(full_path) and full_path.endswith(extension):
            matching_files.append(full_path)
        # If the item is a directory, recursively search for files with the desired extension
        elif os.path.isdir(full_path):
            matching_files.extend(find_files(extension, full_path))

    return matching_files

# Find all .txt files in the current directory and its subdirectories
files = find_files('.txt', '.')

# Print the list of files
print(files)

				
			

This function uses the os.listdir function to iterate over the files and directories in the specified path. If an item is a file with the desired extension, it is added to the list of matching files. If an item is a directory, the function calls itself recursively to search for files with the desired extension in that directory.

This approach can be slow for large directory trees, as it requires traversing the entire structure. An alternative method using the os.walk function may be more efficient for these cases.

3. The ‘os.walk’ Function

The os.walk function in Python is a powerful tool for traversing directory trees. It generates an iterator that yields tuples containing information about the current directory and its subdirectories.

Here is an example of how to use the os.walk function to find all .txt files in a directory tree:

				
					import os

# Find all .txt files in the directory tree rooted at '.'
for root, dirs, files in os.walk('.'):
    for file in files:
        if file.endswith('.txt'):
            print(os.path.join(root, file))

				
			

The os.walk function iterates over all subdirectories, starting from the specified root directory, and yields a tuple for each directory containing the following information:

  • The path of the current directory (root)
  • A list of subdirectories in the current directory (dirs)
  • A list of files in the current directory (files)

In the example above, we use a nested loop to iterate over the files list and print the full path of each file that ends with .txt.

You can also modify the dirs list to control which subdirectories are visited. For example, to skip subdirectories starting with '.', you can use the following code:

				
					import os

# Find all .txt files in the directory tree rooted at '.', but skip subdirectories starting with '.'
for root, dirs, files in os.walk('.'):
    dirs[:] = [d for d in dirs if not d.startswith('.')]
    for file in files:
        if file.endswith('.txt'):
            print(os.path.join(root, file))

				
			

The os.walk function is generally more efficient than a recursive function that uses the os.listdir function, as it avoids the overhead of repeatedly calling the function. It is also more flexible, allowing you to modify the traversed directory tree easily.

Wrap up

There are several ways to find all files with a specific extension in Python, depending on your needs and the complexity of the directory tree you are working with.

The glob module provides a simple and efficient way to find files matching a specified pattern, and the recursive=True option also allows you to search for files in subdirectories.

The os module provides several functions that can be used to search for files, including os.listdir and os.walk. These functions allow you to iterate over the files and directories in a specified directory and can be used to perform a recursive search if needed.

The pathlib module provides a convenient and object-oriented interface for working with filesystem paths and includes a glob method that can be used to find files matching a specified pattern.

Regardless of your chosen approach, consider your solution’s performance and memory efficiency, especially if you are working with large directory trees.


Thanks for reading. Happy coding!