In this tutorial, you’ll learn how to loop through files in a directory using Python. This is a common task when working with files. Whether you want to perform operations on each file, read their contents, or process them in some way, you often need to iterate over files in a directory.
5 Ways to Loop Through Files in a Directory Using Python
We will cover the following methods and compare their advantages and disadvantages at the end.
- Using the
os.listdir()
function - Using the
os.scandir()
function - Using the
pathlib
module - Using the
glob
module - Using the
os.walk()
function
Let us now read each and every method and understand how it works to loop through files in Python inside a directory.
Also Check: Python List All Files in a Directory
1. Using the os.listdir()
function to iterate over files
To loop through files in a directory, you can use os.listdir()
to get a list of filenames and then iterate through them.
Python Code:
import os
# Get the current working directory
cwd = os.getcwd()
# Get the list of files in the current directory
files = os.listdir(cwd)
# Loop through the files and print their names
for file in files:
print(file)
2. Using the os.scandir()
function to loop through files
Starting from Python 3.5, the os.scandir()
function provides a more efficient way to loop through files and directories.
Python Code:
import os
# Get the current working directory
cwd = os.getcwd()
# Get the list of files in the current directory
files = os.scandir(cwd)
# Loop through the files and print their names
for file in files:
print(file.name)
3. Using the pathlib
module to iterate over files
Here’s a complete example of using pathlib
to iterate over files in a directory:
Python Code:
from pathlib import Path
# Get the current working directory
cwd = Path.cwd()
# Get the list of files in the current directory
files = cwd.glob('*')
# Loop through the files and print their names
for file in files:
print(file.name)
Must Read: Python Glob Example
4. Using the glob
module to loop through files
The glob
module is excellent for pattern matching and filtering files based on their names. Once the file list is fetched, we can easily loop through files.
Python Code:
import glob
# Get the list of files in the current directory
files = glob.glob('*')
# Loop through the files and print their names
for file in files:
print(file)
5. Using the os.walk()
function to iterate over files
The os.walk()
method is useful when you need to loop through files in subdirectories recursively.
Python Code:
import os
# Get the current working directory
cwd = os.getcwd()
# Walk through the directory tree and print the names of all files
for root, dirs, files in os.walk(cwd):
for file in files:
print(os.path.join(root, file))
Comparison of methods
Here is a very brief but to-the-point comparison of different Python methods for looping through files in a directory.
Method | Advantages | Disadvantages |
---|---|---|
os.listdir() | Simple to use | Does not return file information |
os.scandir() | Returns file information | Slower than os.listdir() |
pathlib | Object-oriented interface | Not as widely used as other methods |
glob | Supports wildcards | Can be difficult to use for complex patterns |
os.walk() | Recursive | Can be slow for large directory trees |
Recommendation
The best method to use for looping through files in a directory depends on the specific needs of the application. In most cases, the os.listdir()
function is the simplest and most efficient option. If file information is needed, the os.scandir()
function can be used. The pathlib
module provides a more object-oriented interface for working with files and directories. The glob
module can be used for matching files with wildcards. The os.walk()
function can be used for recursively walking through directory trees.
Must Read: How to Read/Write to a File in Python
Combining all methods used to loop through files in a directory
Here is a coding snippet consolidating all the different methods we have seen above. However, in this code, we have tried to cover some unique use cases. Check it out now.
Python Code:
# Get the list of files in the current directory that end with the `.txt` extension
txt_files = [file for file in os.listdir(os.getcwd()) if file.endswith('.txt')]
# Get the list of files in the current directory that are larger than 1 megabyte
large_files = [file for file in os.scandir(os.getcwd()) if file.stat().st_size > 1048576]
# Get the list of files in the current directory that were created in the last 24 hours
import datetime
from pathlib import Path
recent_files = [file for file in Path.cwd().glob('*') if file.stat().st_mtime > datetime.datetime.now() - datetime.timedelta(hours=24)]
# Get the list of files in the current directory that match the pattern `*.jpg`
import glob
jpg_files = glob.glob('*.jpg')
# Get the list of all files in the current directory tree
import os
all_files = []
for root, dirs, files in os.walk(os.getcwd()):
for file in files:
all_files.append(os.path.join(root, file))
Also Check: Read File Line by Line in Python
Conclusion
Looping through files in a directory is a common task in Python. By understanding the different methods available, you can choose the best method for the specific needs of your application.
Happy coding!