Walking¶
Walking a filesystem means recursively visiting a directory and any sub-directories. It is a fairly common requirement for copying, searching etc.
To walk a filesystem (or directory) you can construct a Walker
object and use its methods to do the walking. Here’s an example that prints the path to every Python file in your projects directory:
>>> from fs import open_fs
>>> from fs.walk import Walker
>>> home_fs = open_fs('~/projects')
>>> walker = Walker(filter=['*.py'])
>>> for path in walker.files(home_fs):
... print(path)
Generally speaking, however, you will only need to construct a Walker object if you want to customize some behavior of the walking algorithm. This is because you can access the functionality of a Walker object via the walk
attribute on FS objects. Here’s an example:
>>> from fs import open_fs
>>> home_fs = open_fs('~/projects')
>>> for path in home_fs.walk.files(filter=['*.py']):
... print(path)
Note that the files
method above doesn’t require a fs
parameter. This is because the walk
attribute is a property which returns a BoundWalker
object, which associates the filesystem with a walker.
Walk Methods¶
If you call the walk
attribute on a BoundWalker
it will return an iterable of Step
named tuples with three values; a path to the directory, a list of Info
objects for directories, and a list of Info
objects for the files. Here’s an example:
for step in home_fs.walk(filter=['*.py']):
print('In dir {}'.format(step.path))
print('sub-directories: {!r}'.format(step.dirs))
print('files: {!r}'.format(step.files))
Note
Methods of BoundWalker
invoke a corresponding method on a Walker
object, with the bound filesystem.
The walk
attribute may appear to be a method, but is in fact a callable object. It supports other convenient methods that supply different information from the walk. For instance, files()
, which returns an iterable of file paths. Here’s an example:
for path in home_fs.walk.files(filter=['*.py']):
print('Python file: {}'.format(path))
The compliment to files
is dirs()
which returns paths to just the directories (and ignoring the files). Here’s an example:
for dir_path in home_fs.walk.dirs():
print("{!r} contains sub-directory {}".format(home_fs, dir_path))
The info()
method returns a generator of tuples containing a path and an Info
object. You can use the is_dir
attribute to know if the path refers to a directory or file. Here’s an example:
for path, info in home_fs.walk.info():
if info.is_dir:
print("[dir] {}".format(path))
else:
print("[file] {}".format(path))
Finally, here’s a nice example that counts the number of bytes of Python code in your home directory:
bytes_of_python = sum(
info.size
for info in home_fs.walk.info(namespaces=['details'])
if not info.is_dir
)
Search Algorithms¶
There are two general algorithms for searching a directory tree. The first method is "breadth"
, which yields resources in the top of the directory tree first, before moving on to sub-directories. The second is "depth"
which yields the most deeply nested resources, and works backwards to the top-most directory.
Generally speaking, you will only need the a depth search if you will be deleting resources as you walk through them. The default breadth search is a generally more efficient way of looking through a filesystem. You can specify which method you want with the search
parameter on most Walker
methods.