Python: iterate (and read) all files in a directory (folder)
12th August 2007
To iterate through all the files within the specified directory (folder), with ability to use wildcards (*, ?, and [ ]-style ranges), use the following code snippet:
- import os
- import glob
- path = 'sequences/'
- for infile in glob.glob( os.path.join(path, '*.fasta') ):
- print "current file is: " + infile
If you do not need wildcards, then there is a simpler way to list all items in a directory:
- import os
- path = 'sequences/'
- listing = os.listdir(path)
- for infile in listing:
- print "current file is: " + infile
print was promoted from a statement to a function in Python 3 (use print(infile) instead of print infile).
One should use ‘os.path.join()’ part to make the script cross-platform-portable (different OS use different path separators, and hard-coding path separator would stop the script from executing under a different OS).
Python docs mention that there is also iglob(), which is an iterator and thus working on directories with way too many files it will save memory by returning only single result per iteration, and not the whole list of files – as glob() does.
December 23rd, 2008 at 11:38
works just fine for me, only important change to the code that i had to make was turning print into a function because im using python 3.0, i also set it to read files with *all* extensions.
December 23rd, 2008 at 13:21
Dt, thanks, I’ve updated the code.
May 14th, 2009 at 3:52
May 18th, 2009 at 22:25
Thankyou very much…
This was exactly what I was looking for!
November 24th, 2009 at 3:38
marvellous
December 7th, 2009 at 2:11
#lists all files in directory script is in
January 3rd, 2010 at 20:14
Is there a way to change this script so that it also runs through sub-directories under the given path name?
January 3rd, 2010 at 21:07
Make that code into a function – e.g. scan_dirs(path) – and add a single line of code to it (pseudocode below):
if os.path.isdir(infile): scan_dirs(infile)
This will do exactly what you want.
January 11th, 2010 at 22:07
Bogdan,
Thanks for the help. I’m still not getting the code to look at the directories within the path. Here’s my code, it still only looks at the files under the initial path.
January 11th, 2010 at 23:44
Dan,
script below seems to work perfectly for me:
Basically, I’ve changed the ‘*.*’ wildcard to just ‘*’.
January 12th, 2010 at 4:43
Ahh… My *.* as opposed to a * had it so it wasn’t looking at folders, thus the problem. Thanks again!
November 30th, 2010 at 1:47
Is there a way to also do thiw in Windows? What I need to do is
process every *.txt file in a directory, one at a time, inside
a Python script.
January 26th, 2011 at 21:05
Thanks. This snippet helped a bunch.
May 13th, 2011 at 13:51
Is there a possibility to list the files in order, by name ?
For example :
/path/file01.txt
/path/file02.txt
…………..
If I use the codes you presented here i get scrambled order
May 13th, 2011 at 14:20
I found it:
August 10th, 2011 at 14:54
Hi…
I am working in ubuntu. I have a bunch of commands (say 10 commands like cmd1, cmd2, cmd3…………..cmd10)
I want to write a python script, which can achive the following:
It should traverse through the directory structure and apply a command at particular directory path.
The location and the commands are already known to me.
/local/mnt/myspace/sample1$ cmd1
/local/mnt/myspace/sample2$ cmd2
/local/mnt/myspace$ cmd3
/local/mnt$cmd4
/local/mnt/myspace/sample9$ cmd 8
/local/mnt/myspace/sample3$ dmd10
can someone please provide the script as I am not event a beginner in python.
September 14th, 2011 at 11:28
thank you very much for your explaining. I get a problem when try to list file or directory in Python. You solve my problem
October 13th, 2011 at 14:40
hi …
I have been messing around with a python program to browse through images in a directory and display it in a canvas.can anybody help??
March 16th, 2012 at 13:07
Is there a way to open and read many PDB files(eg:1ASD.pdb,2sew.pdb,5res.pdb) from a folder(eg:protein) present in drive(eg:E:/)automatically without entering each name of the PDB file? bcos it is upto 14,000 PDB files.
March 22nd, 2012 at 17:55
First off, this is great! Can’t begin to tell you how helpful it is. One question: Is there a way to have it loop through only visible files? For example, in every folder, Mac OSX creates a .DS_Store file. When I iterate through, it picks up this file, which gets included in any subsequent arrays, lists, etc.
Thanks
March 22nd, 2012 at 23:21
@Vaishu: just use the script with a proper mask, like *.pdb. Maybe also make it recursive (see comment 10), if you have PDB files in sub-directories.
@Adam: just use the proper filename mask. For example, *.* should not include any files which start with a dot (like .DS_Store). Another way is to check the filename in Python, e.g.
March 23rd, 2012 at 5:01
Thank @Bogdan. That definitely helps, but there’s no way to systematically look for only visible files?
March 23rd, 2012 at 8:00
Hiiii
pls help me to do this simple program:
Drive:F:/
folder:X
files:x1.txt,x2.txt,x3.txt,x4.txt,x5.txt(5 seperate files)
I have to read all these files quickly, so i had generated a list as list.txt=['F:/X/x1.txt','F:/X/x2.txt',F:/X/x3.txt',F:/X/x4.txt',F:/X/x5.txt']
now i have to read list.txt file and i want to generate listres.txt file by ‘w’
where
listres.txt=['F:/X/x1res.txt','F:/X/x2res.txt',F:/X/x3res.txt',F:/X/x4res.txt',F:/X/x5res.txt']
i expect to write result of X1.txt file in X1res.txt alone(X2.txt file in X2res.txt file) but unfortunately it is writing result of all x1+x2+x3+x4+x5 files in x1res files and same result in x2res files how to seperate it?
March 24th, 2012 at 19:52
@Adam, I’m not currently aware of such a method. If it exists, then it should be either somewhere in os.path, or in collections. Please report back if you find it
@Priya, you should probably use http://stackoverflow.com/ or http://codereview.stackexchange.com/ to post your code and have volunteers help you with it.
March 26th, 2012 at 6:48
Hello sir,
Thank you very much to Bogdan and Adam.
April 2nd, 2012 at 8:13
hELLO,
Python Comments used for arranging floating points in ascending order
For ex..,
In Drive=C:/ Folder=r Textfile=seq.txt
Contents of seq.txt=
9.45
6.346
2.5632
8.1452
My aim is i want result as
2.5632
6.346
8.1452
9.45
what python code should be used for such type of process
April 4th, 2012 at 13:29
sorry,
contents of seq.txt is
9.45
9SEQ
6.346
4CGF
2.5632
3RES
8.1452
2HAB
and i want results of only floating points,
i.e)
2.5632
6.346
8.1452
9.45
April 10th, 2012 at 15:08
Hey, I’m having a different problem.
I have two or more folders with lots of files in them. Both folders contain some files that are exactly the same, but the files have different names. I want to use a python script to matches files in these two folders by size. Cause When the size of the file is the same, I think it’s a high enough possibillity that the files are the same.
The best script would merge these to folders together and delete duplicates, based on name or size, or name and size.
Anyone who know how to write one of these scripts?
That would be really helpfull!
sorry bout my bad english.
April 10th, 2012 at 20:00
@Vaishu, you could wrap the conversion to float() into a try..except block, and thus separate purely-numeric values from alphanumeric.
@Paal, you could use a program like ‘fdupes’, which does exactly what you want – de-duplicates the contents of two arbitrary directories.
April 11th, 2012 at 13:41
@Bogdan, Yeah, but I’m on a windows platform at work. Know of any similar program for windows? Or a python script
April 11th, 2012 at 23:40
I guess fdupes could be compiled/run in cygwin.
There are tons of similar programs for windows, and some are even good enough, but I’m not up-to-date on that software – so cannot advise.
You could write the required python script by, e.g., first creating two dicts of {filename: filesize}, and then comparing them to find identical filesizes (and yield the two filenames). This is a suboptimal approach, but that probably doesn’t matter for low numbers of files; for higher numbers, you would want a different approach. (One more suboptimal idea, but slightly better, would be to populate the 1st dict with {size:name}, and then iterate over the files in the 2nd dir, checking for “size in dict”.)
April 16th, 2012 at 13:51
Hey i want to change the file path dynamically. Eg: “/cygdrive/d/Python_Study/Snehal/xyz/1.xml”
here xyz may be anything like default, config etc.. i want to read this path depending on xyz value. How can i do that???
April 24th, 2012 at 8:52
@adam
This should work:
August 9th, 2012 at 16:37
Hi
Can anybody help me?
I have some images in one folder and I want to open all the images one by one,
need to do some processing and save it. I try this code, but got an error, as ‘No such file or directory’
on the other hand, when I print the list of images, it work nicely.
August 21st, 2012 at 21:37
Hi Guys,
I have problem this similar to the situation currently being discussed here. I have some CAD files in differerent folders arranged in sequence of years like,
C:\CADfile\1990_dwg
C:\CADfile\1991_dwg
C:\CADfile\1992_dwg
C:\CADfile\1993_dwg
C:\CADfile\1994_dwg
C:\CADfile\19950_dwg
etc. upto the year 2012
The case is this I want my python script to iterate through these folders, create a geodatabase under each name folder and the populate the geodatabase with the feature classes store in the folder
I have been a able to produce a script that can create the geodatabase of the single file and populate it with feature datasets but the problem is I can not ghet the script to go through all the folders and do the same thing.
Please I will be glad to get help on this.
Here is my scripts so far
November 11th, 2012 at 10:49
hai I am very new to programming specially python.I have been given a task.i have to define a function which looks like my_func(input_db, output_directory).
it has to be robust and:
Check if there are any matching NS**** or T**** files existing.
Only for those files the function has to be executed!
All results
should be stored inside a user-definable directory. The function should print out how many data sets
were found and how many data sets were processed.
can any one help please?
December 7th, 2012 at 20:12
Thank you!
May 20th, 2014 at 12:39
October 24th, 2016 at 18:00
Hello,I’m also new to programming
I’m trying to replace certain characters (like “A” with “U”) in all files in a specific directory
I’m sure my code is a mess , so don’t be mad at me, but can you tell me how to do this?
Thanks