Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

  • Exits

  • Categories

  • Archives

  • Visitors’ track

    Locations of visitors to this page
  • Tags list

Python: iterate (and read) all files in a directory (folder)

12th August 2007

I found a sample of Python code to iterate through all the files within the specified folder (directory), with ability to use wildcards (*, ?, and [ ]-style ranges). Below is a portion of code from a working script:

PYTHON:
  1. import os, glob
  2.  
  3. path = 'sequences/'
  4. for infile in glob.glob( os.path.join(path, '*.fasta') ):
  5.   print "current file is: " + infile
  6.   # for python 3.0 and above, print is now a function print():
  7.   # print("current file is: " + infile)

Thanks Dt for mentioning print being promoted from a statement to a function.

Clearly, the only reason to use that 'os.path.join()' part is to make the script cross-platform-portable, as different systems use different path separators, and hard-coding path separator would stop the script from executing under a different OS.

Python docs mention that there is also iglob(), which is an iterator and thus working on directories with way too many files it will save memory by returning only single result per iteration, and not the whole list of files - as glob() does.

  • Share/Bookmark

13 Responses to “Python: iterate (and read) all files in a directory (folder)”

  1. seun Says:

    This stuff isnt working on my windows system

  2. Bogdan Says:

    it should. you might want to modify the example to suit your exact needs - this example assumes there is 'sequences' folder with at least one .fasta file - if that's not true, then - evidently - you are not going to see this script work, be it windows or linux.

  3. Dt Says:

    works just fine for me, only important change to the code that i had to make was turning print into a function because im using python 3.0, i also set it to read files with *all* extensions.

    import os, glob

    path = 'insert your own path you lazy bastards '

    for infile in glob.glob( os.path.join(path, '*.*') ):
    print("current file is: " + infile)

  4. Bogdan Says:

    Dt, thanks, I've updated the code.

  5. Mike Says:

    import os, glob

    def dir(path):
    for infile in glob.glob( os.path.join(path) ):
    print "current file is: " + infile

    path = dir(raw_input("Enter the path: "))

  6. Ferralll Says:

    Thankyou very much...
    This was exactly what I was looking for!

  7. Richard Says:

    marvellous

  8. Kris Says:

    import os, glob
    path = './'
    for infile in glob.glob( os.path.join(path, '*.*') ):
    print("current file is: " + infile)

    #lists all files in directory script is in

  9. Dan Says:

    Is there a way to change this script so that it also runs through sub-directories under the given path name?

  10. Bogdan Says:

    Make that code into a function - e.g. scan_dirs(path) - and add a single line of code to it (pseudocode below):

    if is_directory(infile): scan_dirs(infile)

    This will do exactly what you want.

  11. Dan Says:

    Bogdan,

    Thanks for the help. I'm still not getting the code to look at the directories within the path. Here's my code, it still only looks at the files under the initial path.

    
    def scandirs(path):
        for currentFile in glob.glob( os.path.join(path, '*.*') ):
            if os.path.isdir(currentFile):
                scandirs(currentFile)
            print "processing file: " + currentFile
    
    scandirs('XML/')
    

  12. Bogdan Says:

    Dan,

    script below seems to work perfectly for me:

    
    import os, glob
    
    def scandirs(path):
        for currentFile in glob.glob( os.path.join(path, '*') ):
            if os.path.isdir(currentFile):
                print 'got a directory: ' + currentFile
                scandirs(currentFile)
            print "processing file: " + currentFile
    
    scandirs('Desktop')
    

    Basically, I've changed the '*.*' wildcard to just '*'.

  13. Dan Says:

    Ahh... My *.* as opposed to a * had it so it wasn't looking at folders, thus the problem. Thanks again!

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>