Comments on: Simple substring counting script in Python

By: Bogdan

Bogdan — Thu, 07 Jan 2010 16:25:46 +0000

Nice, thanks.

By: pete

pete — Thu, 07 Jan 2010 15:26:55 +0000


def getSubStringPositionsList(str=None, subStr=None):
    l=[]
    posCount = 0
    while str:
        pos = str.find(subStr)
        if pos != -1:
            l.append(pos + posCount)
            str = str[pos + len(subStr):]
            posCount += pos + len(subStr)
        else:
            break
    return l
def getSubStringCount(str=None, subStr=None):
    return len(getSubStringPositionsList(str, subStr))

By: Bogdan

Bogdan — Mon, 12 Oct 2009 09:38:03 +0000

Thanks for the contribution!

* please note, that the script listed is 3+ years old; something has definitely changed in Python since then
* your options parser is definitely less flexible (and less verbose to the end-user)
* your main loop is indeed several lines shorter, but mostly thanks to omitting open() and file.close() with while() – imported from __future__, as well as not using the ‘–verbose’ option processing to dump extra data to the terminal

Overall, I find your submission definitely useful, but not actually as short and simple as you implied with “That’s not simple :)”

By: spiderlama

spiderlama — Mon, 12 Oct 2009 04:56:08 +0000

That’s not simple


from __future__ import with_statement
import sys
if __name__ == '__main__':
    assert len(sys.argv) == 3, 'invalid arguments. usage: file str'
    filePath = sys.argv[1]
    substr = sys.argv[2]
    linesFound = 0
    substrFound = 0
    with open(filePath) as f:
        for lineIndex, lineString in enumerate(f):
            if lineString.find(substr) != -1:
                linesFound += 1
                substrFound += lineString.count(substr)
                print filePath + ':' + str(lineIndex) + '\t' + \
                    lineString.rstrip('\r\n')
        print 'Lines read from file:', lineIndex + 2
        print 'Lines with substring found:', linesFound
        print 'Total substrings detected:', substrFound