Elegantly converting Python list into another list with changed items
30th August 2007
I just had a small problem with my Python script:
1. I have a list of tuples, called records_cache; each tuple looks like this:
(note: for copy-pasting, click the PLAIN TEXT box headers)
- (ensembl_transcript_id, ensembl_gene_id, chrom_start, chrom_end, utr_start, utr_end, sequence, strand)
2. I need to INSERT only some of those values into MySQL table, using the MySQLdb executemany() method. The values I need are utr_start, utr_end, ensembl_gene_id (in exactly this order).
To do this, I need to create another list of tuples, but with shorter 3-item tuples. Let us call the new list genes_update.
Here is the elegant, though probably not the most efficient, solution:
- genes_update = [(onetuple[4],onetuple[5],onetuple[1]) for onetuple in records_cache]]
Let’s see what is happening here:
- First, note the brackets around the r-expression. These brackets force the result to be a list.
- Next, note the construct used, it is from the Python manual: list_element.method() for list_element in list. This construct iterates all items of the list, in our case – of the records_cache list. This construct also applies the list_element.method() to each list method.
- Instead of applying some list_element.method(), I just use tuple indices to select the values I need, and then (using parenthesis) join those values into a tuple again.
- So for each tuple in the records_cache list, another (shorter) tuple is returned, and those shorter tuples are joined into the new list, which is assigned to the genes_update list.
After this conversion, I can happily write that list of tuples into MySQL table, using cursor.executemany():
(note: error_log() is a small custom function for appending to script’s error.log file)
- try:
- cursor.executemany("""UPDATE genes SET
- 5utr_start = %s,
- 5utr_end = %s
- WHERE ensembl_gene_id = %s""", genes_update)
- db.commit()
- except MySQLdb.Error, (errno, strerror):
- db.rollback()
- error_log("Error %d: %s (%s)\n" % (errno, strerror, insertq))
- return False # failed to update
- return True # updated
Comments are welcome, especially on the efficiency improvement.
October 6th, 2007 at 4:51
Thank you for sharing!