15th August 2011
Sometimes there is a need to be sure that no identifier is processed twice – for example, when parsing a file into a database, with file potentially containing duplicate records. An obvious solution is to properly wrap the DB insertion code into try…except block, and process duplicate primary ID exceptions. Another, sometimes more desired solution is to maintain a set/list of processed IDs internally, and check against that list prior to attempting the insertion of anything. So is it a set or a list?
There are already quite a few internet resources discussing “python set vs list”, but probably the simplest while elegant way to test that is below.
Read the rest of this entry »
Posted in Notepad, Programming, Python | 1 Comment »
17th May 2011
If you are a Python zealot, and Java doesn’t feel right, but the project you are working on is a Java project – try
- Jython – Python for the Java platform, compile your python scripts into Java bytecode
- Groovy – not Python, but still a scripting language which compiles to jars
Posted in Links, Movies, Programming, Python | No Comments »
25th February 2011
This overview presentation is two years old, but still a highly valuable resource: modules and tools mentioned are alive and useful.
I think this is the second presentation by Giovanni I’m embedding (first one being about GNU/make for bioinformatics).
Posted in Bioinformatics, Links, Python, Software | No Comments »
16th February 2011
Imagine you need to install pycassa (which uses easy_install). Here are the 2 (at maximum) very simple steps to have it properly debianized and installed on your Debian/Ubuntu:
- if you don’t have the python-stdeb package: sudo aptitude install python-stdeb
- pypi-install pycassa
That’s it.
Refer to stdeb readme for more information. You will need that if there are dependencies – which might not be resolved automatically by stdeb.
Before stdeb, it wasn’t exactly trivial to make a .deb from python module.
Posted in *nix, how-to, Notepad, Python, Software | 1 Comment »
11th August 2010
DreamPie: the Python shell you’ve always dreamt about!
• Type your code in the lower pane of the window. To execute, press Ctrl+Enter. One-liners can be executed by simply pressing Enter; If you don’t want them executed, press Space and then Enter.
• Use Ctrl+Up and Ctrl+Down to navigate between code segments you’ve already executed. You can write a few letters before pressing Ctrl+Up, and DreamPie will only search through code segments starting with those letters.
• Press Tab or Ctrl+Space to show a list of completions to the current expression. It will also complete file names!
• Your results are stored in variables named _0, _1, and so on.
• Type a function name and press the space key and DreamPie will automatically add parentheses for you!
Posted in Links, Programming, Python | No Comments »
27th January 2010
Sometimes there is a need to remove all the probesets, which have expression values below the minimal spike-in intensity on the Affymetrix microarray. The reasoning behind this procedure is simple: minimal-expression spike-ins represent the bottom margin of microarray sensitivity, and anything below that margin cannot be reliably quantified – which also means that both fold-change and p-value of expression variance will be unreliable for these probesets.
Here’s a simple R script to do just that. It is abundantly commented, and also contains an optional (commented out) fragment which allows the removal of more low-variance, low-intensity probesets.
Read the rest of this entry »
Posted in Bioinformatics, Programming, Science | No Comments »