Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Archive for the 'Programming' Category

    MySQL: INSERT IF NOT EXISTS syntax

    18th October 2007

    To start: as of the latest MySQL, syntax presented in the title is not possible. But there are several very easy ways to accomplish what is expected using existing functionality.

    There are 3 possible solutions: using INSERT IGNORE, REPLACE, or INSERT … ON DUPLICATE KEY UPDATE.

    Imagine we have a table:

    1. CREATE TABLE `transcripts` (
    2.  `ensembl_transcript_id` varchar(20) NOT NULL,
    3.  `transcript_chrom_start` int(10) unsigned NOT NULL,
    4.  `transcript_chrom_end` int(10) unsigned NOT NULL,
    5.  PRIMARY KEY  (`ensembl_transcript_id`)
    6. ) ENGINE=InnoDB DEFAULT CHARSET=latin1;

    Now imagine that we have an automatic pipeline importing transcripts meta-data from Ensembl, and that due to various reasons the pipeline might be broken at any step of execution. Thus, we need to ensure two things: 1) repeated executions of the pipeline will not destroy our database, and 2) repeated executions will not die due to ‘duplicate primary key’ errors.

    Method 1: using REPLACE
    Read the rest of this entry »

    Share

    Posted in Programming | 46 Comments »

    GoDaddy: undocumented 20-second CPU time maximal execution limit? (python, ELF, etc)

    16th October 2007

    Today, setting up a relatively serious (in CPU resources needed) web-system, I ran into a weird problem of python scripts ending prematurely. After some investigation, it looked like any process which uses up more than 20 seconds of CPU time, is automatically killed. To verify this, I wrote an infinite loop in C,

    int main () {
    unsigned int i;

    for (i = 0; i < 2 ; i++ ) { i = 0; } return 0; } [/c] compiled it and executed several times on the GoDaddy shared hosting server. I did observe the program running for the maximum of 20 seconds of CPU time, not a second more. Please note, that 20 seconds of CPU time can be much more of “real” time, if the script isn’t using 100% of CPU, which often the case for shared hosting. Thus if you have in your php.ini max_execution_time set to, say, 60 seconds, your php script may actually execute as long as one minute; but I’m pretty sure that if your script has lots of CPU-intensive procedures, then as soon as it uses 20 seconds of CPU time, it will be terminated (however, this statement still needs checking – anyone?). To verify, I also created a cron job with the same file. It ran for 30 seconds CPU time. Strangely, this behaviour is not documented anywhere. This limit may also explain a number of other problems, if you have heavy web-applications: they just might be killed before they are finished, causing errors. I do understand the reason for this limitation, and am sure similar limitations exist in other shared hosting environments. The only important thing here is that this limit should have been documented and even put upfront somewhere in the hosting plans descriptions. I also wonder if the limit is the same for all godaddy shared hosting plans, or if it differs. 20 seconds when executed from PHP, and 30 seconds when executed as a cron job were observed on the Deluxe Linux Hosting plan. Extensions, additions and comments are welcome.

    Share

    Posted in *nix, Misc, Programming, Web | 22 Comments »

    Useful Python looping techniques

    26th September 2007

    These are all excerpts from the Python documentation.

    To synchronously and simultaneously loop over two sequences:

    1. questions = ['name', 'quest', 'favourite colour']
    2. answers = ['Lancelot', 'the holy grail', 'blue']
    3.  
    4. for q, a in zip(questions, answers):
    5.     print 'What is your %s?  It is %s.' % (q, a)

    To loop over a sequence with both key and value:
    Read the rest of this entry »

    Share

    Posted in Programming, Python | No Comments »

    How to sort Python dict (dictionary)

    26th September 2007

    Sample script (copypasted from Well House Consultants training course):
    click the PLAIN TEXT header for copy-pasteable version

    1. #!/usr/local/bin/python
    2.  
    3. author = {"php":"Rasmus Lerdorf",\
    4.     "perl":"Larry Wall",\
    5.     "tcl":"John Ousterhout",\
    6.     "awk":"Brian Kernighan",\
    7.     "java":"James Gosling",\
    8.     "parrot":"Simon Cozens",\
    9.     "python":"Guido van Rossum"}
    10.  
    11. langs = author.keys()
    12. langs.sort()
    13.  
    14. for language in langs:
    15.     print language,"is the child of",author[language]

    You can also define the Python ksort() function similar to that found in PHP:
    Read the rest of this entry »

    Share

    Posted in Programming, Python | 1 Comment »

    PHP proxy

    19th September 2007

    PHP proxy is simple but good. I converted it into a proxy-function for one of my projects.

    Do pay attention to the comments, especially these two:

    I had issues with this script (and others) returning 0 for the bytesTotal in flash. Basically, the Content-Length header was absent from the response. By simply adding
    header(“Content-length: “.strlen($response)) before the echo, it resolved the issue. I don’t know if there is a more appropriate fix to account for character encoding, etc, but it seems to work.

    @Schimmi: Well, if you can add some checks there (like who is referring your script) and allow the access to whitelisted clients (served from your domain)… I think, you can totally make it used applications from same-domain….So it would not be open to world. Yeah above script doesn’t have those things.

    Share

    Posted in Links, Notepad, PHP, Programming | No Comments »

    MySQL – Python: good MySQLdb tutorial (examples)

    7th September 2007

    Andy Dustman (used to blog at dustman.net) gave a presentation on Python and MySQL at the MySQL Users Conference 2005, Santa Clara, CA. The presentation is an excellent collection of examples for those who use the MySQLdb Python module.
    Read the rest of this entry »

    Share

    Posted in Links, Notepad, Programming, Python | No Comments »

    Date and time in Python scripts

    7th September 2007

    Here’s an excellent description of various methods to handle date and time in Python scripts.

    Share

    Posted in Links, Notepad, Programming, Python | No Comments »