Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Archive for the 'Bioinformatics' Category

    Bioinformatics is a general term which refers to using computers and computational/math methods in applications to biology.

    Position Frequency Matrix to Position Weight Matrix (PFM2PWM)

    11th September 2006

    In the course of my current research, I was dealing with the TFBS (Transcription Factor Binding Sites) search. To perfrom the search, one needs position weight matrix (PWM) for each TFBS. When you refer to the TRANSFAC database of transcription factors (and matrices), you will get position frequency matrix (PFM), and will need to convert PFM into PWM.

    I did find a couple of conversion formulas, but that was quite an effort to figure out which one is correct – I had seen two different formula variations. Here I will share what I had found.
    Read the rest of this entry »

    Share

    Posted in Bioinformatics | 25 Comments »

    Allow posting duplicate form-name entries with different values

    6th September 2006

    Sometimes, writing automatic HTML forms processors, you need to post several values with the same name of the form field, e.g.:
    collection_gene = str_chrom_name
    collection_gene = gene_stable_id

    This is against the RFC on form fields design and submitting, but this approach is used – for example, by Ensembl. I spent some time to figure out how to make HTTP_Client and HTTP_Request submit multiple ‘name-value’ pairs instead of one (the latest defined, which overrides the previous). The solution is extremely simple:
    Read the rest of this entry »

    Share

    Posted in Bioinformatics, how-to, PHP, Programming, Science | No Comments »

    Avoiding out of memory fatal error when using HTTP_Client or HTTP_Request

    6th September 2006

    If you fetch large amounts of data (e.g. over 2MB per request) using HTTP_Client (or HTTP_request), you may get “out of memory” fatal errors, especially if:

    1. memory_limit is set to default 8M, and
    2. you process multiple pages using single non-reset instance of HTTP_Client object.

    This problem can manifest itself by producing fatal error after a couple of cycles of successful page retrieval – but always, if run with the same parameters, after some constant or only slightly variable number of cycles.

    In my case the problem was that HTTP_Request (a dependancy of HTTP_Client) was holding in memory all the previously fetched pages of the current session (the ‘history’ feature). To force HTTP_Request to hold only the most recent page, you need to ‘disable’ history after creating the HTTP_Client or HTTP_Request object instance:

    1. $req = &new HTTP_Client($params, $headers);
    2. // disable history to save memory
    3. $req->enableHistory(false);

    Hope this helps you.

    Share

    Posted in Bioinformatics, how-to, PHP, Programming, Science | No Comments »

    How to compare promoter structure of several genes

    16th August 2006

    I needed to find out common transcription factor (TF) binding sites (TFBS) in the structure of the promoters of interferon-regulated genes. I tried several different approaches – starting from the simplest sequence-sequence comparisons using BLAST, ClustalW alignments, and searching each of the promoters for transcription factor binding sites – to compare and find common.

    But the easiest way was to use Genomatix’s “Gene2Promoter” tool. With it, the whole procedure is extremely simple:
    Read the rest of this entry »

    Share

    Posted in Bioinformatics | No Comments »