27th January 2010
Sometimes there is a need to remove all the probesets, which have expression values below the minimal spike-in intensity on the Affymetrix microarray. The reasoning behind this procedure is simple: minimal-expression spike-ins represent the bottom margin of microarray sensitivity, and anything below that margin cannot be reliably quantified – which also means that both fold-change and p-value of expression variance will be unreliable for these probesets.
Here’s a simple R script to do just that. It is abundantly commented, and also contains an optional (commented out) fragment which allows the removal of more low-variance, low-intensity probesets.
Read the rest of this entry »
Posted in Bioinformatics, Programming, Science | No Comments »
13th November 2009
A nice report on the cost of bruteforcing variable-length and variable-complexity passwords using cloud computing services (e.g. Amazon’s EC). There’s a kind of a tutorial in their previous post.
Slow DoS attack with just 1 computer against a number of web servers, including Apache: slowloris. There is a solution for Apache, packaged for RedHat and also available for Debian.
Finally, there’s Go programming language. The most inspiring promise to me personally is the ease of execution parallelization with language’s built-in syntactic constructs. That is something highly desired. Also, I like that it is a compiled language. However, it might be 10%-20% slower than pure C. Let’s see how it grows.
Posted in Links, Misc, Programming, Security, Web | No Comments »
25th October 2009
Production: see http://www.howtoforge.com/how-to-set-up-apache2-with-mod_fcgid-and-php5-on-debian-etch – it is for Debian Etch (which is old-stable), but many of the steps apply equally well to Debian Lenny (current-stable). Also, this is a very basic guide, as if you are going to host multiple sites from multiple clients, you most definitely will need some hosting control panel.
Development: see http://www.ruzee.com/blog/2009/01/apache-virtual-hosts-a-clean-setup-for-php-developers. This setup works very well, unless you need to create several virtual hosts every day – in which case necessary actions could be partially scripted.
Posted in Links, Notepad, PHP, Programming, Software | No Comments »
10th October 2009
Regular expressions (regexps) are powerful indeed. But debugging non-trivial regexps is a burden even if you understand how regexps work, and remember most (if not all) regexp syntax.
Miscellaneous tools exist to ease this task. This post was inspired by redet’s comparison of regexp helper tools – it could be sufficient to read only that, if you’re going to try the mentioned tools yourself. Otherwise, read on.
Read the rest of this entry »
Posted in *nix, Notepad, Programming, Software | No Comments »
20th August 2009
Screem HTML/XML editor has tag-specific auto-complete, and is a nice editor for web-developers (at least as long as Quanta is not available for Debian testing).
However, version 0.16.1 is very unstable, and dies with
***MEMORY-ERROR***: screem[5527]: GSlice: assertion failed: sinfo->n_allocated > 0
As a workaround (initially suggested for the highly similar Firestarter crashes), try running screem with this command:
G_SLICE=always-malloc screem
Too bad last development version of Screem is dated March 2006.
Posted in *nix, Software, XHTML/CSS | No Comments »
13th July 2009
The two graphs below (clickable) are for CPU and RAM use during a period of a program going wild between 23:17 and 23:41 (24+ minutes of server’s downtime). The program was run non-root, it just consumed all the memory it could. It was killed by kernel, so the server started responding without any interventions – which were hard to perform, because none of the services (including ssh) were responding during downtime.


If you happen to be developing a C/C++ program – do use mtrace and valgrind, those are huge helpers against the problems akin to that shown on the graphs.
Posted in *nix, Programming, Software | No Comments »
11th June 2009
There is no way I'm aware of to do what the title says. However...
I'm sure that you are aware of the fact that floats representation in any programming language is limited by the precision of the internal binary representations. In other words, you can never have an exact float representation - there will always be some precision associated with the float you are working with. The simplest example is the difference in precision between the float and double types in C.
Suppose I have the following code fragment:
C:
-
if ( result.score >= input->raw_cut_off )
Both result.score and input->raw_cut_off are of type float, and can have positive and negative values. When compared with the greater than or equal ( >= ) operator, it is not always that condition is true - for the precision reasons shortly mentioned above.
As I already said, there is no precision specification for equality operators in C. But it is quite simple to "invent" precision specification; e.g. if I wanted to test for equality only, I could write
C:
-
if ( fabsf( result.score - input->raw_cut_off ) < 0.000001 )
In this example, I'm effectively asking for 6-digit precision for the equality comparison of floating-point values. Note, that if you replace that 0.000001 with the actual precision limit of the floating type you are using, you will be "exactly" comparing floating-point numbers - up to that type's precision, of course
.
The first-most example with the >= operator can be rewritten as
C:
-
if ( result.score > ( input->raw_cut_off - precision) )
where precision is exactly what it is named, e.g. precision = 0.000001.
Sources used:
Posted in Programming, how-to | No Comments »
8th June 2009
Stimulated by a bug in a complex and unfamiliar web PHP application with heaps of custom tweaks by other programmers, I decided to try a more professional approach to PHP programming and debugging than the standard var_dump() and family.
As a result, I'm now using Eclipse PDT with Xdebug and Xdebug Helper (Firefox extension). Now I don't understand how I used to debug my PHP programs before!
After proper configuration (I'm using local Apache, but it is also possible to debug remotely), my work flow is rather simple:
- use my web-app as usual, e.g. tweaking and testing here and there
- if something server-side goes wrong: click the XDebug helper icon in Firefox, and perform some server-request action (e.g. load a page)
- debugging is started in Eclipse PDT, where I can step through the code, set breakpoints, and examine all variables
- as soon as the problem is fixed - click the XDebug helper icon again to continue using the site normally (w/o invoking the debugger)
It takes some time to get used to, but then it's a breeze.
Some advice:
- don't use apt-get/aptitude to install Eclipse; it will be much easier both in the short and long run to use some all-in-one package from the Eclipse PDT site; all you need to do - download, extract, run!
- before actually starting to do anything, tweak the eclipse.ini file by increasing heap size from 40 MiB (default) to some larger value (I used 128MiB). If you don't do this, then at some point your debugging will become painfully sloooow, and then you'll start getting tons of "out of heap memory" errors, each one suggesting that you quit Eclipse immediately
- install XDebug with apt-get/aptitude - worked perfectly, and there's /etc/php5/conf.d/xdebug.ini not to mess with php.ini
- do read XDebug guide for PDT 2.x (I'm assuming you got the 2.x version); it should be the only document you will really need to configure everything
I only wish Eclipse was faster - that is, written not in Java but e.g. C or C++.
Posted in Links, PHP, Programming, Software | 3 Comments »
5th June 2009
Giovanni Dall’olio has recently posted a presentation on using make.
Although it has "bioinformatics" on the title page, this is a good and very easy to understand make intro.
Original post is here.
Posted in Bioinformatics, Links, Programming | 1 Comment »
30th May 2009
We are pleased to announce the release of GNAT GPL 2009, the Ada Toolset for Academic users and FLOSS developers. It introduces many new features including:
- Ability to generate byte code for the JVM
- Improved support for the .NET Framework
- Addition of the Ada-Java Interfacing Suite (AJIS) that enables native Ada code to be called from Java:
http://www.adacore.com/2008/06/17/ada-java_interfacing_suite
- Availability on the Mac OS X (64 bit) platform
- Automatic C/C++ binding generators
- Addition of the GNAT Component Collection (GNATcoll) providing new APIs that can be extended by the user community:
http://www.adacore.com/2008/06/17/gnat_component_collection
GNAT GPL 2009 comes with version 4.3.1 of the GNAT Programming Studio IDE and GNATbench 2.3, the GNAT plug-in for Eclipse.
It is available for the GNU Linux, Mac OS X (64 bit), .NET, JVM and Windows platforms.
GNAT GPL 2009 can be downloaded from the "Download" section on the new Libre website:
https://libre.adacore.com.
I wonder if the new JVM bytecode generation feature was frequently requested by Ada developers, or is just a move towards popularizing Ada as a highly capable programming language. Either way, it's good.
Hopefully, I will find time and a matching project to finally learn Ada properly - since a couple of years I believe Ada is a very good programming language. And the D language is better than C and C++
(holy war, anyone?
)
Posted in Ada, Links, Programming | No Comments »