Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Archive for the 'Software' Category

    How to update a multisite Drupal 6/7 installation using Drush

    25th August 2014

    There are quite a lot of posts on how to do this, but my differs a tiny little bit, so I’m saving it for my own future reference, and also for the benefits of the wider audience.

    I am updating a multisite Drupal 6 installation. To the best of my knowledge, the only difference for Drupal 7 is that instead of the site_offline D6 variable the maintenance_mode variable is used in D7.

    On Debian stable and later, you can sudo aptitude install drush and then just use it immediately after that.

    Note: I recommend su webuser (or sudo -s followed by sudo -s -u webuser) before you run any non-testing drush commands, where webuser is the user which owns your web-exposed files (e.g. Debian’s default is, I think, www-data). I’ve seen a lot of recommendations to run drush as a super-user, but that does not make sense, and may actually cause problems with file ownership.

    One last thing before we start: if your drush seems to work fine but hangs when untarring modules – check this solution.

    Read the rest of this entry »

    Share

    Posted in *nix, Drupal, how-to, Notepad, PHP, Programming, Software, Web | 1 Comment »

    drush pm-update fails: tar hangs when extracting *.tar.gz module archives from drupal.org

    25th August 2014

    Drush is awesome, especially for updating multisite Drupal installations.
    I had only started using it a few days ago, and I’ve immediately hit a problem, to which I did find a workaround.

    Symptoms

    • running drush @sites pm-update results in normal execution up to after answering ‘y[es]‘; then drush seems to hang indefinitely (haven’t waited beyond about 10 minutes, maybe it does produce an error after a long while);
    • running the same command with --debug shows that drush hangs when trying to untar the downloaded module.tar.gz archive; there are no errors/warnings, it just hangs with no CPU usage;
    • trying to untar any of the modules downloaded from drupal.org manually is also unsuccessful: tar -xzvf module.tar.gz seems to do nothing, it also hangs with zero CPU usage/time and no warnings/errors;
    • interestingly, if I create some test.tar.gz locally, tar does happily extract that;
    • finally, running strace tar -xzvf module.tar.gz shows a number of unexpected lines, such as references to NSS and libnss files (I am only showing some of the lines of strace output, including the last line):

      open(“/etc/nsswitch.conf”, O_RDONLY) = 4
      read(4, “# /etc/nsswitch.conf\n#\n# Example”…, 4096) = 683
      open(“/lib/x86_64-linux-gnu/libnss_nis.so.2″, O_RDONLY) = 4
      open(“/lib/x86_64-linux-gnu/libnss_files.so.2″, O_RDONLY) = 4
      open(“/etc/passwd”, O_RDONLY|O_CLOEXEC) = 4
      open(“/usr/lib/x86_64-linux-gnu/libnss_mysql.so.2″, O_RDONLY) = 4
      open(“/etc/group”, O_RDONLY|O_CLOEXEC) = 4
      open(“/etc/libnss-mysql.cfg”, O_RDONLY) = -1 EACCES (Permission denied)
      open(“/etc/libnss-mysql-root.cfg”, O_RDONLY) = -1 EACCES (Permission denied)
      futex(0x7fd0816e8c48, FUTEX_WAIT_PRIVATE, 2, NULL

    Read the rest of this entry »

    Share

    Posted in *nix, Drupal, Notepad, Software | No Comments »

    How to cite PHYLIP

    10th January 2014

    Official PHYLIP FAQ does suggest a few ways to cite the software, but I believe that the best citation is mentioned in the wikipedia PHYLIP article: pubmed reference for PMID 7288891. This PubMed citations seems the best, because

    • it does mention the software tool implementing the maximum likelihood approach,
    • it is likely the earliest mention of the PHYLIP software (which was distributed since around 1980),
    • it refers to a journal indexed by pubmed, and
    • according to Google Scholar, it was already cited over 6660 times :)
    Share

    Posted in Links, Science, Software | No Comments »

    Brief comparison: Dropbox vs BitTorrent Sync vs AeroFS vs SparkleShare

    24th November 2013

    Right now I’m mostly using Dropbox, and recently started BitTorrent Sync for my music collection sync between all the PCs and my backups server, as well as for sharing larger files at work (thanks to direct LAN connections, this is much faster with BTSync than with Dropbox, which has to first upload the file to Dropbox server). I’m also considering syncing a TrueCrypt container of my photos archive using BTSync. SparkleShare is potentially interesting, but given my trend to move to free code-hosting services, I do not yet see a need for it.

    Below is a short summary table I’ve used to compare available solutions. Feel free to contribute to the table in the comments – I’ll update the post, then.

    Read the rest of this entry »

    Share

    Posted in Links, Software | No Comments »

    Alternatives to GNU make

    19th October 2013

    Right now, when I see that I have to often repeat/retype some sets and sequences of commands, I’m trying to wrap them up into some kind of a script, every time choosing the most appropriate language – shell when I need to start lots of existing command-line tools, Python when there’s some data handling and processing involved, and R when I’m invoking commands from R packages. So far I have been avoiding the fairly popular makefile-based approach to automating pipelines and workflows which rely heavily on existing tools. However, being curious, I’ve compiled a short list of modern make-like alternatives, to possibly explore… sometime later…

    • First comes make itself – the oldest and the most widely used software build tool. Stable and powerful. Still, even people who got used to using make, have some gripes about it. The most detailed list of gripes is probably here.
    • SCons is a build tool written in Python. I guess I like that “configuration files are Python scripts” – maybe knowing Python is enough to use SCons, which makes SCons a better choice than make for me. SCons seems to have gained some support (scroll down for comments/discussion). There were some doubts about SCons performance (1, 2, and 3); not sure where SCons is at right now in that regard.
    • waf, a Python-based framework for configuring, compiling and installing applications.
    • pyDoIt is a Python automation tool. It seems to use Python syntax. It aims at bringing the power of build-tools to execute any kind of task, where a task describes some computation to be done (actions), and contains some extra meta-data. Based on the description alone, I’m quite intrigued! I wonder if anyone had already worked with pyDoIt and can share experiences?…
    • Rake – Ruby make – is a simple build program with capabilities similar to those of make. Had seen a lot of positive feedback about this one – mostly regarding simplicity of use. Still [py]DoIt so far looks more attractive to me personally.
    • Ruffus is a lightweight python module for running computational pipelines. Sounds like some good competition to [py]DoIt!
    • Anduril is an open source component-based workflow framework for scientific data analysis. Sounds promising, though the latest downloadable version is over 400 MBs… It probably already contains a bunch of binaries and maybe even data and complete workflows for data analysis. Probably worth a look, but may turn out a little overweight for simple pipelining.
    • snakemake is a scalable bioinformatics workflow engine. I get the feeling that Python is truly dominating the pipelines/workflows world: snakemake, as even the name suggests, is in Python, too. The front-page example is so simple and clear, that snakemake immediately pushes DoIt down from the 1st place! Awesome.
    • Paver is a yet-another Python-based software project scripting tool along the lines of Make or Rake, designed to help out with repetitive tasks with the convenience of Python’s syntax. Sounds similar to DoIt. Have no idea how they actually compare to each other.

    That is it for now.

    What were your experiences with automating repetitive tasks and building simple pipelines?

    Share

    Posted in *nix, Notepad, Programming, Software | No Comments »

    GUIs for R

    17th October 2013

    I’ve tried [briefly] Cantor (which also supports Octave and KAlgebra as backends), rkward, deducer/JGR, R Commander, and RStudio.

    My personal choice was RStudio: it is good-looking, intuitive, easy-to-use, while powerful.

    Next step would be using some R-equivalent of the excellent ipython’s Mathematica-like Notebook webinterface…

    Share

    Posted in *nix, Notepad, Programming, Science, Software | No Comments »

    The favourite file compressor: gzip, bzip2, or 7z?

    17th October 2013

    Here comes a heap of assorted web-links!

    I had personally settled on using pbzip2 for these simple reasons:

    • performance scales quasi-linearly with the number of CPU cores (until one hits an I/O bottleneck);
    • when archive is damaged, you are only guaranteed to loose the damaged block(s) of size 100-900 KiB – remaining information might be salvable.

    Compared to pbzip2, neither gzip nor 7z (lzma) offer quasi-linear speedups proportional to the number of CPU cores.
    pigz, the parallel gzip, does parallelize compression, but gzip compresses not as good as bzip2, and decompression is not parallel like in pbzip2.
    7z is multi-threaded, but it tops out at using 2 CPU cores (see links below for tests).

    pbzip2 is also quite a good choice for FASTQ data files: even if a few blocks get lost due to data corruption, this should not noticeably affect the entire dataset.
    Specialized tools for FASTQ compression do exist (see e.g. this article, also Fastqz, fqzcomp, and samcomp project pages.) I think I liked fastqz quite a bit, but I still have to examine data recoverability in the case of archive damage. It is possible to use external parity tools which support file repair using pre-calculated recovery files – like the linux par2 utility, also for bzip2 archives and any other files in general – but adding parity file may negate the higher compression ratio benefits. Also, if there is no independent block structure of the archive, insufficient parity file may lead to the loss of the entire archive. In other words, this still has to be tested.

    Now the long-promised web-links!
    Read the rest of this entry »

    Share

    Posted in *nix, Links, Notepad, Software | 1 Comment »