Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Archive for the 'Software' Category

    Debian: how to whitelist IP addresses in tumgrey-SPF

    7th August 2013

    SPF is nice for protecting your mail server from spam, but sometimes there is a need to bypass SPF checking. For example, if you rely on 3rd party servers to do spam protection for you :)

    Current setup:

    • MX records point to the spam protection mail servers, which then
    • connect to my server and deliver (hopefully spam-free) mail.

    Problem: some senders (like last.fm) do have proper, strict SPF records. Tumgreyspf on my server then rejects emails relayed through the spam-protection service.

    If these spam protection relay servers are the only which send mail to your server, then it makes sense to fully disable/uninstall tumgreyspf. Putting tumgreyspf into the permanent “learning mode” (set defaultSeedOnly = 1 in /etc/tumgreyspf/tumgreyspf.conf) may not fix the SPF problem described above, as SeedOnly seems to only affect greylisting, and not rejecting unauthorized senders.

    Solution: whitelist relay server IPs.
    Read the rest of this entry »

    Share

    Posted in *nix, how-to, Software | No Comments »

    MultiParanoid vs. QuickParanoid: pro et contra for each

    9th July 2013

    MultiParanoid

    Here we present a new proteome-scale analysis program called MultiParanoid that can automatically find orthology relationships between proteins in multiple proteomes. The software is an extension of the InParanoid program that identifies orthologs and inparalogs in pairwise proteome comparisons. MultiParanoid applies a clustering algorithm to merge multiple pairwise ortholog groups from InParanoid into multi-species ortholog groups.

    QuickParanoid

    QuickParanoid is a suite of programs for automatic ortholog clustering and analysis. It takes as input a collection of files produced by InParanoid and finds ortholog clusters among multiple species. For a given dataset, QuickParanoid first preprocesses each InParanoid output file and then computes ortholog clusters. It also provides a couple of programs qa1 and qa2 for analyzing the result of ortholog clustering.

    So… both use InParanoid… Are there any differences? Let me list those which I’ve found.

    Read the rest of this entry »

    Share

    Posted in *nix, Bioinformatics, Software | 2 Comments »

    Hands-on examination of Linux disk caching effects

    8th July 2013

    LinuxAteMyRAM :) (also as a PDF: Linux disk caching effects)

    To examine the behavior of your Linux box disk caching under specific loads, see Linux write cache mystery (PDF).

    To understand what is going on, see also The Linux Page Cache and pdflush (PDF) by the same author, Gregory Smith.

    Another useful resource is OpenSUSE’s Tuning the Memory Management Subsystem, which nicely explains some of the kernel cache/memory-related configuration options.

    Share

    Posted in *nix, Software | No Comments »

    Converting an existing Windows XP installation to a VirtualBox image

    29th June 2013

    Can be done in 2 steps, where 2nd step is optional:

    1. From Windows itself: use Disk2vhd to create the .vhd image (e.g. NOVA.VHD).
    2. (optional, requires VirtualBox) convert the VHD to VirtualBox-native VDI with VBoxManage clonehd NOVA.VHD nova.vdi --format VDI --variant Standard
    Share

    Posted in Software | 2 Comments »

    Free private git repository hosting

    29th August 2012

    Github is awesome and still improving, but sometimes I’d prefer to have some of my repositories hidden from the eyes of the public – not so much because of the code value (though that is also important sometimes), but rather because those repositories are all “work in progress” or “short-lived” and may have so much junk in them at some moments in time that it would simply be too embarrassing to publish this untidiness.

    Previously, I’ve used gitosis to setup git repository hosting on my server. I’m still using it for long-living projects, but I’m now lazy enough to dislike the steps needed to setup a new repo (and I’m creating more and more new repos, some of which are likely to die very young). Some kind of GUI would help, but gitweb seems not that useful to me (here’s how to make it work with gitosis, and another recipe, or maybe just try gitosis-web or gitosis-web-admin).

    Another downside is that gitosis is no longer actively maintained and was even removed from ubuntu repositories. Suggested course of action for gitosis users is to migrate to gitolite. However, basic design of gitolite is the same, so personally (looking for something easier to use) I see only minor gains in this migration (which I’ll have to perform anyway sooner or later).

    Another interesting self-hosted option is girocco. Too bad I have absolutely no experience with http://repo.or.cz/, so it’s hard to tell if girocco is convenient to use or not… Comments are welcome.

    Using dropbox for git repositories (also here) seems a nice and fairly easy option, with only a few downsides: it’ll eat your dropbox space (which is still much more than you get from free git hosters), and it isn’t that easy in a multi-user environment. Also, you will have to setup dropbox on your headless servers where you may want to run your code, which isn’t exactly what I’d want to do. Same arguments apply to git on google drive.

    An alternative to various self-hosted systems would be to use an existing system with free private projects. Git wiki has a list of hosts to start with.

    Here’s a brief summary of the options I’ve found relatively attractive (see below for my experience with 3 of the listed services). (See also this recent comparison.)

    Providers \ Features
    Repositories
    Users
    Space
    Paid plans?
    BitBucketUnlimited5Unlimited+
    AssemblaUnlimitedUnlimited1 GB+
    GIT EnterpriseUnlimited101 GB+
    ProjectLocker120.2 GB+

    Initially, I found GIT Enterprise and Assembla to be the most attractive options to try. After trying both, I found Assembla faster and generally more attractive to work with. It wasn’t immediately obvious how to create more than one source repository, but after figuring that out everything is smooth.

    However, after trying BitBucket, I had immediately switched all my assembla repositories to it :) BitBucket is just like github, but with free private repositories. It also has an issues tracker and a wiki. It even allows small teams to work on private repositories!

    Share

    Posted in *nix, Links, Software | 1 Comment »

    Beanstalkd and related tools for easy parallelizing and backgrounding

    18th February 2012

    beanstalkd: a simple, fast work queue.
    Jack and the Beanstalkd: a web-app for basic work queue administration.
    beanstalkc: a simple beanstalkd client library for Python.
    queueit: a CLI interface tool which helps to integrate beanstalkd into shell scripts.

    Share

    Posted in Links, Programming, Python, Software | No Comments »

    Megahack of Stratfor

    9th January 2012

    If you haven’t heard yet – stratfor.com was hacked in December 2011, leaking full information about 75k credit cards (including owner’s addresses and CVV codes) and 860k (right, almost a million) user accounts. All Stratfor email archives were also reportedly stolen (around 160-200 GB of data), but those were not made publicly available on the internet – unlike the credit cards and user accounts information, which is still relatively easy to find and download.

    I do not really recollect anything that large. Well, not counting dropbox’s 4-hour window of “any password fits all accounts”, but that was different.

    Here are some of the news items about this seriously large hacking incident:

    Here come more technical reports:

    TheTechGerald’s analysis linked to above got my attention. Unfortunately, a while ago I’ve subscribed to stratfor’s “free intelligence mailing list”, and was wondering if my account information is now publicly available. I was the most worried about the password I’ve used to subscribe, because of the risk of using the same password somewhere else.

    Unlike TheTechGerald, I haven’t used any dictionaries – just the default configuration of a well-known tool for finding weak passwords. Within a single hour, ~100k passwords were decrypted (~12% of all). Till the end of the day, ~50k more passwords were decrypted (totalling 17.4% of 860k). At this point my password was still safe, and I’ve found a way to verify that it is not used anywhere else, so I’ve aborted further decryption.

    There are a few simple conclusions:

    • anybody who had a stratfor account must verify that he/she isn’t using that password anywhere else, because if 1 PC can get 17% of all the passwords in less than a day, it is only a matter of short time until all the leaked passwords will be decrypted and made publicly available in various “md5 decryption databases”
    • system owners should run periodic screenings for weak passwords (and implement policies to prevent creating obviously weak passwords from the very beginning)
    • md5 is very fast to decrypt/bruteforce – a much slower hashing function wouldn’t hurt; also, using a more complex hashing approach, maybe even with a closed-source shared library, could help
    • single-factor authentication (password-based) is likely to get replaced with 2-factor authentication in the nearest future
    • one may enjoy increased personal data safety by using throw-away passwords in conjunction with antispam mailboxes like spam.la and mailinator.com (at least 1600 users – 0.186% – did use these services).

    Read the rest of this entry »

    Share

    Posted in Links, Misc, Security, Software, Web | No Comments »