Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Brief comparison: Dropbox vs BitTorrent Sync vs AeroFS vs SparkleShare

    24th November 2013

    Right now I’m mostly using Dropbox, and recently started BitTorrent Sync for my music collection sync between all the PCs and my backups server, as well as for sharing larger files at work (thanks to direct LAN connections, this is much faster with BTSync than with Dropbox, which has to first upload the file to Dropbox server). I’m also considering syncing a TrueCrypt container of my photos archive using BTSync. SparkleShare is potentially interesting, but given my trend to move to free code-hosting services, I do not yet see a need for it.

    Below is a short summary table I’ve used to compare available solutions. Feel free to contribute to the table in the comments – I’ll update the post, then.

    Read the rest of this entry »

    Posted in Links, Software | No Comments »

    The Mysteries of BitCoin

    24th November 2013

    BitCoin logoDid you know that the creator(s) of BitCoin is/are unknown?
    Did you know that the account which generated the Genesis Block is estimated to have up to 0.6-1 million BitCoins?
    Did you know that the creator(s) of BitCoin disappeared from any BitCoin-related discussion and development forums a long time ago?
    Did you know that 3 journalist investigations aiming to identify BitCoin creator(s) all ended up with different results?

    This information is not available in one piece, but there are some of the pieces at the following URLs:

    A mistery hidden in the Genesis Block
    Who is Satoshi Nakamoto?
    Four years and $100 million later, Bitcoin’s mysterious creator remains anonymous
    Ted Nelson Says That Bitcoin’s Satoshi Nakamoto Is Shinichi Mochizuki
    Where in the World is Satoshi Nakamoto?
    The Rise and Fall of Bitcoin

    Below is a huge graphical brief history of Bitcoin, including the continued growth of its exchange rates.
    Read the rest of this entry »

    Posted in Links, Misc | No Comments »

    Outlook 2010: MAPI was unable to load the information service gwmsp1.dll

    24th November 2013

    If you try starting Outlook 2010 and get an error like this:
    “Outlook 2010 cannot open your default e-mail folders. An unexpected error has occurred. MAPI was unable to load the information service gwmsp1.dll”

    you can easily fix this problem by going to Control Panel, clicking on Mail, then Show Profiles button.
    Remove everything that’s there. Now start outlook again.

    Note: removing all the mail profiles will disable your Novell Groupwise client.
    If you still want to use non-Outlook email profiles, then the better solution is to manually create a new mail profile for Outlook.

    Source: CNET forums.

    Posted in Misc, Notepad | No Comments »

    The list of spammers emails

    13th November 2013

    All sane people agree that spam is a blight of the internet, be it email spam or comments spam or forum spam or any other form of unsolicited, blatant, shameless, out-of-context advertising. Multiple spam-fighting and spam-stopping systems are being developed.

    With automated spam, automated spam-fighting systems might be the only choice. Sending rightfully angry emails to ISPs to notify about their customers violating service agreements is probably a waste of effort (something tells me most of these complaints end up in the trash folder, or even in the… spam folder). However, I get a feeling that some spam is not automated – it appears to have been actually prepared and sent by a human. (Alternatively, spammers behind those spams simply have better software.) Anyway, some spams seem to contain valid contact data of the advertized entity – like an email.

    The resulting idea is very simple and was probably already implemented somewhere by someone: simply publish online contact emails of the entities which, apparently, had chosen spam as the primary means of advertising. These emails will be sooner or later harvested by spammers, added to spam databases, and will start getting progressively more spam.

    There are a few drawbacks to this approach:

    • knowing spam-collection points enables “black PR”-like mass-mailings in the name of one’s competitor, double-hurting the innocents; I do not see a clear method of preventing this, other than by concealing spam collection methods;
    • human intelligence is required to identify if the contained email truly belongs to the advertised entity; this is fairly time-consuming, especially when scaled up; a possible solution (with its own problems) would be to build an online gateway for submitting curated spam samples, thus distributing the workload to all the participating volunteers;
    • the next logical step is actually harvesting and then publishing all the emails from the advertised website;
    • the biggest drawback, however, is low efficiency of this approach; increasing spam percentage will only be a mild nuisance, which isn’t likely to propagate high enough to affect spam-deciders; also, indirectly spamming someone’s mailbox will result in the loss of time, which could have been otherwise used for facebook and other important activities :)

    What do you think? Should such a method be used?

    Below I provide a few sample records from real spam comments, which had true-looking emails. I’m including some extra meta-data. Ideally, this should be stored in some kind of a database.

    Submitted on 2013/11/13 at 15:23 GMT
    Author : Виктор (IP: 95.134.110.37 , 37-110-134-95.pool.ukrtel.net)
    E-mail : aionind@yandex.ru
    E-mail : sale@aion-industry.ru
    E-mail : info@aion-industry.ru
    Submitted on 2013/11/26 at 8:53 GMT
    Author : Виктор (IP: 95.134.146.235 , 235-146-134-95.pool.ukrtel.net)
    E-mail : kvazargr@yandex.ru
    E-mail : info@kvazar-gr.ru
    Submitted on 2013/11/28 at 7:24 GMT
    Author : Виктор (IP: 95.134.117.155 , 155-117-134-95.pool.ukrtel.net)
    E-mail : relevater@yandex.ru
    E-mail : info@relevate.ru
    E-mail : support@relevate.ru
    E-mail : billing@relevate.ru

    There’s definitely a need for a public database, API keys, and quorum algorithms…

    Author : casinoworka (IP: 91.207.4.201 , 201.4.207.91.unknown.SteepHost.Net)
    E-mail : pharmacywork7777777@gmail.com
    E-mail : info@prowessmedical.com

    Posted in Misc, Web | No Comments »

    Alternatives to GNU make

    19th October 2013

    Right now, when I see that I have to often repeat/retype some sets and sequences of commands, I’m trying to wrap them up into some kind of a script, every time choosing the most appropriate language – shell when I need to start lots of existing command-line tools, Python when there’s some data handling and processing involved, and R when I’m invoking commands from R packages. So far I have been avoiding the fairly popular makefile-based approach to automating pipelines and workflows which rely heavily on existing tools. However, being curious, I’ve compiled a short list of modern make-like alternatives, to possibly explore… sometime later…

    • First comes make itself – the oldest and the most widely used software build tool. Stable and powerful. Still, even people who got used to using make, have some gripes about it. The most detailed list of gripes is probably here.
    • SCons is a build tool written in Python. I guess I like that “configuration files are Python scripts” – maybe knowing Python is enough to use SCons, which makes SCons a better choice than make for me. SCons seems to have gained some support (scroll down for comments/discussion). There were some doubts about SCons performance (1, 2, and 3); not sure where SCons is at right now in that regard.
    • waf, a Python-based framework for configuring, compiling and installing applications.
    • pyDoIt is a Python automation tool. It seems to use Python syntax. It aims at bringing the power of build-tools to execute any kind of task, where a task describes some computation to be done (actions), and contains some extra meta-data. Based on the description alone, I’m quite intrigued! I wonder if anyone had already worked with pyDoIt and can share experiences?…
    • Rake – Ruby make – is a simple build program with capabilities similar to those of make. Had seen a lot of positive feedback about this one – mostly regarding simplicity of use. Still [py]DoIt so far looks more attractive to me personally.
    • Ruffus is a lightweight python module for running computational pipelines. Sounds like some good competition to [py]DoIt!
    • Anduril is an open source component-based workflow framework for scientific data analysis. Sounds promising, though the latest downloadable version is over 400 MBs… It probably already contains a bunch of binaries and maybe even data and complete workflows for data analysis. Probably worth a look, but may turn out a little overweight for simple pipelining.
    • snakemake is a scalable bioinformatics workflow engine. I get the feeling that Python is truly dominating the pipelines/workflows world: snakemake, as even the name suggests, is in Python, too. The front-page example is so simple and clear, that snakemake immediately pushes DoIt down from the 1st place! Awesome.
    • Paver is a yet-another Python-based software project scripting tool along the lines of Make or Rake, designed to help out with repetitive tasks with the convenience of Python’s syntax. Sounds similar to DoIt. Have no idea how they actually compare to each other.

    That is it for now.

    What were your experiences with automating repetitive tasks and building simple pipelines?

    Posted in *nix, Notepad, Programming, Software | No Comments »

    Saving and restoring the list of packages installed on a Debian system using aptitude or deborphan

    18th October 2013

    The usual, or even classical way is to create the list of installed packages with sudo dpkg --get-selections > package_list, and then restore when/if necessary with cat package_list | xargs sudo apt-get -y install.

    As VihangD points out in his serverfault answer, the same can be achieved with aptitude, while also excluding dependent, automatically installed packages (which are included by the classical method). To create the list of packages, run aptitude search -F '%p' '~i!~M' > package_list. Here, -F '%p' asks aptitude to only print package names (instead of the default output, which also contains package state and description); search term ‘~i!~M’ asks for all non-automatically installed packages.

    To install packages using the created list, run xargs aptitude --schedule-only install < package_list; aptitude install. The first of these two commands instructs aptitude to mark all the packages from the list as scheduled for installation. The second command actually performs the installation.

    Hamish Downer suggests an alternative way of getting the initial package_list: using the deborphan utility, deborphan -a --no-show-section > package_list. This command asks deborphan to show a list of packages, which have no dependencies on them. Sounds very similar to what we did with aptitude above, but using deborphan will most likely result in a much shorter list of packages (on my system, deborphan printed 291 package names, aptitude printed 847, and dpkg printed 3650 package names). One more potentially important difference between aptitude- and deborphan-produced package lists is that aptitude only specifies package architecture when it is different from native (e.g. 'googleearth:i386' on a 64-bit system), while deborphan specifies architectures for all the packages (resulting in e.g. 'google-talkplugin:amd64' and 'googleearth-package:all' on a 64-bit system).

    Posted in *nix, how-to, Notepad | 2 Comments »

    GUIs for R

    17th October 2013

    I’ve tried [briefly] Cantor (which also supports Octave and KAlgebra as backends), rkward, deducer/JGR, R Commander, and RStudio.

    My personal choice was RStudio: it is good-looking, intuitive, easy-to-use, while powerful.

    Next step would be using some R-equivalent of the excellent ipython’s Mathematica-like Notebook webinterface…

    Posted in *nix, Notepad, Programming, Science, Software | No Comments »