Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Archive for the 'Notepad' Category

    Short miscellaneous notes

    Ukrainian PrivatBank launches WordPress blog

    16th April 2007

    PrivatBank is one of the largest Ukrainian banks, and is very active in the field of e-commerce.

    Today I came across their new blog, which is a WordPress-powered one.
    From the first glance I’d say that it’s here where PrivatBank now publishes the news – before the blog launch, news were published on the privatbank’s website. The main and only difference appears to be in comments – now visitors can comment on those news, and even get response.

    I wonder how many “comment-responders” PrivatBank had to hire to launch this blog :). Or is that tech-support which was given one more duty? :)

    Share

    Posted in Links, Notepad, Web | 1 Comment »

    Does background-image display on top of background-color?

    10th April 2007

    Yes, it should.
    Read the rest of this entry »

    Share

    Posted in Notepad, XHTML/CSS | No Comments »

    Regular blog updates delayed until mid-April

    2nd April 2007

    Being heavily loaded with work, and with a number of tasks overdue, I’ll temporally stop regular updates of this blog until approximately 23rd of April. However, if something emergent occurs, I’ll post it here. If not – await a bunch of posts in late April! :)

    Share

    Posted in Notepad | No Comments »

    Busy with GSoC-2007

    29th March 2007

    The blog isn’t currently updated, as I’m quite busy with several abrupt opportunities, main of which is the Google Summer of Code 2007 (announcement here). I applied for the WordPress with “Improve the performance of WordPress” project. Currently I’m working on the detailed weekly 3-month-long plan of implementation for the project. I do feel that I should have learned about GSoC-2007 earlier than Friday, March 23rd :( . Well, it’s good at least that the deadline for applications was extended until the 27th of March, and I could at least register and submit the only application which I consider to be a good fit for my current activities and skills.

    This year the competition doesn’t seem overly tough, with around 3000 students, over 6100 applications, and 800 stipends for successful applicants. However, among the 131 OpenSource projects, some will definitely enjoy slightly more attention and applications than others. I wonder how many applications there are for WordPress :)

    Share

    Posted in Notepad | No Comments »

    My blog is now no-www Class B

    23rd March 2007

    Update: WordPress 2.5+ does not require adding rewrite rules to .htaccess, as it now redirects the browser to the correct (configured) URL itself. However, if you would like the redirection to be made by apache’s mod_rewrite rather than by PHP’s header() instruction (which I suspect to be slower than mod_rewrite), then you can still use the instructions below. (Another consideration to stick to mod_rewrite is the presence of other software installed into the root of the same domain as the WP blog; mod_rewrite solution works for all, while WP’s own redirect works only for WP.)

    See important update at the end of this post!

    In the early days of my acquaintance with internet, I considered it obligatory to add the “www.” part in front of every site (domain) name. As a matter of fact, without those three mysterious letters most of the websites “didn’t work”.

    Now, nearly a decade later, it appears clear to me that the www part is redundant. But it was only today, that I finally switched my blog to the use of Class B no-www policy. Earlier it was Class A, the most common.
    Read the rest of this entry »

    Share

    Posted in Misc, Notepad, Web | 3 Comments »

    SQL tips

    2nd February 2007

    This is a copy of the Top 1000 SQL Performance Tips by Jay Pipes Sheeri Kritzer Bill Karwin Ronald (“Jeremy Basher”) Bradford Farhan “Frank Mash” Mashraqi Taso Du Val Ron Hu Klinton Lee Rick James Alan Kasindorf Eric Bergen Kaj Arno Joel Seligstein Amy Lee. Some tips dropped as not useful for me personally. This post is kept as a personal note/reference.

    Specific Query Performance Tips:
    1. Use EXPLAIN to profile the query execution plan
    2. Use Slow Query Log (always have it on!)
    3. Don’t use DISTINCT when you have or could use GROUP BY
    4. Insert performance
    1. Batch INSERT and REPLACE
    2. Use LOAD DATA instead of INSERT
    5. LIMIT m,n may not be as fast as it sounds
    6. Don’t use ORDER BY RAND() if you have > ~2K records
    7. Use SQL_NO_CACHE when you are SELECTing frequently updated data or large sets of data
    8. avoid wildcards at the start of LIKE queries
    9. avoid correlated subqueries and in select and where clause (try to avoid in)
    10. no calculated comparisons — isolate indexed columns
    11. ORDER BY and LIMIT work best with equalities and covered indexes
    12. separate text/blobs from metadata, don’t put text/blobs in results if you don’t need them
    13. derived tables (subqueries in the FROM clause) can be useful for retrieving BLOBs w/out sorting them. (self-join can speed up a query if 1st part finds the IDs and use it to fetch the rest)
    14. ALTER TABLE…ORDER BY can take data sorted chronologically and re-order it by a different field — this can make queries on that field run faster
    15. Know when to split a complex query and join smaller ones
    16. Delete small amounts at a time if you can
    17. make similar queries consistent so cache is used
    18. Have good SQL query standards
    19. Don’t use deprecated features
    20. Turning OR on multiple index fields (<5.0) into UNION may speed things up (with LIMIT), after 5.0 the index_merge should pick stuff up. 21. Don't use COUNT * on Innodb tables for every search, do it a few times and/or summary tables, or if you need it for the total # of rows, use SQL_CALC_FOUND_ROWS and SELECT FOUND_ROWS() 22. Use INSERT ... ON DUPLICATE KEY update (INSERT IGNORE) to avoid having to SELECT 23. use groupwise maximum instead of subqueries Scaling Performance Tips:
    1. Use benchmarking
    2. isolate workloads don’t let administrative work interfere with customer performance. (ie backups)
    3. as your data grows, indexing may change (cardinality and selectivity change). Structuring may want to change. Make your schema as modular as your code. Make your code able to scale. Plan and embrace change, and get developers to do the same.

    Network Performance Tips:
    1. Minimize traffic by fetching only what you need.
    1. Paging/chunked data retrieval to limit
    2. Don’t use SELECT *
    3. Be wary of lots of small quick queries if a longer query can be more efficient
    2. use multi_query if appropriate to reduce round-trips

    OS Performance Tips:
    1. Use proper data partitions
    1. For Cluster. Start thinking about Cluster *before* you need them
    2. Keep the database host as clean as possible. Do you really need a windowing system on that server?
    3. Utilize the strengths of the OS
    4. pare down cron scripts
    5. create a test environment
    6. source control schema and config files
    7. for LVM innodb backups, restore to a different instance of MySQL so Innodb can roll forward
    8. partition appropriately
    9. partition your database when you have real data — do not assume you know your dataset until you have real data

    MySQL Server Overall Tips:
    1. innodb_flush_commit=0 can help slave lag
    2. Optimize for data types, use consistent data types. Use PROCEDURE ANALYSE() to help determine the smallest data type for your needs.
    3. use optimistic locking, not pessimistic locking. try to use shared lock, not exclusive lock. share mode vs. FOR UPDATE
    4. if you can, compress text/blobs
    5. compress static data
    6. don’t back up static data as often
    7. enable and increase the query and buffer caches if appropriate
    8. config params — http://docs.cellblue.nl/easy_mysql_performance_tweaks/ is a good reference
    9. Config variables & tips:
    1. use one of the supplied config files
    2. key_buffer, unix cache (leave some RAM free), per-connection variables, innodb memory variables
    3. be aware of global vs. per-connection variables
    4. check SHOW STATUS and SHOW VARIABLES (GLOBAL|SESSION in 5.0 and up)
    5. be aware of swapping esp. with Linux, “swappiness” (bypass OS filecache for innodb data files, innodb_flush_method=O_DIRECT if possible (this is also OS specific))
    6. defragment tables, rebuild indexes, do table maintenance
    7. If you use innodb_flush_txn_commit=1, use a battery-backed hardware cache write controller
    8. more RAM is good so faster disk speed
    9. use 64-bit architectures
    10. –skip-name-resolve
    11. increase myisam_sort_buffer_size to optimize large inserts (this is a per-connection variable)
    12. look up memory tuning parameter for on-insert caching
    13. increase temp table size in a data warehousing environment (default is 32Mb) so it doesn’t write to disk (also constrained by max_heap_table_size, default 16Mb)
    14. Run in SQL_MODE=STRICT to help identify warnings
    15. /tmp dir on battery-backed write cache
    16. consider battery-backed RAM for innodb logfiles
    17. use –safe-updates for client

    Storage Engine Performance Tips:
    1. InnoDB ALWAYS keeps the primary key as part of each index, so do not make the primary key very large
    2. Utilize different storage engines on master/slave ie, if you need fulltext indexing on a table.
    3. BLACKHOLE engine and replication is much faster than FEDERATED tables for things like logs.
    4. Know your storage engines and what performs best for your needs, know that different ones exist.
    1. ie, use MERGE tables ARCHIVE tables for logs
    2. Archive old data — don’t be a pack-rat! 2 common engines for this are ARCHIVE tables and MERGE tables
    5. use row-level instead of table-level locking for OLTP workloads
    6. try out a few schemas and storage engines in your test environment before picking one.

    Database Design Performance Tips:
    1. Design sane query schemas. don’t be afraid of table joins, often they are faster than denormalization
    2. Don’t use boolean flags
    3. Use Indexes
    4. Don’t Index Everything
    5. Do not duplicate indexes
    6. Do not use large columns in indexes if the ratio of SELECTs:INSERTs is low.
    7. be careful of redundant columns in an index or across indexes
    8. Use a clever key and ORDER BY instead of MAX
    9. Normalize first, and denormalize where appropriate.
    10. use INET_ATON and INET_NTOA for IP addresses, not char or varchar
    11. make it a habit to REVERSE() email addresses, so you can easily search domains (this will help avoid wildcards at the start of LIKE queries if you want to find everyone whose e-mail is in a certain domain)
    12. In 5.1 BOOL/BIT NOT NULL type is 1 bit, in previous versions it’s 1 byte.
    13. A NULL data type can take more room to store than NOT NULL
    14. Choose appropriate character sets & collations — UTF16 will store each character in 2 bytes, whether it needs it or not, latin1 is faster than UTF8.
    15. Use Triggers wisely
    16. use min_rows and max_rows to specify approximate data size so space can be pre-allocated and reference points can be calculated.
    17. Use HASH indexing for indexing across columns with similar data prefixes
    18. Use myisam_pack_keys for int data
    19. be able to change your schema without ruining functionality of your code
    20. segregate tables/databases that benefit from different configuration variables

    Other:
    1. Read and post to MySQL Planet at http://www.mysqlplanet.org/

    Share

    Posted in Notepad | No Comments »

    419 Scambaiting

    30th January 2007

    Came across the 419 eater scambaiting site. I did get tons of scam emails, but never thought of them as of an opportunity to unveil the identity (or at least get a photo) of the scammer. This must be fun – unveiling scammers :) , especially when the scammer is pushed to waste his money for some fictional “hotel booking” or “identity proof donations”. It’s like reversing the roles of cat and mouse :) . Nice work, 419eater!

    Share

    Posted in Notepad | No Comments »