<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
> <channel><title>Autarchy of the Private Cave &#187; awk</title> <atom:link href="https://bogdan.org.ua/tags/awk/feed" rel="self" type="application/rss+xml" /><link>https://bogdan.org.ua</link> <description>Tiny bits of bioinformatics, [web-]programming etc</description> <lastBuildDate>Wed, 28 Dec 2022 16:09:04 +0000</lastBuildDate> <language>en-US</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>https://wordpress.org/?v=3.8.27</generator> <item><title>How to replace newlines with commas, tabs etc (merge lines)</title><link>https://bogdan.org.ua/2010/11/16/how-to-replace-newlines-with-commas-tabs-etc-merge-lines.html</link> <comments>https://bogdan.org.ua/2010/11/16/how-to-replace-newlines-with-commas-tabs-etc-merge-lines.html#comments</comments> <pubDate>Tue, 16 Nov 2010 08:20:45 +0000</pubDate> <dc:creator><![CDATA[Bogdan]]></dc:creator> <category><![CDATA[*nix]]></category> <category><![CDATA[Bioinformatics]]></category> <category><![CDATA[how-to]]></category> <category><![CDATA[Notepad]]></category> <category><![CDATA[Software]]></category> <category><![CDATA[awk]]></category> <category><![CDATA[grep]]></category> <category><![CDATA[linux]]></category> <category><![CDATA[paste]]></category> <category><![CDATA[sed]]></category> <category><![CDATA[sort]]></category> <category><![CDATA[tr]]></category> <guid
isPermaLink="false">http://bogdan.org.ua/?p=1208</guid> <description><![CDATA[Imagine you need to get a few lines from a group of files with missing identifier mappings. I have a bunch of files with content similar to this one: ENSRNOG00000018677 1368832_at 25233 ENSRNOG00000002079 1369102_at 25272 ENSRNOG00000043451 25353 ENSRNOG00000001527 1388013_at 25408 ENSRNOG00000007390 1389538_at 25493 In the example above I need &#8217;25353&#8242;, which does not have corresponding [&#8230;]]]></description> <content:encoded><![CDATA[<p>Imagine you need to get a few lines from a group of files with missing identifier mappings. I have a bunch of files with content similar to this one:</p><blockquote><p> ENSRNOG00000018677      1368832_at      25233<br
/> ENSRNOG00000002079      1369102_at      25272<br
/> ENSRNOG00000043451                            25353<br
/> ENSRNOG00000001527      1388013_at      25408<br
/> ENSRNOG00000007390      1389538_at      25493</p></blockquote><p>In the example above I need &#8217;25353&#8242;, which does not have corresponding affy_probeset_id in the 2nd column.</p><p>It is clear how to do that:</p><div
id="ig-sh-1" class="syntax_hilite"><div
class="code"><ol
class="code" style="font-family:monospace;"><li
style="font-weight: normal; vertical-align:top;"><div
style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">sort -u *_affy_ensembl.txt | grep -v '_at' | awk '{print $2}'</div></li></ol></div></div><p>This outputs a column of required IDs (EntrezGene in this example):</p><blockquote><p> 116720<br
/> 679845<br
/> 309295<br
/> 364867<br
/> 298220<br
/> 298221<br
/> 25353</p></blockquote><p>However, I need these IDs as a comma-separated list, not as newline-separated list.</p><p>There are several ways to achieve the desired result (only the last pipe commands differ):</p><div
id="ig-sh-2" class="syntax_hilite"><div
class="code"><ol
class="code" style="font-family:monospace;"><li
style="font-weight: normal; vertical-align:top;"><div
style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">sort -u *_affy_ensembl.txt | grep -v '_at' | awk '{print $2}' | gawk '$1=$1' ORS=', '</div></li></ol></div></div><div
id="ig-sh-3" class="syntax_hilite"><div
class="code"><ol
class="code" style="font-family:monospace;"><li
style="font-weight: normal; vertical-align:top;"><div
style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">sort -u *_affy_ensembl.txt | grep -v '_at' | awk '{print $2}' | tr '\n' ','</div></li></ol></div></div><div
id="ig-sh-4" class="syntax_hilite"><div
class="code"><ol
class="code" style="font-family:monospace;"><li
style="font-weight: normal; vertical-align:top;"><div
style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">sort -u *_affy_ensembl.txt | grep -v '_at' | awk '{print $2}' | sed ':a;N;$!ba;s/\n/, /g'</div></li></ol></div></div><div
id="ig-sh-5" class="syntax_hilite"><div
class="code"><ol
class="code" style="font-family:monospace;"><li
style="font-weight: normal; vertical-align:top;"><div
style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">sort -u *_affy_ensembl.txt | grep -v '_at' | awk '{print $2}' | sed ':q;N;s/\n/, /g;t q'</div></li></ol></div></div><div
id="ig-sh-6" class="syntax_hilite"><div
class="code"><ol
class="code" style="font-family:monospace;"><li
style="font-weight: normal; vertical-align:top;"><div
style="font: normal normal 1em/1.2em monospace; margin:0; padding:0; background:none; vertical-align:top;">sort -u *_affy_ensembl.txt | grep -v '_at' | awk '{print $2}' | paste -s -d &quot;,&quot;</div></li></ol></div></div><p>These solutions differ in efficiency and (slightly) in output. <strong>sed</strong> will read all the input into its buffer to replace newlines with other separators, so it might not be best for large files. <strong>tr</strong> might be the most efficient, but I haven&#8217;t tested that. <strong>paste</strong> will re-use delimiters, so you cannot really get comma-space &#8220;, &#8221; separation with it.</p><p>Sources: <a
href="http://www.linuxquestions.org/questions/programming-9/sed-how-do-you-replace-end-of-line-with-a-space-637013/" class="broken_link" rel="nofollow">linuxquestions 1 (explains used sed commands)</a>, <a
href="http://www.linuxquestions.org/questions/programming-9/merge-lines-in-a-file-using-sed-191121/" class="broken_link" rel="nofollow">linuxquestions 2</a>, <a
href="http://www.cyberciti.biz/faq/linux-unix-sed-replace-newline/">nixcraft</a>.</p><p><a
class="a2a_button_citeulike" href="https://www.addtoany.com/add_to/citeulike?linkurl=https%3A%2F%2Fbogdan.org.ua%2F2010%2F11%2F16%2Fhow-to-replace-newlines-with-commas-tabs-etc-merge-lines.html&amp;linkname=How%20to%20replace%20newlines%20with%20commas%2C%20tabs%20etc%20%28merge%20lines%29" title="CiteULike" rel="nofollow noopener" target="_blank"></a><a
class="a2a_button_pocket" href="https://www.addtoany.com/add_to/pocket?linkurl=https%3A%2F%2Fbogdan.org.ua%2F2010%2F11%2F16%2Fhow-to-replace-newlines-with-commas-tabs-etc-merge-lines.html&amp;linkname=How%20to%20replace%20newlines%20with%20commas%2C%20tabs%20etc%20%28merge%20lines%29" title="Pocket" rel="nofollow noopener" target="_blank"></a><a
class="a2a_button_kindle_it" href="https://www.addtoany.com/add_to/kindle_it?linkurl=https%3A%2F%2Fbogdan.org.ua%2F2010%2F11%2F16%2Fhow-to-replace-newlines-with-commas-tabs-etc-merge-lines.html&amp;linkname=How%20to%20replace%20newlines%20with%20commas%2C%20tabs%20etc%20%28merge%20lines%29" title="Kindle It" rel="nofollow noopener" target="_blank"></a><a
class="a2a_button_evernote" href="https://www.addtoany.com/add_to/evernote?linkurl=https%3A%2F%2Fbogdan.org.ua%2F2010%2F11%2F16%2Fhow-to-replace-newlines-with-commas-tabs-etc-merge-lines.html&amp;linkname=How%20to%20replace%20newlines%20with%20commas%2C%20tabs%20etc%20%28merge%20lines%29" title="Evernote" rel="nofollow noopener" target="_blank"></a><a
class="a2a_button_pinterest" href="https://www.addtoany.com/add_to/pinterest?linkurl=https%3A%2F%2Fbogdan.org.ua%2F2010%2F11%2F16%2Fhow-to-replace-newlines-with-commas-tabs-etc-merge-lines.html&amp;linkname=How%20to%20replace%20newlines%20with%20commas%2C%20tabs%20etc%20%28merge%20lines%29" title="Pinterest" rel="nofollow noopener" target="_blank"></a><a
class="a2a_dd addtoany_share_save addtoany_share" href="https://www.addtoany.com/share#url=https%3A%2F%2Fbogdan.org.ua%2F2010%2F11%2F16%2Fhow-to-replace-newlines-with-commas-tabs-etc-merge-lines.html&#038;title=How%20to%20replace%20newlines%20with%20commas%2C%20tabs%20etc%20%28merge%20lines%29" data-a2a-url="https://bogdan.org.ua/2010/11/16/how-to-replace-newlines-with-commas-tabs-etc-merge-lines.html" data-a2a-title="How to replace newlines with commas, tabs etc (merge lines)"><img
src="https://static.addtoany.com/buttons/share_save_120_16.png" alt="Share"></a></p>]]></content:encoded> <wfw:commentRss>https://bogdan.org.ua/2010/11/16/how-to-replace-newlines-with-commas-tabs-etc-merge-lines.html/feed</wfw:commentRss> <slash:comments>2</slash:comments> </item> </channel> </rss>