<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Schadenfreude &#187; dump</title>
	<atom:link href="http://www.ralree.com/tag/dump/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ralree.com</link>
	<description>Malicious enjoyment derived from observing someone else's misfortune</description>
	<lastBuildDate>Thu, 09 Feb 2012 01:49:15 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Reading compressed files with postgres using named pipes</title>
		<link>http://www.ralree.com/2009/09/04/reading-compressed-files-with-postgres-using-named-pipes/</link>
		<comments>http://www.ralree.com/2009/09/04/reading-compressed-files-with-postgres-using-named-pipes/#comments</comments>
		<pubDate>Fri, 04 Sep 2009 06:38:55 +0000</pubDate>
		<dc:creator>Erik</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[awesome]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[dba]]></category>
		<category><![CDATA[dump]]></category>
		<category><![CDATA[postgres]]></category>

		<guid isPermaLink="false">http://www.ralree.com/?p=22661</guid>
		<description><![CDATA[Postgres has the same type of ability MySQL has to read in files, yet much nicer syntax. LOAD DATA INFILE from MySQL is just COPY in postgres. I decided to try having it read from a named pipe today, and it worked out nicely. I started out making a test db and making a nice little schema: postgres@tardis:~$ createdb test postgres@tardis:~$ psql test psql (8.4.0) Type "help" for help. test=# CREATE TYPE rank AS ENUM ('general', 'sergeant', 'private'); CREATE TYPE [...]]]></description>
			<content:encoded><![CDATA[<p>Postgres has the same type of ability MySQL has to read in files, yet much nicer syntax.  <code>LOAD DATA INFILE</code> from MySQL is just <code>COPY</code> in postgres.  I decided to try having it read from a named pipe today, and it worked out nicely.<br />
<span id="more-22661"></span><br />
I started out making a test db and making a nice little schema:</p>
<pre><code>
postgres@tardis:~$ createdb test
postgres@tardis:~$ psql test
psql (8.4.0)
Type "help" for help.

test=# CREATE TYPE rank AS ENUM ('general', 'sergeant', 'private');
CREATE TYPE
test=# CREATE TABLE military (id SERIAL PRIMARY KEY,
test(#   name VARCHAR(128),
test(#   rank rank);
NOTICE:  CREATE TABLE will create implicit sequence "military_id_seq" for serial column "military.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "military_pkey" for table "military"
CREATE TABLE
</code></pre>
<p>Notice the use of <code>SERIAL</code>?  That&#8217;s postgres&#8217; <code>AUTO_INCREMENT</code>, basically.  I like it better.  Next, it&#8217;s time to make a text file with some data and compress it.  Here&#8217;s what I put in the file (note that the spaces between the words are <code>TAB</code> characters):</p>
<pre><code>
general Lee
sergeant  Hartman
private Pyle
</code></pre>
<p>And compress it with <code>gzip</code>, making a nice little file:</p>
<pre><code>
hank@tardis:/tmp$ gzip file
hank@tardis:/tmp$ zcat file.gz
general	Lee
sergeant	Hartman
private	Pyle
</code></pre>
<p>Now let&#8217;s actually make a named pipe for postgres to read from:</p>
<pre><code>
hank@tardis:/tmp$ mkfifo namedpipe
</code></pre>
<p>Now that we have our named pipe, let&#8217;s start reading from it:</p>
<pre><code>
test=# COPY military (rank, name) FROM '/tmp/namedpipe' WITH DELIMITER E'\t';
</code></pre>
<p>The <code>E'\t'</code> part means to escape characters inside the single-quoted string, turning this into an actual tab character.  All that we have to do now is use zcat:</p>
<pre><code>
hank@tardis:/tmp$ zcat file.gz > namedpipe
</code></pre>
<p>Immediately, there&#8217;s some output in the psql session:</p>
<pre><code>
COPY 3
</code></pre>
<p>So, postgres says it got 3 records successfully.  Yay!  Now, let&#8217;s display them:</p>
<pre><code>
test=# select * from military;
 id |  name   |   rank
----+---------+----------
  1 | Lee     | general
  2 | Hartman | sergeant
  3 | Pyle    | private
(3 rows)
</code></pre>
<p>So, this is a pretty good method to read in compressed files with postgres.  I&#8217;ve seen many articles that use similar methods with postgres dump files, but it&#8217;s useful for bulk delimited data loading as well, as many times it&#8217;s prudent to compress bulk data files and not extract them before loading them.  See the postgres <a href="http://www.postgresql.org/docs/8.4/interactive/sql-copy.html">COPY</a> page for more information about this awesome function.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ralree.com/2009/09/04/reading-compressed-files-with-postgres-using-named-pipes/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

