Postgres has the same type of ability MySQL has to read in files, yet much nicer syntax.
LOAD DATA INFILE from MySQL is just
COPY in postgres. I decided to try having it read from a named pipe today, and it worked out nicely.
I started out making a test db and making a nice little schema:
postgres@tardis:~$ createdb test postgres@tardis:~$ psql test psql (8.4.0) Type "help" for help. test=# CREATE TYPE rank AS ENUM ('general', 'sergeant', 'private'); CREATE TYPE test=# CREATE TABLE military (id SERIAL PRIMARY KEY, test(# name VARCHAR(128), test(# rank rank); NOTICE: CREATE TABLE will create implicit sequence "military_id_seq" for serial column "military.id" NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "military_pkey" for table "military" CREATE TABLE
Notice the use of
SERIAL? That’s postgres’
AUTO_INCREMENT, basically. I like it better. Next, it’s time to make a text file with some data and compress it. Here’s what I put in the file (note that the spaces between the words are
general Lee sergeant Hartman private Pyle
And compress it with
gzip, making a nice little file:
hank@tardis:/tmp$ gzip file hank@tardis:/tmp$ zcat file.gz general Lee sergeant Hartman private Pyle
Now let’s actually make a named pipe for postgres to read from:
hank@tardis:/tmp$ mkfifo namedpipe
Now that we have our named pipe, let’s start reading from it:
test=# COPY military (rank, name) FROM '/tmp/namedpipe' WITH DELIMITER E'\t';
E'\t' part means to escape characters inside the single-quoted string, turning this into an actual tab character. All that we have to do now is use zcat:
hank@tardis:/tmp$ zcat file.gz > namedpipe
Immediately, there’s some output in the psql session:
So, postgres says it got 3 records successfully. Yay! Now, let’s display them:
test=# select * from military; id | name | rank ----+---------+---------- 1 | Lee | general 2 | Hartman | sergeant 3 | Pyle | private (3 rows)
So, this is a pretty good method to read in compressed files with postgres. I’ve seen many articles that use similar methods with postgres dump files, but it’s useful for bulk delimited data loading as well, as many times it’s prudent to compress bulk data files and not extract them before loading them. See the postgres COPY page for more information about this awesome function.