<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Schadenfreude &#187; databases</title>
	<atom:link href="http://www.ralree.com/tag/databases/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ralree.com</link>
	<description>Malicious enjoyment derived from observing someone else's misfortune</description>
	<lastBuildDate>Sun, 28 Feb 2010 04:18:37 +0000</lastBuildDate>
	
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Reading compressed files with postgres using named pipes</title>
		<link>http://www.ralree.com/2009/09/04/reading-compressed-files-with-postgres-using-named-pipes/</link>
		<comments>http://www.ralree.com/2009/09/04/reading-compressed-files-with-postgres-using-named-pipes/#comments</comments>
		<pubDate>Fri, 04 Sep 2009 06:38:55 +0000</pubDate>
		<dc:creator>Erik</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[awesome]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[dba]]></category>
		<category><![CDATA[dump]]></category>
		<category><![CDATA[postgres]]></category>

		<guid isPermaLink="false">http://www.ralree.com/?p=22661</guid>
		<description><![CDATA[Postgres has the same type of ability MySQL has to read in files, yet much nicer syntax.  LOAD DATA INFILE from MySQL is just COPY in postgres.  I decided to try having it read from a named pipe today, and it worked out nicely.

I started out making a test db and making a [...]]]></description>
			<content:encoded><![CDATA[<p>Postgres has the same type of ability MySQL has to read in files, yet much nicer syntax.  <code>LOAD DATA INFILE</code> from MySQL is just <code>COPY</code> in postgres.  I decided to try having it read from a named pipe today, and it worked out nicely.<br />
<span id="more-22661"></span><br />
I started out making a test db and making a nice little schema:</p>
<pre><code>
postgres@tardis:~$ createdb test
postgres@tardis:~$ psql test
psql (8.4.0)
Type "help" for help.

test=# CREATE TYPE rank AS ENUM ('general', 'sergeant', 'private');
CREATE TYPE
test=# CREATE TABLE military (id SERIAL PRIMARY KEY,
test(#   name VARCHAR(128),
test(#   rank rank);
NOTICE:  CREATE TABLE will create implicit sequence "military_id_seq" for serial column "military.id"
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "military_pkey" for table "military"
CREATE TABLE
</code></pre>
<p>Notice the use of <code>SERIAL</code>?  That&#8217;s postgres&#8217; <code>AUTO_INCREMENT</code>, basically.  I like it better.  Next, it&#8217;s time to make a text file with some data and compress it.  Here&#8217;s what I put in the file (note that the spaces between the words are <code>TAB</code> characters):</p>
<pre><code>
general Lee
sergeant  Hartman
private Pyle
</code></pre>
<p>And compress it with <code>gzip</code>, making a nice little file:</p>
<pre><code>
hank@tardis:/tmp$ gzip file
hank@tardis:/tmp$ zcat file.gz
general	Lee
sergeant	Hartman
private	Pyle
</code></pre>
<p>Now let&#8217;s actually make a named pipe for postgres to read from:</p>
<pre><code>
hank@tardis:/tmp$ mkfifo namedpipe
</code></pre>
<p>Now that we have our named pipe, let&#8217;s start reading from it:</p>
<pre><code>
test=# COPY military (rank, name) FROM '/tmp/namedpipe' WITH DELIMITER E'\t';
</code></pre>
<p>The <code>E'\t'</code> part means to escape characters inside the single-quoted string, turning this into an actual tab character.  All that we have to do now is use zcat:</p>
<pre><code>
hank@tardis:/tmp$ zcat file.gz > namedpipe
</code></pre>
<p>Immediately, there&#8217;s some output in the psql session:</p>
<pre><code>
COPY 3
</code></pre>
<p>So, postgres says it got 3 records successfully.  Yay!  Now, let&#8217;s display them:</p>
<pre><code>
test=# select * from military;
 id |  name   |   rank
----+---------+----------
  1 | Lee     | general
  2 | Hartman | sergeant
  3 | Pyle    | private
(3 rows)
</code></pre>
<p>So, this is a pretty good method to read in compressed files with postgres.  I&#8217;ve seen many articles that use similar methods with postgres dump files, but it&#8217;s useful for bulk delimited data loading as well, as many times it&#8217;s prudent to compress bulk data files and not extract them before loading them.  See the postgres <a href="http://www.postgresql.org/docs/8.4/interactive/sql-copy.html">COPY</a> page for more information about this awesome function.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ralree.com/2009/09/04/reading-compressed-files-with-postgres-using-named-pipes/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Pitfalls with digital health records</title>
		<link>http://www.ralree.com/2009/04/08/pitfalls-with-digital-health-records/</link>
		<comments>http://www.ralree.com/2009/04/08/pitfalls-with-digital-health-records/#comments</comments>
		<pubDate>Thu, 09 Apr 2009 01:36:38 +0000</pubDate>
		<dc:creator>Erik</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[government]]></category>
		<category><![CDATA[health]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[security]]></category>

		<guid isPermaLink="false">http://www.ralree.com/?p=22497</guid>
		<description><![CDATA[The more I hear about digital national health records, the more I worry about them with regards to security.  Various interpretations of the new legislation in the 2009 Stimulus bill could mean anything from implementing something like SAFEHealth, a decentralized system, to something like Google Health, which would centralize medical records.  I expect [...]]]></description>
			<content:encoded><![CDATA[<p>The more I hear about digital national health records, the more I worry about them with regards to security.  Various interpretations of the new legislation in the 2009 Stimulus bill could mean anything from implementing something like <a href="http://www.safehealthinfo.org/default.htm">SAFEHealth</a>, a decentralized system, to something like Google Health, which would centralize medical records.  I expect that a decentralized system will not be what the government will choose.  Proper usage of a decentralized system would be fine, but removes a lot of the utility promised by proponents of electronic health records, such as the possibility of access to updated health records from anywhere.  I&#8217;d like to start off with an alarming quote I found in <a href="http://www.technologyreview.com/biotech/21428/">this interview</a> with Karen Bell, director of the Office of Health IT Adoption at the U.S. Department of Health and Human Services:</p>
<blockquote><p>TR: What about the public-health benefits? Systems that house large quantities of patient data could enable new types of research studies.</p>
<p>KB: Absolutely, that&#8217;s something I get really excited about. It will totally break open our knowledge base. For example, I have been diagnosed with low-pressure glaucoma, which is fairly unusual. No one knows what causes it. I would love to be able to search the system for anyone with this form of glaucoma and start to look for similarities.</p></blockquote>
<p><span id="more-22497"></span><br />
I&#8217;d love to be able to do that too, except it would potentially violate the privacy rights of all of those individuals if they hadn&#8217;t agreed to specifically let you see their records.  If they were to elect to share their information to help others find similarities as she suggested, that would be fine, but we should not assume everyone will do that, and we would have to have a process for this election upon diagnosis.</p>
<p>The first issue to cover is whether the Internet will be used as the medium of record transfer, if point-to-point connections will be established using the phone system or another network, or if an entirely new network will be created to facilitate these transfers, like the financial network.  This article assumes the first option, especially since citizens will supposedly have access to the information online.  A separate network would be a much better solution, but would cost much more to deploy.</p>
<h2>Why does it matter?</h2>
<p>What are the non-privacy-related implications of Internet-accessible health records vs. them being on paper in a drawer?  Most of them have to do with hacking, bribery and blackmail.  Let&#8217;s say someone pays the Database Administrator of the health system $1,000 to change your health records to indicate you saw the doctor about gonorrhea (or they simply use a stolen doctor&#8217;s account, or they&#8217;re a hacker, etc. etc.).  Now, they give you a call, letting you know that they&#8217;ll tell your wife unless you pay $10,000.  Blackmail is a huge possibility.  This is possible now, but only by those who work with patient information physically.</p>
<p>One effect of the centralized hackable database of health records would be the illegal issuance of prescriptions for drugs like Valium, Oxycontin, etc. for a fee. All that would have to happen is a falsified entry into the database, and you can go down to the store and pick up your bottle. I don&#8217;t necessarily oppose loosening rules on prescription drugs, but creating a new electronic black market for health record falsification could prove dangerous. After considering this possibility further, it would be possible to remove prescriptions from the system as well, possibly endangering lives.</p>
<h2>So we should just use paper?  Come on!</h2>
<p>Now, I&#8217;m not necessarily against digitizing health records.  If each citizen, on the initialization of their record, was given a private key, and all the records were encrypted with the matched public key, and kept in a large central database, that would be fine.  Yet, there are problems with this too since in an emergency, the health records wouldn&#8217;t be accessible unless the patient was conscious and able to type their passphrase.  Therefore, there would have to be an override of some sort, which would destroy the security of the system.  This override could be a &#8220;health safety deposit box&#8221; provided to patients optionally by a private corporation, which would contain their passphrase for emergency use, and would be authorized for query by the living will of the patient.  This is the only possible way I can see for centralized health records to be implemented securely, but it seems to be unworkable at the moment.</p>
<p>So what about decentralization, which is what we currently have with paper and with the SAFEHealth system.  If the records were kept by the doctor, and encrypted with both his and the patient&#8217;s public keys (for patient confidentiality), that would be secure.  Of course, assuming the medium of transfer is the Internet, the decryption and changes would have to be done on a standalone computer to prevent the cleartext from being retrievable from the Internet, and any transfer to another office would involve re-encrypting the files with the other doctor&#8217;s public key, transferring of the result to an Internet-enabled machine, and the reverse process on the other end to read the records.  Because this is painful and time-consuming, doctors and administrative assistants (I like &#8217;secretaries&#8217; better, but whatever) would obviously skirt the security here.  And human involvement to decrypt would still be needed in emergencies.  I&#8217;ve just sent an email to SAFEHealth asking for more information about their system:</p>
<blockquote><p>Hello,<br />
I&#8217;m interested in learning more about how your system works at a deeply technical level.  Could you please point me to an explanation of exactly how records are stored, accessed, encrypted, decrypted, which keys are used, who generates those keys, and what network protocols are used to access the information?  Thanks.</p></blockquote>
<p>The only workable solution seems to be the patient signing away the rights of the government to make his/her health records potentially public information.  We&#8217;ve seen various scandals involving medical industry employees already, like <a href="http://www.scmagazineus.com/Octomoms-hospital-records-accessed-15-workers-fired/article/129820/">Octomom</a>.  Many people would sign this, especially if they went to different doctors all the time.  Many people don&#8217;t care about their medical records being public, so they&#8217;d do it for convenience.  But at a process for removal from the system at any time should be available for all patients.  A system like this might complicate seeking diagnosis for things like alcoholism, opiate addiction, and mental health for fear that one&#8217;s employer might find out about the condition.  Any patient should be able to elect to use paper instead, and be responsible for the transfer of his/her medical records to medical professionals for treatment.</p>
<h2>But it&#8217;s for your own good!</h2>
<p>A dangerous assumption is that we must force the patient to allow doctors access to their medical records for his/her own good.  The fourth amendment exists to prevent this very thing from happening.  It could also be argued that random searches of homes would discover meth labs and would save children, but it is unacceptable in this country because of our natural right to privacy.  One way to assure access in case of emergency for those who have privacy concerns is by using a living will to allow access to the paper records assuming they&#8217;re filed somewhere accessible.  Private companies could provide medical record storage facilities for profit, and could be called in case of emergency need of the records (or as I described before, the passphrase to unlock the records).</p>
<p>If one thinks this article is scathing to the whole idea of digital health records, he/she should have a look at <a href="http://www.campaignforliberty.com/article.php?view=36">this one</a>.  While some of the same concerns and many more are brought up, different fears are addressed.  The corruption of government employees would also be a danger (which I touched on with the DBA bribe example earlier), but some of the later examples (the police officer having access) are a little unfounded and paranoid.</p>
<p>Microsoft and Google both have products for storing large amounts of health information.  When stories like <a href="http://www.technewsworld.com/story/Microsoft-Debuts-IE8-Only-to-Have-It-Hacked-66557.html">this</a> are appearing all the time, that really concerns me.  I&#8217;ll finish up with a good quote from <a href="http://www.computerworld.com/action/article.do?command=viewArticleBasic&amp;articleId=9126279&amp;source=NLT_AM">this article</a>.</p>
<blockquote><p>&#8220;Ironically, HIPAA creates felony penalties if a doctor or hospital abuses the data, but there&#8217;s absolutely no penalties for a Microsoft or a Google because they&#8217;re not covered by the law,&#8221; Brailer said. &#8220;It&#8217;s nothing that they&#8217;re doing wrong. It just shows you the state of mind of Congress when that rule was written 10 years ago, because they never ever envisioned there would be online services managing health information.</p>
<p>&#8220;I think that&#8217;s a very high priority, because one consequence of the President-elect ramping up people&#8217;s attention to this is that people will come back to a lot of their fundamental worries about the protection of their health information,&#8221; Brailer said.</p></blockquote>
<p>I look forward to comments and suggestions for this post.  This is definitely a hot-button issue at the moment, and any constructive criticism will be appreciated, and probably responded to.</p>
<h2>Update 4/10/2009</h2>
<p>So, Lawrence Garber,  			Principal Investigator for SAFE Health, and I have had a great email thread going on the security details of their system.  It sounds pretty good, but there are still concerns.  Apparently, they use HTTPS over a VPN, which isn&#8217;t a bad solution for network traffic security.  Yet, the last response I received from him indicated the following:</p>
<blockquote><p>There&#8217;s no need or requirement to encrypt the data on the server because it&#8217;s within our physically, password-protected, and firewall secured datacenter. However passwords and credentials are encrypted.</p></blockquote>
<p>So, again, we&#8217;re back to the unhackable datacenter with unencrypted data idea, which, <a href="http://media.www.thenorthernlight.org/media/storage/paper960/news/2006/07/25/News/Uaf-Server.Hack.Discovered.Last.Year-2542582.shtml">from personal experience</a>, isn&#8217;t a good idea.</p>
<div class="im"></div>
]]></content:encoded>
			<wfw:commentRss>http://www.ralree.com/2009/04/08/pitfalls-with-digital-health-records/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>OSCON Sessions, Day 1</title>
		<link>http://www.ralree.com/2008/07/24/oscon-sessions-day-1/</link>
		<comments>http://www.ralree.com/2008/07/24/oscon-sessions-day-1/#comments</comments>
		<pubDate>Thu, 24 Jul 2008 08:08:00 +0000</pubDate>
		<dc:creator>Erik</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[africa]]></category>
		<category><![CDATA[chisimba]]></category>
		<category><![CDATA[cloud computing]]></category>
		<category><![CDATA[databases]]></category>
		<category><![CDATA[ejabberd]]></category>
		<category><![CDATA[erlang]]></category>
		<category><![CDATA[foscon]]></category>
		<category><![CDATA[mysql]]></category>
		<category><![CDATA[oscon]]></category>
		<category><![CDATA[oscon08]]></category>
		<category><![CDATA[philanthropy]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[ruby]]></category>
		<category><![CDATA[xmpp]]></category>

		<guid isPermaLink="false">http://www.ralree.info/2008/07/24/oscon-sessions-day-1</guid>
		<description><![CDATA[I went to 5 sessions today, and I was pleasantly surprised by most of them.  
CouchDB
CouchDB is a distributed non-relational database written in Erlang.  It is unique in that its main query interface is simply HTTP REST, and for every UPDATE, it simply creates a new version of the row.  Additionally, you [...]]]></description>
			<content:encoded><![CDATA[<p>I went to 5 sessions today, and I was pleasantly surprised by most of them.  </p>
<h2><a href="http://github.com/hank/life/tree/master/oscon/2008/sessions/CouchDB.rdoc">CouchDB</a></h2>
<p>CouchDB is a distributed non-relational database written in Erlang.  It is unique in that its main query interface is simply HTTP REST, and for every UPDATE, it simply creates a new version of the row.  Additionally, you can request the entire history of a row very simply.</p>
<h2><a href="http://github.com/hank/life/tree/master/oscon/2008/sessions/Hypertable.rdoc">Hypertable</a></h2>
<p>An open-source implementation of Google&#8217;s bigtable.  Hypertable uses novel methods such as Bloom filters to significantly decrease query times, as well as smart messaging to distribute a database across many nodes.  It is also non-relational.</p>
<h2><a href="http://github.com/hank/life/tree/master/oscon/2008/sessions/Africa.rdoc">Creating and supporting Free Software in Africa</a></h2>
<p>A group of CS professors hailing from Africa have gotten together to create a community that fosters creativity and innovation from people in Africa.  People in first-world countries can participate by acting as mentors, or directly contribute to the projects involved.  <a href="http://en.wikipedia.org/wiki/Chisimba">Chisimba</a> is an open-source MVC framework for rapid application development.  I am very interested in contributing to this project.</p>
<h2><a href="http://github.com/hank/life/tree/master/oscon/2008/sessions/LucidDB.rdoc">LucidDB</a></h2>
<p>I thought going in that this would be somehow in the same ballpark as Hypertable and CouchDB, but I was disappointed.  Basically, they are using compression and some fairly neat indexing to speed up traditional database queries.  The main problem is that they only have a Java API, which completely turned me off after 30 minutes.  Before that, it seemed like they were getting some pretty promising results.  If they add some more APIs in the future, this may be another one to take a look at.</p>
<h2><a href="http://github.com/hank/life/tree/master/oscon/2008/sessions/History.of.Failure.rdoc">A History of Failure</a></h2>
<p>An awesome talk by Paul Fenwick from Australia, generally detailing failures in computer science and engineering going back into the 20th century and even back to Roman times.  This was a wonderful presentation &#8211; he&#8217;s a really good speaker &#8211; and it poked a <em>lot</em> of fun at New Zealand.</p>
<p>All in all, I must say that this OSCON is much better than last year&#8217;s at least according to what I was looking for in the sessions.  The exhibit hall is also very good this year &#8211; I&#8217;m pretty loaded down with swag at the moment.  </p>
<p>I know someone who would have gotten a kick out of <a href="http://en.oreilly.com/oscon2008/public/schedule/detail/4549">Temporally Quaquaversal Virtual Nanomachine Programming In Multiple Topologically Connected Quantum-Relativistic Parallel Timespaces&#8230;Made Easy!</a> had they been here.  He needs to come next year (you know who you are..)</p>
<p>Tonight, I also attended <a href="http://pdxfoscon.org/">FOSCON 4: Cooking with Ruby</a>.  This was a spectacular event hosted by Cubespace.  I have to say that the live coding competition was a great spectacle, and held everyone&#8217;s attention for hours.  It was an epic battle between Symfony, Rails, Smalltalk/Seaside, and Drupal.  The rankings ended up being the following:</p>
<ol>
<li>Rails</li>
<li>Drupal</li>
<li>Symfony</li>
<li>Smalltalk/Seaside</li>
</ol>
<p>The presentations were good as well for the most part (notes <a href="http://github.com/hank/life/tree/master/oscon/2008/foscon/Notes.rdoc">here</a>).  <strong>AND THEY HAD BEER!</strong> I had some of the best keg beer imaginable &#8211; I thought it would be crap like you usually get out of a keg, but this was real quality Northwestern hopped pale ale.  My cup says Bridgeport Ales, so I&#8217;ll have to investigate.  If anyone knows the exact beer that was available in the left-side keg tonight, I&#8217;d appreciate a comment.  I also met some cool people, some of which are all into XMPP and ejabberd.  I may have to check all of that out now&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ralree.com/2008/07/24/oscon-sessions-day-1/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic Page Served (once) in 0.376 seconds -->
