CSV import to postgres

If you're interested in the relative performance of various libraries parsing CSV, you may find the csv-game interesting.

But as @sfackler mentioned (and I mentioned in your follow up thread), if you want to reduce time on the end-to-end of you ETL job, use postgres' COPY command. If you need to, dump the transformed CSV to a new file. Or read the COPY in from stdin using a pipeline.

I've seen people ingesting >50gb of data into a db using inserts and it took a few weeks each time. Converting the job to use bulk upload, the upload took hours.

1 Like