Skip to content

Dump

PostgreSQL Migrator names dump the process of exporting schema and data from a source database. Export lands in stdout, files or directly in target PostgreSQL server.

pg_migrate dump tries to have a semantic similar to pg_dump with flags like --data-only, --section= and more. When dumping directly to PostgreSQL, pg_migrate dump crosses the line to reach pg_restore features.

Table priority

Table priority is the average row size of the table. As soon as a table is ready to receive data, PostgreSQL Migrator queue COPY task.

Streamed COPY

PostgreSQL Migrator implements data transfer using COPY statements. Unlike Foreign Data Wrapper, COPY has absolutely no dependency, making it the most universal data transfer method. PostgreSQL Migrator COPY implementation is very fast and efficient.

When a batch reaches 8MB of data, PostgreSQL Migrator rotates the batch: a new COPY is executed. The actual size of the batch varies depending on the size of the last row of the batch. e.g. a huge row of 9MB fills a batch of one single row. Three rows of 5MB each will produce two batchs: one of 10MB and another for the last row. PostgreSQL driver sends data by messages of 64k to PostgreSQL server.

To summarize: PostgreSQL Migrator sends data to PostgreSQL by chunks of 64k. Once the batch hits 8MB of data, PostgreSQL Migrator finish last row and issues a new COPY statement for next rows..

No Foreign Data Wrapper

Foreign Data Wrappers have a few limitations:

  • often requires non-free software
  • installation is often painful
  • intrusive: require creating temporary objects in PostgreSQL
  • slow: no bulk copy
  • memory hungry: entire batch of row is in memory

Until FDW implements a streamed bulk copy, PostgreSQL Migrator won’t consider it.

Large Object

PostgreSQL Migrator chunks LOBs and long columns to prevent OOM. Chunk size varies seconding source DBMS, see table below.

Column type Chunk size
LONG Not chunked
LONG RAW Not chunked
CLOB 128kB
BLOB 128kB
Column type Chunk size
mediumtext Not chunked
mediumblob Not chunked
longtext 128kB
longblob 128kB

PostgreSQL Migrator inlines long columns for tables with average row size below 64KiB.

Table constraints

PostgreSQL Migrator serializes creation of table constraints:

  • primary key
  • unique keys, one by one.
  • indexes, one by one.
  • foreign keys, one by one.

The dump scheduler parallelize constraints of different tables, but not constraints on the same table. This prevents deadlocks. Each constraint runs in its own transaction.