Dump¶
PostgreSQL Migrator names dump the process of exporting schema and data from a source database. Export lands in stdout, files or directly in target PostgreSQL server.
pg_migrate dump tries to have a semantic similar to pg_dump
with flags like --data-only, --section= and more.
When dumping directly to PostgreSQL,
pg_migrate dump crosses the line to reach pg_restore features.
Table priority¶
Table priority is the average row size of the table. As soon as a table is ready to receive data, PostgreSQL Migrator queue COPY task.
Streamed COPY¶
PostgreSQL Migrator implements data transfer using COPY statements. Unlike Foreign Data Wrapper, COPY has absolutely no dependency, making it the most universal data transfer method. PostgreSQL Migrator COPY implementation is very fast and efficient.
When a batch reaches 8MB of data, PostgreSQL Migrator rotates the batch: a new COPY is executed. The actual size of the batch varies depending on the size of the last row of the batch. e.g. a huge row of 9MB fills a batch of one single row. Three rows of 5MB each will produce two batchs: one of 10MB and another for the last row. PostgreSQL driver sends data by messages of 64k to PostgreSQL server.
To summarize: PostgreSQL Migrator sends data to PostgreSQL by chunks of 64k. Once the batch hits 8MB of data, PostgreSQL Migrator finish last row and issues a new COPY statement for next rows..
No Foreign Data Wrapper¶
Foreign Data Wrappers have a few limitations:
- often requires non-free software
- installation is often painful
- intrusive: require creating temporary objects in PostgreSQL
- slow: no bulk copy
- memory hungry: entire batch of row is in memory
Until FDW implements a streamed bulk copy, PostgreSQL Migrator won’t consider it.
Large Object¶
PostgreSQL Migrator chunks LOBs and long columns to prevent OOM. Chunk size varies seconding source DBMS, see table below.
| Column type | Chunk size |
|---|---|
| LONG | Not chunked |
| LONG RAW | Not chunked |
| CLOB | 128kB |
| BLOB | 128kB |
| Column type | Chunk size |
|---|---|
| mediumtext | Not chunked |
| mediumblob | Not chunked |
| longtext | 128kB |
| longblob | 128kB |
PostgreSQL Migrator inlines long columns for tables with average row size below 64KiB.
Table constraints¶
PostgreSQL Migrator serializes creation of table constraints:
- primary key
- unique keys, one by one.
- indexes, one by one.
- foreign keys, one by one.
The dump scheduler parallelize constraints of different tables, but not constraints on the same table. This prevents deadlocks. Each constraint runs in its own transaction.