Tuesday, 12 November 2019 14:38

Reading External File

Written by http://blog.mclaughlinsoftware.com/2019/11/13/reading-external-file/
Rate this item
(0 votes)

I’m working on items for migrating my database class from Oracle to PostgreSQL. I ran into an interesting limitation when I tried using the COPY command to read an external CSV

file.

I had prepared the system by creating a new directory hierarchy owned by the postgres user on top of a /u01/app mount point. I set the ownership of the directories and files with the following command from the /u01/app mount point:

chown-R postgres:postgres postgres

chown -R postgres:postgres postgres

After running the following command:

COPY transaction_uploadFROM'/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV;

COPY transaction_upload FROM '/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV;

The command raised the following error:

COPY transaction_upload FROM '/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV;ERROR: must be superuser or a member of the <code>pg_read_server_files</code> role to COPY from a fileHINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.

COPY transaction_upload FROM '/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV; ERROR: must be superuser or a member of the <code>pg_read_server_files</code> role to COPY from a file HINT: Anyone can COPY to stdout or from stdin. psql's \copy command also works for anyone.

The two options for fixing the problem are: Changing the student user to a superuser, and granting the pg_read_server_files role to the student user. Changing the student user to a superuser isn’t really a practical option. So, I connected as the postgres superuser and granted the pg_read_server_files role to the student user. It is a system level role and therefore doesn’t limit the role to only the videodb database.

As the postgres user, type the following command to grant the pg_read_server_files role to the system user:

GRANT pg_read_server_files TO student;

GRANT pg_read_server_files TO student;

After granting the role to the student user, I created a small test case. The test table definition is:

CREATETABLE test( id INTEGER, first_name VARCHAR(20), last_name VARCHAR(20));

CREATE TABLE test ( id integer , first_name varchar(20) , last_name varchar(20));

I created a test.csv file in the /u01/app/upload/postgres directory, like:

1,Simon,Bolivar2,Peter,Davenport3,Michael,Swan

1,Simon,Bolivar 2,Peter,Davenport 3,Michael,Swan

The test.csv file requires the following permissions and ownerships:

-rw-r--r--. 1 postgres postgres 49 Nov 1310:56 test.csv

-rw-r--r--. 1 postgres postgres 49 Nov 13 10:56 test.csv

The permissions are user read-write, groups read, and others read. The ownership should be granted to postgres and the primary group for the postgres user, which should also be postgres.

You can then connect to psql as the student user with the database set to videodb and run the following copy command:

COPY transaction_uploadFROM'/u01/app/upload/postgres/test.csv' DELIMITERS ',' CSV;

COPY transaction_upload FROM '/u01/app/upload/postgres/test.csv' DELIMITERS ',' CSV;

If you put a comma at the end of each line, like you would do in MySQL, it raises an error. The trailing comma raises the following error:

ERROR: extra data after last expected column

ERROR: extra data after last expected column

If you forget a delimiting commas somewhere on a line, the copy command raises the following error:

ERROR: missing data for column "last_name"CONTEXT: COPY tester, line 3: "3,Michael Swan"

ERROR: missing data for column "last_name" CONTEXT: COPY tester, line 3: "3,Michael Swan"

The error points to the column after the missing column. The context points to the line number while displaying the text.

You should take careful note that the copy command is an appending command. If you run it a second time, you insert a duplicate set of values in the target table.

After experimenting, its time to fix my student instance. The transaction_upload_mysql.csv file has two critical errors that need to be fixed. They are:

  1. A comma terminates each line, which would raise an extra data after last expected column error.
  2. A comma terminates each line followed by some indefinite amount of whitespace, which would also raise an extra data after last expected column error.

Since I have students with little expertise in Unix or Linux commands, I must provide a single command that they can use to convert the file with problems to one without problems. However, they should copy the transaction_upload_mysql.csv file to ensure they don’t disable the equivalent functionality for the MySQL solution space.

They should copy two files as the root user from the mysql directory to the postgres directory, as follows:

cp/u01/app/upload/mysql/transaction_upload_mysql.csv /u01/app/upload/postgres/transaction_upload_postgres.csvcp/u01/app/upload/mysql/transaction_upload2_mysql.csv /u01/app/upload/postgres/transaction_upload2_postgres.csv

cp /u01/app/upload/mysql/transaction_upload_mysql.csv /u01/app/upload/postgres/transaction_upload_postgres.csv cp /u01/app/upload/mysql/transaction_upload2_mysql.csv /u01/app/upload/postgres/transaction_upload2_postgres.csv

As the root user in the /u01/app/upload/postgres directory, run the following command:

cat transaction_upload_postgres.csv |sed-e's/\,$//g'> x; cat x |sed-e's/\,[[:space:]]$//g'> y; mv y transaction_upload_postgres.csv; rm x

cat transaction_upload_postgres.csv | sed -e 's/\,$//g' > x; cat x | sed -e 's/\,[[:space:]]$//g' > y; mv y transaction_upload_postgres.csv; rm x

Please check the file permissions and ownerships with the ll (long list) command. If the file isn’t like this:

-rw-r--r--. 1 postgres postgres 49 Nov 1310:56 transaction_upload_postgres.csv

-rw-r--r--. 1 postgres postgres 49 Nov 13 10:56 transaction_upload_postgres.csv

Then, they should be able to change it as the root user with these commands:

chown postgres:postgres transaction_upload_postgres.csvchmod544 transaction_upload_postgres.csv

chown postgres:postgres transaction_upload_postgres.csv chmod 544 transaction_upload_postgres.csv

Lastly, they should connect to the psql as the student user, using the videodb database and run the following command:

COPY transaction_uploadFROM'/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV;

COPY transaction_upload FROM '/u01/app/upload/postgres/transaction_upload_postgres.csv' DELIMITERS ',' CSV;

A query of the import table with this:

SELECTCOUNT(*)FROM transaction_upload;

SELECT COUNT(*) FROM transaction_upload;

should return:

 count ------- 11520(1 row)

count ------- 11520 (1 row)

As always, I hope this helps those looking for some explanation and example on the copy feature of PostgreSQL.

Read 43 times

Leave a comment

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.