Java JDBC Postgres copyIn not recognizing end of line and padding double quotes

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP



Java JDBC Postgres copyIn not recognizing end of line and padding double quotes



I am trying to load data from Oracle to Greenplum using Java. I store the result set as comma separated values in to byte array input stream and then load it using copy in.


import java.sql.*;
import au.com.bytecode.opencsv.CSVWriter;
import java.io.*;
import org.postgresql.copy.CopyManager;
import org.postgresql.core.BaseConnection;

public class ORtoGP
public static void main(String args) throws SQLException
try
String dbURL = "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = xxxxxx)(PORT = 1521))) (CONNECT_DATA = (SERVICE_NAME = xxxxxx) (SRVR = DEDICATED)))";
String strUserID = "xxxxxx";
String strPassword = "xxxxxx";
Connection myConnection=DriverManager.getConnection(dbURL,strUserID,strPassword);
Statement sqlStatement = myConnection.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_READ_ONLY);
String readRecordSQL = "select id,name from table where rownum <= 10 ";
ResultSet rs = sqlStatement.executeQuery(readRecordSQL);

StringWriter stringWriter = new StringWriter();
CSVWriter csvWriter = new CSVWriter(stringWriter);

rs.first();
csvWriter.writeAll(rs, true);
String orresult = stringWriter.toString();
System.out.println(orresult);

byte bytes = orresult.getBytes();
ByteArrayInputStream orinput = new ByteArrayInputStream(bytes);


String dbURL1 = "jdbc:postgresql://xxxxx:5432/xxxxx";
String user = "xxxx";
String pass = "xxxx";
Connection conn2 = DriverManager.getConnection(dbURL1, user, pass);

CopyManager copyManager = new CopyManager((BaseConnection) conn2);
copyManager.copyIn("copy java_test from stdin with DELIMITER ','",orinput);

rs.close();
myConnection.close();
csvWriter.close();

catch (Exception e)
System.out.println(e);





However, I run in to two issues:





There is an easier way. There are two open source projects that make it a lot easier to load data from Oracle into Greenplum. One is Outsourcer and the other is gplink. Both can be found here: pivotalguru.com
– Jon Roberts
Aug 13 at 14:18





Thanks.. I have already checked Outsourcer.. it has a dependency that you need to run it from a server where gpfdist is installed or something like that. I don't remember the specifics as it has been a while since I tried, but it just cannot be run from 'any machine'. I still tried to get it to work in a specific server, but was getting some Java errors, so I just left it.
– CuriP
Aug 13 at 15:10






Outsourcer has to run on a machine that every host in the cluster can access. So the Master or Standby are the best options since the segment hosts are in a private network.
– Jon Roberts
Aug 13 at 16:43





Thanks.. we have some challenges in getting it run from Master or Standby. DBA has pushed back master option.. we have operational issues between business groups in using stand by. So that option is also a ruled out. So I gave up on it.
– CuriP
Aug 13 at 18:25




1 Answer
1



According to the documentation the default format is text, which does not handle quoting.


text



You need to specify FORMAT csv in your command.


FORMAT csv





Thanks.. I added this - copyManager.copyIn("copy java_test from stdin with csv header",orinput);, and it resolved those issues. However, I get a different error, org.postgresql.util.PSQLException: ERROR: invalid byte sequence for encoding "UTF8": 0x92 Hint: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". I checked the database, and noticed that it does have UTF8 encoding. So does this error imply that incoming data is in a different form of encoding? How can this be fixed? Thanks.
– CuriP
Aug 12 at 5:43





Did you look at the documentation link I provided? Did you see the ENCODING option?
– Jim Garrison
Aug 12 at 5:50


ENCODING





Yes, I did try encoding as first thing. Unfortunately encoding not supported in my Postgres version - 8.2 (postgresql.org/docs/8.2/static/sql-copy.html). I also tried this - orresult.replaceAll("u0000", "");.. but doesn't seem to make much difference.
– CuriP
Aug 12 at 5:59





You will have to re-encode the data as you read it. The issue is not u0000 but 0x92.
– Jim Garrison
Aug 12 at 6:02


u0000


0x92





I add this to the code, and it resolved the encoding issue too.. - byte bytes = orresult.getBytes("UTF8"); . Thank you for taking time.
– CuriP
Aug 12 at 20:39






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Firebase Auth - with Email and Password - Check user already registered

Dynamically update html content plain JS

How to determine optimal route across keyboard