Most efficient solution for reading CLOB to String, and String to CLOB in Java?

I have a big CLOB (more than 32kB) that I want to read to a String, using StringBuilder. How do I do this in the most efficient way? I can not use the "int length" constructor for StringBuilder since the lenght of my CLOB is longer than a "int" and needs a "long" value. I am not that confortable with the Java I/O classes, and would like to get some guidance. Edit - I have tried with this code for clobToString():

private String clobToString(Clob data) < StringBuilder sb = new StringBuilder(); try < Reader reader = data.getCharacterStream(); BufferedReader br = new BufferedReader(reader); String line; while(null != (line = br.readLine())) < sb.append(line); >br.close(); > catch (SQLException e) < // handle this exception >catch (IOException e) < // handle this exception >return sb.toString(); > 
asked Jan 30, 2010 at 22:36 127k 99 99 gold badges 321 321 silver badges 401 401 bronze badges What exactly you want to do once you read the CLOB into a String? Commented Jan 30, 2010 at 22:38 Do you mean CLOB in the database sense, or just "large string"? Commented Jan 30, 2010 at 22:41 Yes, it is a CLOB from an DB2 data base. Commented Jan 30, 2010 at 22:42 The CLOB contains a large XML-string that will be passed to JAXB. Commented Jan 30, 2010 at 22:56 I am wondering if there is any helpful classes in Java NIO for this. Commented Jan 31, 2010 at 9:47

12 Answers 12

Ok I will suppose a general use, first you have to download apache commons, there you will find an utility class named IOUtils which has a method named copy();

Now the solution is: get the input stream of your CLOB object using getAsciiStream() and pass it to the copy() method.

InputStream in = clobObject.getAsciiStream(); StringWriter w = new StringWriter(); IOUtils.copy(in, w); String clobAsString = w.toString(); 
answered Jan 30, 2010 at 22:54 Omar Al Kababji Omar Al Kababji 1,828 4 4 gold badges 17 17 silver badges 31 31 bronze badges

Thanks, that looks nice. But I leavy the question open little bit more, because I would prefer a solution that only uses the standard library.

Commented Jan 31, 2010 at 9:46 I already have the Apache Commons library loaded so this is the perfect solution. Thanks! Commented Jun 2, 2011 at 14:47

getAsciiStream will give you headaches if you use unicode. (or any characters falling outside of ascii)

Commented Sep 29, 2011 at 12:41

I changed InputStream to Reader and clobObject.getAsciiStream() to clobObject.getCharacterStream() to prevent encoding issues.

Commented Jun 11, 2014 at 7:52 IOUtils.copy(in, w) is deprecated, use IOUtils.copy(in, w, StandardCharsets.UTF_8) instead Commented Oct 5, 2021 at 11:35

What's wrong with:

clob.getSubString(1, (int) clob.length()); 

For example Oracle oracle.sql.CLOB performs getSubString() on internal char[] which defined in oracle.jdbc.driver.T4CConnection and just System.arraycopy() and next wrap to String . You never get faster reading than System.arraycopy() .

UPDATE Get driver ojdbc6.jar, decompile CLOB implementation, and study which case could be faster based on the internals knowledge.

answered Jul 1, 2014 at 15:41 48k 28 28 gold badges 264 264 silver badges 327 327 bronze badges Leaves a lot of newlines characters in the string. Commented Sep 16, 2014 at 19:58

@Gervase Newlines can be significant in XML. Anyway, you shoud trim useless spaces and newlines before storing it to the DB.

Commented May 23, 2016 at 14:43

Some points to clear: What happen if clob.length() is greater than Integer.MAX_VALUE? What's jar contains oracle.sql.CLOB?

Commented May 26, 2016 at 14:33

@Stephan I studied ojdbc6.jar . Integer.MAX_VALUE is a limit for array length for JDK Platform 2 and String hold chars in array. So you out of luck for > 2 GiB CLOBs. Try streaming approach because you can't hold that data with pure Java memory model (unless you use some native extension and 64-bit platform with enough system memory).

Commented May 27, 2016 at 14:24

@MarekBernád OK. I believe that you have problem because you cross transaction / connection boundaries. It is the problem with cumbersome frameworks that hides resource management. If you in managed EE environment access getter inside @Transactional )) If you bother with efficiency Hibernate is not a good framework.

Commented Jul 8, 2020 at 16:12

My answer is just a flavor of the same. But I tested it with serializing a zipped content and it worked. So I can trust this solution unlike the one offered first (that use readLine) because it will ignore line breaks and corrupt the input.

/********************************************************************************************* * From CLOB to String * @return string representation of clob *********************************************************************************************/ private String clobToString(java.sql.Clob data) < final StringBuilder sb = new StringBuilder(); try < final Reader reader = data.getCharacterStream(); final BufferedReader br = new BufferedReader(reader); int b; while(-1 != (b = br.read())) < sb.append((char)b); >br.close(); > catch (SQLException e) < log.error("SQL. Could not convert CLOB to string",e); return e.toString(); >catch (IOException e) < log.error("IO. Could not convert CLOB to string",e); return e.toString(); >return sb.toString(); > 
answered Dec 10, 2012 at 3:42 Stan Sokolov Stan Sokolov 2,240 1 1 gold badge 24 24 silver badges 24 24 bronze badges Good job, thank you Commented Feb 2, 2022 at 17:34

I can not use the "int length" constructor for StringBuilder since the length of my CLOB is longer than a int and needs a long value.

If the CLOB length is greater than fits in an int, the CLOB data won't fit in a String either. You'll have to use a streaming approach to deal with this much XML data.

If the actual length of the CLOB is smaller than Integer.MAX_VALUE , just force the long to int by putting (int) in front of it.

127k 99 99 gold badges 321 321 silver badges 401 401 bronze badges answered Jan 31, 2010 at 11:19 17.4k 2 2 gold badges 62 62 silver badges 80 80 bronze badges Indeed, if the CLOB size is bigger than 2^32 bytes, you've got big problems Commented Jan 31, 2010 at 11:55 I would suggest writing it to a file, if he need the whole CLOB to process Commented Jan 5, 2016 at 10:39

If you really must use only standard libraries, then you just have to expand on Omar's solution a bit. (Apache's IOUtils is basically just a set of convenience methods which saves on a lot of coding)

You are already able to get the input stream through clobObject.getAsciiStream()

You just have to "manually transfer" the characters to the StringWriter:

InputStream in = clobObject.getAsciiStream(); Reader read = new InputStreamReader(in); StringWriter write = new StringWriter(); int c = -1; while ((c = read.read()) != -1) < write.write(c); >write.flush(); String s = write.toString(); 

Bear in mind that

  1. If your clob contains more character than would fit a string, this won't work.
  2. Wrap the InputStreamReader and StringWriter with BufferedReader and BufferedWriter respectively for better performance.
6,546 6 6 gold badges 58 58 silver badges 91 91 bronze badges answered Jan 31, 2010 at 12:09 3,570 6 6 gold badges 31 31 silver badges 38 38 bronze badges

That looks similar to the code i provided in my question, are there any key differences between them that I don't see? In example in a performance point of view?

Commented Jan 31, 2010 at 12:41

Oops, i missed out on your code fragment! It's somewhat similar, but bear in mind that by just grabbing the BufferedReader.readLine(), you'll miss out on the linebreaks.

Commented Feb 1, 2010 at 0:36 Small Correction Line 2 should be Reader read = new InputStreamReader(in); Commented Jun 4, 2012 at 7:15

No, no, no. getAsciiStream() forces ASCII encoding and corrupts all non-ASCII-characters. What you're doing is getting an InputStream (bytes) from a character source, and then immediately turning them back into characters using a random (platform default) encoding on InputStreamReader . It's a redundant operation except for the fact that it corrupts non-ASCII data. Just read from the getCharacterStream() Reader directly and write to the StringWriter .

Commented Sep 20, 2012 at 12:02

If using Mule, below are the steps.

Follow below steps.

Enable streaming in the connector i.e. progressiveStreaming=2

Typecast DB2 returned CLOB to java.sql.Clob (IBM Supports this type cast)

Convert that to character stream (ASCII stream sometimes may not support some special characters). So you may to use getCharacterStream()

That will return a "reader" object which can be converted to "String" using common-io (IOUtils).

So in short, use groovy component and add below code.

clobTest = (java.sql.Clob)payload.field1 bodyText = clobTest.getCharacterStream() targetString = org.apache.commons.io.IOUtils.toString(bodyText) payload.PAYLOADHEADERS=targetString return payload 

Note: Here I'm assuming "payload.field1" is holding clob data.