I'm working on some SocketChannel
-to-SocketChannel
code which will do best with a direct byte buffer--long lived and large (tens to hundreds of megabytes per connection.) While hashing out the exact loop structure with FileChannel
s, I ran some micro-benchmarks on ByteBuffer.allocate()
vs. ByteBuffer.allocateDirect()
performance.
There was a surprise in the results that I can't really explain. In the below graph, there is a very pronounced cliff at the 256KB and 512KB for the ByteBuffer.allocate()
transfer implementation--the performance drops by ~50%! There also seem sto be a smaller performance cliff for the ByteBuffer.allocateDirect()
. (The %-gain series helps to visualize these changes.)
Buffer Size (bytes) versus Time (MS)
Why the odd performance curve differential between ByteBuffer.allocate()
and ByteBuffer.allocateDirect()
? What exactly is going on behind the curtain?
It very well maybe hardware and OS dependent, so here are those details:
- MacBook Pro w/ Dual-core Core 2 CPU
- Intel X25M SSD drive
- OSX 10.6.4
Source code, by request:
package ch.dietpizza.bench;
import static java.lang.String.format;
import static java.lang.System.out;
import static java.nio.ByteBuffer.*;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.UnknownHostException;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.ReadableByteChannel;
import java.nio.channels.WritableByteChannel;
public class SocketChannelByteBufferExample {
private static WritableByteChannel target;
private static ReadableByteChannel source;
private static ByteBuffer buffer;
public static void main(String[] args) throws IOException, InterruptedException {
long timeDirect;
long normal;
out.println("start");
for (int i = 512; i <= 1024 * 1024 * 64; i *= 2) {
buffer = allocateDirect(i);
timeDirect = copyShortest();
buffer = allocate(i);
normal = copyShortest();
out.println(format("%d, %d, %d", i, normal, timeDirect));
}
out.println("stop");
}
private static long copyShortest() throws IOException, InterruptedException {
int result = 0;
for (int i = 0; i < 100; i++) {
int single = copyOnce();
result = (i == 0) ? single : Math.min(result, single);
}
return result;
}
private static int copyOnce() throws IOException, InterruptedException {
initialize();
long start = System.currentTimeMillis();
while (source.read(buffer)!= -1) {
buffer.flip();
target.write(buffer);
buffer.clear(); //pos = 0, limit = capacity
}
long time = System.currentTimeMillis() - start;
rest();
return (int)time;
}
private static void initialize() throws UnknownHostException, IOException {
InputStream is = new FileInputStream(new File("/Users/stu/temp/robyn.in"));//315 MB file
OutputStream os = new FileOutputStream(new File("/dev/null"));
target = Channels.newChannel(os);
source = Channels.newChannel(is);
}
private static void rest() throws InterruptedException {
System.gc();
Thread.sleep(200);
}
}
jmh
didn't exist in 2010. (neat project, but i'm not going to run again 11 years later.) the warmup is in the code--JIT is not a factor. but all that is irrelevant. the above discussion is not about raw numbers, but about why there is strange curve between two different IO models. – Photocopy