java – JNI, the fastest way to cast byte [] to int [] vice versa


Good day to all!

The task is this: there is a byte [] array received over the network. In fact, these bytes contain an int [] array of the same size, four times less. You just need to take a memory area and replace the pointer by rewriting the service information about the array (length, etc.). Arrays are relatively large ~ 100kb. There are many arrays and it is critically important to process them quickly.

Tried solutions and speeds (with JIT warming up):

  1. Loop traversal and collecting int through bitwise operations (was dropped immediately due to inefficiency)
  2. Using ByteBuffer.wrap (byte []). AsIntBuffer (). Get (int []) time: 49200 nanoseconds
  3. Using JNI and GetByteArrayRegion / SetIntArrayRegion time: 14700 nanoseconds
  4. Unsafe closed API time: about 200 nanoseconds on average

As a result of the research, it turned out that the most interesting option is casting through the Unsafe closed API, however, there are some problems associated either with caching, or with the fact that I do not write all the information correctly in the header of the arrays. Please help me figure it out if anyone knows, or who has a desire =)

public class UnsafeUtil {

    public static Unsafe unsafe;
    private static final long INT_ARRAY_HEADER_OFFSET;
    private static final long BYTE_ARRAY_HEADER_OFFSET;
    private static final long ADDRESS_SIZE;
    private static final long ARRAY_TYPE_OFFSET;
    private static final long ARRAY_LENGTH_OFFSET;
    private static final int ARRAY_TYPE_INTS;
    private static final int ARRAY_TYPE_BYTES;
    private static final int ARRAY_LENGTH_INTS = 32768;
    private static final int ARRAY_LENGTH_BYTES = ARRAY_LENGTH_INTS<<2;
    private static final long ARRAY_HEADER_INTS;
    private static final long ARRAY_HEADER_BYTES;

    private static int[] test;

    private static UnsafeUtil instance = new UnsafeUtil();

    static {
        try {
            Field f = Unsafe.class.getDeclaredField("theUnsafe");
            unsafe = (Unsafe)f.get(null);
            int[] ints = new int[ARRAY_LENGTH_INTS];
            byte[] bytes = new byte[ARRAY_LENGTH_BYTES];
            ADDRESS_SIZE = unsafe.addressSize();
            BYTE_ARRAY_HEADER_OFFSET = unsafe.arrayBaseOffset(ints.getClass());
            INT_ARRAY_HEADER_OFFSET = unsafe.arrayBaseOffset(bytes.getClass());
            System.err.println("INFO: Initialized binary caster.");
            System.err.println("INFO: address size " + ADDRESS_SIZE);
            System.err.println("INFO: offset int[] " + INT_ARRAY_HEADER_OFFSET);
            System.err.println("INFO: offset byte[] " + BYTE_ARRAY_HEADER_OFFSET);
                System.err.println("CRITICAL ERROR: Sorry, i don't know your environment. It seems" +
                        " like you are using something rare (int[] offset != byte[] offset, i dont know" +
                        "how to cast it fast in binary level.)");
            // searching length offset
            ARRAY_TYPE_BYTES = unsafe.getInt(bytes, ARRAY_TYPE_OFFSET);
            ARRAY_TYPE_INTS = unsafe.getInt(ints, ARRAY_TYPE_OFFSET);
            ARRAY_HEADER_BYTES = unsafe.getLong(bytes, ADDRESS_SIZE);
            ARRAY_HEADER_INTS = unsafe.getLong(ints, ADDRESS_SIZE);
            // warming-up
            long l = 0L;
            for (int i = 0; i < 10000; i++){
                l+= testReinterpretBtI();
            for (int i = 0; i < 10000; i++){
                l+= testReinterpretItB();

        } catch (Exception ex) {
            throw new RuntimeException(ex);

    public static void main(String[] args) {
        long l = 0;
        for (int i = 0; i < 10; i++){
            l+= testReinterpretBtI();
        for (int i = 0; i < 10; i++){
            l+= testReinterpretItB();
        System.err.println("AVG CAST TIME WITH WARM-UP: " + (l/20) + " ns");

    public static byte[] modifyArrayLength(byte[] array, int newLength, long lengthOffset){
        unsafe.putInt(array, lengthOffset, newLength);
        return array;

    public static int[] reinterpretAsIntArray(byte[] array){
        unsafe.putLong(array, ADDRESS_SIZE, ARRAY_HEADER_INTS);
        Object o = array;
        return (int[]) o;

    public static byte[] reinterpretAsByteArray(int[] array){
        unsafe.putLong(array, ADDRESS_SIZE, ARRAY_HEADER_BYTES);
        Object o = array;
        return (byte[]) o;

    private static long testReinterpretBtI(){
        byte[] b = new byte[ARRAY_LENGTH_BYTES];
        long t1 = System.nanoTime();
        int[] i = reinterpretAsIntArray(b);
        long t2 = System.nanoTime();
        test = i;
        return (t2-t1);

    private static long testReinterpretItB(){
        int[] i = new int[ARRAY_LENGTH_INTS];
        long t1 = System.nanoTime();
        byte[] b = reinterpretAsByteArray(i);
        long t2 = System.nanoTime();
        test = i;
        return (t2-t1);

UPDATE: I managed to find out about where to dig and what the problem is. Probably, when calling unsafe, I erased the object's reference counter and at some point the object was deleted by the garbage collector, which then led to an access violation when trying to access the already freed memory. I will think about how to organize the exchange stack so that there are no jambs with links. I will probably count manually, or make +1 to the counter and reuse the same objects, or off-heap array. Presumably the latter is the best.

UPDATE: There is a Dequeue queue of fixed-size byte [] arrays. The byte [] arrays from the queue are used as buffers to receive data over the network to avoid memalloc. the byte [] data coming from the server to the client is placed in the buffers from the queue. The client casts the data of the byte [] buffer into an int [] array, uses it for its own purposes. caste is done via unsafe as described in the code above. when the int [] arrays are no longer needed, they are cast back to byte [] and sent back to the queue. at some point (it looks like after passing gc) an accessViolation happens when trying to access int []


Hmm, but what about parse the data received over the network directly into int? It will take a low-level implementation, really. But it may not fit, because I don’t know how you get them.

Scroll to Top