UPC Collections

The Java programming language is versatile and robust, however, for large-scale applications with memory intensive processes — such as those commonly found in scientific and high performance computing — there is still much to be desired. The public Java Development Kit (JDK) does not provide any means of forcible destruction of objects, nor does it support large collection sizes (limited to \(2^{31}-1\) elements) as found in choice computational languages such as Fortran and C/C++. Within Java’s Hotspot, though, the non-public \(\texttt{sun.misc.Unsafe}\) class allows such support with basic off-heap (C heap) functionality. The submitted UPC collections are fully integrated with this class to provide off-heap arrays, lists, hash sets, hash maps, and matrices allowing for the allocation and destruction of very large (i.e., \(2^{63}-1\)) collections consistent with current advanced computational needs. In addition, a customized version of OpenJDK 1.9 with advanced bulk-operation \(\texttt{Unsafe}\) support and companion UPC collections are also provided.

Stock JDK Implementation

The stock version functions on standard Java, integrating six \(\texttt{Unsafe}\) methods. For allocating and destroying collections, \(\texttt{allocateMemeory}\) and \(\texttt{freeMemory}\) are used. Bulk operations are achieved through \(\texttt{setMemory}\) and \(\texttt{copyMemory}\). Lastly, getting and setting single entries is done by way of \(\texttt{get<Type>}\) and \(\texttt{put<Type>}\), where \(\texttt{<Type>}\) refers to the data type — \(\texttt{byte}\), \(\texttt{char}\), \(\texttt{double}\), \(\texttt{float}\), \(\texttt{int}\), \(\texttt{long}\), or \(\texttt{short}\). Therefore, any version of Java supporting these operations can employ the stock UPC library. For example, the authors have successfully experimented with UPC on Windows, Mac, and Linux systems using Oracle and OpenJDK versions of Java.

Modified JDK Implementation

The modified version requires a customized OpenJDK implementation, which provides advanced, native support for bulk operations — detailed below. Its intent is to offload repetitive JNI calls to the JVM, providing a performance boost for tasks such as sorting and searching. These enhancements are not limited in use to UPC; they are intended as general purpose methods, easily integrated with existing \(\texttt{Unsafe}\) code.

Bulk operations:

  • A memory zeroing allocator — Java collections are zeroed by default, but \(\texttt{Unsafe}\) simply reserves memory without overwriting existing values, forcing the user to manually clear the range — a potentially expensive operation and easy to overlook. This addition employs Java’s \(\texttt{Copy::zero_to_bytes}\) method which does this at the OS level (very fast), and is the method utilized by Java’s zeroing collections.
  • Quicksort and counting sort — Basic non-recursive quicksort (for \(\texttt{int}\)-sized and larger data types) and counting sort (for \(\texttt{boolean}\), \(\texttt{byte}\), \(\texttt{char}\), and \(\texttt{short}\)).
  • Linear and binary search — Performs said searches over a memory range.
  • Set and remove all — Sets or removes all values in a memory range.
  • Reverse — Reverses all values in a memory range.
  • Contains all — Checks to see if one memory range contains all of the values in another.
  • Rehash — Rehashes primitive hash maps and hash sets when growing.

Download UPC

There are seven files available for download. To download the zip files, select the link, then in the upper right-hand corner of Google Drive, select download. The larger files will not open in preview mode in Google Drive.

Compile OpenJDK

The following steps will assist in compiling OpenJDK. Note, you must have a working version of JDK 1.8 installed in order to compile this project.

  1. Extract the files to a location of your choosing, \(\texttt{cd}\) to it
  2. Execute the configuration script: \(\texttt{./configure}\)
  3. If you need to specify the JDK path, append \(\texttt{-\(\)-with-boot-jdk=/<path to JDK directory>}\)
  4. More than likely, you will need to install dependencies for the configuration process to complete. Unfortunately, the only real way to know which libraries you need is to wait for an error. However, the following were required for the two test system:
    • Linux Mint: \(\texttt{libc6-dev build-essential libX11-dev libxext-dev}\) \(\texttt{libxrender-dev libxtst-dev libxt-dev libcups2-dev}\) \(\texttt{libfreetype6-dev libasound2-dev}\)
    • Rocks: \(\texttt{cups-devel}\)
  5. Once successful, execute \(\texttt{make all}\)
  6. The compiled JDK will be located under: \(\texttt{OpenJDK_1_9_Dev_UPC/build/linux-x86_64-normal-server-release/jdk}\)
Generating Stock or Modified Source

UPC_1_0.jar contains the NetBeans project supporting both \(\texttt{stock}\) and \(\texttt{modified}\) instances. UPC is generator driven, and can thus internally create either source following these steps:

  1. Unzip UPC_1_0.tar.gz.
  2. Open the project with NetBeans.
  3. Add the modified OpenJDK instance to the Java Platform Manager (Project properties -> Libraries -> Manage Platforms…)
  4. Set the Java Platform to the platform instance created in Step 3.
  5. Execute the project.
  6. At the first prompt, enter either \(\texttt{stock}\) or \(\texttt{modified}\)
  7. At the second prompt, enter the packages to compile or \(\texttt{all}\). The options are \(\texttt{arrays}\),  \(\texttt{lists}\),  \(\texttt{hashsets}\),  \(\texttt{hashmaps}\), and/or \(\texttt{matrices}\). Enter these as space-separated parameters.
    • The options are for developmental convenience; \(\texttt{all}\) should be used for final output to ensure a uniform UPC instance is instantiated.
  8. Once completed, clean and build the project.
Contributors

Project lead: Ray Hylock