Data and Software

Welcome to the data and software home for the HSIM Computational Lab. This page contains basic project descriptions and links to various data/software projects those in the lab have been apart of and/or created.

Data Listing:

Alliance Regional Medical Center (armc)

Description: A simulated healthcare database in ten varieties. Patient sizes range from 10 to 1,000,000 with upwards of 18 million encounters for the larger sets.

Publication: Hylock, R., Harris, S.T., (2017). Healthcare Database Management for Health Informatics and Information Management Students: Challenges and Instruction Strategies—Part 2. Educational Perspectives in Health Information Management.

Developed by: Ray Hylock
Lab page: HSIM Computational Lab


Software Listing:

Java Optimization Test Function Suite

Description: The Java Optimization Test Function Suite provides 102 JUnit validated test functions crafted from a consensus of 52 published sources. This is the most comprehensive Java-based collection of its kind.

Developed by: Ray Hylock
Lab page: HSIM Computational Lab


siSPOTR

Abstract: RNA interference (RNAi) serves as a powerful and widely used gene silencing tool for basic biological research and is being developed as a therapeutic avenue to suppress disease-causing genes. However, the specificity and safety of RNAi strategies remains under scrutiny because small inhibitory RNAs (siRNAs) induce off-target silencing. Currently, the tools available for designing siRNAs are biased toward efficacy as opposed to specificity. Prior work from our laboratory and others’ supports the potential to design highly specific siRNAs by limiting the promiscuity of their seed sequences (positions 2-8 of the small RNA), the primary determinant of off-targeting. Here, a bioinformatic approach to predict off-targeting potentials was established using publically available siRNA data from more than 50 microarray experiments. With this, we developed a specificity-focused siRNA design algorithm and accompanying online tool which, upon validation, identifies candidate sequences with minimal off-targeting potentials and potent silencing capacities. This tool offers researchers unique functionality and output compared with currently available siRNA design programs. Furthermore, this approach can greatly improve genome-wide RNAi libraries and, most notably, provides the only broadly applicable means to limit off-targeting from RNAi expression vectors.

Publication: Boudreau, R. L., Spengler, R. M., Hylock, R. H., Kusenda, B. J., Davis, H. A., Eichmann, D. A., Davidson, B. L. (equal credit to first two authors) (2013).  siSPOTR: a tool for designing highly specific and potent siRNAs for human and mouse. Nucleic Acids Research, 41(1).
Developed by: the Davidson Laboratory (now at UPenn) and the Institute for Clinical and Translational Science (ICTS) at the University of Iowa
Software: There are two options: ICTS’s hosted service or source from GitHub


UPC Collections and OpenJDK Extension

Abstract: The Java programming language is versatile and robust, however, for large-scale applications with memory intensive processes, much is still to be desired. The public Java Development Kit (JDK) supports neither forcible object destruction nor large collection sizes (limited to 231-1) as found in choice computational languages such as Fortran and C/C++. Within Java’s Hotspot, however, the non-public sun.misc.Unsafe class allows such features through basic off-heap functionality. The submitted UPC collections are fully integrated with Unsafe to provide large (i.e., 263-1), destructible arrays, lists, hash sets, hash maps, and matrices, consistent with advanced computational needs. Additionally, a customized version of OpenJDK 1.9 with bulk-operation Unsafe support and companion collections are provided. These tools are compared to Java and extent third-party primitive collections. Testing indicates UPC performs in a consistent to superior manner compared to the state-of-the-art, with greater improvements realized by way of the modified virtual machine.

Publication: Hylock, R. (2016). UPC: Large-Scale Memory Efficient Java Primitive Collections. Journal of Software, 11(3), pp. 251-271.
Developed by: Ray Hylock
Lab page: HSIM Computational Lab