ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Optimizing OpenCL programs for different hardware architectures

Jure Šemrov (2017) Optimizing OpenCL programs for different hardware architectures. EngD thesis.

Download (1063Kb)


    The main question in this thesis we will be trying to solve, is how to write a proper OpenCL program to effectively run on different architectures. A problem to overcome are the architectural differences between systems. To maximize the efficiency, we need to adapt the program. This defers by the number of compute units, number of threads in a work-group, use of a vector unit, local memory and cache to minimize latency. To summarize, we need to exploit both instruction and thread level parallelism as well as other architectural advantages. We used five programs, histogram, matrix multiply, prefix sum, n body problem and bitonic sort. Then we adapted them to three different systems, Intel Core i5-2450M CPU, Xeon Phi 5110P manycore processor and Tesla K20 GPU. To test these adaptations in practice, we measured program runtime for different work-group sizes and tried to explain what is going on. Our conclusions show, that we need at least as many work-groups as there are compute units. The work-group size have to be large enough to reduce the overhead of maintaining a work-group and hide memory latency. At the same time they should be small enough to reduce overhead of communication and to keep executing more work-groups simultaneously on each compute unit. To execute programs efficiently on a CPU and manycore processors, we need to take into account caches and wideness of a vector unit, while on a GPU we need to exploit high memory throughput and hide latency with large work-groups and local memory.

    Item Type: Thesis (EngD thesis)
    Keywords: OpenCL, heterogeneous systems, compute unit, work groups, work-items, SIMD, SIMT, local memory
    Number of Pages: 78
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    izr. prof. dr. Uroš Lotrič270Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1537344195)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3750
    Date Deposited: 11 Jan 2017 10:45
    Last Modified: 02 Feb 2017 10:02
    URI: http://eprints.fri.uni-lj.si/id/eprint/3750

    Actions (login required)

    View Item