ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Strategies for Balanced Selection from Retrospective Data for Simulation of Prospective Studies

Aleš Smodiš (2016) Strategies for Balanced Selection from Retrospective Data for Simulation of Prospective Studies. EngD thesis.

Download (337Kb)


    The increase of medical research generates more and more findings which can result in new or enhanced existing treatments. This increase of medical research leads to problems at ensuring a sufficent number of patients for prospective studies of all of the promising treatments. On the other hand a prospective study can be simulated to a certain degree with a retrospective study using existing data. The main problem with this approach is, that the existing data usually have unbalanced distributions of characteristics over the sets of patients, which makes it difficult to evaluate effects of treatment. An algorithm is described for balancing sets of patients with given characteristics, which creates balanced subsets of patients using pairing and elimination of selected patients. The algorithm uses Pearson's chi-squared test for measuring the balance quality between two sets, and the sum of weighed differences between the characteristics for defining element pairs between sets. Two new element pairing strategies are introduced: a greedy method using an element similarity matrix, and the minimin algorithm using a state tree with limited depth for choosing the next elements to pair. A measure for the quality of a match between two sets is introduced. Results show that the greedy method gives better results from the original algorithm, whereas the minimin algorithm turns out to be time demanding because of the combinatorial complexity. At depths at which the algorithm is still practical to use, it gives results at best comparable to the original algorithm, but worse than the greedy method. The methods were experimentally compared on real data from medical studies in cancer treatment.

    Item Type: Thesis (EngD thesis)
    Keywords: retrospective studies, simulated prospective studies, pairing, data set balancing, heuristic search, heuristic evaluation of balance quality, Pearson's chi-squared test
    Number of Pages: 57
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    akad. prof. dr. Ivan Bratko77Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1536769731)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 3229
    Date Deposited: 29 Jan 2016 11:55
    Last Modified: 16 Feb 2016 10:12
    URI: http://eprints.fri.uni-lj.si/id/eprint/3229

    Actions (login required)

    View Item