ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

SPEAKER DIARIZATION OF AUDIO RECORDINGS

Gašper Žejn (2013) SPEAKER DIARIZATION OF AUDIO RECORDINGS. EngD thesis.

[img]
Preview
PDF
Download (1301Kb)

    Abstract

    Speech analysis is a broad research area in computer science. Diarization is a process of answering the question "who spoke when" by analyzing speech and extracting speaker specific information from it. This thesis focuses on evaluation of freely available tools for speaker diarization for use on Slovenian speech with emphasis on recordings of meetings. Two tools are evaluated, SHoUT and LIUM SpkDiarization. Both tools use similar theoretical primitives, which are explained in chapter 3. Tools, their use and test recordings are introduced in chapter 4. Results show the SHoUT tool is useful for Slovenian speech too, despite the fact the tool was not evaluated on Slovenian speech during its development. LIUM SpkDiarization is less stable and shows peculiar anomalies, such as merging all the speakers of same gender into one, which indicates additional research and parameter discovery should be done before using LIUM SpkDiarization on Slovenian speech.

    Item Type: Thesis (EngD thesis)
    Keywords: speech analysis, diarization, diarisation, speaker indexing
    Number of Pages: 33
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Dušan Kodek236Mentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=9963348)
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 2075
    Date Deposited: 28 Jun 2013 17:05
    Last Modified: 16 Jul 2013 14:41
    URI: http://eprints.fri.uni-lj.si/id/eprint/2075

    Actions (login required)

    View Item