ePrints.FRI - University of Ljubljana, Faculty of Computer and Information Science

Design and Development of Web Crawler for Dynamic Web Application Analysis

Luka Šarc (2014) Design and Development of Web Crawler for Dynamic Web Application Analysis. EngD thesis.

Download (2313Kb)


    Dynamic web applications represent the largest share in web applications ecosystem. They integrate with each other in a web browser. Users are not aware of connections with third-party service providers and may be unknowingly revealing their browsing data. In this thesis, a web crawler for dynamic web application analysis was designed and implemented to address this problem. Traditional crawlers are not sufficient for described area, since their interest is in semantics of web applications. Our implementation of crawler executes web application in a sandbox within virtual web browser. This allows crawler to track resources needed for the execution and detect integration of web applications. We conducted a crawl through 100,000 web applications. The results revealed high level of web application integration. In average, a web application integrates with six third-party providers. The results confirmed that the proposed solution provides effective analysis for described problem domain.

    Item Type: Thesis (EngD thesis)
    Keywords: web crawler, dynamic web application, third-party service providers, sandbox, integration
    Number of Pages: 62
    Language of Content: Slovenian
    Mentor / Comentors:
    Name and SurnameIDFunction
    prof. dr. Matjaž Branko JuričMentor
    Link to COBISS: http://www.cobiss.si/scripts/cobiss?command=search&base=51012&select=(ID=1536136643 )
    Institution: University of Ljubljana
    Department: Faculty of Computer and Information Science
    Item ID: 2718
    Date Deposited: 17 Sep 2014 18:44
    Last Modified: 06 Jan 2015 13:45
    URI: http://eprints.fri.uni-lj.si/id/eprint/2718

    Actions (login required)

    View Item