Modeling of Nonlinear Dynamical System using Information Theoretic Methods

Marko Bratina (2009) Modeling of Nonlinear Dynamical System using Information Theoretic Methods. PhD thesis.

Abstract

General purpose models of dynamical systems are based on extracting important information regarding the underlying processes directly from the measurable process data. Commonly used methods for system analysis and modeling are based on second order statistics. Lately, however, solutions exceeding its limitations have been proposed. Growing potential of contemporary computer systems has encouraged the use of methods originating from information theory in this field. The definitions of basic measures in information theory, i.e., entropy, divergence and average mutual information, are based on probability theory and statistics. Each of these measures in its own frame determines the quantity of information and uncertainty of random variables and can therefore be used also in the modeling. In the set of measured process data, some of them may be mutually dependent and they do not provide any new information about the system. Moreover, if this dependency is not detected, it results in larger and more complicated models. Therefore, applying feature extraction methods to input data prior to modeling is expected to result in simplifying the modeling procedure and improving the generalization properties of the obtained model. Average mutual information and divergence, which both measure mutual dependency among data, are most suitable as criteria in the process of feature extraction. Using mutual information it can be determined how much information about a given output is contained in each input set. In combination with optimization methods the most appropriate set of features can thus be found. Similarly, divergence may be used as a measure in the independent component analysis where features are obtained as linear combinations of inputs. Independent component analysis may also be used as method for localizing exceptional events, shocks, with a strong influence on the future behavior of the system. If we are dealing with a repeatable process, the results of such analysis are also applicable for prediction. Therefore, a method which uses inputs in times of shocks as features in further modeling was proposed and tested. Methods for feature extraction based on information theory were tested in combination with neural network modeling and compared to several classical approaches. The neural networks are general purpose models, comprised from identical simple units, called neurons, which map the weighted sum of its own inputs to output. Regarding the topology of connections among neurons, a variety of neural networks were proposed. In this work, only the multilayered perceptron is used. In this case, neurons are arranged in layers and each neuron is connected to all neurons in neighboring layer, but not to neurons in its own layer. The great advantage of this topology lies in the fact that very effective gradient learning techniques are available for setting the free parameters of the model. The learning is based on minimizing a selected criterion function. Most commonly, the mean squared error is used as the criterion function. However, it may be replaced by entropy of the error which measures the uncertainty of errors on the output of the model - the lower this uncertainty is the better model fits the data. The performance of measures originating from information theory in the feature extraction process and in learning process was tested on problems of time series prediction and on problem of building a model predictive control of rubber compound mixer. Time series prediction problem analyses have revealed that models including independent component analysis of feature extractions give the best prediction of future value. If the problem is slightly modified so that only classification of future value in one of the predefined classes in predicted, the method of maximally discriminative projections with the conditional entropy as a measure, has also performed well. Feature extraction based on mutual information yielded poorer results, mainly due to low number of data. At the same time, it was shown the learning using minimization of entropy of the error gives good results in case of future value prediction, while it is not suitable in case of classification prediction. In the rubber compound production the quality of the final product depends strongly on several parameters, such as quality of input materials, ambient temperature and chamber temperature. To reduce the deviations in the quality of the final product in spite of changes in the input and process parameters, a need for closed loop control has emerged. The closed loop control is based on the information about the viscosity of the compound, measured as the motor torque. A model predictive control was designed which gives a prediction of classification of the torque curve based on the time course of viscosity. According to this classification, the torque of the motor is changed by changing the rate of rotation. In this problem, the model using the proposed method of feature extraction based on shock detection performed well. The closed loop controller of the rubber compound mixing process has in fact reduced the variations of the quality of the final rubbed compound.

Item Type:

Thesis (PhD thesis)

Keywords:

nonlinear dynamical system, information theory, feature extraction, neural network modeling, time series forecasting, model predictive control of rubber mixing process.

Number of Pages:

100

Language of Content:

Slovenian

Mentor / Comentors:

Name and Surname	ID	Function
doc. dr. Uroš Lotrič	270	Mentor

Link to COBISS:

http://www.cobiss.si/scripts/cobiss?command=search&base=50070&select=(ID=7391572)

Institution:

University of Ljubljana

Department:

Faculty of Computer and Information Science

Item ID:

907

Date Deposited:

11 Sep 2009 10:21

Last Modified:

13 Aug 2011 00:35

URI:

http://eprints.fri.uni-lj.si/id/eprint/907

Actions (login required)

View Item