Project: Tinere Echipe 2011  2014
Project data:

Project type: National

Project code: PNIIRUTE201130278

Contract number: 045 / October 5, 2011

Project financed by: CNCS  UEFISCDI

Project title:
Nonparametric methods for machine learning: applications to robotics and data analysis
(ro: Metode neparametrice in instruirea automata a masinilor: aplicatii in robotica si analiza datelor)

Project period: October 5, 2011  October 4, 2014
Project abstract:
Nonparametric methods became popular due to the explosion of computational and storage capacity of computers. Applying nonparametric methods to e.g. bioinformatics data is becoming a standard: nonparametric Support Vector Machines (SVM) are used as benchmarks when a new algorithms is tested. Using the nonparametric Dirichlet process model for automated clustering of data brings an increased flexibility in modelling: the number of components can be automatically inferred from the data. Inference with Gaussian processes  again nonparametric  provides more flexibility in modelling: they are usually better suited for cases where model complexity is not known in advance. Irrespective of the method we use, there are problems needed to be solved and most of these problems relate to the nonparametric nature of the algorithm: the number of parameters grows as data are processed. Even though the application of nonparametric models is straightforward, to apply successfully, we need to consider approximations or simplifications to the respective models and we intend to study these approximations. Reinforcement learning considers agents in an unknown environment, therefore its complexity is unknown. We will use sparse Gaussian processes to keep model complexity flexible and efficient at the same time. Dirichlet processes use sampling and we want to construct approximations  e.g. variational, Laplace  to make them amenable for large datasets.
Members
Project objectives:
We want to apply nonparametric methods for data analysis:

The application of the nonparametric Dirichlet processes in analysing large datasets is not entirely novel, but the variational speedup, instead of using sampling procedures can be considered a significant step forward.

Gaussian processes have been used in reinforcement learning with great success and using the sparse extension can be an important step forward.

Data dependent kernels in data processing  within or outside of the reinforcement learning area  can also be important. The ultimate aim of data processing is the information extraction from the observations and we can consider datadependent kernels as a summation of information present in the data.
Project synthesis:
Detailed research results can be found in the publications related to the project (below).
A short classification of the research results:

Approximating algorithms for reinforcement learning
We used reinforcement learning algorithms and Gaussian processes to improve the existing reinforcement learning algorithms.
We experimented with diferent methods to create datadependent kernels that speed up learning speed and at the same time help in obtaining a more accurate robotic model.

Robot model learning for forward control

Input noise models

Approximate hashing methods for fast search
Synthetic reports:
Scientific synthetic report in Romanian (PDF file) and English (PDF file) are available behind the respective links.
Publications:
Publications 
>2011 

Jakab Hunor, Csató Lehel. Gudied exploration in gradient based policy search with Gaussian processes. Technical Report, 2011

gpguided.pdf 

Zalan Bodo. Fast LargeScale Kernel kNearest Neighbors. Technical Report, 2011

knn3.pdf 

Bócsi, B.; Csató L. and Peters, Jan. Structured Output Gaussian Processes. Technical Report, 2011

sogp.pdf 
>2012 

Bócsi, B.; Hennig, P.; Csató L. and Peters, J. Learning Tracking Control with Forward Models. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), St. Paul, USA, 2012.

Bocsietal12.txt 

Zalán Bodó, Lehel Csató. Improving Kernel LocalitySensitive Hashing Using PreImages and Bounds. IJCNN 2012.

BodoCsato12a.txt 

H. Jakab and L. Csató. Manifoldbased nonparametric learning of actionvalue functions. In European Symposium on Artificial Neural Networks (ESANN), 2012.

JakabCsato12a.pdf 

H. Jakab and L. Csató. Reinforcement learning with guided policy search using Gaussian processes. In International Joint Conference on Neural Networks (IJCNN), 2012.

JakabCsato12b.pdf 
>2013 

Bócsi, B. and Csató L. Hessian Corrected Input Noise Models. In Proceedings of the International Conference on Artificial Neural Networks (ICANN), pp. 18, Springer, Sofia, Bulgaria,LNCS8131, 2013

BocsiCsato13a.txt 

Bócsi, B.; Csató L. and Peters, J. Alignmentbased Transfer Learning for Robot Models. In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN), pp. 173G, Dallas, USA, 2013.

BocsiCsato13b.txt 

Zalán Bodó Lehel Csató Linear Spectral Hashing. In Proceedings of the 21th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2013, pp. 303308.

BodoCsato13a.txt 

Jakab, H. and Csató L. Novel feature selection and kernelbased value approximation method for reinforcement learning. In Proceedings of the International Conference on Artificial Neural Networks (ICANN),Sofia, Bulgaria, 2013, pp. 170177

JakabCsato13a.pdf 
>2014 

Botond A. Bócsi and Lehel Csató and Jan Peters. Indirect robot model learning for tracking control, Advanced Robotics, Taylor & Francis, 2014, 28, pp. 589599

BocsiCsatoPeters14.pdf 

Botond A. Bócsi, Hunor S. Jakab, and Lehel Csató. Simulation extrapolation gaussian processes for input noise modeling. In 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2014, Timisoara, Romania, September 2125, 2014, pages to be published, 2014.

BocsiJakabCsato14.pdf 

Zalán Bodó and Lehel Csató. Linear Spectral Hashing, Neurocomputing, 2014, 141, pp 117123

BodoCsato14a.txt 

Zalán Bodó and Lehel Csató. Augmented hashing for semisupervised scenarios, Proceedings of the 22nd European Symposium on Artificial Neural Networks, Computational intelligence and Machine Learning (ESANN), 2014, pp. 5358

BodoCsato14b.pdf 

Zalán Bodó and Lehel Csató. A note on label propagation for semisupervised learning. 2014. (submitted)

BodoCsato14c.txt 

Hunor S. Jakab and Lehel Csató. Springer Series in Bio/Neuroinformatics, Artificial Neural Networks, volume 4, chapter Sparse approximations to value functions in reinforcement learning, pages 295–315. Springer International Publishing Switzerland, 2014.

JakabCsato14.pdf 
