Project: Tinere Echipe 2011 - 2014


Project data:

  • Project type: National
  • Project code: PN-II-RU-TE-2011-3-0278
  • Contract number: 045 / October 5, 2011
  • Project financed by: CNCS - UEFISCDI
  • Project title:
    Non-parametric methods for machine learning: applications to robotics and data analysis
    (ro: Metode neparametrice in instruirea automata a masinilor: aplicatii in robotica si analiza datelor)
  • Project period: October 5, 2011 -- October 4, 2014

Project abstract:

Non-parametric methods became popular due to the explosion of computational and storage capacity of computers. Applying non-parametric methods to e.g. bioinformatics data is becoming a standard: non-parametric Support Vector Machines (SVM) are used as benchmarks when a new algorithms is tested. Using the non-parametric Dirichlet process model for automated clustering of data brings an increased flexibility in modelling: the number of components can be automatically inferred from the data. Inference with Gaussian processes - again non-parametric - provides more flexibility in modelling: they are usually better suited for cases where model complexity is not known in advance. Irrespective of the method we use, there are problems needed to be solved and most of these problems relate to the non-parametric nature of the algorithm: the number of parameters grows as data are processed. Even though the application of non-parametric models is straightforward, to apply successfully, we need to consider approximations or simplifications to the respective models and we intend to study these approximations. Reinforcement learning considers agents in an unknown environment, therefore its complexity is unknown. We will use sparse Gaussian processes to keep model complexity flexible and efficient at the same time. Dirichlet processes use sampling and we want to construct approximations - e.g. variational, Laplace - to make them amenable for large data-sets.

Members

Project objectives:

We want to apply non-parametric methods for data analysis:
  • The application of the non-parametric Dirichlet processes in analysing large data-sets is not entirely novel, but the variational speed-up, instead of using sampling procedures can be considered a significant step forward.
  • Gaussian processes have been used in reinforcement learning with great success and using the sparse extension can be an important step forward.
  • Data dependent kernels in data processing - within or outside of the reinforcement learning area - can also be important. The ultimate aim of data processing is the information extraction from the observations and we can consider data-dependent kernels as a summation of information present in the data.

Project synthesis:

Detailed research results can be found in the publications related to the project (below).
A short classification of the research results:
  • Approximating algorithms for reinforcement learning
    We used reinforcement learning algorithms and Gaussian processes to improve the existing reinforcement learning algorithms. We experimented with diferent methods to create data-dependent kernels that speed up learning speed and at the same time help in obtaining a more accurate robotic model.
  • Robot model learning for forward control
  • Input noise models
  • Approximate hashing methods for fast search

Synthetic reports:

Scientific synthetic report in Romanian (PDF file) and English (PDF file) are available behind the respective links.

Publications:

Publications
>2011
Jakab Hunor, Csató Lehel. Gudied exploration in gradient based policy search with Gaussian processes. Technical Report, 2011
gpguided.pdf
Zalan Bodo. Fast Large-Scale Kernel k-Nearest Neighbors. Technical Report, 2011
knn3.pdf
Bócsi, B.; Csató L. and Peters, Jan. Structured Output Gaussian Processes. Technical Report, 2011
sogp.pdf
>2012
Bócsi, B.; Hennig, P.; Csató L. and Peters, J. Learning Tracking Control with Forward Models. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), St. Paul, USA, 2012.
Bocsietal12.txt
Zalán Bodó, Lehel Csató. Improving Kernel Locality-Sensitive Hashing Using Pre-Images and Bounds. IJCNN 2012.
BodoCsato12a.txt
H. Jakab and L. Csató. Manifold-based non-parametric learning of action-value functions. In European Symposium on Artificial Neural Networks (ESANN), 2012.
JakabCsato12a.pdf
H. Jakab and L. Csató. Reinforcement learning with guided policy search using Gaussian processes. In International Joint Conference on Neural Networks (IJCNN), 2012.
JakabCsato12b.pdf
>2013
Bócsi, B. and Csató L. Hessian Corrected Input Noise Models. In Proceedings of the International Conference on Artificial Neural Networks (ICANN), pp. 1-8, Springer, Sofia, Bulgaria,LNCS8131, 2013
BocsiCsato13a.txt
Bócsi, B.; Csató L. and Peters, J. Alignment-based Transfer Learning for Robot Models. In Proceedings of the 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1-73G, Dallas, USA, 2013.
BocsiCsato13b.txt
Zalán Bodó Lehel Csató Linear Spectral Hashing. In Proceedings of the 21th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2013, pp. 303-308.
BodoCsato13a.txt
Jakab, H. and Csató L. Novel feature selection and kernel-based value approximation method for reinforcement learning. In Proceedings of the International Conference on Artificial Neural Networks (ICANN),Sofia, Bulgaria, 2013, pp. 170-177
JakabCsato13a.pdf
>2014
Botond A. Bócsi and Lehel Csató and Jan Peters. Indirect robot model learning for tracking control, Advanced Robotics, Taylor & Francis, 2014, 28, pp. 589-599
BocsiCsatoPeters14.pdf
Botond A. Bócsi, Hunor S. Jakab, and Lehel Csató. Simulation extrapolation gaussian processes for input noise modeling. In 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, SYNASC 2014, Timisoara, Romania, September 21-25, 2014, pages to be published, 2014.
BocsiJakabCsato14.pdf
Zalán Bodó and Lehel Csató. Linear Spectral Hashing, Neurocomputing, 2014, 141, pp 117-123
BodoCsato14a.txt
Zalán Bodó and Lehel Csató. Augmented hashing for semi-supervised scenarios, Proceedings of the 22nd European Symposium on Artificial Neural Networks, Computational intelligence and Machine Learning (ESANN), 2014, pp. 53-58
BodoCsato14b.pdf
Zalán Bodó and Lehel Csató. A note on label propagation for semi-supervised learning. 2014. (submitted)
BodoCsato14c.txt
Hunor S. Jakab and Lehel Csató. Springer Series in Bio-/Neuroinformatics, Artificial Neural Networks, volume 4, chapter Sparse approximations to value functions in reinforcement learning, pages 295–315. Springer International Publishing Switzerland, 2014.
JakabCsato14.pdf