NOVA Information Management School

Data Mining II

Code

200029

Academic unit

NOVA Information Management School

Credits

7.5

Teacher in charge

Teaching language

Portuguese. If there are Erasmus students, classes will be taught in English

Objectives

Introducing the main concepts and methods of supervised Machine Learning.

Prerequisites

No requirement.

Subject matter

1. Introduction to Machine Learning- The concept of learning. Learing a function.
- Concept of generalization. Training set e test set.
- Supervised and unsupervised learning.
- Classification and clustering.
- Performance of a classifier. Data splitting. Crossvalidation and its variants. Precision e Recall. F-measure. K-statistic.
- The concept of feature. Feature selection.

2. Decision Trees
- General Functioning of the method
- Examples of application

3. Neural Networks
- Introduction
- Perceptron:
- One neuron model
- Perceptron Learning Rule.
- Convergence theorem of Perceptron.
- Main activation functions.
- Adaline:
- general structure
- Delta rule. The concept of gradient descent.
- Linearly separable and non-linearly separable problems.
- Layers of hidden neurons.
- Theorem of Universal Approximation.
- Backpropagation
- Ciclic or recursive Neural Networks:
- Jordan Networks
- Elman Networks
- Hopfield Networks (the concept of associative memory, Hebb learning rule).
- Examples of application

4. Support Vector Machines
- General functioning
- Kernel functions
- Examples of application

5. Genetic Programming
- Representation of solutions and principal differences with Genetic Algorithms.
- Genetic Operators
- Fitness Calculation
- Property of Closure and Sufficiency
- Steady State.
- Automatically Defined Functions (ADF).
- GP Benchmarks (even parity, multiplexer, symbolic regression, artificial ant on the Santa Fe trail).
- Parallel and Distributed Genetic Programming (definition and experimental study).
- Diversity and premature convergence
- Open issues and new trends in GP
- integration of semantic awareness in GP
 

Bibliography

"Machine Learning" Tom Mitchell McGraw-Hill, 1997; "A Brief Introduction to Neural Networks" D. Kriesel 2007.; "Introduction to Data Mining", Chapter 4 Pang-Ning Tan, Michael Steinbach, and Vipin Kumar 2006.; "An Introduction to Support Vector Machines for Data Mining" Robert Burbidge and Bernard Buxton 2001; "A field guide to genetic programming" Riccardo Poli, William B. Langdon and Nicholas Freitag McPhee, 2008.

Teaching method

Theoretical classes: board + slides; Practical casses: slides + projection of exercises and examples using various software environments.

Evaluation method

20% project number 1, 20% project number 2, 60% final exam.

Courses