Programme Details
Rationale
Data processing is a complex but crucial task in any research project. Despite the availability of dedicated software, such as spreadsheet or statistical analysis applications, a basic knowledge of programming and the use of adequate libraries can open many additional possibilities for data analysis. Furthermore, programming allows the automation of tasks such as file processing, the computation of statistics and the production of charts and reports, making the whole process more reliable, reproducible and efficient.
This course does not assume prior knowledge of programming or of the Python language, so it starts with an introduction to this programming language and to basic programming techniques, with applications to file processing, data aggregation and organization, and the creation of simple scripts. After this introduction, the focus of the course will be on the use of analysis and data visualization libraries such as pandas and matplotlib, as well as familiarity with the IPython interactive computation console for quick data analysis. All software used will be free and open source software.
Syllabus
- Programming in Python and data structures
- Functions , classes and modules
- Processing of data files
- Libraries statistics and numerical processing
- Data visualization libraries
- Brief introduction to advanced topics of data processing , such as classification, clustering and image segmentation.
Learning outcomes
- Understand the fundamentals of programming in Python
- Learning to use data analysis libraries , statistical and visualization
- Implement scripts to automate data processing
- Automate file conversion , import and export data
- Automate reporting and graphs from data files References to be indicated