Data processing is a complex but crucial task in any research project. Despite the availability of dedicated software, such as spreadsheet or statistical analysis applications, a basic knowledge of programming and the use of adequate libraries can open many additional possibilities for data analysis. Furthermore, programming allows the automation of tasks such as file processing, the computation of statistics and the production of charts and reports, making the whole process more reliable, reproducible and efficient.
This course does not assume prior knowledge of programming or of the Python language, so it starts with an introduction to this programming language and to basic programming techniques, with applications to file processing, data aggregation and organization, and the creation of simple scripts. After this introduction, the focus of the course will be on the use of analysis and data visualization libraries such as pandas and matplotlib, as well as familiarity with the IPython interactive computation console for quick data analysis. All software used will be free and open source software.
PhD students at Universidade NOVA de Lisboa.
PhD holders working at NOVA (Professors, researchers,...)
No prior knowledge of programming is required but participants should have a basic knowledge of algebra and statistics. Participants must bring their own laptop computer and be familiar with its use.
There are no currently scheduled editions of this course.
Course Coordinator and Lecturer:
Ludwig Krippahl, PhD
2 ECTS | 3 days
24 contact hours - Lectures and Practical activities
32 Independent working hours - exercises and final work
Assessment:
Worksheets (30%) and a final work (70%).
McKinney, W. Python for data analysis: Data wrangling with Pandas, NumPy, and Ipython. O'Reilly Media, Inc., 2012