Academic Year:
2022/23
3377 - Bachelor's Degree in Computer Engineering
25432 - Introduction to Natural Language Processing Techniques for Everyday Applications
Teaching Plan Information
Academic Course:
2022/23
Academic Center:
337 - Engineering School
Study:
3377 - Bachelor's Degree in Computer Engineering
Subject:
25432 - Introduction to Natural Language Processing Techniques for Everyday Applications
Ambit:
---
Credits:
5.0
Course:
3 and 4
Teaching languages:
Theory: | Group 1: English |
Practice: | Group 101: English |
Seminar: | Group 101: English |
Teachers:
Leo Wanner
Teaching Period:
First quarter
Schedule:
Presentation
Natural Language Processing (NLP) techniques are omnipresent in many applications that we regularly use in our daily routine and that range from simple spellcheckers to personal assistants such as Alexa, Siri, Google Now, or Cortana. The objective of the course is twofold: (i) to provide an overview and the state-of-the-art of the individual techniques that are used in selected common applications, including, e.g., automatic text summarization, sentiment analysis, hate speech classification, automatic report generation, or the above mentioned personal assistance; and (ii) to teach skills necessary to design and build prototypical NLP-based applications using off-the-shelf modules widely available nowadays as open source SW.
The activities of the course are of three different types:
- Lectures in which the theoretical background of fundamental notions and techniques in NLP are introduced and their use is demonstrated drawing upon state-of-the-art applications;
- Laboratory assignments, in which students work collaboratively in small groups on experiments with off-the-shelf techniques and implement their own solutions;
Seminars, in which students present and discuss an NLP solution described in a scientific publication of their choice.
Associated skills
The course contributes to the basic skills and expertise acquired during the undergraduate studies:
- The capacity to collect and interpret relevant data in the area of Computer Science and Artificial Intelligence in general and Natural Language Processing in particular in order to be able to assess and comment on relevant topics from the scientific, ethical and social points of view.
The capacity to communicate information, ideas, problems and solutions in the area of Natural Language Processing to general public and NLP scholars alike. Furthermore, the course contributes to transversal skills related to
CE1. Solving the mathematical problems which can be set out in the arise in engineering and apply the knowledge on: linear algebra; differential and integral calculus; numerical methods, numerical algorithms, statistics, and optimization.
CE8. Mastering the concepts of data programming and programming and data structures, including principles of secure design and defensive programming, program verification and error detection.
CE10. Recognizing basic algorithmic procedures and applying them for the resolution of computational problems, analyzing the solution’s suitability and complexity.
CE11. Solving complex computational problems using the principles and techniques of intelligent systems.
Learning outcomes
It is expected that the students will obtain knowledge about state-of-the-art NLP techniques and acquire the skills to both integrate publicly available off-the-shelf modules into applications and develop on their own simple applications that use state-of-the-art techniques. In particular:
RA.CE1.5 Using knowledge of statistics to solve problems which can be set out in the in engineering.
RA.CE8.3 Designing and using advanced data structures and the most proper suitable algorithms for solving a problem.
RA.CE10.3 Applying basic techniques of artificial intelligence.
RA.CE11.2 Solving complex problems using machine learning techniques.
RA.CE11.3 Applying advanced intelligent computation techniques for the design and development of intelligent applications.
Sustainable Development Goals
Natural Language Processing applications contribute to the achievement of most of the 17 UN Sustainable Development Goals (including, e.g., Goal 1 – No Poverty, Goal 3 – Ensure Healthy Lives and Promote Well-Being for All at all Ages, Goal 4 – Quality Education, etc.).
Prerequisites
The basic prerequisites for a successful attendance of the course include programming skills, basic principles of Artificial Intelligence, and basic knowledge of logics. Highly desirable is also a closer acquaintance with classical and deep (neural network based) machine learning techniques.
Contents
The theoretical lectures of the course are grouped into four main thematic blocks (which are not necessarily taught in this order):
- Introduction
- Representations and metrics in NLP:
- How do we represent words, sentences and texts in NLP? Particular emphasis will be put on modern deep embedding techniques.
- What are the metrics to measure their distribution, correlation, similarity, etc. and thus assess their relevance, derive their structure or semantics?
- Instruments for the implementation of NLP techniques
- Classical Machine Learning models
- Deep Neural Network models
- Applications that build upon the thematic blocks 2 and 3, including, e.g. (the studied applications may vary, depending on the preference of the audience)
- Text analysis
- Text summarization
- Sentiment analysis / opinion mining
- Author profiling /author identification
- Hate speech classification
- Dialogue management in personal assistant applications
Teaching Methods
The methodology of the course foresees the immersion of the student into NLP during theoretical lessons, practical lab sessions and seminars. In the theoretical lessons, the student will learn about the theoretical (and, to the extent needed, mathematical) background of some selected NLP techniques and their use in applications. For illustration, some well-known applications will be also studied. In the practical lab, the student will implement some selected restricted techniques and learn how to use the code of open source implementations as stand-alone modules and/or as modules integrated into more complex applications. In seminars, each student will read a state-of-the-art scientific article on an NLP field of their choice summarize it and make a presentation in front of the class.
Evaluation
The evaluation consists of three parts:
- an exam, which can be either written or oral and which assesses the understanding of the student of the theoretical notions and NLP techniques presented during the theory classes;
- assessment of the work carried out by the student in the context of the lab sessions;
- assessment of the seminar presentation and active participation in the seminar discussions.
The final grade is weighted as follows: 40% exam + 40% lab + 20% seminar. The student must pass the exam, the lab and the seminar (the last two are counted together) in order to pass the course. The make-up exam later on during the academic year is on the theoretical part of the course only.
Bibliography and information resources
Bibliography on each topic will be provided in the class.
The students are also encouraged to consult
Y. Goldberg (2017). Neural Network Methods for Natural Language Processing. Morgan & Claypool Publishers. The e-version of the book will be uploaded to Aula Global of the course.