°ÄÃÅÁùºÏ²Ê¿ª½±¼Ç¼

Internal

CS3TM: Text Mining and Natural Language Processing

°ÄÃÅÁùºÏ²Ê¿ª½±¼Ç¼

CS3TM: Text Mining and Natural Language Processing

Module code: CS3TM

Module provider: Computer Science; School of Mathematical, Physical and Computational Sciences

Credits: 20

Level: Level 3 (Honours)

When you'll be taught: Semester 2

Module convenor: Professor Xia Hong, email: x.hong@reading.ac.uk

Pre-requisite module(s):

Co-requisite module(s):

Pre-requisite or Co-requisite module(s):

Module(s) excluded:

Placement information: NA

Academic year: 2024/5

Available to visiting students: Yes

Talis reading list: Yes

Last updated: 21 May 2024

Overview

Module aims and purpose

The aim of this module is to introduce the field of text mining and natural language processing. A key focus of the module is placed on the theories and practice of processing text data from the aspects of lexicons, syntactics, and semantics. 

This module also encourages students to develop a set of professional skills, such as problem solving, critical thinking, scientifical evaluation, creativity, technical report writing, organization and time management, self-reflection. 

Module learning outcomes

By the end of the module, it is expected that students will be able to: 

  1. Understand and apply the fundamental principles of text mining and natural language processing; 
  2. Apply methods and algorithms to process different types of textual data; 
  3. Empirically evaluate the performances of methods and algorithms by using accuracy and efficiency metrics;and 
  4. Apply analytical and programming skills through using the existing NLP methods and tool s such as NLTK and scikit-learn (python). 

Module content

The module covers the following topics: 

  • Regular expression, Text Normalization 
  • N-gram and language model, part-of-speech tagging 
  • lexical semantics, Word Senses and WordNet 
  • Syntactic and Semantic parsing 
  • Text classification, sentiment analysis 
  • Information extraction including name entity recognition and relation extraction 
  • Advanced topics: Machine learning for NLP, Word embedding, Hidden Markov model and Viterbi algorithm 

Structure

Teaching and learning methods

The lectures will introduce students the theories, concepts and underpinning principles specified in the indicative content. Students will be supervised in the practical sessions to apply the concepts and principles to given problems context for learning.  

The lectures and practical sessions will enable students to practice a known NLP software, perform analysis and report writing. 

There will also be learning materials in digital forms when they are required to support learning.  

There are two types of assessment (i.e., formative assessment and summative assessment) which will support and reinforce students’ learning. Formative assessment is carried out through weekly learning activities either exemplar questions, or sample programmable problems.  

Summative assessment consists of one piece of written coursework assignment and one written examination. The written coursework assignment requires students to demonstrate scientific writing of individual report. Appropriate feedback will be timely communicated with students for enhancing learning.  

Study hours

At least 38 hours of scheduled teaching and learning activities will be delivered in person, with the remaining hours for scheduled and self-scheduled teaching and learning activities delivered either in person or online. You will receive further details about how these hours will be delivered before the start of the module.


 Scheduled teaching and learning activities  Semester 1  Semester 2 Ìý³§³Ü³¾³¾±ð°ù
Lectures 22
Seminars 8
Tutorials
Project Supervision
Demonstrations
Practical classes and workshops 8
Supervised time in studio / workshop
Scheduled revision sessions
Feedback meetings with staff
Fieldwork
External visits
Work-based learning


 Self-scheduled teaching and learning activities  Semester 1  Semester 2 Ìý³§³Ü³¾³¾±ð°ù
Directed viewing of video materials/screencasts
Participation in discussion boards/other discussions
Feedback meetings with staff
Other
Other (details)


 Placement and study abroad  Semester 1  Semester 2 Ìý³§³Ü³¾³¾±ð°ù
Placement
Study abroad

Please note that the hours listed above are for guidance purposes only.

 Independent study hours  Semester 1  Semester 2 Ìý³§³Ü³¾³¾±ð°ù
Independent study hours 162

Please note the independent study hours above are notional numbers of hours; each student will approach studying in different ways. We would advise you to reflect on your learning and the number of hours you are allocating to these tasks.

Semester 1 The hours in this column may include hours during the Christmas holiday period.

Semester 2 The hours in this column may include hours during the Easter holiday period.

Summer The hours in this column will take place during the summer holidays and may be at the start and/or end of the module.

Assessment

Requirements for a pass

Students need to achieve an overall module mark of 40% to pass this module.

Summative assessment

Type of assessment Detail of assessment % contribution towards module mark Size of assessment Submission date Additional information
Online written examination Exam 50 2 hours Semester 2 Assessment Period Answer 3 out of 4 questions
Set exercise Technical report 50 7 pages (excluding appendices). 20 hours Semester 2, Teaching Week 11

Penalties for late submission of summative assessment

The Support Centres will apply the following penalties for work submitted late:

Assessments with numerical marks

  • where the piece of work is submitted after the original deadline (or any formally agreed extension to the deadline): 10% of the total marks available for that piece of work will be deducted from the mark for each working day (or part thereof) following the deadline up to a total of three working days;
  • the mark awarded due to the imposition of the penalty shall not fall below the threshold pass mark, namely 40% in the case of modules at Levels 4-6 (i.e. undergraduate modules for Parts 1-3) and 50% in the case of Level 7 modules offered as part of an Integrated Masters or taught postgraduate degree programme;
  • where the piece of work is awarded a mark below the threshold pass mark prior to any penalty being imposed, and is submitted up to three working days after the original deadline (or any formally agreed extension to the deadline), no penalty shall be imposed;
  • where the piece of work is submitted more than three working days after the original deadline (or any formally agreed extension to the deadline): a mark of zero will be recorded.

Assessments marked Pass/Fail

  • where the piece of work is submitted within three working days of the deadline (or any formally agreed extension of the deadline): no penalty will be applied;
  • where the piece of work is submitted more than three working days after the original deadline (or any formally agreed extension of the deadline): a grade of Fail will be awarded.

The University policy statement on penalties for late submission can be found at: /cqsd/-/media/project/functions/cqsd/documents/qap/penaltiesforlatesubmission.pdf

You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.

Formative assessment

Formative assessment is any task or activity which creates feedback (or feedforward) for you about your learning, but which does not contribute towards your overall module mark.

Each topic in a week has defined learning tasks which will enable students to self-reflect on the learning.   

Outcomes of the formative assessment for each topic may be given in the guidance tutorial notes, online tests feedback. 

Weekly pseudo codes and executable Python codes are given for basic algorithms.  

Reassessment

Type of reassessment Detail of reassessment % contribution towards module mark Size of reassessment Submission date Additional information
Online written examination Exam 100 3 hours During the University resit period Answer 4 out of 6 questions

Additional costs

Item Additional information Cost
Computers and devices with a particular specification
Required textbooks They are specified in Talis.
Specialist equipment or materials
Specialist clothing, footwear, or headgear
Printing and binding
Travel, accommodation, and subsistence

THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT'S CONTRACT.

Things to do now