EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING
This course is an introduction to data-driven methods applied to natural language processing. The emphasis is on methods, but we will survey applications such as syntactic parsing, text classification, information extraction, tagging, summarization. The final lectures will deal with statistical machine translation.Lecturer: Philipp Koehn
TA: Tommy Herbert
Lectures: Monday and Thursday, 5:10pm, changed to: WRB room G.11
Tutorials: Tuesday and Friday, 1pm, AT 4.12
Tutorial group assignments.
TUTORIALS
- Tutorial 1 was discussed on January 22 (Tuesday) and 25 (Friday).
- Tutorial 2 will be discussed on February 1 (Friday) and 5 (Tuesday).
- Tutorial 3 will be discussed on February 8 (Friday) and 12 (Tuesday).
- The project baseline systems will be presented on February 22 (Friday) and 26 (Tuesday).
- Tutorial 4 will be discussed on March 4 (Tuesday) and 7 (Friday).
- Tutorial 5 will be discussed on March 11 (Tuesday) and 14 (Friday).
ASSESSMENTA single assessment (worth 30%) of the course will be given out late January. You will have to turn in your paper and code at the end of March in class. If you have a problem accessing the data from the web site, it is also available at /home/miles/projects/ner/data-eng/ (English) and/home/miles/projects/ner/data-deu/ (German).
The rest of the marks (70%) will go on the exam. Past exam, solutions.
SYLLABUSExact dates will change and may move around. Topics may shift and change during flight.
MS refers to "Manning and Schütze", JM refers to "Jurafsky and Martin", K to "Koehn", the three textbooks listed below.
REFERENCESWhen possible, online papers will be made available. As for books, the key references are: