You can sometimes find open topics posted here. If currently no topics are posted, you may get in touch with Professor Demberg or one of the Postdocs or PhDs anyway to inquire about current topics. Or, if you have an own idea, please feel free to suggest it to Prof. Demberg.
Analysis of English-French translation of discourse relations using automatic word alignments
Discourse relations are logical relations between segments of texts that make the text coherent. They are often marked by discourse connectives. For example, the connective "because" marks a "reason" relation. Each language has its own collection of connectives and they often do not have mutual cross-lingual correspondences. For instance, the French connective "en effet" can be translated to "indeed" or "in fact" in English. When it is used to mark a "cause" relation, it is often omitted (implicitated) in the English translation (Zufferey 2016). Previous work used automatic word alignment to induce French connective lexicons and discourse annotation by projecting annotations from English (Laali and Kosseim 2014, Laali 2017). In this project, we would like to use this technique to study how discourse relations are marked in English-French translations.
Laali, Majid, and Leila Kosseim. "Inducing discourse connectives from parallel texts." Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers. 2014.
Zufferey, Sandrine. "Discourse connectives across languages: Factors influencing their explicit or implicit translation." Languages in Contrast. International Journal for Contrastive Linguistics 16.2 (2016): 264-279.
Laali, Majid. Inducing discourse resources using annotation projection. Diss. Concordia University, 2017.
Bachelor / Master Seminar and Thesis
- You need to do our Bachelor/Master Seminar before you can register for the thesis.
- The seminar is used to further specify the topic of the thesis, perform a literature review, identify suitable methods and formulate the hypotheses you want to test as part of your thesis. It consists of two deliverables: 1) a talk (ca 30 min, followed by questions); 2) a seminar paper including the introduction to your topic, a literature review and the specification of methods and hypotheses (10-20 pages).
- After having done both parts of the seminar, you (!) need to write an email to sek-vd(at)lst.uni-saarland.de. In this mail, please put Prof. Demberg and your advisor in CC and provide the following information: your full name, your matriculation number, the date of the seminar talk, the title of the talk that you gave. Only then the processing of LSF data is started.
- You need to register your thesis. This is only possible, once you finished the Bachelor / Master seminar and when the data is entered in the LSF.
- You need to defend your thesis (this can be shortly before or shortly after handing in your final thesis).
- You need to write down your thesis (A German and English template for this can be found here).
The following questions are considered (if applicable for the specific thesis topic and further questions might be considered if relevant for it) while grading a thesis. This is aimed at providing you with an overview of aspects important to a thesis. If you have any further questions, refer to your advisor for more information.
- Is the thesis topic (as agreed upon initially) properly addressed?
- Does the thesis show the student implemented appropriate scientific methods (i.e. decisions were made in an informed manner and documented properly, etc.)?
- Related work
- Is the selection of related work applicable and comprehensive?
- Was the feedback on the related work during the bachelor / master seminar properly integrated in the thesis?
- Is the related work appropriately presented (i.e., it was described in a focused way what constitutes the related work, it was clearly shown why this work is relevant for one's own work and which aspects have flowed into one's own work, etc.)?
- Are citations used correctly and wherever needed?
- Is the bibliography complete with consistent formatting?
- Execution of the written part
- Does the abstract properly describe the thesis?
- Is the thesis structured correctly and comprehensibly?
- Is the motivation of the thesis clearly elaborated on?
- Does the thesis contain a clear summary of the results achieved?
- Is there a critical discussion on the performance and the limitations of the work to reflect on the choices made?
- Is future work thoroughly described and are connections to the own work well presented?
- Is the language used appropriate without spelling mistakes?
- Does the thesis follow an internal consistency (e.g., special terms are always written in the same form)?
- Is the thesis consistent and free of incorrect descriptions (i.e., there are no contradictions within the thesis, etc.)?
- Is the thesis presented clearly and are the means of presentation appropriate (e.g., short sentences, images are used were reasonable, images are easy to understand, etc.)?
- Is the layout of the thesis appropriate (i.e., all images are referenced, no widows and orphans, tables are properly formatted, etc.)?
- Is the concept (in relation to the thesis topic) presented thoroughly in the thesis?
- Are the hypotheses formulated clearly?
- Is the chosen and described solution novel?
- Is the concept appropriately presented with a motivation why this solution is the correct one to target the goal of the work?
- For theses addressing an NLP task:
- Has the task been addressed comprehensively?
- Has the dataset been chosen appropriately?
- Is the chosen method / algorithm suitable for the task?
- Was training and testing conducted correctly? hyperparameter choice based on dev set (if applicable) / have different random initializations been tried (if applicable)?
- Were evaluation measures chosen appropriately?
- Is an error analysis provided?
- Were statistical tests conducted to test whether obtained differences are statistically different from one another?
- Is the code base made available (github or similar) and is it documented following ACL guidelines / best practices?
- Are the descriptions in the thesis sufficient to allow for replicability?
- Have the results of the evaluation been discussed with respect to the hypotheses of the thesis?
- For thesis in experimental psycholinguistics:
- Quality and documentation of experimental materials (well designed, no confounds) (if applicable)
- Is the methodological approach appropriate (addresses the hypotheses, suitable experimental design)
- Was the experiment implemented correctly (wrt. randomization, counter-balancing, choice of fillers, task instructions, practice trials etc.)
- Was the number of participants in the study chosen in a well-motivated way?
- Were the participants selected appropriately?
- Has the study been pre-registered?
- Did the study follow ethical guidelines?
- Was the data handled in a way that is in line with data protection (pseudonymization or anonymization; appropriate storage etc)?
- Was the data analysed correction (statistics)?
- Is the study described well enough so that it could be replicated?
- Are the results presented clearly, and discussed with respect to the hypotheses?
- For theses that contain data set collection:
- Was the pre-processing and/or post-processing of the data performed appropriately and correctly?
- Were the instructions given to annotators clear (annotation scheme / instructions to crowd-workers)
- Was the data source chosen in a well-motivated way?
- Is the quality of the data good and have data quality checks been performed?
- Is the dataset described using descriptive statistics?