A quick overview of my PhD research project

February 12, 2018 — Published in Computational linguistics, Natural language generation

I’m working as a PhD student at the University of Twente, where I research natural language generation for adapative games. People often ask me what my research is about, so I figured I should write a blogpost to explain a bit more about my research field.

What is a “natural language”?

A natural language is a language that is used by humans, such as English, Dutch or Japanese, as opposed to the formal languages of mathematics and logics. Note that I use this very informal definition, since I’m a computer scientist and not a linguist. If you ask a linguist to define a natural language, you get probably many different answers, such as the definition from Wikipedia.

What do you mean with natural language generation?

When I say “natural language generation”, I mean the task of automatically generating (creating) human language text, for example with a piece of software. The software takes data as input, and outputs text in a natural language.

You can create all possible types of texts with natural language generation: dialogues, riddles, jokes, social media messages, novels, poetry or more formal texts like weather forecasts, product descriptions and sports match summaries.

In some sense, natural language generation (NLG) is the opposite or inverse of natural language processing (NLP); with NLG, you generate text from data (which can be text as well), with NLP you extract data from natural language texts. Some people hope that you can apply NLP techniques to NLG problems and vice versa, by designing algorithms that work in both directions.

Natural language generation is part of computational linguistics, the interdisciplinary research field where computer science and linguistics meet.

Can you give me an example of natural language generation?

An example of a natural language generation program is a program that greets the user by name, with a greeting that depends on the time. The input data in this case is a name and the current time, and the output is a sentence in English, which is a natural language.

Input		        Output
("Judith",11:06)        "Good morning, Judith!"
("Hector",14:45)        "Good afternoon, Hector!"
("Rachel",19:25)        "Good evening, Rachel!"

What is your research project about?

In my specific research project, I try to apply natural language generation techniques to adaptive training games, ie. training games that adapt their content to the player.

The two subjects (natural language generation and adaptive games) that make up my research project are both too extensive for one PhD project. This means I’ll have to choose a more specific focus for the upcoming four years. At the moment (february 2018), I’m mostly reading (survey) papers and thinking up as many relevant and open problems as possible. Later this year I will decide on more specific research questions together with my supervisors and the rest of the project team.