I’m working as a PhD student at the University of Twente, where I research natural language generation for adapative games. People often ask me what my research is about, so I figured I should write a blogpost to explain a bit more about my research field.
A natural language is a language that is used by humans, such as English, Dutch or Japanese, as opposed to the formal languages of mathematics and logics. Note that I use this very informal definition, since I’m a computer scientist and not a linguist. If you ask a linguist to define a natural language, you get probably many different answers, such as the definition from Wikipedia.
When I say “natural language generation”, I mean the task of automatically generating (creating) human language text, for example with a piece of software. The software takes data as input, and outputs text in a natural language.
You can create all possible types of texts with natural language generation: dialogues, riddles, jokes, social media messages, novels, poetry or more formal texts like weather forecasts, product descriptions and sports match summaries.
In some sense, natural language generation (NLG) is the opposite or inverse of natural language processing (NLP); with NLG, you generate text from data (which can be text as well), with NLP you extract data from natural language texts. Some people hope that you can apply NLP techniques to NLG problems and vice versa, by designing algorithms that work in both directions.
Natural language generation is part of computational linguistics, the interdisciplinary research field where computer science and linguistics meet.
An example of a natural language generation program is a program that greets the user by name, with a greeting that depends on the time. The input data in this case is a name and the current time, and the output is a sentence in English, which is a natural language.
Input Output ("Judith",11:06) "Good morning, Judith!" ("Hector",14:45) "Good afternoon, Hector!" ("Rachel",19:25) "Good evening, Rachel!"
In my specific research project, I try to apply natural language generation techniques to adaptive training games, ie. training games that adapt their content to the player.
The two subjects (natural language generation and adaptive games) that make up my research project are both too extensive for one PhD project. This means I’ll have to choose a more specific focus for the upcoming four years. At the moment (february 2018), I’m mostly reading (survey) papers and thinking up as many relevant and open problems as possible. Later this year I will decide on more specific research questions together with my supervisors and the rest of the project team.
Judith van Stegeren is a Dutch computer scientist. She is working as PhD candidate at the University of Twente, where she researches natural language generation for the video games industry. She occassionaly works as a consultant in data engineering for textual data.