Visiting ICCC 2019

In: Research, Travel
Published on 2019-07-03
Written from the perspective of a second-year PhD candidate in the Netherlands.

This is a ~~short~~ report of my visit to ICCC, or the International Conference on Computational Creativity 2019, in Charlotte, NC. It's not meant as a summary, but as an overview of the things I found interesting during the conference. If you want more information about the talks, check the conference program, or read the live-tweets from Matthew Guzdial, Christoph Salge and me.

I've marked some places in this text with a (*), which means you can find an accompanying paper in the ICCC 2019 proceedings.

ICCC is a small conference: single track and about 150 participants. The conference concentrates on the automation of creativity in all its aspects, from creative computer science (how can we generate paintings, poems or dance?) to conceptual AI (how can we formalize the cognitive aspects of human creativity?), cultural sciences (does a computer generated text differ in cultural value from a human-written one?) and philosophy (can computers be creative? What does creativity even mean?). Despite the varying backgrounds of the participants, most of the talks were understandable and relevant for my own research. ICCC turned out to be a pretty good fit for my research projects, which deal with coherence and context of automatically generated texts.

I found that Computational Creativity doesn't just value a generative system for its raw output, but also pays attention to the conceptual idea behind a system. Attendees seemed to appreciate the thought process behind a project: what input do we use and why, which approach do we choose, what libraries and datasets are available, do we make the project open-source or not, etc. I visited ICCC to present a poster about Churnalist, a headline generator for creating fictional headlines around a context. Churnalist is certainly not state of the art when it comes to language generation, but the system is ideal for testing hypotheses about text generation for a specific context. At ICCC I noticed that this topic interested other participants as well, and that people could appreciate the conceptual side of Churnalist, even though its outputs are not that sophisticated yet.

I saw also a certain openness about negative results in research work at ICCC, which I think is very important, as transparency in research leads to more progress. If you know which approaches work and why, it's easy to adopt part of someone else's approach in your own research, even if their work is in a different application domain. I know this sounds like basic science methodology, but in practice researchers tend to hide their bad results or ugly academic code, and use the approaches that sound most promising to funding agencies. In other words, the ICCC community matched with my own ideas about science methodology and research best practices.

Workshop on Generative Deep Learning

On 18 June, I attended the workshop on deep learning and computational creativity (DL4CC). The workshop was an informal meeting with about 30 people and consisted of presentations and discussions. Even though I don't use deep learning in my current work, the quality of the discussions made it worthwhile to attend.

Highlights of the workshop:

AI-driven systems for co-creation should no longer be seen as unintelligent tools. Instead, they must be designed as articulate creative partners. It would be nice if these systems not only gave the user more creative possibilities, but also made suggestions and critical remarks during the creative process and corrected or even interrupted the user. The biggest problem in deep-generative system is a human-computer-interaction problem. Active collaborating with AI fits with the new focus on reflection and framing in Computational Creativity: "simply creating something" is no longer enough for a creative system. Systems should present generated output together a justification, so that the user can follow the underlying reasoning and generative process. See also the recent revival of "explainable AI".
Some speakers defined "creativity" of outputs using the parameter "surprise", but I was critical of this. Creativity judgments defined in terms of surprise can never be objective, because surprise is strongly linked to experience and consequentially to age, social environment, etc. Nick Montfort's keynote on day 3 (see below) made me slightly less sceptical.
Mikhail Jacob presented a project with virtual agents that play improv theatre in VR. He emphasized the importance of generating a "creative arc" instead of the optimal artifact. If a virtual agent would always choose the most creative artifact (for example, the weirdest or most surprising act), you would not get a coherent story over time. I can imagine that this is important for generating any artifact with a sequence, such as music, dance or animations. I recognize this problem from my research in language generation, where, for example, consecutively choosing the most beautiful word does not always lead to the most coherent text.

After the talks, there was a discussion session in which we brainstormed about the current challenges for deep generative systems, to find fruitful directions for new research. The following five main points emerged from this session:

How do we evaluate creative systems and generated artifacts? How do we define creativity?
How do we deal with human-computer co-creation? How should new generative systems behave to facilitate collaboration? How do we build the next generation of explainable, reflective, framing and collaborative generative systems?
What do we do with gaps in the latent space of generative systems? How do we find these gaps, how do we classify them and how do we address them? In other words, how do we "escape" from the latent space learned from a training set?
As deep learning is applied to more application domains, we run into domains for which no 'big data' is available (to use as training data). How do we deal with this?
How can we build personalized models in deep generative systems? What additional training data do we need from the user, how do we get that data and how do we incorporate it in the general model?

The results of this discussion have been published online for the rest of the scientific community.

Day 1

The first day of the conference was opened Mary Lou Maher. She explained ICCC's approach for welcoming new people to the conference (which I loved as a first-time attendee): first-time ICCC participants were explicitly invited to ask questions after a talk, and session chairs were asked to favor new people over acquaintances when picking audience members for questions.

Rebecca Fiebrink gave the first keynote. Her work centers on building usable creative systems that use supervised machine learning. Her approach to researching creative systems reminded me of Kate Compton, which also places the user first. Fiebrink gave a demo of The Wekinator, a supervised machine learning system that can be used for creating music. The system solves the problem of lack of training data by asking users on the fly for examples.

In her keynote, Fiebrink emphasized that sometimes it is better for a system to focus on the mistakes in the model. In machine learning, we tend to focus too much on positive metrics like accuracy. However, for a user it's more useful to know where the gaps and mistakes are in the underlying model, and where these mistakes come from. I recognize this difference from a CLIN talk from Erik Tjong Kim Sang about historians using ML to classify historical documents. Big surprise: the historians only wanted to use machine learning if it was 100% transparent where classification errors came from.

I also thought the HiBot was very cute: an arduino robot with motion sensor that cheerfully waves back whenever someone waves at it.

Kyle Booten presented work* which was inspired by Erasmus' book De Copia, a book with exercises for improving rhetorics and text variation. I like both the fact that Booten drew inspiration from a historical Dutch figure (and texts and methods) and that he conducted small, elegant crowdsourcing experiments to test his hypotheses.

Tony Veale discussed a new Twitterbot* that responds to any Twitter account that uses the phrase "Read me like a book". The bot analyses tweets from these accounts to build an 11-parameter personality and language model for that account, and uses the model to tweet book recommendations to the user. The bot builds on Veale's earlier work on metaphor generation and computational humor. An example conversation between Veale's Trumpbot and the book bot can be read here.

Although I like work related to Twitter bots and language generation, I'm critical of the bot's social behavior. It tends to tweet to unsuspecting users that have not explicitly invited the bot to interact with them, and the output can contain sensitive or controversial content, which is not in line with the best practices for Twitter bots.

Razvan Bunescu mentioned a paper by West and Horvitz, in which researchers turned a classic linguistic crowdsourcing experiment around: they presented headlines from satirical news site 'The Onion' to crowdsource workers and asked them to change one word in the humorous headline to change it into a serious headline, for example "BP ready to resume oil {spilling, drilling}". This is much easier for humans than asking them to come up with satirical headlines on the spot, but the result is the same: a high quality dataset that can be used to study satirical language.

Poster session and demos

At the poster session I presented my prototype of Churnalist*. The session coincided with the demo session (in another, adjacent room) and social event with good food and wine. I spoke to various people (Kyle Booten, Sarah Harmont, Eyal Gruss, Pablo Gervás and many others) about my work, their work, art, music, our experiences visiting the USA and the future of computational creativity, which I enjoyed immensely.

Kyle Booten presented a demo called Fragile Pulse*, which seemed a satirical version of meditation apps. I didn't have time to try it out, but the demo revolves around a text written by Booten that is paired with a microphone and motion sensor. If the reader of the text makes noise or moves a lot, the text is modified by a text generator to include all kinds of anxiety-related words. In other words: the only way to read the original text is by sitting still and being silent.

Day 2

On day two of the conference, artists Lilla LoCurto and Bill Outcault opened with a keynote about conceptual art that incorporates technology. The keynote started with a bit of contemporary art history, with lots of examples by other artists. I particularly liked Beuys Voice by Nam June Paik (a robot built from old television sets) and Stochastic Process Painting by Cheyney Thompson (a painting consisting of tiny hand-painted squares, each square's color is picked algorithmically by a computer). LoCurto and Outcault also disucssed their own works the willful marionette and cat's cradle -- it was fascinating to hear about the development of their artistic work. In order to realise their ideas, they needed to collaborate with organisations, programmers, and researchers throughout the entire process.

Mike Cook recorded a talk about his survey* on framing in computational creativity. Framing can be seen as the sign next to an artwork in a museum. It gives the audience extra context in which they can interpret the artwork: the name of the artist, the year in which it was made, methods and materials used and the story behind the piece. Cook argues that creative systems should provide something similar when presenting output to the user. I figure that providing some kind of framing for an output is the minimum requirement for creating more explainable generative systems. It can also help us take a step in the next direction: creating generative systems that can reflect on the things that they're generating. I have not read the accompanying paper yet, but I estimate that it will be important in my work -- and that of the rest of the Computational Creativity community -- in the coming years.

Alison Pease made an interesting observation about framing in academia: "Papers frame systems, talks frame papers." Very true, and it illustrates clearly why framing is such an important concept in Computational Creativity.

Day 3

Nick Montfort opened the third and final day of the conference. In his keynote he distinguished between H-creativity (global, universal) and P-creativity (personal, individual) and posed this question to the audience: what lies in between? Drawing on examples, Monfort argued that subjectivity plays a larger role in computational creativity than we (scientists) like to admit, and that it should get more attention, especially during evaluation. In the evaluation phase, we should keep in mind the cultural background of the evaluators. Similarly, the definition of "creativity" is a cultural artifact, and consequentially not something objective. We shouldn't abandon the notion of objective research, but we should be aware of the culturally determined aspects of creativity. Montfort finished with slides with advice for Computational Creativity researchers, which are well worth a read.

A presentation by Maya Ackerman about field work in computational creativity* listed creative ways of obtaining feedback during every phase of the research process. She also did a small reading of the published book with computer generated stories of the MEXICA story generation system by Rafael Perez y Perez.

Christoph Salge gave a short talk about the Generative Design in Minecraft Competition. I was happy to hear that this year there will be a bonus assignment focussing on chronicle generation*! The goal is to generate a chronicle for the procedurally generated Minecraft settlements, i.e. a text that describes the historical background of the settlement. It should be clear from the generated text which chronicle belongs to which settlement. I'm looking forward to the results of this new bonus challenge, as it has some overlap with my PhD topic: natural language generation for games. If you are interested in participating in the competition, be sure to check out the website. The organizing team made it really easy to participate. As long as you can code basic Python, you can experiment with their bootstrapping scripts and create a submission for the competition.

Pablo Gervás talked about storytelling system INES that can create multiplot stories*. I particularly liked the architecture of the system, which is based on multiple microservices. Secondly, I recognized Gervás' story about the development process from my own experience: although the goal was to design and implememt the system, the primary researcher ended up working on one tiny but complex part of the system during his PhD.

At some point in the talk, Gervás also mentioned the following dilemma for reseachers: are we going to build tiny systems that tackle one specific problem, or are we going to investigate large difficult problems that we encounter in many different problem spaces? Building solutions to tiny problems leads to completable projects with a clear scope and thus quicker/more publications. However, solving the large difficult problems is probably more valuable for science and society in the long run, but not as "profitable" (career-wise) for early career researchers because of the high pressure to publish. An interesting observation.

The final conference session was about computational humor and colorful language.

Khalid Alnajjar presented work on a creative headline generator built by a team from the University of Helsinki. Although their goal was different from that of Churnalist, our implementations used the same libraries and datasets -- an encouraging observation.

Thomas Winters gave a talk about joke generation for 'I like my X like I like my Y, Z' jokes*. The trick is to find nouns that have a particular (adjective) property in common, such as: "I like my coffee like I like my war, cold". He and his co-authors also created JokeJudger, an opensource platform in javascript for generating and evaluating this type of joke. It might be useful for evaluating other types of text too.

Conclusion

In short, I felt right at home at ICCC. The research community was very welcoming to newcomers, and the talks and social events were enjoyable, interesting and productive. Next year's conference is in Coimbra (Portugal) from June 29th--July 3 2020, and I'm definitely going to try to visit ICCC again. In 2021 the conference will be in Mexico City, Mexico.

If you want to talk about any of the stuff above, send me a message on Twitter!

(Thanks to Ruud de Jong for proofreading this humongous blog post!)