Discussion Ontology: Knowledge Discovery from Human Activities in Meetings

Hironori TOMOBE
21st Century COE Program on Intelligent Media Integration, Nagoya University
Katashi NAGAO
Center for Information Media Studies, Nagoya University

1 Introduction

Discussion Mining is a preliminary study on gathering knowledge based on the content of offline discussion meetings. We developed a system that acquires information from various sources (textual data, audio-visual data, metadata, and so on) and semi-automatically generates structured discussion content data. The goal of this study is to activate knowledge in a discussion group by analyzing meetings, for example, extracting the argument flow and discovering arguments that are related to on-going discussions. However, in order to analyze structured discussion content data, we need an efficient methodology for meeting discussions.

Discussion ontology forms the basis of our discussion methodology. Discussion ontology requires the semantic relations among each element of a meeting to be clarified. In this paper, to build a discussion ontology, we generate discussion content and analyze meeting metadata.

2 Discussion Mining

In discussion mining, we record human activity in the real world using four cameras and a microphone. We focus on meetings that include a presenter, a secretary, and participants, where the presenter presents her/his agenda using Microsoft PowerPoint. The presenter uses an exclusive tool to transmit the slides and their timing. The meeting minutes are recorded automatically. A secretary, meanwhile, uses another exclusive tool to record the content of the speaker’s presentation. Participants in the meeting transmit their IDs and comments using tag-type devices called discussion tags, which maintain structure during the discussion. Furthermore, using a button device, participants can arbitrarily input their stance on what is being presented as well as any discussions they have had with other participants.


In discussion mining, we target face-to-face meetings. The detailed scenery of a meeting is recorded with four cameras and a microphone installed in the meeting room. One camera records the main screen, another records the presenter's face, and the two remaining cameras record the participants’ actions. Audio information is recorded using a microphone installed in the center of the meeting room. Figure 1 shows an image of the discussion room. The secretary records the speaker’s argument using an exclusive tool. Participants support the secretary in structuring the minutes (or metadata) using an electronic tag called a discussion tag. Whilst speaking, a participant chooses one of two discussion tags, and holds it up toward the ceiling. The system detects the speaker's position via position data from the infrared sensor, and automatically turns one of the cameras toward the speaker. Two or more people are able to use a discussion tag simultaneously. This card can also be used to segment overlapping utterances.

There are two types of discussion tags: "start-up" and "follow-up." Participants use the startup tag to make remarks that trigger a discussion, and use the follow-up tag to make remarks that relate to an ongoing discussion. Using these tags, discussion segmentation is performed automatically, and this supports the analysis of the discussion and clarifies points when a video of the meeting is viewed. During a presentation or a discussion, each participant can input information corresponding to her or his stance by operating a button device. Participants can input three stance types: “agree," "disagree," and "neutral," This information is collected and recorded automatically by the discussion mining system, which can also evaluate utterances and determine which stance type forms the majority. This information is added to the minutes in real time, and is edited by the secretary. Currently, the secretary inputs the text manually, although in the future, this will be done automatically using speech recognition technology. A record of this information is saved in XML and MPEG-4 formats and is stored as multimedia minutes in an XML database.

This system enables us to visualize the structure of these multimedia minutes by creating a graphical display and edit mode for statements using scalable vector graphics (SVG). The graph is semi-automatically structuralized using pertinent information and keywords from statements and slides, as shown in Figure 2. This function also allows users to edit information.

3 Discussion Ontology -- An Approach

3.1 Statement Classification using intention tags