The Hot Air Generator is an AI-driven, interactive media installation
that exposes the tendency of generative AI models to hallucinate
believable, yet non-factual data when prompted with information they
have not been explicitly trained on. After recording the sound of a
steaming water kettle, sequentially chained up AI models (speech audio
enhancement model → speech-to-text model → large language model) process
the outputs of their predecessors and create a newspaper article from
thin air.
Javascript, MQTT, Processing, Arduino, OpenAI GPT-3.5, The New York
Times API, various machine learning services (accessed via web APIs),
Exhibited at
TRANSFORM 2023
Conference on AI, sustainability, art and design in Trier, Germany.
The chain of machine learning models that receive the output of their
predecessors as inputs.
The speech audio enhancement model listens to the emitted sounds of the
water kettle.
As the audio recording enhancement models has ever only been trained on
human speech, it cannot generate anything else than oral sounds. This
leads to the model to interpret the dynamic noises of the water kettle as
speech and "enhances" them into sounds that cleary resemble a human voice,
yet are entirely incomprehensible and lack any semantic meaning.
(For public exhibitions, the water kettle is not recorded
live. Instead, the model processes pre-recorded water kettle sounds, so
voices from surrounding visitors are not accidentally included in the
audio data.)
The mumbling ouput sound is then fed into a
speech-to-text model where the same phenomenon occurs. Based on a
statistical analysis of what matches most the speech data it has been
originally trained on, the model wrongly identifies certain parts of the
audio output of the previous model as clear speech and transcribes it into
a non-sensical english sentence.
In a last step, this generated sentence is given to a large language
model. The model is prompted to combine the transcript with a headlinefrom
the title page of the new york times at the given moment in time which is
received via an API and generate a newspaper article from it. The model
then creates a text that is grammatically correct and semantically
plausible, yet entirely fictional and non-factual.