Math + Code + Art Initiative · UCLA Mathematics · 2026
Inner Landscapes
Sound, Image, and Emotion Through Neural Networks
Shanmei Wanyan, Jiayin Lu, Hanyin Zhang, Ying Jiang, Yue Sun, Wanxi Yang, Yumeng He, Chenfanfu Jiang
Exhibition Statement
Inner Landscapes explores how contemporary AI systems translate between sound, image, and emotional meaning through cross-modal associations. Combining language-model-assisted image selection, neural image representations, and audio-conditioned generation, the work explores how emotional meaning can emerge through associations across different sensory modalities.
Each piece begins with a reference image selected to represent an emotional state such as calm, anxiety, nostalgia, grief, or joy. Large language model assistance is used to recommend candidate images based on visual composition, atmosphere, and semantic associations, after which the artist selects the final reference image.
A coordinate-based neural network learns the visual structure of the reference image from pixel coordinates while receiving the music spectrogram of the current time frame as input. The network is trained simultaneously on multiple zoom levels of the image. Different spectrogram thresholds emphasize different visual scales, allowing the representation to continuously shift between larger and smaller structures. As the music evolves, the imagery appears to expand, contract, and breathe in response to changes in rhythm, intensity, and frequency signals.
To preserve visually significant features, an edge-based density map assigns greater training importance to regions containing strong structural detail. Fourier coordinate encoding enables the network to capture high-frequency image information, producing a balance between sharp, focused details and larger, softer painterly forms. This creates a visual effect similar to selective focus.
The trained representation is then rendered through a stochastic random-walk painting process. Strokes accumulate on a digital canvas, while music intensity and emotion-specific styles influence their length, width, transparency, and directional randomness. The result is an evolving painting in which sound drives motion, color, and structural change.
Each dynamic painting video expresses a distinct emotional landscape. Calm appears through gentle movement and cool tones, anxiety through unstable branching and flickering intensity, nostalgia through fading and re-emerging traces, grief through dimming and dissolution, and joy through expanding forms and radiant color.
Mathematics, machine learning, computer vision, and stochastic processes function as artistic media. Through computational processes that translate between sound, image, motion, and meaning, the series reveals how neural networks can generate abstract yet recognizable emotional landscapes through cross-modal associations.
Works
Calm · 2026
Anxiety · 2026
Nostalgia · 2026
Grief · 2026
Joy · 2026
Medium
Dynamic video painting. Neural image representations, audio-conditioned generation, stochastic painting process.