Song Form-aware Full-Song Text-to-Lyrics Generation with Multi-Level Granularity Syllable Count Control

Yunkee Chae, Eunsik Shin, Hwang Suntae, Seungryeol Paik, Kyogu Lee·November 20, 2024

Summary

The text outlines a song form-aware framework for full-song text-to-lyrics generation with multi-level granularity syllable count control. It addresses challenges in lyrics generation, focusing on input text, song forms, and syllable counts. The system uses a structured token system for precise control at word, phrase, line, and paragraph levels, ensuring alignment with specified constraints and song form structures. It generates complete lyrics conditioned on input text and song form, addressing unique challenges beyond traditional text. The text presents results from a generation task, comparing models trained from scratch or initialized with large-scale pre-trained GPT-2 weights. It evaluates models for generating and infilling lyrics, focusing on perplexity, syllable count distance, syllable count error rate, and song form consistency. The study suggests enhanced flexibility in lyrics generation and plans for future work incorporating genre tags or rhyme control.

Key findings

4

Introduction
Background
Overview of text-to-lyrics generation challenges
Importance of song form in lyrics creation
Objective
Aim of the research: developing a framework for full-song text-to-lyrics generation with multi-level granularity control
Method
Data Collection
Types of input texts used for training
Song forms considered in the framework
Data Preprocessing
Tokenization techniques for structured representation
Methods for handling multi-level granularity
Model Training
Training from scratch vs. initialization with pre-trained GPT-2 weights
Evaluation Metrics
Perplexity for assessing model performance
Syllable count distance and error rate for accuracy
Song form consistency for structural adherence
Results
Comparison of models trained with different approaches
Analysis of generated and infilled lyrics quality
Discussion
Flexibility in Lyrics Generation
Insights into enhanced control over lyrics creation
Challenges and Limitations
Examination of remaining hurdles in text-to-lyrics conversion
Future Work
Plans for incorporating genre tags or rhyme control
Conclusion
Summary of contributions and implications for the field
Outlook on potential advancements in text-to-lyrics generation
Basic info
papers
computation and language
artificial intelligence
Advanced features
Insights
How does the system control syllable counts at different levels of granularity?
What are the key evaluation metrics used to assess the models' performance in generating and infilling lyrics?
What future enhancements are suggested for the system, particularly in relation to genre tags or rhyme control?
What is the main focus of the song form-aware framework described in the text?