Computer Science Thesis Proposal

Location:
In Person and Virtual - ET - Reddy Conference Room, Gates Hillman 4405 and Zoom

Speaker:
SHUQI DAI , Ph.D. Student, Computer Science Department, Carnegie Mellon University
https://www.shuqid.net/

Towards Artificial Musicians: Modeling Style for Music Composition, Performance, and Synthesis via Machine Learning

The field of Artificial Intelligence Generative Content (AIGC) is increasingly delving into music content creation. However, three fundamental and intricate challenges persist in understanding and creating music: (1) multi-modal music representations, (2) highly complex and logical music structure, and (3) personalized and stylistic music preferences. This thesis tackles these three challenges by focusing on a practical application: creating virtual musicians or "re-creating" existing musicians. 

The thesis creates artificial musicians across different music creation levels and representation modalities. (1) For symbolic music composition, I combine music domain knowledge with machine learning models to compose melodies, harmonies, and bass lines while preserving specific styles. (2) Expressive performance control, highly crucial in music creativity but often ignored, is achieved through diffusion models, generating pitch envelopes, dynamics, and playing techniques, capturing the unique performance styles of singers and instrumentalists. (3) Acoustic audio synthesis involves synthesis from scratch and transferring timbres of vocals and instruments, including zero-shot vocal and instrumental synthesis of unseen targets. These layers converge to model composition and musicianship across multi-modal music representations. 

The thesis emphasizes music domain knowledge in stylistic and personalized music modeling, and delves into music structure analysis to elevate generation quality. I further discuss the applications of the technologies in this thesis in areas such as music therapy, music education, the theory development of non-Western music, and human-computer interactive live performance. Ethical and legal implications of AI music are also explored, foreseeing its fusion with the future music industry. The proposal outlines technical foundations and design frameworks for the three music creation levels, rationalizes technology choices, presents achievements, and offers solutions for pending tasks. The contribution, success criteria, future prospects, and research schedule are discussed.   

Thesis Committee:

Roger B. Dannenberg (Chair)
Chris Donahue 
Junyan Zhu 
Julius O. Smith (Stanford University)
Gus Guangyu Xia (Mohamd Bin Zayed University of Artificial Intelligence)

Additional Information

In Person and Zoom Participation.  See announcement.


Add event to Google
Add event to iCal