publications
2025
- Towards Music Industry 5.0: Perspectives on Artificial IntelligenceAlexander Williams, and Mathieu BarthetIn Artificial Intelligence for Music Workshop at the 39th Annual AAAI Conference on Artificial Intelligence, Mar 2025
Artificial Intelligence (AI) is a disruptive technology that is transforming many industries including the music industry. Recently, the concept of Industry 5.0. has been proposed emphasising principles of sustainability, resilience, and human-centricity to address current shortcomings in Industry 4.0. and its associated technologies, including AI. In line with these principles, this paper puts forward a position for ethical AI practices in the music industry. We outline the current state of AI in the music industry and its wider ethical and legal issues through an analysis and discussion of contemporary case studies. We list current commercial applications of AI in music, collect a range of perspectives on AI in the industry from diverse stakeholders, and comment on existing and forthcoming regulatory frameworks and industry initiatives. Resultingly, we provide several timely research directions, practical recommendations, and commercial opportunities to aid the transition to a human-centric, resilient, and sustainable music industry 5.0. This work particularly focuses on western music industry case studies in the European Union (EU), United States of America (US), and United Kingdom (UK), but many of the issues raised are universal. While this work is not exhaustive, we nevertheless hope it guides researchers, businesses, and policy makers to develop responsible frameworks for deploying and regulating AI in the music industry.
2024
- Deep Learning-based Audio Representations for the Analysis and Visualisation of Electronic Dance Music DJ MixesAlexander Williams, Haokun Tian, Stefan Lattner, Mathieu Barthet, and Charalampos SaitisIn AES International Symposium on AI and the Musician, Jun 2024
Electronic dance music (EDM), produced using computers and electronic instruments, is a collection of musical subgenres that emphasise timbre and rhythm over melody and harmony. It is usually presented through the medium of DJing, where tracks are curated and mixed sequentially to offer unique listening and dancing experiences. However, unlike key and tempo annotations, DJs still rely on audition rather than metadata to examine and select tracks with complementary audio content. In this work, we investigate the use of deep learning-based representations (Complex Autoencoder and OpenL3) for analysing and visualising audio content on a corpus of DJ mixes with approximate transition timestamps and compare them with signal processing-based representations (joint time-frequency scattering transform and mel-frequency cepstral coefficients). Representations are computed once per second and visualised with UMAP dimensionality reduction. We propose heuristics based on the identification of observed patterns in visualisations and time-sensitive Euclidean distances in the representation space to compute DJ transition lengths, transition smoothness, and inter-song, song-to-song, and full-mix audio content consistency using audio representations along with rough DJ transition timestamps. Our method enables the visualisation of variations within music tracks, facilitating the analysis of DJ mixes and individual EDM tracks. This approach supports musicians in making informed creative decisions based on such visualisations. We share our code, dataset annotations, computed audio representations, and trained CAE model. We encourage researchers and music enthusiasts alike to analyse their own music using our tools: https://github.com/ alexjameswilliams/EDMAudioRepresentations.
- Invited Talk: Using AI to Augment Creativity in Electronic Dance MusicAlexander WilliamsOct 2024
AI is a disruptive technology that is being applied to numerous areas of the music industry. However, the reception from creators is mixed. Some fear that AI may be used to automate opportunities away from human creators, while others see AI as a tool that can be used in beneficial ways. This talk will showcase recent research applications of AI to augment human creativity in two electronic dance music case studies: 1) co-creative music artwork generation; and 2) analysing DJ mixes to inform human creative practice. It will also touch on some of the ethical and legal concerns raised by these AI technologies in music.
- Invited Talk: Applications and Perspectives of AI in the Music IndustryAlexander WilliamsDec 2024
This seminar will introduce a co-creative process for generating images to accompany a piece of music using pre-trained neural network models. According to this process, visuals are influenced not only by the audio of a piece of music, but also by a corpus of illustrations and prompts recommended by the user, in order to anchor the generated content in a creative approach. The seminar will also discuss important topic of ethics and responsible innovation regarding AI and music. We will address the issues and initiatives in industry and governance guiding researchers, companies, and policy makers in the development of fair and responsible frameworks for the deployment and regulation of AI in the music industry.
- Tutorial: Model Pipelines for AI Music Artwork GenerationAlexander WilliamsDec 2024
In this workshop, we will study how to combine state-of-the-art deep neural network models for music-to-text conversion (MusiCNN and LP-MusicCaps), image-to-text (CLIP Interrogator), key word detection via natural language processing (KeyBERT), and text-to-image (Stable Diffusion) to recommend, through generation, visuals for a piece of music. The Colab notebook used during the workshop will include a sequencer to produce musical sequences from randomly selected audio samples. Images illustrating these musical sequences will be generated using pre-trained models. This workshop will also cover Python libraries for recording energy use during computation (CodeCarbon)”
2023
- A Reinforcement Learning Approach to Powertrain OptimisationHocine Matallah, Asad Javied, Alexander Williams, Ashraf Fahmy Abdo, and Fawzi BelblidiaIn Sustainable Design and Manufacturing, Dec 2023
A strategy to reduce computation time and improve minimisation performance in the context of optimisation of battery electric vehicle power trains is provided, motivated by constraints in the motor manufacturing business. This paper proposes a holistic design exploration approach to investigate and identify the optimal powertrain concept for cars based on the component costs and energy consumption costs. Optimal powertrain design and component sizes are determined by analysing various powertrain configuration topologies, as well as single and multi-speed gearbox combinations. The impact of powertrain combinations on vehicle attributes and total costs is investigated further. Multi-objective optimisation in this domain considers a total of 29 component parameters comprised of differing modalities. We apply a novel reinforcement learning-based framework to the problem of simultaneous optimisation of these 29 parameters and demonstrate the feasibility of this optimisation method for this domain. Our results show that, in comparison to single rear motor setups, multi-motor systems offer better vehicle attributes and cheaper total costs. We also show that load points with front and back axle motors may be shifted to a greater efficiency zone to achieve decreased energy consumption and expenses.
- Sound-and-Image-Informed Music Artwork Generation Using Text-to-Image ModelsAlexander J. Williams, Stefan Lattner, and Mathieu BarthetIn Music Recommender Systems Workshop at the 17th ACM Conference on Recommender Systems, Sep 2023
Music and its accompanying artwork have a symbiotic relationship. While some artists are involved in both domains, the creation of music and artwork require different skill sets. The development of deep generative models for music and image generation has potential to democratise these mediums and make multi-modal creation more accessible for casual creators and other stakeholders. In this work, we propose a co-creative pipeline for the generation of images to accompany a musical piece. This pipeline utilises state-of-the-art models for music-to-text, image-to-text, and subsequently text-to-image generation to recommend, via generation, visuals for a piece of music that are informed not only by the audio of a musical piece, but also a user-recommended corpus of artworks and prompts to give a meaningful grounding in the generated material. We demonstrate the potential of our pipeline using a corpus of material from artists with strongly connected visual and musical identities, and make it available in the form of a Python notebook for users to easily generate their own musical and visual compositions using their chosen corpus - available here: https://github.com/alexjameswilliams/Music-Text-To-Image-Generation
2021
- Survey of Energy Harvesting Technologies for Wireless Sensor NetworksAlexander J. Williams, Matheus F. Torquato, Ian M. Cameron, Ashraf A. Fahmy, and Johann SienzIEEE Access, Sep 2021Conference Name: IEEE Access
Energy harvesting (EH) technologies could lead to self-sustaining wireless sensor networks (WSNs) which are set to be a key technology in Industry 4.0. There are numerous methods for small-scale EH but these methods differ greatly in their environmental applicability, energy conversion characteristics, and physical form which makes choosing a suitable EH method for a particular WSN application challenging due to the specific application-dependency. Furthermore, the choice of EH technology is intrinsically linked to non-trivial decisions on energy storage technologies and combinatorial architectures for a given WSN application. In this paper we survey the current state of EH technology for small-scale WSNs in terms of EH methods, energy storage technologies, and EH system architectures for combining methods and storage including multi-source and multi-storage architectures, as well as highlighting a number of other optimisation considerations. This work is intended to provide an introduction to EH technologies in terms of their general working principle, application potential, and other implementation considerations with the aim of accelerating the development of sustainable WSN applications in industry.
- Cascade Optimisation of Battery Electric Vehicle PowertrainsMatheus F. Torquato, Kayalvizhi Lakshmanan, Natalia Narożańska, Ryan Potter, Alexander Williams, Fawzi Belblidia, Ashraf A. Fahmy, and Johann SienzIn Procedia Computer Science, Jan 2021
Motivated by challenges in the motor manufacturing industry, a solution to reduce computation time and improve minimisation performance in the context of optimisation of battery electric vehicle powertrain is presented. We propose a cascade optimisation method that takes advantage of two different vehicle models: the proprietary YASA MATLAB® vehicle model and a Python machine learning-based vehicle model derived from the proprietary model. Gearbox type, powertrain configuration and motor parameters are included as input variables to the objective function explored in this work while constraints related to acceleration time and top speed must be met. The combination of these two models in a constrained optimisation genetic algorithm managed to both reduce the amount of computation time required and achieve more optimal target values relating to minimising vehicle total cost than either the proprietary or machine learning model alone. The coarse-to-fine approach utilised in the cascade optimisation was proven to be mainly responsible for the improved optimisation result. By using the final population of the machine learning vehicle model optimisation as the initial population of the following simulation-based minimisation, the initial time-consuming search to produce a population satisfying all domain constraints was practically eliminated. The obtained results showed that the cascade optimisation was able to reduce the computation time by 53% and still achieve a minimisation value 14% lower when compared to the YASA Vehicle Model Optimisation.
- Real-Time Hybrid Visual Servoing of a Redundant Manipulator via Deep Reinforcement LearningAlexander WilliamsSwansea University, Feb 2021
Fixtureless assembly may be necessary in some manufacturing tasks and environments due to various constraints but poses challenges for automation due to non-deterministic characteristics not favoured by traditional approaches to industrial au-tomation. Visual servoing methods of robotic control could be effective for sensitive manipulation tasks where the desired end-effector pose can be ascertained via visual cues. Visual data is complex and computationally expensive to process but deep reinforcement learning has shown promise for robotic control in vision-based manipu-lation tasks. However, these methods are rarely used in industry due to the resources and expertise required to develop application-specific systems and prohibitive training costs. Training reinforcement learning models in simulated environments offers a number of benefits for the development of robust robotic control algorithms by reducing training time and costs, and providing repeatable benchmarks for which algorithms can be tested, developed and eventually deployed on real robotic control environments. In this work, we present a new simulated reinforcement learning envi-ronment for developing accurate robotic manipulation control systems in fixtureless environments. Our environment incorporates a contemporary collaborative industrial robot, the KUKA LBR iiwa, with the goal of positioning its end effector in a generic fixtureless environment based on a visual cue. Observational inputs are comprised of the robotic joint positions and velocities, as well as two cameras, whose positioning reflect hybrid visual servoing with one camera attached to the robotic end-effector, and another observing the workspace respectively. We propose a state-of-the-art deep reinforcement learning approach to solving the task environment and make prelimi-nary assessments of the efficacy of this approach to hybrid visual servoing methods for the defined problem environment. We also conduct a series of experiments exploring the hyperparameter space in the proposed reinforcement learning method. Although we could not prove the efficacy of a deep reinforcement approach to solving the task environment with our initial results, we remain confident that such an approach could be feasible to solving this industrial manufacturing challenge and that our contributions in this work in terms of the novel software provide a good basis for the exploration of reinforcement learning approaches to hybrid visual servoing in accurate manufacturing contexts.