Fasciani, Stefano; Simionato, Riccardo & Tidemann, Aleksander
(2024).
A Universal Tool for Generating Datasets from Audio Effects,
Proceedings of the Sound and Music Computing Conference 2024.
SMC Network.
ISSN 2518-3672.
Simionato, Riccardo & Fasciani, Stefano
(2024).
Conditioning Methods for Neural Audio Effects,
Proceedings of the Sound and Music Computing Conference 2024.
SMC Network.
ISSN 2518-3672.
Simionato, Riccardo & Fasciani, Stefano
(2024).
Hybrid Neural Audio Effects
,
Proceedings of the Sound and Music Computing Conference 2024.
SMC Network.
ISSN 2518-3672.
Simionato, Riccardo & Fasciani, Stefano
(2023).
Fully Conditioned and Low-latency Black-box Modeling of Analog Compression.
I Serafin, Stefania; Fontana, Federico & Willemsen, Silvin (Red.),
Proceedings of the 26th International Conference on Digital Audio Effects.
Aalborg University Copenhagen.
ISSN 2413-6700.s. 287–295.Fulltekst i vitenarkiv
Simionato, Riccardo & Fasciani, Stefano
(2023).
A Comparative Computational Approach to Piano Modeling Analysis,
Proceedings of the Sound and Music Computing Conference 2023.
SMC Network .
ISSN 978-91-527-7372-7.
Simionato, Riccardo & Fasciani, Stefano
(2022).
Deep Learning Conditioned Modeling of Optical Compression.
Proceedings of the International Conference on Digital Audio Effects.
ISSN 2413-6700.Fulltekst i vitenarkivVis sammendrag
Deep learning models applied to raw audio are rapidly gaining relevance in modeling audio analog devices. This paper investigates the use of different deep architectures for modeling audio optical compression. The models use as input and produce as output raw audio samples at audio rate, and it works with no- or small-input buffers allowing a theoretical real-time and low-latency implementation. In this study, two compressor parameters, the ratio, and threshold have been included in the modeling process aiming to condition the inference of the trained network. Deep learning architectures are compared to model an all-tube optical mono compressor including feed-forward, recurrent, and encoder-decoder models. The results of this study show that feed-forward and long short-term memory architectures present limitations in modeling the triggering phase of the compressor, performing well only on the sustained phase. On the other hand, encoder-decoder models outperform other architectures in replicating the overall compression process, but they overpredict the energy of high-frequency components.
Bentsen, Lars Ødegaard; Simionato, Riccardo; Wallace, Benedikte & Krzyzaniak, Michael Joseph
(2022).
Transformer and LSTM Models for Automatic Counterpoint Generation using Raw Audio.
Proceedings of the SMC Conferences.
ISSN 2518-3672.
doi: 10.5281/zenodo.6572847.
Fulltekst i vitenarkivVis sammendrag
A study investigating Transformer and LSTM models applied to raw audio for automatic generation of counterpoint was conducted. In particular, the models learned to generate missing voices from an input melody, using a collection of raw audio waveforms of various pieces of Bach’s work, played on different instruments. The research demonstrated the efficacy and behaviour of the two deep learning (DL) architectures when applied to raw audio data, which are typically characterised by much longer sequences than symbolic music representations, such as MIDI. Currently, the LSTM model has been the quintessential DL model for sequence-based tasks, such as generative audio models, but the research conducted in this study shows that the Transformer model can achieve competitive results on a fairly complex raw audio task. The research therefore aims to spark further research and investigation into how Trans- former models can be used for applications typically dominated by recurrent neural networks (RNN). In general, both models yielded excellent results and generated sequences with temporal patterns similar to the input targets for songs that were not present in the training data, as well as for a sample taken from a completely different dataset.