Computer Science PhD

Audio-Visual & Multi-Modal Deep Learning

Dennis
Fedorishin

I'm Dennis, a Senior Machine Learning Engineer at ACV Auctions. I recently completed my PhD in Computer Science at the University at Buffalo where I focused on multi-modal audio deep learning, specifically on how audio interacts with other modalities to build stronger deep representations. My expertise includes audio-specific areas like sound event detection and classification and other multi-modal deep learning areas like audio-visual and vision-language learning.

My goal? Do good research.

PUBLICATIONS

Audio Match Cutting: Finding and Creating Matching Audio Transitions in Movies and Videos
2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
D. Fedorishin, L. Lu, S. Setlur, V. Govindaraju.
Paper arXiv Project Page Bibtex

Fine-Grained Engine Fault Sound Event Detection Using Multimodal Signals
2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
D. Fedorishin, L. Forte III, P. Schneider, S. Setlur, V. Govindaraju.
Paper arXiv Bibtex

Best Paper Award

CoNAN: Conditional Neural Aggregation Network For Unconstrained Face Feature Fusion
2023 IEEE International Joint Conference on Biometrics (IJCB)
B. Jawade, D. Mohan, D. Fedorishin, S. Setlur, V. Govindaraju.
Paper arXiv Bibtex

Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization
2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
D. Fedorishin, D. Mohan, B. Jawade, S. Setlur, V. Govindaraju.
Paper arXiv Code Bibtex

Oral Presentation

Large-Scale Acoustic Automobile Abnormality Detection: Diagnosing Engines Through Sound
2022 ACM Conference on Knowledge Discovery and Data Mining (KDD)
D. Fedorishin, D. Mohan, J. Birgiolas, L. Forte III, P. Schneider, S. Setlur, V. Govindaraju.
Paper Bibtex

Waveforms and Spectrograms: Enhancing Acoustic Scene Classification Using Multimodal Feature Fusion
2021 Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE)
D. Fedorishin, N. Sankaran, D. Mohan, J. Birgiolas, P. Schneider, S. Setlur, V. Govindaraju.
Paper Code Bibtex

Bayesian Personalized-Wardrobe Model (BP-WM) for Long-Term Person Re-Identification
2021 IEEE Conference on Advanced Video and Signal-based Surveillance (AVSS)
K. Lee, N. Sankaran, D. Mohan, K. Davila, D. Fedorishin, S. Setlur, V. Govindaraju.
Paper Bibtex

Moving in the Right Direction: A Regularization for Deep Metric Learning
2020 Conference on Computer Vision and Pattern Recognition (CVPR)
D. Mohan, N. Sankaran, D. Fedorishin, S. Setlur, V. Govindaraju.
Paper Bibtex

Representation Learning Through Cross-Modality Supervision
2019 IEEE Conference on Automatic Face and Gesture Recognition (FG)
N. Sankaran, D. Mohan, S. Setlur, V. Govindaraju,
D. Fedorishin
Paper Bibtex

In 2019, I was selected as a Goldwater Scholar.

1 out of 496 in the country.

Couldn't have done it without without Phil Schneider, Venu Govindaraju, Kwang Oh, and Nicholas Eadie.

EXPERIENCE

Senior Machine Learning Engineer @ ACV Auctions

November 2024 - Present

  • Multi-Modal deep learning for automatic vehicle inspections.

Computer Science PhD @ CUBS

May 2020 - November 2024

  • Dissertation: "Listen and Learn: From Supervised to Contrastive Multi-Modal Audio Deep Learning"
  • Audio-Visual deep learning for acoustic, visual, and spatial understanding.
  • Collaboration with ACV Auctions focusing on acoustic automobile engine abnormality detection.
  • Multimodal feature fusion for sound event detection and classification.

Research Scientist Intern @ Dolby Labs

May 2023 - August 2023

  • Research intern with Dolby's Advanced Technology Group focusing on audio-visual deep learning.
  • Developed self-supervised audio retrieval and transition framework for automatically creating seamless audio transitions between scenes (audio match cuts) in movies and videos.

Research Scientist Collaborator @ ACV Auctions

August 2019 - May 2023

  • Lead of machine learning research projects and coordinating with UB’s Center for Unified Biometrics and Sensors for joint research.
  • Deployed large-scale deep learning pipeline to automatically diagnose vehicle engines through sound (Saving ACV Auctions over $1,000,000 each year through averting repair costs). Trained on millions of vehicles.
  • Deployed multiple sound event detection models to detect subtle audible engine faults and provide audio recording quality control, portrayed to all 1,000+ employed vehicle inspectors to aid in inspection.

Software Engineering Intern @ ACV Auctions

May 2019 - August 2019

  • Android/iOS front-end development including feature implementation and major structure refactoring.
  • Managed domain-specific legacy API migrations from Perl into Django REST API.
  • Deep learning based document processing automation using text localization and OCR.
  • Migration of main monolith service to a domain-specific microservice architecture following CQRS patterns.

Undergraduate Research Assistant @ CUBS

September 2017 - May 2020

  • Direction regularization for deep metric learning loss functions.
  • Multimodal feature fusion for deep learning based action unit recognition.
  • Construction and analysis of generative adversarial networks for visible to thermal image construction.
  • Convolutional neural network enhancements via channel reweighting and global attention.

Software Engineering Intern @ PostProcess

June 2018 - August 2018

  • Full stack development using ASP.NET MVC, JavaScript, SQL, and Universal Windows Platform.
  • Expanded and redesigned company SQL databases for stability in increased use and traffic.
  • Statistical analysis of historical machine data for discovering optimal operating parameters.

Undergraduate Research Assistant @ SMALL

September 2017 - June 2019

  • Researching post-processing effects on models created from additive manufacturing.
  • Facial spoofing of the iPhone X FaceID using 3D printed masks generated from deep learning models.
  • Microfluidic device imaging and depth analysis using modulation transfer function algorithm.

PROJECTS

Visual Diminished Reality

CUBS @ UB (Ongoing Project)

Anyone who tries out a VR headset falls in love. What if you could see everything you normally would, but not know whats real anymore? Whether its the coffee mug on your table that disappears or the person next to you, diminished reality is the name of the game.

  • Image inpainting network to synthesize portions of a user's view.
  • Realtime package and pipeline to allow for use within popular VR/AR equipment.

Assetto Corsa Self Driving Car

DandyHacks 2018

While many are researching how to safely drive a car autonomously on a road, we thought the opposite. How do you send a self driving car as fast as possible around a track? Say hello to a fan favorite racing game mixed with deep learning.

  • Wheel, gas, and brake position prediction using CNN on raw video feed, implemented in Keras.
  • Supervised learning from user-recorded racing sessions.

Bitcoin Price Predition Using Sentiment Analysis

BrickHack 4

Bitcoin, a trader's favorite volatile money dump. Everyone always thinks they can predict its value, so we put that to the test. Sentiment calculated from posts on the most reliable sites: reddit and twitter.

  • Deep learning + lexicon based sentiment analysis fused with immediate historical price data to predict short and long term pricing changes.
  • Proven correlation between sentiment and price via backtest of 2016 and 2017 prices.