Добавить
Уведомления

Reproducibility and Extendibility Best Practices for Machine Learning

Boston Data Science February 2020 Meetup Abstract: Recently in deep learning there has been a reproducibility 'crisis'. Tons of hyper-parameters, different preprocessing methods, tons of dependencies, poor-code structure, and changing data sources make it difficult for even a skilled data scientist to reproduce results let alone extend them. Moreover, this hampers the ability to effectively deploy models to production to achieve real value. This talk will cover best practices for managing experiments, unit testing for machine learning, maintaining model code quality, managing dependencies, versioning data, and designing all around extendable models. This talk, although deep learning focused, is applicable towards anyone building and training machine learning models. It will include several demos of refactoring code and writing model unit tests. Attendees will leave with a better all around knowledge of how to successfully track experiments and an idea of what open source tools exist to facilitate that process. Speaker Bio: Isaac Godfried is a data engineer at Monster focusing on the data and machine learning platform. Prior to his current position Isaac worked on machine learning problems in both retail and healthcare. He also has participated in many Kaggle competitions. Isaac’s main focus is to remove barriers related to the use of deep learning in industry. Specifically, this involves researching techniques like transfer and meta learning for data constrained scenarios, designing tools to effectively track and manage experiments, and creating frameworks to deploy models at scale. In his spare time Isaac also conducts research in AI for good causes like medicine and climate. Slides: https://docs.google.com/presentation/d/1ZlQM80OsfGVRoY9GGQBfwqMRq5tj2aJfQe1Rm5pflrU/edit https://www.meetup.com/Boston-Data-Science-Meetup/events/267908208

Иконка канала Кодерские идеи
63 подписчика
12+
16 просмотров
2 года назад
12+
16 просмотров
2 года назад

Boston Data Science February 2020 Meetup Abstract: Recently in deep learning there has been a reproducibility 'crisis'. Tons of hyper-parameters, different preprocessing methods, tons of dependencies, poor-code structure, and changing data sources make it difficult for even a skilled data scientist to reproduce results let alone extend them. Moreover, this hampers the ability to effectively deploy models to production to achieve real value. This talk will cover best practices for managing experiments, unit testing for machine learning, maintaining model code quality, managing dependencies, versioning data, and designing all around extendable models. This talk, although deep learning focused, is applicable towards anyone building and training machine learning models. It will include several demos of refactoring code and writing model unit tests. Attendees will leave with a better all around knowledge of how to successfully track experiments and an idea of what open source tools exist to facilitate that process. Speaker Bio: Isaac Godfried is a data engineer at Monster focusing on the data and machine learning platform. Prior to his current position Isaac worked on machine learning problems in both retail and healthcare. He also has participated in many Kaggle competitions. Isaac’s main focus is to remove barriers related to the use of deep learning in industry. Specifically, this involves researching techniques like transfer and meta learning for data constrained scenarios, designing tools to effectively track and manage experiments, and creating frameworks to deploy models at scale. In his spare time Isaac also conducts research in AI for good causes like medicine and climate. Slides: https://docs.google.com/presentation/d/1ZlQM80OsfGVRoY9GGQBfwqMRq5tj2aJfQe1Rm5pflrU/edit https://www.meetup.com/Boston-Data-Science-Meetup/events/267908208

, чтобы оставлять комментарии