Data versioning in machine learning projects - Dmitry Petrov





The interactive transcript could not be loaded.


Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Aug 1, 2018

PyData Berlin 2018

In machine learning projects it is easy to get lost in many versions of your data files. Data Version Control or DVC is an open source tool for data science projects that was created to solve the issue of discrepancy between code and data files. It works on top of Git and helps you switch between Git branches and extracts not only source code but a right version of data files.

Slides: https://www.slideshare.net/DmitryPetr...

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.

Comments are turned off
When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...