<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>deeplearning | Luca Moschella</title><link>https://luca.moschella.dev/tag/deeplearning/</link><atom:link href="https://luca.moschella.dev/tag/deeplearning/index.xml" rel="self" type="application/rss+xml"/><description>deeplearning</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Wed, 31 Mar 2021 00:00:00 +0000</lastBuildDate><image><url>https://luca.moschella.dev/media/icon_hu1d8b72d1596cbf95d7a803f4cce27471_29590_512x512_fill_lanczos_center_3.png</url><title>deeplearning</title><link>https://luca.moschella.dev/tag/deeplearning/</link></image><item><title>NN Template</title><link>https://luca.moschella.dev/project/nn-template/</link><pubDate>Wed, 31 Mar 2021 00:00:00 +0000</pubDate><guid>https://luca.moschella.dev/project/nn-template/</guid><description>&lt;h2 id="git-is-not-enough">Git is Not Enough&lt;/h2>
&lt;p>Version control and multi-user collaboration are problems largely solved by git for classic codebases. Unfortunately, git alone is not enough to handle the lifecycle of a modern ML (research) project, where many different problems arise:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>Data versioning: can you recover the pre-processed data a model has been trained with? What if the data is a work in progress?&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Hyperparameters comparison: can you reliably say which hyperparameters are the best?&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Model comparison: can you identify which approach/model is the best?&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Sweeps: can you easily search for the best hyperparameters and models?&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Code organization and reproducibility: how steep is the codebase learning curve?&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>You have to tackle all the previous problems simultaneously: a jumble for each new project.&lt;/p>
&lt;h2 id="ml-tooling">ML Tooling&lt;/h2>
&lt;p>Luckily many great tools have been developed to solve or alleviate these obstacles. Examples are &lt;em>PyTorch Lightning&lt;/em> to organize your code, &lt;em>DVC&lt;/em> for data versioning, &lt;em>Weights &amp;amp; Biases&lt;/em> to compare and analyze your experiments, &lt;em>Hydra&lt;/em> for configurations and sweeps, &lt;em>Streamlit&lt;/em> to interact and showcase your system.&lt;/p>
&lt;h3 id="tooling-scaffolding">Tooling Scaffolding&lt;/h3>
&lt;p>These tools must work together in each project: a non-project-specific scaffolding that can and should be abstracted. &lt;code>nn-template&lt;/code> is exactly this: a generic template to bootstrap your project, enforcing code best practices.&lt;/p>
&lt;p>It provides boilerplate code for:&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://github.com/PyTorchLightning/pytorch-lightning" target="_blank" rel="noopener">PyTorch Lightning&lt;/a>, lightweight PyTorch wrapper for high-performance AI research.&lt;/li>
&lt;li>&lt;a href="https://github.com/facebookresearch/hydra" target="_blank" rel="noopener">Hydra&lt;/a>, a framework for elegantly configuring complex applications.&lt;/li>
&lt;li>&lt;a href="https://dvc.org/doc/start/data-versioning" target="_blank" rel="noopener">DVC&lt;/a>, track large files, directories, or ML models. Think &amp;ldquo;Git for data&amp;rdquo;.&lt;/li>
&lt;li>&lt;a href="https://wandb.ai/home" target="_blank" rel="noopener">Weights and Biases&lt;/a>, organize and analyze machine learning experiments. &lt;em>(educational account available)&lt;/em>&lt;/li>
&lt;li>&lt;a href="https://streamlit.io/" target="_blank" rel="noopener">Streamlit&lt;/a>, turns data scripts into shareable web apps in minutes.&lt;/li>
&lt;/ul>
&lt;p>You can click &lt;a href="https://github.com/lucmos/nn-template/generate" target="_blank" rel="noopener">here&lt;/a> to start a project with this template.&lt;/p></description></item></channel></rss>