提升Python项目完成效率的调试方法技巧

608次阅读
没有评论

效率提升是极为重要的事情,我们的时间本来就不充裕,不应该过多将时间浪费在调试过程中。对于大型项目光有dubug是不够的,如果需要提高产品调试进度,必须需要采取一些其他的方法,这里打算利用python的特性以及一些规范方法来说明一下,当然类似的方法不仅仅可以用于python,其他编程语言也是通用的。

注意,我们要吸收的不是方法,而是思想。

使用assert断言来判断代码是否正常运行

assert断言从C语言中就已经有了,在C++中也经常使用。当然python语言也继承了这个特性。总体来说assert就是首先假设一个语句没有问题,如果有问题则抛出Assert异常。

断言一般用于在调试中使用,也就是用于debug代码的时候。个人建议写在一眼看不出来是否正确需要仔细检查的部分,比如:

<span class="token keyword">def</span> <span class="token function">style_hook</span><span class="token punctuation">(</span>self<span class="token punctuation">,</span> module<span class="token punctuation">,</span> grad_input<span class="token punctuation">,</span> grad_output<span class="token punctuation">)</span><span class="token punctuation">:</span>
    self<span class="token punctuation">.</span>mask <span class="token operator">=</span> self<span class="token punctuation">.</span>mask<span class="token punctuation">[</span><span class="token punctuation">:</span><span class="token punctuation">,</span> <span class="token number">0</span><span class="token punctuation">:</span><span class="token number">1</span><span class="token punctuation">,</span> <span class="token punctuation">:</span><span class="token punctuation">,</span> <span class="token punctuation">:</span><span class="token punctuation">]</span>
    <span class="token comment"># 这段代码对grad_input[0]和self.mask的shape进行了匹配,当然如果输入到这个函数时,前两者变量就是错</span>
    <span class="token comment"># 误的那么这段代码无论如何也不可能正确执行,因此在调试的时候在这个项目的其他代码中对这两个变量进行了</span>
    <span class="token comment"># 修改从而使这段代码正确执行,修改完后,断言部分可注释掉,但是如果保留可以预防其他数据导致的bug</span>
    <span class="token keyword">assert</span> grad_input<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span>shape <span class="token operator">==</span> self<span class="token punctuation">.</span>mask<span class="token punctuation">.</span>shape<span class="token punctuation">,</span> \
        <span class="token string">'grad_input:{} is not matchable with mask:{}'</span><span class="token punctuation">.</span><span class="token builtin">format</span><span class="token punctuation">(</span>grad_input<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span>shape<span class="token punctuation">,</span> self<span class="token punctuation">.</span>mask<span class="token punctuation">.</span>shape<span class="token punctuation">)</span>

    grad_input_1 <span class="token operator">=</span> grad_input<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">.</span>div<span class="token punctuation">(</span>torch<span class="token punctuation">.</span>norm<span class="token punctuation">(</span>grad_input<span class="token punctuation">[</span><span class="token number">0</span><span class="token punctuation">]</span><span class="token punctuation">,</span> <span class="token number">1</span><span class="token punctuation">)</span> <span class="token operator">+</span> <span class="token number">1e</span><span class="token operator">-</span><span class="token number">8</span><span class="token punctuation">)</span>
    grad_input_1 <span class="token operator">=</span> grad_input_1 <span class="token operator">*</span> self<span class="token punctuation">.</span>weight
    grad_input_1 <span class="token operator">=</span> grad_input_1 <span class="token operator">*</span> self<span class="token punctuation">.</span>mask
    grad_input <span class="token operator">=</span> <span class="token builtin">tuple</span><span class="token punctuation">(</span><span class="token punctuation">[</span>grad_input_1<span class="token punctuation">,</span> grad_input<span class="token punctuation">[</span><span class="token number">1</span><span class="token punctuation">]</span><span class="token punctuation">,</span> grad_input<span class="token punctuation">[</span><span class="token number">2</span><span class="token punctuation">]</span><span class="token punctuation">]</span><span class="token punctuation">)</span>

    <span class="token keyword">return</span> grad_input

当然Assert断言操作也不是越多越好,容易造成程序的混乱和效率问题,总之我们可以在一些重要的数据,需要作注释来检查的地方使用断言进行替代,可以有效对代码进行调试。

使用tqdm来代替print显示结果

tqdm是一个比python自带的原生的print打印函数打印信息更好的打印进度工具。

下面的动图来源自官网。

提升Python项目完成效率的调试方法技巧

看起来比我们普通使用print打印更加“好看一些”,其实使用这个工具主要作用在于时间预测和速度估计,在训练的时候可以自己设定一个速度的标准,然后在训练的时候可以通过观察速度变化做出一些判断,或者根据估计出来的需要时间来估计训练所需要的时间。

在训练中保存记录log

保存Log记录是很重要的,我们不应当只在训练中实时观察输出结果,也应该在训练过程中对中间数据进行统计。一般是将每一次训练的时间、训练参数、以及训练结果打包放到一个目录下,目录的名称根据训练参数和结果而定,保证之后的查阅方便以及便于分析:

<span class="token comment"># 下方的代码返回一个我们要生成目录的名称,并且保存了参数信息</span>
<span class="token keyword">def</span> <span class="token function">get_log_dir</span><span class="token punctuation">(</span>model_name<span class="token punctuation">,</span> config_id<span class="token punctuation">,</span> cfg<span class="token punctuation">)</span><span class="token punctuation">:</span>
    <span class="token comment"># load config</span>
    name <span class="token operator">=</span> <span class="token string">'MODEL-%s_CFG-%03d'</span> <span class="token operator">%</span> <span class="token punctuation">(</span>model_name<span class="token punctuation">,</span> config_id<span class="token punctuation">)</span>
    <span class="token keyword">for</span> k<span class="token punctuation">,</span> v <span class="token keyword">in</span> cfg<span class="token punctuation">.</span>items<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
        v <span class="token operator">=</span> <span class="token builtin">str</span><span class="token punctuation">(</span>v<span class="token punctuation">)</span>
        <span class="token keyword">if</span> <span class="token string">'/'</span> <span class="token keyword">in</span> v<span class="token punctuation">:</span>
            <span class="token keyword">continue</span>
        name <span class="token operator">+=</span> <span class="token string">'_%s-%s'</span> <span class="token operator">%</span> <span class="token punctuation">(</span>k<span class="token punctuation">.</span>upper<span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">,</span> v<span class="token punctuation">)</span>
    now <span class="token operator">=</span> datetime<span class="token punctuation">.</span>datetime<span class="token punctuation">.</span>now<span class="token punctuation">(</span>pytz<span class="token punctuation">.</span>timezone<span class="token punctuation">(</span><span class="token string">'Asia/Shanghai'</span><span class="token punctuation">)</span><span class="token punctuation">)</span>
    name <span class="token operator">+=</span> <span class="token string">'_VCS-%s'</span> <span class="token operator">%</span> git_hash<span class="token punctuation">(</span><span class="token punctuation">)</span>
    name <span class="token operator">+=</span> <span class="token string">'_TIME-%s'</span> <span class="token operator">%</span> now<span class="token punctuation">.</span>strftime<span class="token punctuation">(</span><span class="token string">'%Y%m%d-%H%M%S'</span><span class="token punctuation">)</span>
    <span class="token comment"># create out</span>
    log_dir <span class="token operator">=</span> osp<span class="token punctuation">.</span>join<span class="token punctuation">(</span>here<span class="token punctuation">,</span> <span class="token string">'logs'</span><span class="token punctuation">,</span> name<span class="token punctuation">)</span>
    <span class="token keyword">if</span> <span class="token operator">not</span> osp<span class="token punctuation">.</span>exists<span class="token punctuation">(</span>log_dir<span class="token punctuation">)</span><span class="token punctuation">:</span>
        os<span class="token punctuation">.</span>makedirs<span class="token punctuation">(</span>log_dir<span class="token punctuation">)</span>
    <span class="token keyword">with</span> <span class="token builtin">open</span><span class="token punctuation">(</span>osp<span class="token punctuation">.</span>join<span class="token punctuation">(</span>log_dir<span class="token punctuation">,</span> <span class="token string">'config.yaml'</span><span class="token punctuation">)</span><span class="token punctuation">,</span> <span class="token string">'w'</span><span class="token punctuation">)</span> <span class="token keyword">as</span> f<span class="token punctuation">:</span>
        yaml<span class="token punctuation">.</span>safe_dump<span class="token punctuation">(</span>cfg<span class="token punctuation">,</span> f<span class="token punctuation">,</span> default_flow_style<span class="token operator">=</span><span class="token boolean">False</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> log_dir

参数使用字典或者list包含起来或者使用命令进行输入

如果训练参数数量多,建议将训练参数写成字典的形式:

configurations <span class="token operator">=</span> <span class="token punctuation">{</span>
    <span class="token number">1</span><span class="token punctuation">:</span> <span class="token builtin">dict</span><span class="token punctuation">(</span>
        max_iteration<span class="token operator">=</span><span class="token number">100000</span><span class="token punctuation">,</span>
        lr<span class="token operator">=</span><span class="token number">1.0e-10</span><span class="token punctuation">,</span>
        momentum<span class="token operator">=</span><span class="token number">0.99</span><span class="token punctuation">,</span>
        weight_decay<span class="token operator">=</span><span class="token number">0.0005</span><span class="token punctuation">,</span>
        interval_validate<span class="token operator">=</span><span class="token number">4000</span><span class="token punctuation">,</span>
    <span class="token punctuation">)</span>
<span class="token punctuation">}</span>

也可以使用命令行参数的功能单独对特定的参数进行调整:

parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-content_weight"</span><span class="token punctuation">,</span> <span class="token builtin">type</span><span class="token operator">=</span><span class="token builtin">int</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token number">8</span><span class="token punctuation">)</span> 
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-style_weight"</span><span class="token punctuation">,</span> <span class="token builtin">type</span><span class="token operator">=</span><span class="token builtin">int</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token number">2000</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-tv_weight"</span><span class="token punctuation">,</span> <span class="token builtin">type</span><span class="token operator">=</span><span class="token builtin">float</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token number">1e</span><span class="token operator">-</span><span class="token number">3</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-num_iterations"</span><span class="token punctuation">,</span> <span class="token builtin">type</span><span class="token operator">=</span><span class="token builtin">int</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token number">8000</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-normalize_gradients"</span><span class="token punctuation">,</span> action<span class="token operator">=</span><span class="token string">'store_true'</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-init"</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token string">"random"</span><span class="token punctuation">,</span> choices<span class="token operator">=</span><span class="token punctuation">[</span><span class="token string">"random"</span><span class="token punctuation">,</span> <span class="token string">"image"</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-init_image"</span><span class="token punctuation">,</span> <span class="token builtin">help</span><span class="token operator">=</span><span class="token string">"initial image"</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-optimizer"</span><span class="token punctuation">,</span> <span class="token builtin">help</span><span class="token operator">=</span><span class="token string">"optimiser"</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token string">"lbfgs"</span><span class="token punctuation">,</span> choices<span class="token operator">=</span><span class="token punctuation">[</span><span class="token string">"lbfgs"</span><span class="token punctuation">,</span> <span class="token string">"adam"</span><span class="token punctuation">]</span><span class="token punctuation">)</span>
parser<span class="token punctuation">.</span>add_argument<span class="token punctuation">(</span><span class="token string">"-learning_rate"</span><span class="token punctuation">,</span> <span class="token builtin">type</span><span class="token operator">=</span><span class="token builtin">float</span><span class="token punctuation">,</span> default<span class="token operator">=</span><span class="token number">1e0</span><span class="token punctuation">)</span>

<span class="token comment"># 使用时调用 args.content_weight args.init 即可</span>

关于如何使用命令行参数:https://oldpan.me/archives/argparse-python-order-command

使用git进行版本控制

git不多说,版本控制神器,不管是公司中正式的项目代码还是你个人使用的练习项目,使用git的的好处都是不言而喻。

我们还可以通过使用python中的subprocess模块来自动执行git命令来获取当前的commot信息从而让我们对我们代码的版本和修改过程有一个直观的过程。

<span class="token comment"># 下面的代码返回当前一个commit的commit哈希码值和提交作者的信息</span>
<span class="token keyword">def</span> <span class="token function">git_hash</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">:</span>
    cmd <span class="token operator">=</span> <span class="token string">'git log -n 1 --pretty="%h -%ar"'</span>
    <span class="token builtin">hash</span> <span class="token operator">=</span> subprocess<span class="token punctuation">.</span>check_output<span class="token punctuation">(</span>shlex<span class="token punctuation">.</span>split<span class="token punctuation">(</span>cmd<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">.</span>strip<span class="token punctuation">(</span><span class="token punctuation">)</span>
    <span class="token keyword">return</span> <span class="token builtin">hash</span>

注意

好的习惯并不能直接帮助我们提高效率,需要我们在习惯这些技巧的过程中逐渐掌握一些窍门。另外,虽然这些方法需要我们投入时间去学习,但是如果熟练了这些技巧,对调试程序的好处是巨大的。

神龙|纯净稳定代理IP免费测试>>>>>>>>天启|企业级代理IP免费测试>>>>>>>>IPIPGO|全球住宅代理IP免费测试

相关文章:

版权声明:wuyou2023-02-06发表,共计3101字。
新手QQ群:570568346,欢迎进群讨论 Python51学习