Hard Negative Mixing for Contrastive Learning 리뷰

<p>오늘의 논문: Hard Negative Mixing for Contrastive Learning: <a href="https://arxiv.org/abs/2010.01028" target="_blank" class="ke-link">https://arxiv.org/abs/2010.01028</a></p><div class="figure-img" data-ke-type="image" data-ke-style="alignCenter" data-ke-mobilestyle="widthOrigin"><img src="https://t1.daumcdn.net/cafeattach/1Z6Ow/9957e86788b4456dab7d6b8ef3d22c69330efa23" class="txc-image" width="425" height="184" data-img-src="https://t1.daumcdn.net/cafeattach/1Z6Ow/9957e86788b4456dab7d6b8ef3d22c69330efa23" data-origin-width="708" data-origin-height="307"></div><p>Contrastive Learning은 embedding space 안에서 query와 positive sample(key)은 가깝게, negative sample은 멀게 만들어 모델을 학습시키는 방식이다.</p><div class="figure-img" data-ke-type="image" data-ke-style="alignCenter" data-ke-mobilestyle="widthOrigin"><img src="https://t1.daumcdn.net/cafeattach/1Z6Ow/ac5a8811ef4cd731d371855e9d48121ab410457f" class="txc-image" width="392" height="57" data-img-src="https://t1.daumcdn.net/cafeattach/1Z6Ow/ac5a8811ef4cd731d371855e9d48121ab410457f" data-origin-width="981" data-origin-height="142"><div class="figcaption">Constrastive Learning의 Loss. q와 k의 similarity가 높고 q와 n의 similarity가 낮을 때 loss가 낮다.</div></div><div class="figure-img" data-ke-type="image" data-ke-style="alignCenter" data-ke-mobilestyle="widthOrigin"><img src="https://t1.daumcdn.net/cafeattach/1Z6Ow/88effe0f91b1350012d4e1db41eb23cce9e50819" class="txc-image" width="355" height="84" data-img-src="https://t1.daumcdn.net/cafeattach/1Z6Ow/88effe0f91b1350012d4e1db41eb23cce9e50819" data-origin-width="864" data-origin-height="204"></div><div class="figure-img" data-ke-type="image" data-ke-style="alignCenter" data-ke-mobilestyle="widthOrigin"><img src="https://t1.daumcdn.net/cafeattach/1Z6Ow/8d015fa3e292dbd019ad5764ac9212f950fe8e0f" class="txc-image" width="306" height="99" data-img-src="https://t1.daumcdn.net/cafeattach/1Z6Ow/8d015fa3e292dbd019ad5764ac9212f950fe8e0f" data-origin-width="510" data-origin-height="165"><div class="figcaption">matching probability. q와 z_i 가 비슷하고 q와 나머지 z_i들은 다를 때 matching probability가 높다.</div></div><p>보통은 negative sample의 개수가 많을수록 학습의 효율이 좋아진다. 그렇지만 batch나 memory bank에서 negative sample을 가져오는 기존의 방식으로는 negative sample의 개수를 늘리는 cost가 높으며, 단순히 sample의 수를 늘리는 것으로는 성능에 한계가 있다. </p><p> </p><p>이 논문에서는 negative sample이 많을 수록 학습이 잘 되는 이유가 hard negative sample (query와 비슷해서 맞추기 어렵) 때문이라고 말한다. 그리고, 이러한 hard negative sample을 만드는 방법, MoCHi를 제시하고 있다.</p><p> </p><div class="figure-img" data-ke-type="image" data-ke-style="alignCenter" data-ke-mobilestyle="widthOrigin"><img src="https://t1.daumcdn.net/cafeattach/1Z6Ow/4732062879becab5e99d776fd557a08968c7d49f" class="txc-image" width="362" height="311" data-img-src="https://t1.daumcdn.net/cafeattach/1Z6Ow/4732062879becab5e99d776fd557a08968c7d49f" data-origin-width="1451" data-origin-height="1246"><div class="figcaption">memory bank에서 가져온 negative sample들은 query로 부터 멀리 떨어져 있는 반면, MoCHi로 만든 synthetic negative sample은 query와 가깝다.</div></div><p>MoCHi(N, s, s')로 hard negative sample을 만드는 방식은 단순하다.</p><p>1. Embedding space에서 matching probability가 높은 순서대로 이미 가지고 있는 negative sample들을 정렬한다.</p><p>2. 상위 N개의 negative sample을 추린다.</p><p>3. N개 negative sample 중 2개를 random으로 골라 섞는다. (feature-level MixUp) s번 반복해 s개의 mixed negatives를 얻는다.</p><div class="figure-img" data-ke-type="image" data-ke-style="alignCenter" data-ke-mobilestyle="widthOrigin"><img src="https://t1.daumcdn.net/cafeattach/1Z6Ow/04629a21069f567d62a4ddb709e723ece1b0461c" class="txc-image" width="500" height="87" data-img-src="https://t1.daumcdn.net/cafeattach/1Z6Ow/04629a21069f567d62a4ddb709e723ece1b0461c" data-origin-width="644" data-origin-height="112"><div class="figcaption">n_i, n_j는 상위 N개 negative sample에서 랜덤으로 선택, alpha_k는 random mixing coefficient.</div></div><p>4. 더 어려운 negative sample을 얻기 위해 2의 N개 negative sample 중 하나를 random으로 고르고, query와 섞는다. (feature-level MixUP) s'번 반복해 s'개의 mixed negatives를 얻는다.</p><p>5. 만든 hard negative sample로 모델을 학습시킨다.</p><p> </p><p>이렇게 만든 hard negative sample로 모델을 학습시킨 결과, baseline인 MoCo보다 확연히 높은 성능을 얻었다고 한다.</p><p> </p><p>오늘도 인공지능 공부 화이팅~</p>