[20210307] Weekly Arxiv Causal Talk #1

jungwoo-ha · 2021-03-13T04:36:20Z

AI 주요 이슈
- CVPR 2021 notification (3.1)
  - NAVER AI LAB & CLOVA
    - Relabeling: https://arxiv.org/abs/2101.05022
    - PCME: https://arxiv.org/abs/2101.05068
    - ReXNet: https://arxiv.org/abs/2007.00992
    - Looking into Your Speech: TBD
    - StyleMapGAN: TBD
    - Rainbow memory: TBD
  - 카카오브레인, 하이퍼커텍트 등
- ICRA 2021 notification (3.1)
- ICCV 2021 데드라인: abs 10일, full paper: 17일
- Interspeech 2021: 3월 26일
- LINE-Yahoo JPN 합병 완료, A홀딩스 출범, AI에 5년간 5조 투자, 5000명 채용.
- Nipa gpu 지원 사업: https://n.news.naver.com/mnews/article/138/0002099753?sid=105
Arxiv 논문 리스트
- WIT: Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine Learning (https://arxiv.org/abs/2103.01913 ) → from Google Research
  - 위키피디아의 문서-이미지 데이터를 공개 (이미지 11.5M, 텍스트 37,5 M)
- MixSpeech: Data Augmentation for Low-resource Automatic Speech Recognition (https://arxiv.org/abs/2102.12664) → accepted @ ICASSP 2021
  - from Chinese Academy of Sciences, Tsinghua U, MSRA
  - SpecAug 이후 드디어 음성에서도 Mixup 류 aug?
  - 두 spectrogram mix, GT annotation loss mix, LAS 활용
  - TIMIT, WSJ, HKUST
- Perceiver: General Perception with Iterative Attention (https://arxiv.org/abs/2103.03206 ) → from Deepmind
  - Transformer 구조를 개선해서 이미지, 비디오, 오디오, 3d 포인트 클라우드 에서 모두 잘되도록 하는 구조.
- When Face Recognition Meets Occlusion: A New Benchmark (https://arxiv.org/abs/2103.02805v1 ) → ICASSP 2021, from Wuhan Univ.
  - 마스크를 합성한 것이 많음. 안경이 있는건 또 신기
- Learning Accurate and Interpretable Decision Rule Sets from Neural Networks (https://arxiv.org/abs/2103.02826v1 ) → AAAI 2021, from UCSD
  - XAI (Rule layer / OR layer), 2 FC layers.
- Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image (https://infinite-nature.github.io/ ) from Google Research
  - 사진 한장이면 scene video clip 을 만들어 주는 Wow 데모
  - Colab: https://colab.research.google.com/github/google-research/google-research/blob/master/infinite_nature/infinite_nature_demo.ipynb#scrollTo=pmFK0wEyMQ-X
- Data Augmentation for Object Detection via Differentiable Neural Rendering (https://arxiv.org/abs/2103.02852v1 )
  - From U of Pittsburgh
  - 오브젝트 디텍터 학습을 위해 싱글 이미지 주어지면 neural renderer를 이용해서 카메라 뷰를 다양화한 이미지를 생성하여 data augment 하고 학습에사용.
  - 생각보다 이미지 뷰가 엄청 다양하진 않지만 꽤 신박한 아이디어로 보임.
  - 다른 augment와 함께 쓸 수 있을 것 같은데..
  - 근데 online affine이 낫나.. 이게 낫나… ROI 나올까??
  - Single Image Renderer가 더 좋아지면 더 유용할지도 …
  - 근데 왜 object detector에만 썼는지.. (COCO 로 검증)
- Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings (https://arxiv.org/abs/2103.02886v1 )
  - From BAIR (Peter Abbeel)
  - Cnn encoder 일부 freezing
  - Experience replay 에서 이미지 대신 vector 저장.
  - 효율화해도 성능 유지됨
  - DeepMind Control Suite 에서 테스트
- Coordinate Attention for Efficient Mobile Network Design (https://arxiv.org/abs/2103.02907 )
  - From NU Singapore
  - 작은 cnn 백본에서 se 나 cbam 을 대체 가능
  - Mobilebetv2 moblienext effnet 에 얹어봄
  - 블럭내에 채널 split, bn, 그룹 컨브,.존재 (뭔가 ResNest: ResNet Split-attention 냄새도 나고..)
  - Madd 는 별로 안 늘어나는데 latency 나 메모리 관점에선 어떠할지 체크 필요
- The Transformer Network for the Traveling Salesman Problem (https://arxiv.org/abs/2103.03012v1 )
  - From NTU, Singapore
  - AI 수업때 늘 나오던 TSP를 RL기반 transformer enc - dec로 해결.
  - TSP50, TSP100 에서 기존 heuristic 기반 SOTA combinatorial optimization solver 대비 더 좋은 성능 보임.
  - 2018년에 Max Welling lab에서 Routing 연구 있음.
  - 학교 숙제를 요걸로 한번 해보시면 어떨까요?
  - 관련 영상 (The Transformer Network for the Traveling Salesman Problem (ucla.edu))
- CoTr: Efficiently Bridging CNN and Transformer for 3D Medical Image Segmentation (https://arxiv.org/abs/2103.03024v1 )
  - from Northwestern Polytech Univ (China), U of Adelaide
  - 초고해상도 3d 의료 이미지 세그먼테이션
  - CNN feature + deformable Transformer → 속도와 long range dependency 문제 동시 해결
  - 기본적 enc, dec는 CNN. 중간 추상화가 deformable transformer 역할
  - cropped size: 48x192x192, BCV 데이터셋에서 검증.
- Advances in Multi-turn Dialogue Comprehension: A Survey (https://arxiv.org/abs/2103.03125v1 )
  - from 상하이교통대
  - 멀티턴 대화모델들에 대한 survey 정리 연구
  - 멀티텀 대화 연구나 서비스 만드시는 분들에겐 도움될듯
  - 감사하게도 Dialog-BERT 논문 인용함.
- DONeRF: Towards Real-Time Rendering of Neural Radiance Fields using Depth Oracle Networks (https://arxiv.org/abs/2103.03231v1 )
  - NeRF 이후로 Neural renderer 비약적인 결과 보여줌
  - 최근 ShapeNeRF는 이미지 거의 안씀.. (주로 구글리서치)
  - 근데… 연산량 넘 많음..
  - Facebook reality lab과 오스트리아 Graz Univ 에서 무려 800x800 을 15fps 로 뽑아내는 NeRF를 만듬. (대략 48배 빠름)
  - 과연 VR/AR의 세상이 오는 데 혁혁한 기여 가능?
  - https://depthoraclenerf.github.io/
- Anycost GANs for Interactive Image Synthesis and Editing (https://arxiv.org/abs/2103.03243v1 )
  - from MIT, Adobe Research, CMU
  - https://hanlab.mit.edu/projects/anycost-gan/
  - Cost budget을 고려한 생성.

jungwoo-ha closed this as completed Apr 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[20210307] Weekly Arxiv Causal Talk #1

[20210307] Weekly Arxiv Causal Talk #1

jungwoo-ha commented Mar 13, 2021 •

edited

Loading

[20210307] Weekly Arxiv Causal Talk #1

[20210307] Weekly Arxiv Causal Talk #1

Comments

jungwoo-ha commented Mar 13, 2021 • edited Loading

jungwoo-ha commented Mar 13, 2021 •

edited

Loading