[20211128] Weekly AI ArXiv 만담 #31

jungwoo-ha · 2021-11-27T02:23:06Z

News
- 2021 과학기술미래인재포럼 영상
- 2021 AI대학원 심포지엄 --> 신청은 홈피에서
  - 12월 3일 코엑스에서 개최됩니다.
- 2021 ICT트렌드 컨퍼런스 by IT여성기업인협회
  - 장소는 코엑스 역시 날짜는 12월 3일
  - 참가신청: https://forms.gle/zCFgShoi1uxemmWj9
- DEVIEW 2021 다시보기
Arxiv
- DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
  - MS 365 AI와 MSR의 DeBERTa 멤버들이 그대로 개선
  - DeBERTa: Disentangled attention (word content and position): https://arxiv.org/pdf/2006.03654.pdf (ICLR 2021), 공개는 작년 6월 --> RoBERTa보다 좋은 성능 보임
  - MLM을 Electra의 pretraining task인 repace token detection (RTD)로 대체 (대체 생성된 token 잡아내기)
  - Electra 스타일로 하되 Electra의 D와 G 사이의 tug of war (줄다리기) 문제 해결을 위한 gradient-disengtangled embedding sharing 제안
  - https://github.com/microsoft/DeBERTa
- GauGAN2
  - GauGAN2 combines segmentation mapping, inpainting and text-to-image generation in a single model, making it a powerful tool to create photorealistic art with a mix of words and drawings.
- NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
  - 이름이 왜 이런가 했더니: https://en.wikipedia.org/wiki/N%C3%BCwa (중국 고대신화의 여와..)
  - text, image, video를 모두 입력에 넣고 visual decoding 하는 멀티모달 foundation model (from MSRA, 북경대)
  - 이를 위해서 3D Nearby attention 기반의 변형 transformer
  - pretraining task: T2I, video prediction, T2V: 다양한 어려운 pretraining task approach
  - Image tokenizer는 VQ-GAN (Video도 그냥 그대로)
  - Pretraining data: 2.9M I-T pair, Moments in Time (727k video), VATEX (241k text-video pair)
- Florence: A New Foundation Model for Computer Vision
  - 다양한 CV task들을 위한 Foundation model (Foundation model이 궁금한 분은: ) from MSR and MS Cloud
  - 훈련데이터는 web scale image-text data
  - 12 layer transformer (LM encoder) + Conv+SwinTransformer (Vision encoder)
  - Image-text contrastive learning은 UniCL을 쓴다는데 (동일 그룹) 2022년에 arxiving을 준비하고 있다는... ㅡㅡ;;;
  - Scalability를 위한 테크닉들 함께 소개 (그래서 512 A100 10일간 돌림)
  - Transfer type: finetuning, linear probing, few-shot, zero-shot for 44 benchmark tasks
  - ImageNet-1k zero sho 83,7%, COCO 62.4 mAP, VQA 80.36, 87.8 for K-600

ghlee3401 · 2021-11-28T05:46:59Z

ArXiv
- One-shot Voice Conversion for Style Transfer based on Speaker Adaptation
  - 샘플 URL : https://kerwinchao.github.io/Oneshotvc.github.io/
  - one-shot VC 를 제안 : 하나의 타겟 utterance만을 이용하여 타겟 화자의 목소리를 만드는 task
  - recognition-synthesis system (4개의 주요 모듈로 이루어져 있음) - Fig. 1.
    1. content module : ASR로부터 추출된 bottleneck feature를 입력으로 speaker normalization[16번논문]을 수행하여 output은 speaker에 독립적인 contents가 나옴
    2. speaker module : reference encoder와 classifier를 이용하여 target speaker의 mel-spectrogram으로부터 speaker representation을 추출
    3. prosody module : log f0, vuv, energy와 같은 external feature와 bottleneck feature를 이용한 explicit & implicit modeling을 하여 speaker에 독립적인 prosody representation을 추출
    4. conversion module : 각 모듈의 output을 input으로 받아 mel-spectrogram을 예측 (over-fitting 문제를 방지하기 위하여 weight regularization을 사용)
  - 3번의 training step
    1. any-to-one VC method를 이용하여 content module 학습
    2. large dataset을 이용하여 2가 적혀 있는 모듈 학습 (이 때 target speaker audio는 source audio의 동일 화자 랜덤 추출)
    3. target speaker 1개의 발화만을 이용하여 fine-tuning
  - 영어여서 확실한 판단은 어렵지만 target speaker의 목소리로 말하고 발음도 준수
- Improved Prosodic Clustering for Multispeaker and Speaker-independent Phoneme-level Prosody Control
  - 샘플 URL : https://innoetics.github.io/publications/multispeaker-prosody-control/index.html
  - TTS 에서 F0와 duration의 phoneme-level prosody control을 위한 prosodic clustering을 제안
  - Training set에 다양한 f0와 duration의 값이 들어있지 않으면 unseen에 대해서 adaptation 시 문제 발생
  - phoneme-lelvel에서 f0와 duration에 대한 다양한 data augmentation을 수행하고 클러스터링을 함
  - f0의 경우 speaker variation을 줄이기 위하여 normalization을 취한 후 클러스터링 수행
  - duration의 경우 K-Means 알고리즘이 아니라 duration에 해당하는 샘플 수가 같은 것끼리 묶음
  - 결과적으로 few-shot adaptation 시 학습이 더 잘 됨
  - 샘플 페이지를 보면 데이터 augmentation 방법이 다양한 것이 눈에 띔
  - 너무 과하게 shift한 값들은 사용하는게 맞을지는 의문

kimyoungdo0122 · 2021-11-28T10:54:45Z

News
- DEVIEW 2021과 HyperCLOVA(다시보기는 하정우 소장님 글로!)
  - 인상깊었던 부분은?
  - HyperCLOVA는 단순히 좋은 PLM이 아니라 연구와 개발, 서비스, 비즈니스 등 모든 요소를 아우르는 Lifecycle의 핵심적인 마중물
  - DEVIEW 챙겨보던 지인은 HyperCLOVA 중심의 DEVIEW에 아쉬움을 내비치기도...
  - Large-scale AI를 사내에 구축할지 말지 고민하시는 분들께 좋은 행사였을 것 같습니다
  - 조직 관점에서 Large-scale 모델의 장점이 뭐가 있을까?
    - 하이퍼클로바를 중심으로 AI 연구 및 개발 프로세스를 통합or개선할 수 있지 않을까
    - ai 개발이 은근히 연구자or개발자 개인의 노하우에 영향을 많이 받을텐데 휴리스틱한 점도 개선이 될지?
    - hyperclova fine tuning하는 방법을 통해 ai 개발의 한계비용을 점차 낮출 수 있을까(손익분기점이 언제인지는 모르겠지만)
    - 하이퍼클로바를 서비스에 사용하려는 시도를 계속 하면서 잘 작동하지 않거나 개선할 수 있는 문제가 계속 나타날텐데, 새로운 연구 방향이나 주제가 도출되고 새로운 task, benchmark, metric, dataset 등을 제시하는 practical한 연구에 큰 도움이 될 듯!
- OpenAI’s API Now Available with No Waitlist
  - API Guidlines
  - API #
  - pre-model description
  - 원하는 input output에 어떤 engine을 써야할지 판단을 도와주는 툴
  - 저번주에 Azure에 올라간다고 했는데.. 바로 api #을 여는 이유는?
- 모두콘 2021이 12월 4일에 진행됩니다! 많은 관심 부탁드려요~~!!
  - MODUCON 2021

qqueing · 2021-11-28T11:39:27Z

Scaling Law for Recommendation Models: Towards General-purpose User Representations
- 네이버의 추천 쪽 하이퍼 스케일 유저 representations 연구
- 다운스트림 태스크에 대한 실험 및 다양한 실험 결과들(scaling law에 대한 확인등)
- (참고) 데뷰 2021 https://tv.naver.com/v/23649376

hollobit · 2021-11-28T13:01:07Z

유네스코 인공지능 윤리 권고 채택

https://unesdoc.unesco.org/ark:/48223/pf0000379920.page=14
https://kaiea.org/research/?idx=8994590&bmode=view
https://zdnet.co.kr/view/?no=20211124105459

유네스코 193개 회원국이 인공지능 윤리에 관한 첫 번째 글로벌 협약을 채택

141개 조항의 가치와 원칙

정부가 만든 'AI 윤리원칙 자율점검표' 실효성 놓고 의견 분분

http://it.chosun.com/site/data/html_dir/2021/11/26/2021112601857.html
http://www.aitimes.com/news/articleView.html?idxno=141704

AI 윤리 자율점검표 초안에서는 작년 12월 과기정통부가 마련한 AI 윤리기준 내 10대 핵심 요건에 해당하는 체크리스트 문항 47개를 제시
하지만 자율점검표의 실효성을 둘러싼 업계의 반응은 나뉜다. 실무에서 활용할 수 있는 구체적인 지침이 필요하다는 의견이 있지만, 자율적 지침이 아닌 규제적 성격을 갖게 될까 우려하는 목소리도 있다. 또 자율적 지침이기에 기업들이 적극 활용하지 않을 것이란 우려도 있다

jungwoo-ha closed this as completed Jan 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[20211128] Weekly AI ArXiv 만담 #31

[20211128] Weekly AI ArXiv 만담 #31

jungwoo-ha commented Nov 27, 2021 •

edited

Loading

ghlee3401 commented Nov 28, 2021 •

edited

Loading

kimyoungdo0122 commented Nov 28, 2021 •

edited

Loading

qqueing commented Nov 28, 2021

hollobit commented Nov 28, 2021

[20211128] Weekly AI ArXiv 만담 #31

[20211128] Weekly AI ArXiv 만담 #31

Comments

jungwoo-ha commented Nov 27, 2021 • edited Loading

ghlee3401 commented Nov 28, 2021 • edited Loading

kimyoungdo0122 commented Nov 28, 2021 • edited Loading

qqueing commented Nov 28, 2021

hollobit commented Nov 28, 2021

유네스코 인공지능 윤리 권고 채택

정부가 만든 'AI 윤리원칙 자율점검표' 실효성 놓고 의견 분분

jungwoo-ha commented Nov 27, 2021 •

edited

Loading

ghlee3401 commented Nov 28, 2021 •

edited

Loading

kimyoungdo0122 commented Nov 28, 2021 •

edited

Loading