-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
[20230305] Weekly AI ArXiv 만담 시즌2 - 8회차 #74
Comments
News
2️⃣ ColabFold: making protein folding accessible to all -> (From multiple institutions, 1162 citations) An open-source and efficient protein folding model. 3️⃣ Hierarchical Text-Conditional Image Generation with CLIP Latents -> (From OpenAI, 718 citations) DALL·E 2, complex prompted image generation that left most in awe. 4️⃣ A ConvNet for the 2020s -> (From Meta and UC Berkeley, 690 citations) A successful modernization of CNNs at a time of boom for Transformers in Computer Vision. 5️⃣ PaLM: Scaling Language Modeling with Pathways -> (From Google, 452 citations) Google's mammoth 540B Large Language Model, a new MLOps infrastructure, and how it performs. 2021 2️⃣ Swin Transformer: Hierarchical Vision Transformer using Shifted Windows -> (From Microsoft, 4810 citations) A robust variant of Transformers for Vision. 3️⃣ Learning Transferable Visual Models From Natural Language Supervision -> (From OpenAI, 3204 citations) CLIP, image-text pairs at scale to learn joint image-text representations in a self supervised fashion 4️⃣ On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? -> (From U. Washington, Black in AI, The Aether, 1266 citations) Famous position paper very critical of the trend of ever-growing language models, highlighting their limitations and dangers. 5️⃣ Emerging Properties in Self-Supervised Vision Transformers -> (From Meta, 1219 citations) DINO, showing how self-supervision on images led to the emergence of some sort of proto-object segmentation in Transformers. 2020 2️⃣ Language Models are Few-Shot Learners -> (From OpenAI, 8070 citations) GPT-3, This paper does not need further explanation at this stage. 3️⃣ YOLOv4: Optimal Speed and Accuracy of Object Detection -> (From Academia Sinica, Taiwan, 8014 citations) Robust and fast object detection sells like hotcakes. 4️⃣ Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer -> (From Google, 5906 citations) A rigorous study of transfer learning with Transformers, resulting in the famous T5. 5️⃣ Bootstrap your own latent: A new approach to self-supervised Learning -> (From DeepMind and Imperial College, 2873 citations) Showing that negatives are not even necessary for representation learning. ArXiv |
LLaMA 이야기.최근 페이스북이 공유한 LLaMA의 더 최신 이야기 입니다.
이미지 관련 모델.
|
High-resolution image reconstruction with latent diffusion models from human brain activity 어제부터 트위터에서 크게 화제가 된 논문을 공유해드립니다. 뇌의 fMRI 신호에서 L2 regularized linear model(???)을 학습해 stable diffusion의 image와 text latent encoding에 맞추는 모델을 만들었을 때 대상에게 보여준 영상과 유사한 영상을 복원할 수 있다는 것을 보여준 연구입니다. 각 대상마다 수천장의 영상이 있어야하며 한 모델은 한 대상에게만, 그리고 아마 한 장치에 대해서만 사용 가능할 것으로 예상하지만 뇌파 정보에서 딥러닝 학습을 이용하지 않고 pretrained model과 linear model 학습만을 통해 복원 가능하다는 것을 보여주어 파급력이 매우 클 것으로 생각됩니다. 다만, 재현성을 확인해야지만 신뢰 가능할 것 같습니다. Dropout Reduces Underfitting Consistency Models Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages Full Stack Optimization of Transformer Inference: a Survey Transformer 모델의 최적화에 대한 하드웨어 및 소프트웨어 최적화 및 이슈에 대해 잘 정리된 survey paper 공유해드립니다. |
Arxiv
|
|
No description provided.
The text was updated successfully, but these errors were encountered: