Skip to content

[译] [105] 使用 Copilot 的风险和挑战 #7

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
cssmagic opened this issue Feb 28, 2024 · 0 comments
Open

[译] [105] 使用 Copilot 的风险和挑战 #7

cssmagic opened this issue Feb 28, 2024 · 0 comments

Comments

@cssmagic
Copy link
Owner

cssmagic commented Feb 28, 2024

1.5 Risks and challenges when using Copilot

1.5 使用 Copilot 的风险和挑战

Now that we’re all pumped up about getting Copilot to write code for us, we need to talk about the dangers inherent in using AI assistants. See references [2] and [3] for elaboration on some of these points.

现在我们都对 Copilot 帮我们写代码感到兴奋,但我们还需要讨论一下使用 AI 助手时所面临的潜在风险。有关这些问题的更多阐述,请参阅参考文献 [2] 和 [3]。

  • Copyright—Copilot learned how to program using human-written code. (You’ll hear people use the word “train” when talking about AI tools like Copilot. In this context, training is another word for learning.) More specifically, it was trained using millions of GitHub repositories containing open-source code. One worry is that Copilot will “steal” that code and give it to us. In our experience, Copilot doesn’t often suggest a large chunk of someone else’s code, but that possibility is there. Even if the code that Copilot gives us is a melding and transformation of various bits of other people’s code, there may still be licensing problems. For example, who owns the code produced by Copilot? There is currently no consensus on the answer. The Copilot team is adding features to help; for example, Copilot can tell you whether the code that it produced is similar to already-existing code and what the license is on that code [4]. Learning and experimenting on your own is great, and we encourage that—but take the necessary care if you do intend to use this code for purposes beyond your home. We’re a bit vague here, and that’s intentional: it may take some time for laws to catch up to this new technology. It’s best to play it safe while these debates are had within society.

  • 版权问题。Copilot 是通过学习人类编写的代码掌握编程技能的。(在讨论 Copilot 等 AI 工具时,人们常会使用 “训练” 一词,这里指的是“学习”。)具体来说,它采用 GitHub 上数百万开源的代码仓库进行训练。人们担心 Copilot 可能会 “盗用” 这些代码并提供给我们。然而,根据我们的观察,Copilot 很少推荐使用他人代码的大块内容,尽管这种可能性确实存在。即便 Copilot 提供的代码是多段他人代码的融合与转化,也可能引发版权问题。例如,Copilot 生成的代码的版权归属尚未有明确共识。为了解决这一问题,Copilot 团队正在引入新功能,比如能够检测其生成的代码是否与现有代码相似,以及相关代码的许可证情况 [4]。我们鼓励你将这些代码用于个人的学习和实验,但如果你计划用于家庭之外的场合,请务必谨慎对待。这里的表述有意保持了一定的模糊性,这是因为法律适应这项新技术可能需要一段时间。在社会就这些议题进行充分讨论之前,保持谨慎是明智之举。

  • Education—As instructors of introductory programming courses ourselves, we have seen first-hand how well Copilot does on the types of assignments we have historically given our students. In one study [5], Copilot was asked to solve 166 common introductory programming tasks. And how well did it do? On its first attempt, it solved almost 50% of these problems. Give Copilot a little more information, and that number goes up to 80%. You have already seen for yourself how Copilot solves a standard introductory programming problem. Education needs to change in light of tools like Copilot, and instructors are currently discussing how these changes may look. Will students be allowed to use Copilot and, if so, in what ways? How can Copilot help students learn? And what will programming assignments look like now?

  • 教育。作为编程入门课程的讲师,我们亲眼见证了 Copilot 在编程作业这种任务类型上的卓越表现。在一项研究 [5] 中,Copilot 面对 166 项常见的初级编程任务,它的表现如何?在第一次尝试中,它就解决了将近 50% 的问题;一旦提供更多信息,这一比例甚至可提升至 80%。你已经目睹了 Copilot 在解决标准入门级编程问题时所表现出的能力。教育领域需要考虑到 Copilot 这类工具的出现并进行变革,目前教师们也在积极探讨这种变革的具体形式。学生们能否使用 Copilot?如果可以,他们将如何使用?Copilot 又将如何辅助学生学习?未来的编程作业又将呈现出何种新的面貌?

  • Code quality—We need to be careful not to trust Copilot, especially with sensitive code or code that needs to be secure. Code written for medical devices, for example, or code that handles sensitive user data must always be thoroughly understood. It’s tempting to ask Copilot for code, marvel at the code that it produces, and accept that code without scrutiny. But that code might be plain wrong. In this book, we will work on code that will not be deployed at large, so while we will focus on getting correct code, we will not worry about the implications of using this code for broader purposes. We will start building the foundations you will need to independently determine whether code is correct.

  • 代码质量。我们必须保持警惕,不能盲目信任 Copilot,尤其是在处理敏感代码或需要保障安全的代码时。例如,为医疗设备编写的代码,或者处理用户敏感数据的代码,我们必须彻底理解。人们在面对 Copilot 的神奇表现时很容易麻痹大意,从而在未经仔细审核的情况下接受它生成的代码,但那些代码可能完全是错误的。在这本书中,我们处理的代码并不会大规模部署,因此,虽然我们会专注于获取正确的代码,但不会过多考虑这些代码在更大范围使用时会有何种影响。我们将致力于帮助你建立必要的基础,以便独立判断代码的正确性。

  • Code security—As with code quality, code security is absolutely not assured when we get code from Copilot. For example, if we are working with user data, getting code from Copilot is not enough. We would need to perform security audits and have expertise to determine that the code is secure. Again, though, we will not be using code from Copilot in real-world scenarios. Therefore, we will not focus on security concerns.

  • 代码安全。与代码质量一样,从 Copilot 获得的代码无法保证安全性。例如,在处理用户数据时,把 Copilot 提供的代码拿来就用是远远不够的。我们需要执行安全审计,并且通过专业知识来判断代码的安全性。当然,我们不会在现实场景中使用 Copilot 提供的代码,因此,我们不会将重点放在安全问题上。

  • Not an expert—One of the markers of being an expert is awareness of what one knows and, equally important, what one doesn’t. Experts are also often able to state how confident they are in their response; and, if they are not confident enough, they will learn further until they know that they know. Copilot and, more generally, LLMs do not do this. You ask them a question, and they answer, plain as that. They will confabulate if necessary: they will mix bits of truth with bits of garbage into a plausible-sounding but overall nonsensical response. For example, we have seen LLMs fabricate obituaries for people who are alive, which doesn’t make any sense, yet the “obituaries” do contain elements of truth about people’s lives. When asked why an abacus can perform math faster than a computer, we have seen LLMs come up with responses—including something about abacuses being mechanical and therefore necessarily the fastest. There is ongoing work in this area for LLMs to be able to say, “sorry, no, I don’t know this”, but we are not there yet. They don’t know what they don’t know, and that means they need supervision.

  • 不是专家。专家的一个显著特征是对自己所知与所不知有清晰的认识。他们还能准确表达对自己答案的信心程度;如果信心不足,他们会持续学习,直到确信自己掌握了知识。Copilot 以及更广泛的 LLM 并不具备这种能力。你向它们提问时,它们就是直接给出回答而已;在必要时它们还会编造答案:将真实片段与垃圾信息混合,形成看似合理但总体上没有意义的回答。例如,我们观察到 LLM 有时会为尚在人世的人虚构讣告,尽管这不合逻辑,但这些 “讣告” 中却包含了关于这些人生活的真实信息。当被问及算盘为何能在运算速度上超越计算机时,LLM 有时会给出一些站不住脚的解释,比如算盘因为是机械的,所以必然速度更最。目前,LLM 正在努力改进,以便能够在不知道答案时明确表示 “对不起,我不知道”,但这一目标尚未实现。它们不知道自己不知道什么,这意味着它们需要监督。

  • Bias—LLMs will reproduce the same biases present in the data on which they were trained. If you ask Copilot to generate a list of names, it will generate primarily English names. If you ask for a graph, it may produce a graph that doesn’t consider perceptual differences among humans. And if you ask for code, it may produce code in a style reminiscent of how dominant groups write code. (After all, the dominant groups wrote most of the code in the world, and Copilot is trained on that code.) Computer science and software engineering have long suffered from a lack of diversity. We cannot afford to stifle diversity further, and indeed we need to reverse the trend. We need to let more people in and allow them to express themselves in their own ways. How this will be handled with tools like Copilot is currently being worked out and is of crucial importance for the future of programming. However, we believe Copilot has the potential to improve diversity by lowering barriers for entry into the field.

  • 偏见。LLM 会重现其训练数据中存在的偏见。例如,当你请求 Copilot 生成一份姓名清单时,它通常会生成一些英文名;如果你要求它绘制图表,得到的图表可能没有充分考虑到人类之间的视觉感知差异;而要求它编写代码时,它输出的代码风格很可能反映了主流群体的编码习惯。(毕竟主流群体编写了世界上的大部分代码,而 Copilot 正是基于这些代码进行训练的。)长期以来,计算机科学和软件工程领域一直面临多样性不足的问题。我们不能允许多样性进一步受损,更应努力扭转这一趋势。我们需要让更多的人参与进来,让他们能以自己的方式自由表达。如何面对 Copilot 这类工具所带来的挑战,目前正在积极探索中,这对编程的未来极为关键。尽管如此,我们相信 Copilot 有希望通过降低行业门槛来促进多样性的提升。

# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

No branches or pull requests

1 participant