Sep 23, 2022 2 min read

Links / 2022-09-18

My first email to some friends on this topic; the positive response to this is part of why I decided to start the newsletter.

[Ed.: This was my first email to some friends on this topic, now in Ghost for posterity.]

GitHub Copilot developer productivity research

GH did a study on “how do developers perform with copilot?” I am unsure how representative it is, but even if only 1/10th as effective as portrayed here, many developers will adopt it.

Open Data licensing resource

“a project for AI model training with trusted dataset compliance”, subproject of LF’s lfaidata.foundation, found via SPDX’s spdx-ai subproject.

Stanford Call for norms on model release

Stanford AI project calls for figuring out how to release models. It feels to me akin to early OSI days, when OSI (briefly) talked about more than licensing; or even slightly pre-OSI days (as this, essentially, is a call for an OSI-like entity for open models).

Pertinent quote, but the paper is worth a read in its entirety:

Releasing foundation models is important since no single organization has the needed range of diverse perspectives to foresee all the long-term issues, but at the same time, release must be appropriately gated to minimize risks. The community currently lacks norms on release. Our emphasis here is that the question is not what a good release policy is, but how the community should decide. We have sketched out the foundation models review board…

CMU research on license-as-norms in deepfake space

We conducted an interview study of an AI-enabled open source Deepfake project to understand how members of that community reason about the ethics of their work. We found that notions of the “Freedom 0” to use code without any restriction … were central to how community members framed their responsibilities, and the actions they believed were and were not available to them.

This paper is very good; perhaps a little too theoretical to be useful for a practitioner but I think very helpful for anyone thinking about norms, licenses, and the interplay within them in a community that actively thinks about norms but also feels bound by open community norms that long pre-date the project.

If you’re interested, I also had a good thread with the author clarifying some points about how the paper writes about licenses v. norms:

I finished this paper today, and I'm struggling a bit with it. Specifically: in Sec. 4 you talk a lot about Freedom 0 as the source for norms, while in Sec. 3.1.1 you instead attribute the norms to the license—not broader free/open cultural expectations.
— Luis Villa (@luis_in_brief) September 7, 2022

Injection attacks and AI

We’ve spent a lot of time as a community talking about AI ethics but not a lot (yet) talking about AI security; that moment is coming. Key quote:

The more I think about these prompt injection attacks against GPT-3, the more my amusement turns to genuine concern. I know how to beat XSS, and SQL injection, and so many other exploits. I have no idea how to reliably beat prompt injection!

The AI Unbundling

There’s no explicitly open angle to this article, but it’s a good overview of how AI might really lead to massive change, in the context of things like midjourney, copilot, etc. It will ring very familiar to anyone who looked at the changes wrought by open from 1998-2018.