A university got itself banned from the Linux kernel (2021)
theverge.com42 points by italophil 3 hours ago
42 points by italophil 3 hours ago
Woah, the thing that leapt out at me, as a professor, is that they somehow got an exemption from the UMN institutional review board. Uh, how?? It's clearly human subjects research under the conventional federal definition[1] and obviously posed a meaningful risk of harm, in addition to being conducted deceptively. Someone has to have massively been asleep at the wheel at that IRB.
[1] https://grants.nih.gov/policy-and-compliance/policy-topics/h...
I think they should have gotten permission from IRB ahead of time, but this doesn't sound like they were researching human subjects? They were studying the community behind the Linux kernel, and specifically the process for gatekeeping bad changes from making it to the kernel; they weren't experimenting on specific community members. Would you consider it human experimentation if I was running an experiment to see if I could get crappy products listed on Amazon, for example?
> Would you consider it human experimentation if I was running an experiment to see if I could get crappy products listed on Amazon, for example?
Yes, if in the course of that experimentation, you also shipped potentially harmful products to buyers of those products "to see if Amazon actually let me".
>I think they should have gotten permission from IRB ahead of time, but this doesn't sound like they were researching human subjects?
I assure you that it falls under IRB's purview -- I came into the thread intending to make grandparent's comment. When using deception in a human subjects experiment, there is an additional level of rigor -- you usually need to debrief the participant about said deception, not wait for them to read about it in the press.
(And if a human is reviewing these patches, then yes, it is human subjects research.)
The whole story is a good example of why there are IRBs in the first place --- in any story not about this Linux kernel fiasco people generally cast them as the bad guys.
The ultimate problem is that it's easy to fake stuff so you have to use heuristics to see who you can trust. You sort of sum up your threat score and then decide how much attention to apply. Without doing something like that, the transaction costs dominate and certain valuable things can't be done. It's true that Western universities are generally a positive component to that score and students under a professor there are another positive component to the score.
It's like if my wife said "I'm taking the car to get it washed" and then she actually takes the car to the junkyard and sells it. "Ha, you got fooled!". I mean, yes, obviously. She's on the inside of my trust boundary and I don't want to live a life where I'm actually operating in a way immune to this 'exploit'.
I get that others object to the human experimentation part of things and so on, but for me that could be justified with a sufficiently high bar of utility. The problem is that this research is useless.
No, random anonymous contributors with cheng3920845823@gmail.com as their email address are not as trustworthy as your wife, and blindly merging PRs from them into some of the most security-critical and widely used code in the entire world without so much as running a static analyzer is not reasonable.
Oh I misunderstood the sections in the article about the umn.edu email stuff. My mistake. The actual course of events:
1. Prof and students make fake identities
2. They submit these secret vulns to Greg KH and friends
3. Some of these patches are accepted
4. They intervene at this point and reveal that the patches are malicious
5. The patches are then not merged
6. This news comes out and Greg KH applies big negative trust score to umn.edu
7. Some other student submits a buggy patch to Greg KH
8. Greg KH assumes that it is more research like this
9. Student calls it slander
10. Greg KH institutes policy for his tree that all umn.edu patches should be auto-rejected and begins reverts for all patches submitted in the past by such emails
To be honest, I can't imagine any other such outcome could have occurred. No one likes being cheated out of work that they did, especially when a lot of it is volunteer work. But I was wrong to say the research was useless. It does demonstrate that identities without provenance can get malicious code into the kernel.
Perhaps what we really need is a Social Credit Score for OSS ;)
Actually, I think #7 is one of the same students working for the professor. So GKH is correct in assuming it's more of the same.
Research can be non-useless but also unethical at the same time...
> 3. Some of these patches are accepted
> 4. They intervene at this point and reveal that the patches are malicious
> 5. The patches are then not merged
It's not clear to me that they revealed anything, just that they did fix the problems:
> In their paper, Lu and Wu claimed that none of their bugs had actually made it to the Linux kernel — in all of their test cases, they’d eventually pulled their bad patches and provided real ones. Kroah-Hartman, of the Linux Foundation, contests this — he told The Verge that one patch from the study did make it into repositories, though he notes it didn’t end up causing any harm.
(I'm only working from this article, though, so feel free to correct me)
You know there's a lot of he-said she-said here. The truth is that I was repeating there what they claimed in the paper which is that they intervened prior to merge to mainline.
I don't believe they revealed that they were hypocrite commits at the time of their acceptance, that was only revealed when the paper was put on a preprint server. But they did point out the problems to maintainers before the changes were mainlined.
>No one likes being cheated out of work that they did, especially when a lot of it is volunteer work.
You know what would really be wasteful of volunteer hours? Instituting a policy whereby the community has to trawl through 20 years of commits from umn.edu addresses and manually reviewing them for vulnerabilities even though you have no reasonable expectation that such commits are likely to contain malicious code and you're actually just butthurt. (they found nothing after weeks of doing this btw)
>Then, there’s the dicier issue of whether an experiment like this amounts to human experimentation. It doesn’t, according to the University of Minnesota’s Institutional Review Board. Lu and Wu applied for approval in response to the outcry, and they were granted a formal letter of exemption.
I had to apply for exemptions often in grad school. You must do so before performing the research -- it is not ethical to wait for outcry then apply after the fact. Any well run CS department trains it's incoming students on IRB procedures during orientation, and Minnesota risks all federal funding if they continue to allow researchers to operate in this manner.
(Also "exempt" usually refers to exempt from the more rigorous level of review used for medical experiments -- you still need to articulate why your experiment is exempt to avoid people just doing whatever they want then asking for forgiveness after the fact)
(2021) Discussion at the time (3025 points, 1954 comments) https://news.ycombinator.com/item?id=26887670
Fun fact: one of the researchers removed any reference to this from their publications page: https://www-users.cse.umn.edu/~kjlu/
The authors were 100% in the right, and GKH was 100% in the wrong. It's very amusing to go back and read all of the commenters calling for the paper authors to face criminal prosecution. The fact is that they provided a valuable service and exposed a genuine issue with kernel development policies. Their work reflected poorly on kernel maintainers, and so those maintainers threw a hissy fit and brigaded the community against them.
Also, banning umn.edu email addresses didn't even make sense since the hypocrite commits were all from gmail addresses.
> Also, banning umn.edu email addresses didn't even make sense since the hypocrite commits were all from gmail addresses.
The blanket ban was kicked off by another incident after the hypocrite commit incident.
I mean...there is a whole discussion about the questionable ethics of the research methods in the verge article. And human subjects and issues-of-consent questions aside, they are also messing with a mission critical system (linux kernel), and apparently left crappy code in there for all the maintainers to go back and weed out.
1) once hypocrite commits were accepted, the authors would immediately retract them
2) I don't think it's unethical to send someone an email that has bad code in it. You shouldn't need an IRB to send emails.
1) How did they hit stable then? [0]
2) Yes, emails absolutely need IRB sign-off too. If you email a bunch of people asking for their health info or doing a survey, the IRB would smack you for unapproved human research without consent. Consent was obviously not given here.
[0] https://lore.kernel.org/linux-nfs/CADVatmNgU7t-Co84tSS6VW=3N...
> I don't think it's unethical to send someone an email that has bad code in it.
It's unethical because of the bits you left out: sending code you know is bad, and doing so under false pretenses.
Whether or not you think this rises to the level of requiring IRB approval, surely you must be able to understand that wasting people's time like this is going to be viewed negatively by almost anyone. Some people might be willing to accept that doing this harm is worth it for the greater cause of the research, but that doesn't erase the harm done.
The stupid thing about the experiment was that it's never been a secret that the kernel is vulnerable to malicious patches. The kernel community understood this long before these academics wasted kernel maintainer time with a silly experiment.
While I did see some problems with their approach (i.e. doing the IRB reviews retroactively instead of doing them ahead of time, and not properly disclosing the experiments afterwards), I think this research is valuable, and I don't think the authors were too unethical. The event that this most reminds me of the Sokal Squared scandal, where researchers sent bogus papers to journals in order to test those journal's peer review standards.
Imo, the experiment was worthwhile, it exposed a risk, hopefully the kernel is better armed against similar attacks now.
Did they ever get un-banned ? IIRC, that Univ has/had great Computer Science Dept.
But there is always the BSDs.