26
@JCollaboratory
One implication of this is that, all else equal, Facebook’s measure will tend increase or decrease
simply as a result of changes in the resources allocated to this process. Moreover, given the
complexity of borderline cases – those where it is a judgement call about whether content
breaches Community Standards or not – there is also a sense in which increased activity on the
part of Facebook creates transgressions, for example when content is removed which no users
would ever have actually complained about, or even necessarily found troubling. This is likely
to be most pronounced for provisions of the Community Standards that are more nuanced and
subtle; nudity, for example, is clearly defined by policy and easy to identify,
14
although the general
public may not agree with the policy. Bullying or harassment, by definition, depends on how a user
responds to content. One might expect that a higher rate of proactive removal of bullying content
would be associated with a higher rate of non-violating content being removed.
15
The Group also notes that changes in this measure of eectiveness reflect a combination of
Facebook proactivity and two dierent types of user behavior. An increase in the fraction of
violating content initially identified by Facebook, as opposed to users, could reflect changes in
the Facebook review process, changes in the propensity of users to post violating content, and/
or changes in the propensity of viewers to report content. For example, changes in the Facebook
interface that encourages users to report violations, or increased awareness of Facebook Community
Standards, would tend to make this measure smaller in magnitude (at least in as much as these
promote reporting behavior among users). Increased user participation in enforcement of Community
Standards should be a policy goal, but it would lead to a reduction in the proactivity metric.
All the points raised above illustrate that there are unavoidable tradeos in choosing which
metrics to focus on. Dierent metrics prioritize dierent goals. A particular metric, like proactivity,
will encourage employees to optimize one value. However, in so doing, it might divert eort and
attention from other important values. Focusing on a measure of proactivity conveys to employees
that it is desirable to increase the percentage of violating content that is proactively removed. If
the number of proactive actions taken continues to increase, this could be considered ‘good’
because Facebook is doing more to catch violating content; but it could also be considered
‘bad’ because even if the specific metric is ‘improving,’ either (a) more violating content is being
posted, (b) users are not bothering to report violations, or (c) user reported violations are not being
actioned, calling into question the eectiveness of the overall regulatory system
16
; and it might also
be considered ‘counter-productive’ because content is being flagged and actioned mistakenly,
or when it did not need to be. As more robust time trends become available answers to some
these questions will be forthcoming, but the underlying issues will remain. This is not necessarily
a call for this metric to be calculated dierently, but rather that more care may be needed in
presentation and interpretation.
14
Although there are times when nudity is ambiguous. For example, a shot of a nipple is usually prohibited, but not when the image is a
woman breastfeeding or a work of art.
15
Further, to the extent that proactive removal may suggest to the users that their benign interactions are negative, this may have
unintended behavioral eects on users themselves.
16
See Ariana Tobin et al., Facebook’s Uneven Enforcement of Hate Speech Rules Allows Vile Speech to Stay Up, ProPublica (Dec. 28,
2017), https://www.propublica.org/article/facebook-enforcement-hate-speech-rules-mistakes (discussing examples where Facebook
erroneously left up pieces of content reported by users, which Facebook later acknowledged were violations of the Standards).