2018/Berlin/ethics

Data Ethics on the IndieWeb was a session at IndieWebCamp Berlin 2018.

Notes archived from: https://etherpad.indieweb.org/ethics

IndieWebCamp Berlin 2018
Session: Data Ethics on the IndieWeb
When: 2018-11-03 15:05

Participants

Sven Knebel
Sebastian Greger
Jeremy Keith
Alexandr
Jan
Tiara Miller
Martijn van der Ven
Peter Molnar
Dylan Harris
charlie owen
Ton Zijlstra (remote, reading the notes)
Greg McVerry
Add yourself here… (see this for more details)

Notes

goal of the session is not to discuss legal questions, nor GDPR; we simply want to start a discussion where indieweb sites use data about other people that potentially comes with a responsibility to deal with it ethically.
- comes with big questions: who defines what is ethical? how can a shared understanding be established in a community-driven open source movement like this, how can it be rooted in practice?
- no simple answer but worthwhile to inject the debate into indieweb efforts (session goal is discussion, not solution)

pulling likes and retweets to display on website there was an issue if a user didnt hear of indieweb they arent aware that their name would be used on the website it was being pulled to
was using pixel images to link back to original post
do indie website fall under gdpr? - we are not lawyers, don't get into that (personal websites don't fall under gdpr, so you don't need a privacy statement e.g., but doesn't mean you can post personal data) (depends on what your lawyer thinks "household activity" is)
there is not really a difference between processing and storing the content or data (again not a lawyer disclaimer)
there is no distiniction between storing and processing in GDPR
seeking permission to share the data or is it that because it is on the web it is therefore public
- where do you define the line for this?
twitter users grant twitter the right to use their data and dont have ownership
from use case point of view you make a post to another persons site, realize it wasnt a good idea and you want to delete it can you?
- yes using gone on the server so it is removed
private copy is ethically a different thing
ethics on the indieweb not the legallity of it
hard to speak about legallity of it because right now it is only basic information and not very fine tuned. Need test cases for this
what is the data we process, what indieweb-protocols are potentially affected?
how do we deal with it? does it need new mechanisms?
- who decides this on an opensource indieweb?
is the online realm a public or private space?
- if someone is on twitter visable to all that their content can be harvested onto third parties or do they think it will only be shared amongst friends?
can people be trusted to make the decision about sharing or processing personal data?
spirit of the law can be more murky
calculated vs not calculated
What data is theirs to control?
petermolnar- you should license your content all of the time
licensing in social media is a mess, you give rights to facebook to republish, does that mean it gives the right to others to do so?
having a public instagram left images open to be used else where, being harvested. Needed to make private to get them removed
knowing the worst case scenarios
making informed choices
can things be used if on public domains?
- don't use the information unless it is explicitly said that it can be used
looking further into if there is a license instead of infering that it doesn't
infrences that they pinged me and that they have the refrence
terms and conditions
- how broad can these be? is there a better way to do it?
actively sharing
superfeeder feed
anyone can send a web mention to anyone
licensing on its own in microformats need to know what it means
url- can you share the url without an ethical issue
spam-hundered of webmentions-type of attack vector
webmentions spam hypothetical, but possible, haven't found any historical evidence yet
twitter grants the right to remove their likes, retweets etc
flags- share okay, republish okay
robots text files
no index, no archive meta tags
linux foundation- short identifiers for licenses (https://spdx.org/licenses/)
meta tag for micro information on licenses/redistribution (there apparently had been an existing proposal on the indieweb, to use an attribute on a-tags to express that a webmention should not be sent/processed; idea was not further developed - reference?)
what is the standard, using something that already exists
embeds
discussed at another indieweb camp - see https://indieweb.org/2017/Nuremberg/law
block post- write a link with something for no webmention as link tag attiribute

you can always hide the webmention endpoint on a post per post basis, your CMS may not let you?

being able to talk about there people's sites without them receiving webmentions about it
robots.txt on domain
fair use
parse the url, infer hcard as showing information
oembed as permission
- meta tags in place to allow oembed to define how the content should show up on another website
oembed needs to be enable
need explicit plugin to publish
wordpress.com has it
web mention in a way claims control over the data- fuzzy/blurred topic
hierarchy of signals- a scorecard
- IP reputaion- new IP will be marked as spam, starts as a - score
not a new question
- public intrest, the information if out there makes it vulnerable for abuse
different countries = different laws
if by displaying all responses i recieve does it make it an open dialouge, taking out certain responses to match personal views
what do the silos do?
algorithom for how a webmention comes up
vouch trust topic
we should document which of the silos are defining comments belong to the poster of comment or the person who owns the source being commented on
example of a webmention missing context of a network: http://quickthoughts.jgregorymcverry.com/2018/10/28/hypatiadotca-if-i-come-off-as-a I was responding to a webmention not to Twitter and this thread: https://twitter.com/hypatiadotca/status/1056624889849737216

thus I missed much of the historical and cutural impacts of the first tweet, and was called out , rightfully so for jumping in where my voice wasn't needed, and for violating a norm of the network when I was just replying from my site... add notes

summary:

some of the aspects to consider

license (is licensing a solution? how could it be done? content license vs. "license" to process data?)
terms and conditions of silos (silos grant their users control over their data; API users are subject to that, how does this need to be considered?)
inferred permission from using a certain protocol (e.g. Google infers from use of AMP that site can be made available via AMP)
various spam aspects (both incoming, but also displaying webmentions that were not intended by the original author; everybody can send a webmention on behalf of somebody else)
oembed features (does provision of an oembed endpoint imply that the site owner is ok with webmentions to be displayed?)

potential starting points for exploration:

could a "vouch" mechanism help identify users whose webmentions are intentional? (or some other way this mechanism can be useful?)
if licenses etc. are to be used, what would be the machine-readable format; vs. how is the user made aware that their site is sending these?
use of metatags (comparable to "noindex" or "noarchive), either on page or link level - e.g. to express denial/permission to display a webmention?
some robot.txt (or similar) level default policy for an entire domain?
provision of oembed as permission for webmention display

discussion:

it's a wicked problem and there is no easy answer!
potentially a combination of factors (a "hierarchy of signals", like a score) could help siteowners to determine what to display
risk of creating an algorithm (that's what the silos do; but is an "indie-algorithm" really desirable?it would have to be very open and transparent for sure)

Participants

Notes

See Also