Data Ethics on the IndieWeb was a session at IndieWebCamp Berlin 2018.
Notes archived from: https://etherpad.indieweb.org/ethics
IndieWebCamp Berlin 2018
Session: Data Ethics on the IndieWeb
When: 2018-11-03 15:05
- Sven Knebel
- Sebastian Greger
- Jeremy Keith
- Tiara Miller
- Martijn van der Ven
- Peter Molnar
- Dylan Harris
- charlie owen
- Ton Zijlstra (remote, reading the notes)
- Greg McVerry
- Add yourself here… (see this for more details)
- goal of the session is not to discuss legal questions, nor GDPR; we simply want to start a discussion where indieweb sites use data about other people that potentially comes with a responsibility to deal with it ethically.
- comes with big questions: who defines what is ethical? how can a shared understanding be established in a community-driven open source movement like this, how can it be rooted in practice?
- no simple answer but worthwhile to inject the debate into indieweb efforts (session goal is discussion, not solution)
- pulling likes and retweets to display on website there was an issue if a user didnt hear of indieweb they arent aware that their name would be used on the website it was being pulled to
- was using pixel images to link back to original post
- do indie website fall under gdpr? - we are not lawyers, don't get into that (personal websites don't fall under gdpr, so you don't need a privacy statement e.g., but doesn't mean you can post personal data) (depends on what your lawyer thinks "household activity" is)
- there is not really a difference between processing and storing the content or data (again not a lawyer disclaimer)
- there is no distiniction between storing and processing in GDPR
- seeking permission to share the data or is it that because it is on the web it is therefore public
- where do you define the line for this?
- twitter users grant twitter the right to use their data and dont have ownership
- from use case point of view you make a post to another persons site, realize it wasnt a good idea and you want to delete it can you?
- yes using gone on the server so it is removed
- private copy is ethically a different thing
- ethics on the indieweb not the legallity of it
- hard to speak about legallity of it because right now it is only basic information and not very fine tuned. Need test cases for this
- what is the data we process, what indieweb-protocols are potentially affected?
- how do we deal with it? does it need new mechanisms?
- who decides this on an opensource indieweb?
- is the online realm a public or private space?
- if someone is on twitter visable to all that their content can be harvested onto third parties or do they think it will only be shared amongst friends?
- can people be trusted to make the decision about sharing or processing personal data?
- spirit of the law can be more murky
- calculated vs not calculated
- What data is theirs to control?
- petermolnar- you should license your content all of the time
- licensing in social media is a mess, you give rights to facebook to republish, does that mean it gives the right to others to do so?
- having a public instagram left images open to be used else where, being harvested. Needed to make private to get them removed
- knowing the worst case scenarios
- making informed choices
- can things be used if on public domains?
- don't use the information unless it is explicitly said that it can be used
- looking further into if there is a license instead of infering that it doesn't
- infrences that they pinged me and that they have the refrence
- terms and conditions
- how broad can these be? is there a better way to do it?
- actively sharing
- superfeeder feed
- anyone can send a web mention to anyone
- licensing on its own in microformats need to know what it means
- url- can you share the url without an ethical issue
- spam-hundered of webmentions-type of attack vector
- webmentions spam hypothetical, but possible, haven't found any historical evidence yet
- twitter grants the right to remove their likes, retweets etc
- flags- share okay, republish okay
- robots text files
- no index, no archive meta tags
- linux foundation- short identifiers for licenses (https://spdx.org/licenses/)
- meta tag for micro information on licenses/redistribution (there apparently had been an existing proposal on the indieweb, to use an attribute on a-tags to express that a webmention should not be sent/processed; idea was not further developed - reference?)
- what is the standard, using something that already exists
- discussed at another indieweb camp - see https://indieweb.org/2017/Nuremberg/law
- block post- write a link with something for no webmention as link tag attiribute
you can always hide the webmention endpoint on a post per post basis, your CMS may not let you?
- being able to talk about there people's sites without them receiving webmentions about it
- robots.txt on domain
- fair use
- parse the url, infer hcard as showing information
- oembed as permission
- meta tags in place to allow oembed to define how the content should show up on another website
- oembed needs to be enable
- need explicit plugin to publish
- wordpress.com has it
- web mention in a way claims control over the data- fuzzy/blurred topic
- hierarchy of signals- a scorecard
- IP reputaion- new IP will be marked as spam, starts as a - score
- not a new question
- public intrest, the information if out there makes it vulnerable for abuse
- different countries = different laws
- if by displaying all responses i recieve does it make it an open dialouge, taking out certain responses to match personal views
- what do the silos do?
- algorithom for how a webmention comes up
- vouch trust topic
- we should document which of the silos are defining comments belong to the poster of comment or the person who owns the source being commented on
- example of a webmention missing context of a network: http://quickthoughts.jgregorymcverry.com/2018/10/28/hypatiadotca-if-i-come-off-as-a I was responding to a webmention not to Twitter and this thread: https://twitter.com/hypatiadotca/status/1056624889849737216
thus I missed much of the historical and cutural impacts of the first tweet, and was called out , rightfully so for jumping in where my voice wasn't needed, and for violating a norm of the network when I was just replying from my site... add notes
some of the aspects to consider
- license (is licensing a solution? how could it be done? content license vs. "license" to process data?)
- terms and conditions of silos (silos grant their users control over their data; API users are subject to that, how does this need to be considered?)
- inferred permission from using a certain protocol (e.g. Google infers from use of AMP that site can be made available via AMP)
- various spam aspects (both incoming, but also displaying webmentions that were not intended by the original author; everybody can send a webmention on behalf of somebody else)
- oembed features (does provision of an oembed endpoint imply that the site owner is ok with webmentions to be displayed?)
potential starting points for exploration:
- could a "vouch" mechanism help identify users whose webmentions are intentional? (or some other way this mechanism can be useful?)
- if licenses etc. are to be used, what would be the machine-readable format; vs. how is the user made aware that their site is sending these?
- use of metatags (comparable to "noindex" or "noarchive), either on page or link level - e.g. to express denial/permission to display a webmention?
- some robot.txt (or similar) level default policy for an entire domain?
- provision of oembed as permission for webmention display
- it's a wicked problem and there is no easy answer!
- potentially a combination of factors (a "hierarchy of signals", like a score) could help siteowners to determine what to display
- risk of creating an algorithm (that's what the silos do; but is an "indie-algorithm" really desirable?it would have to be very open and transparent for sure)