2018/Berlin/datalake

Digital Archiving was a session at IndieWebCamp Berlin 2018.

IndieWebCamp Berlin 2018
Session: Digital Archiving
When: 2018-11-03 14:10

Participants

media vs format
licensing is interesting: public, permissive licensed has chance to get additional copies outside
media:
- reliablitiy of media vs reliability of interfaces
- Tape is awesome in reliability/storage duration, but readers are expensive and need to be maintained too
- optical: CD is random, DVDs (outside maybe M-Disk?) less good. readers long-term available. Blue-Ray: archival grade available, cheap ones organic storage layers. Expensive media too.
- Flash (SD, SSD, CF, ...) loose data (charge) if unpowered for extensive period (~5 years) (SSDs I've seen numbers in the months if not stored in cold places)
- HDDs are relatively cheap, sturdy even if stored powered off
digital:
- text is great
- common, simple formats are good (jpg, png)
- PDF is complex - if you do want to store to print, LaTeX (https://www.latex-project.org/) or PostScript
- potentially: archive software too if possible (trickier for modern, clound-connected software) - if archiveing source code, archive compiler, and all dependencies as well

again audience: if external users, central, somewhat standardized hierarchy is useful.
full-text search

References