2018/Baltimore/interactivetranscripts

Building Blocks (draw it/diagram it) was a session at IndieWebCamp Baltimore 2018.

Notes archived from: https://etherpad.indieweb.org/interactivetranscripts

IndieWebCamp Baltimore 2018

Session: Interactive Transcripts

When: 2018-01-20 15:05

Participants

Matt Griffin (facilitator)
Mattww
derekfields.is
Aaron Parecki
adambachman.org
Kim L
amyhurst.com
dariusmccoy.com
Lydia
Eddie Hinkle
Chris Aldrich
Add yourself here… (see this for more details)

Notes

A few points of reference for how time markers are used in other areas:

VIDEO EDITORS - video editors use markers that have kind (comment, chapter, shot, subtittle, etc.), in-time, out-time, duration (in place of out-time), and similar, and can be exported and imported using various standard EDL and XML formats.
- there are interchange file format standards, but a lot of the marker creation and use takes place in specific methods associated with a particular nonlinear editor (NLE)
SUBTITLE FILES - online video and media publishing have other formats (several) for creating and publishing subtitles (subtitles that can be pulled as you go, or converted from text to bitmap to baked into video, etc. (.SRT, .SUB, .VTT formats)
- in past, subtitle approaches were a different beast than video editor markers in many cases (standards evolved separately), but as youtube/vimeo/etc evolve time reference tools for clips and subtitle sync, there may be convergence based on "healthiest" strategies from video markers and subs
MUSIC EDITORS - music have audio cues and MIDI cues (among other tools) that can include annotations and things that are not sound or music into the editing track
- HYBRID Text Marker - Media Marker strategies - O'Reilly Oriole (example: https://www.safaribooksonline.com/oriole/regex-golf-with-peter-norvig ) mixes page navigation with cuing markers for trigger media as well as navigating to bookmarks within that media.
as you scroll, videos will start, elements will move in or out, become interactive, ...

A few areas that might be interesting to indieweb uses

Text-based time markers - for capturing markers and also for using these markers to visualze that there are markers
Plaintext note formats for annotations/notes that could be cued by a media player
Working with "raw" recordings and standards for referencing clips from it, to reinforce importance of sharing full documentation without losing the power of referencing specific highlights

And a few special cases to "test" our discussion on:

raw interview or panel recording (for example, at a conference)
recorded 1st person personal vlog journaling (along with edit down to what is worth publishing, without erasing the document of the piece as a whole)
concerts and other live performance recordings - where you want to draw attention to a specific aspect that might not be recognized as special without curatorial help

http://www.videonot.es/

shows videos from several services (YouTube, vimeo, coursera, Udacity, ...) and lets you type notes along with time sequence
stores the notes in service of your choice (e.g. Dropbox, Google Drive?)
https://github.com/UniShared/videonotes

Matt is interested in tools that work with plaintext without relying on a service. It's easy to archive video and text.

There are standard, text-only formats for subtitles video on the web (WebVTT / .vtt) and embedded into video files (.SRT subtitle files)

Marty uses various tools and approaches to synching text and audio

This Week in the IndieWeb Audio Edition podcast is based on a written script, so has a full text transcript
uses Gentle forced audio aligner to get time-synced transcript data
audiograms - videos made from short ~1 minute audio clips plus text transcripts to bake in subtitles
- example: https://martymcgui.re/2017/12/26/105811/
- good exerpts for sharing on social networks where audio support is not first-class
native browser video subtitle support with WebVTT format files
- example: https://martymcgui.re/2017/09/29/221157/
- "trick" a video element by giving it an audio source and a poster image
- browser natively renders subtitles
- difficulty: his toolset for editing long transcripts is difficult
other folks are doing this kind of thing. Gretta creates interactive transcripts from long podcasts. chunks of text are annotated with who is speaking. each has a "share as video" link.
- example: https://www.gretta.com/1262100467/10010024/#t=PT33.4S
eventually he will write better tools to serve his needs, but is okay for now with what he has hacked together

Consideration - allowing audio to be on equal footing as text "First class citizen"

Adam mention "videoqueue" or similar -- hopefully he

Marty mentioned gretta.com that offers a service for transcribing the audio, and allowing you to navigate the edited transcript, and use that to export the media associated with that text

Amy talked about video coding in the social sciences, especially the open source BORIS

allows coding of body language and other physical phenomena and activity
creates a time-coded transcript. even includes measuring tools!

Matt mentioned trint.com (in the near future)

Aaron Parecki mentioned media fragments

https://www.w3.org/TR/media-frags/
standard in the browser
supported by youtube, we use it on wiki pages for IWC sessions in long videos like the demos
Aaron Parecki wrote a javascript shim that copies any media fragment from the page URL to the video/audio element's URL. Marty McGuire uses this on his site and is aware of a couple of folks having linked to excerpts of his podcast episodes.

Amherst College used to have a platform called Vidbolt that allowed users to annotate YouTube video and share comments either publicly or with a specific community. (Seems to be down now, but the creators may have opensourced it or could be talked into it.)

Bottom line: it's hard to make text transcripts from audio. Aaron Parecki has been at events w/ a live (but remote on skype) transcriptionist. YouTube automatic transcripts are bad, but they do alignment from existing text well.

$60 - $100 / hr for a transcriptionist

Matt G says trint works this way as well. Also lets you do things like edit the text transcript to get an edited audio file. E.g. remove the "um"s from the transcript and they're gone from the audio.

Another interesting video annotation tool & review: Using Video Annotation Tools to Teach Film Analysis

Question(?) about live commentary / annotation. Some conferences do this.

Some live video streaming platforms support timestamped chat annotations that will play alongside the video during playback.

similarly, YouTube live (and FB live) will play back likes and emoji responses that were made during the recording

An example of videos as a feedback tool. Students don't like to watch videos of themselves to learn from them.

Maybe they'd read a text transcript?
Or maybe an instructor send them a link to a snippet.
- "You said 'Um' a lot here." "Here's a clip where you sounded confident."
Use BORIS to mark nervous behaviors

Open Source Python LIbrary that will help align source text with a video file: https://github.com/lowerquality/gentle

Combining with IndieWeb tech? Anyone linking to audio fragments as references?

the recently found tantek mf1 talk on archive.org is audio only. not being cited with media frags as far as we know, though.
Marty McGuire's podcast episodes use Aaron Parecki's music and he sends webmentions to the posts where the original audio came from.
- these could have media fragments to jump right to the portion of the audio that contains the music

amyhurst.com mentions Amazon's XRay when you're watching TV through their apps.

Who is that speaking on screen?
where was this filmed?
what is this song?

Overcast podcast app allows share-with-timestamp. Eddie Hinkle wants to use that to share snippets with commentary on his site that allow you to listen to it right in the site.

Marty McGuire also wants this!

Q&A between Matt G and amyhurst.com: How difficult is BORIS to use?

must be able to watch a video, use a mouse, type.
can get more complicated depending on what you're looking for.
some use foot pedals to stop/start video so they can focus on typing.
can export markers as plain text? yes, with timecodes.
how to present is the challenge

Some chat about associating multiple data streams together. Annotating video of water level markers, animal behavior, ...

Participants

Notes

See Also