Revision as of 11:43, 21 November 2023

Fragmented Media & workflows

Rosa and Riviera looked individually and collectively into the Media Fragments URI. This wiki page defines a list of small steps, projects, tools and things that can be applied to the Worm Archive, from any perspective (radio maker, archivist, listener, etc), as part of the Special Issue 22.

Ideally, each step/thing/tool/project is readable, self-documented, and applicable in a non-linear way. The aim is not to add too much to the archive in terms of size and complexity. For now, we have defined the following steps/things/tools/projects:

Transcribing audio via an etherpad using a greasemonkey script
Adding media fragments to transcriptions
Translating transcriptions into WebVTT / OGGKate formats
Dynamically add VTT cues to a video element text track api
Being able to listen to the worm archive
Being able to navigate the worm archive (audio files) using various chapters
Being able to view real-time transcriptions/events of an element in the worm archive
Being able to, as a participant in the chat, live transcribe radio
Being able to view valuable metadata of an audio file?
Translate the transcriptions into a zine
Being able to transcribe a mixcloud?
- yt-dlp on the one hand, and uploading to Youtube for automatic transcript,
translate SRT to VTT
tool that creates an artwork/thumbnail based on ffmpeg info
Gather additional metadata from mixcloud

This script scrapes radio worms archive on mixcloud and stores all entries as a json file

Each project/tool/thing/step could end up as a manual (man)

Transcribing

As transcribing the same as annotations? What methods of describing a broadcast are valuable pieces of metadata?

The end

A sample WebVTT file could look something like this:

1 WEBVTT
2 
3 00:01.000 --> 00:04.000
4 - Never drink liquid nitrogen.
5 
6 00:05.000 --> 00:09.000
7 - It will perforate your stomach.
8 - You could die.

Notice the structure of lines three and six. The text ‘Never drink liquid nitrogen’ appears for three seconds. This is determined by the begin and end timestamps. The end timestamp is required otherwise the timed transcript will not render at all. By comparison, the media fragments standard also implements begin and end timestamps, albeit in a different way. For example the following media fragment URL will play the audio file for just over two minutes from 12:33.170 to 14:33.375.

https://hub.xpub.nl/chopchop/worm/xpub/radioworm-20231107.ogg#t=12:33.170,14:33.375

Taken

☐ (Edit the Greasemonkey JavaScript) [1]
☐ Making a VTT Previewer
☐ Bash Script for translating mediafragment-based transcripts to WebVTT formats
☐ Check if there are other text files on the archive somewhere?
☐ OGGKate, still relevant?
☐ Audacity labels, still relevant?

Thinking

Could this change not only a listening experience, but also a making experience?

Links

otter.ai Webvtt

@@ Line 21: / Line 21: @@
 * tool that creates an artwork/thumbnail based on ffmpeg info
 * Gather additional metadata from mixcloud
-  ''This [[https://hub.xpub.nl/chopchop/~vitrinekast/python-mixcloud/ script]] scrapes radio worms archive on mixcloud and stores all entries as a json file]]''
+''This [https://hub.xpub.nl/chopchop/~vitrinekast/python-mixcloud script scrapes] radio worms archive on mixcloud and stores all entries as a json file''

Prototyping nov 14: Difference between revisions