User:Simon/Trim4/Extracting text from the web: Difference between revisions

From XPUB & Lens-Based wiki
(Created page with "11.11.19 curl is a command that can be used from the terminal to take text from a URL. It can be piped with software such as pandoc to convert the text to other forma...")
 
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
11.11.19
== 11.11.19 Extracting text using curl ==


    curl  
<code>curl</code> is a command that can be used from the terminal to take text from a URL. It can be piped with software such as pandoc to convert the text to other formats, and in support of a workflow I'm starting to develop, this comes in quite handy.
is a command that can be used from the terminal to take text from a URL. It can be piped with software such as pandoc to convert the text to other formats, and in support of a workflow I'm starting to develop of using markdown, this comes in quite handy.


I'm writing text on the pad, and then converting it to markdown. This extra step isn't necessary (in fact it adds to the work) but I'm interested in using pads as multi-flow publishing tools in the future I'm testing this out. Also, using a pad allows me to style the text simply using markdown rather than HTML.
I'm writing text on the pad, and then converting it to markdown. This extra step isn't necessary (in fact it adds to the work) but I'm interested in using pads as multi-flow publishing tools in the future so I'm testing this out. Also, using a pad allows me to style the text simply using markdown rather than HTML.


For example, this is a file I made from some notes on a Flusser interview about linear writing:
For example, this is a file I made from some notes on a Flusser interview about linear writing:


    $ curl https://pad.xpub.nl/p/flusser_interview_notes/export/txt | pandoc -t markdown > flusser.md
<code>
$ curl https://pad.xpub.nl/p/flusser_interview_notes/export/txt | pandoc -t markdown > flusser.md
</code>


I'm then storing the files in my git, which is private at the moment. Having texts in git allows me to use its versioning capabilities, allowing me to go back over old modified versions in the file tree - I can copy paste from these snippets that I may want to go back and retain in the future..
I'm then storing the files in [https://git.xpub.nl/simoon/thesis my git], which is public. Having texts in git allows me to use its versioning capabilities, allowing me to go back over old modified versions in the file tree - I can copy paste from these snippets that I may want to go back and retain in the future...

Latest revision as of 16:34, 20 June 2020

11.11.19 Extracting text using curl

curl is a command that can be used from the terminal to take text from a URL. It can be piped with software such as pandoc to convert the text to other formats, and in support of a workflow I'm starting to develop, this comes in quite handy.

I'm writing text on the pad, and then converting it to markdown. This extra step isn't necessary (in fact it adds to the work) but I'm interested in using pads as multi-flow publishing tools in the future so I'm testing this out. Also, using a pad allows me to style the text simply using markdown rather than HTML.

For example, this is a file I made from some notes on a Flusser interview about linear writing:

$ curl https://pad.xpub.nl/p/flusser_interview_notes/export/txt | pandoc -t markdown > flusser.md

I'm then storing the files in my git, which is public. Having texts in git allows me to use its versioning capabilities, allowing me to go back over old modified versions in the file tree - I can copy paste from these snippets that I may want to go back and retain in the future...