ACHDI what has been done so far

From XPUB & Lens-Based wiki
Revision as of 13:39, 17 April 2025 by Wordfa (talk | contribs) (→‎IMPORNANT NOTEs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Eyes on the Prize
sticker for event

Links

Github/Code -> https://github.com/FWORDIE/cssWhoDoneIt

Example Spec sheet - > https://www.w3.org/TR/CSS2/

Description(?)

The web is not neutral, it is not anarchic, it is standardised. It is a tapestry built on the specifications of the World Wide Web Consortium, their member organisations and their team of rotating editors. Although open in principle, the W3C is opaque in practice. Troves of information lay hidden in a labyrinth of publicly available repositories, emails, list, and web pages. Through mass and unwieldy data scraping of CSS specifications, this project tries to illuminate the key players, their contributions, and hence their influence in shaping our collective web.

Initial Idea

The idea came from our discussions on A Dao of Web Design (no thanks to the author).

Arising from the idiom "Give a man a fish and he is fed for a day, teach him how to fish and he is fed for life." we decided to ask the question "But who makes the fishhook?"

Through this special issue we delved into the materiality of the web, and focused on different CSS properties, as a result we ended up getting curious about who decides on the properties, how, when and what companies they are affiliated to. (And trust me after spending enough time browsing through the specs you keep on seeing the same names over and over again).

So we decided to scrape all the information we can get on who/when/from where etc. proposed and _____ the different CSS properties. Initially the aim was to turn all the information we scrape into a plug-in (still a viable option, needs further discussions) that shows who came up with the properties used for the website you are currently on.

In any-case the code is shared in github and our hope is that at least it can be used by anyone who wants to get information from the w3c website and their significantly difficult to navigate archives in an organized manner.

We would appreciate any ideas/suggestions/questions on how we can present it, if we decide not to go with the plug-in idea.

What has been done so far

Scrapper.ts

disclaimer: this has been the first trial and afterwords we have changed approaches, the explanation of this part is here more for archival purposes

The idea was to write a typescript code to scrape all the data of css properties from their spec sheets. We decided to follow each property. To reach their specs we used the short-hand property list found in developer.mozilla, first checked if the property in question exists in the doc(from the index), then scrape the author (editor/co-editor took some time to seperate the two), company the authors are from, date, previous versions, and then wrote a loop to keep on scrapping the same data from the previous versions.

As more properties and more spec sheets were scrapped we started running into an issue, as already mentioned we started getting doc type as well, to understand whether the spec sheets were a working draft, editors draft etc. Although most of the initial specs taken from the same doctype had the same HTML structure, as the previous versions started getting older some issues started arising. This led us to realize how many different functions (or if else if statements (?)) would be needed to actually be able to scrape all the necessary information of a single property (especially considering some properties exist in multiple different docs that are all written in various different ways)

This resulted with Fred going "Wait a second, I bet we can just scrape all the spec sheets for the properties instead" (or something like that, I wasn't really there). So he started coding right away, which leads our journey to....

getSpecs.ts

So this starts by getting all the spec sheets from W3C from the working/editor's drafts and docs https://www.w3.org/TR/?filter-tr-name=Css and there is apparently this https://drafts.csswg.org/ (all the working drafts). It scrapes all the specs sheets we can get our hands on, and their previous versions if they have been linked, then scrapes the information above.

[
  "https://www.w3.org/TR/css-grid-3/",
  "https://www.w3.org/TR/2025/WD-css-grid-3-20250207/",
  "https://www.w3.org/TR/2024/WD-css-grid-3-20241003/",
  "https://www.w3.org/TR/2024/WD-css-grid-3-20240919/",
  "https://www.w3.org/TR/2020/CRD-css-grid-1-20201218/",
  "https://www.w3.org/TR/2020/CRD-css-grid-1-20201021/",
... 1138
]

getSpecInfo.ts

So this function can be run on any list of spec sheets to scrape the data out of it

We are currently running the functions iteratively and fixing the problems as it fails since all the spec sheets are formated differently.

So far we have; date, this spec URL, this doc name, and abstract (most of the time).

To Run we can say:

Deno run -A getSpecInfo.ts test 10

This will run our functions for 10 random Spec Sheets from our list of Spec Sheets.

 let thisSpecsInfo: SpecSheet = {
      authors: await getAuthors($specSheet, specSheet),
      editors: await getEditors($specSheet, specSheet),
      date: await getDate($specSheet, specSheet),
      thisSpecUrl: specSheet,
      thisDocName: await getDocName($specSheet, specSheet),
      type: await getType($specSheet, specSheet),
      properties: await getProps($specSheet, specSheet),
      abstract: await getAbstract($specSheet, specSheet),
    };

Fred wrote a few tools, so that we can run the function on a random amount of the specs (very useful considering we reached over a thousand so far and it takes timee, also a tool to only run the function on the specs that were last scraped and had an error. This saves a lot of time and makes it esier to go back to the functions and edit them so that we can keep on fixing the issues without needing to take 20 min long ciggie breaks all the time.

Fun features:

  • Script Collects and Categorises errors, so we can find issues quickly
  • Our CLI takes arguments e.g.
    • test + optional number > runs our script on randomly selected spec sheets
    • spec + url > runs script on one named spec sheet
    • broken > runs just the spec sheets that had errors last time
    • focus > only scrapes for certain params e.g. date, abstract etc
  • we have a progress bar:
Screenshot 2025-03-10 at 10.30.26.png

What needs to be done

  1. Finish the code
    • Authors
    • Editors (contributors?)
    • type
    • company the authors and editors are tied to
  2. Figure out how to present it
    1. Could it be a timeline of companies/editors over time?
  3. Write code to organise our data for printing
  4. Test Dot Matrix Printer

Where we are at

A 95% correct JSON of all 1150 Spec Sheets, that looks like:

{
    "authors": [
      {
        "name": "Tab Atkins Jr.",
        "org": "Google",
        "link": "Http://www.xanthir.com/contact/"
      },
      {
        "name": "Elika J. Etemad / fantasai",
        "org": "Apple",
        "link": "Http://fantasai.inkedblade.net/contact"
      },
      {
        "name": "Rossen Atanassov",
        "org": "Microsoft",
        "link": "Mailto:ratan@microsoft.com"
      },
      {
        "name": "Oriol Brufau",
        "org": "Igalia",
        "link": "Mailto:obrufau@igalia.com"
      }
    ],
    "date": "2025-03-26T00:00:00+01:00",
    "thisSpecUrl": "https://www.w3.org/TR/css-grid-2/",
    "thisDocName": "CSS Grid Layout Module Level 2",
    "type": "Candidate Recommendation Draft",
    "properties": [
      "grid",
      "grid-area",
      "grid-auto-columns",
      "grid-auto-flow",
      "grid-auto-rows",
      "grid-column",
      "grid-column-end",
      "grid-column-start",
      "grid-row",
      "grid-row-end",
      "grid-row-start",
      "grid-template",
      "grid-template-areas",
      "grid-template-columns",
      "grid-template-rows"
    ],
    "abstract": "This CSS module defines a two-dimensional grid-based layout system, optimized for user interface design. In the grid layout model, the children of a grid container can be positioned into arbitrary slots in a predefined flexible or fixed-size layout grid. Level 2 expands Grid by adding “subgrid” capabilities for nested grids to participate in the sizing of their parent grids."
  }

Presentation

dot matrix for printing

what will the text output look like?

Julia Luteijn’s artwork ‘Kilo-girls

I want it to have a a serif font

op 1

date: 2002-08-02

type: working draft

properties: appearance, box-sizing, content, cursor, display, font, icon, key-equivalent, list-style-type, nav-index, nav-up, nav-right, nav-down, nav-left, outline, outline-color, outline-offset, outline-style, outline-width, overflow, overflow-x, overflow-y, resizer

Name: Tantek Çelik[#] Organization: Microsoft[#]

op 2

date: 2002-08-02 type: working draft

properties: appearance, box-sizing, content, cursor, display, font, icon, key-equivalent, list-style-type, nav-index, nav-up, nav-right, nav-down, nav-left, outline, outline-color, outline-offset, outline-style, outline-width, overflow, overflow-x, overflow-y, resizer

Name: Tantek Çelik[#] Organization: Microsoft[#]

op 3 (for multiple editors)

date: 2025-03-18 type: editor's draft

properties:

aspect-ratio, contain-intrinsic-block-size, contain-intrinsic-height, contain-intrinsic-inline-size, contain, intrinsic-size, contain-intrinsic-width, min-intrinsic-sizing

Editor: Tab Atkins [#] Organisation: Apple[#]

Editor: Elika J. Etemad, fantasia[#] Organisation: Apple[#]

Editor: Jen Simmons[#] Organisation: Apple[#]

op 4 (for multiple editors)

- - - - - - -

2025-03-18

Editor's Draft Properties Defined [#]

Tab Atkins (Contribution number [#]) Apple (Contribution number [#])

Elika J. Etemad, fantasia (Contribution number [#]) Apple (Contribution number [#])

Jen Simmons (Contribution number [#]) Apple (Contribution number [#])

alternatively

the properties can be formed as a list and then there will be more paper......

Option Decided Upon

this is what the paper looks like :)


also questions;

  1. the dates are recorded year first, shal we change it?
  2. Should we just write name or author?
  3. ohh so for the author and company when there are multiple, lets see

Written Text for project

Signal-2025-04-07-091216.png

Notes for new possible versions

  • For the experts to stand alone
  • you can ask the code to insert the number of authors date ect
  • maybe try with different metaphors for each expert
  • you have loads of runon sentences.

Imre is Obsessed with fishhooks

What if we imagine the web as a river? Access to it is provided by different browsers and within it, countless websites swimming around like fishes. Most of us who browse the web, only visit the websites, take what we want from them and leave. But we all know the old saying, “give a man a fish and you feed him for a day. Teach him how to fish and you feed him for a lifetime” We can all browse the river and get lost in millions of websites, eat them all, provided to us by the fisherman. Or we can start creating a fish for ourselves. But then the question becomes: Who made the fishhook we are using, who came up with the materials of the net? W3C is where our fishers get their tools, it is almost as old as fishing itself.


Fishing is a big industry, after all aren’t our lives surrounded by the river that is the Web? The tools on fishing, on making and styling websites is a game many desire to influence. So there are loads of organizations contributing to the W3C in the form of members. Only members’ employees can be a part of decision making processes on what new tools will be available for the developers. It consists of rigorous sessions, each session concludes with a draft. The journey starts with Editor’s Draft and keeps on going through different meetings until it is perfected for fishers use, as a Recommendation. They are all open for the public to see and follow, so that fishers can chime in and those who only browse, can understand the process if they ever wish to.


Creating websites, like fishing, can be pursued in many different ways, some do it as a hobby, some only try because their friend insisted it’s fun and cool, and some earn a living out of it. Some fishers will only know of the new tools as they appear on the market, when, say, their friends start using them. Others will have ideas for certain tools and will try to reach the W3C. And of course there are those developers who are very enthusiastic and experienced. Right way to refer to them, according to W3C, is “experts”, then they are invited to contribute in the manufacturing of the hooks and the nets. These invited experts are called to this duty without financial compensation. They put in hours and effort purely out of passion and desire, and of course because they know what equipment the fishers need, as they need it themselves. Sometimes invited experts are funded by members if their interests align. However, our data has no way of showing when such cases arise as they are in no obligation to declare it.


The browsers control the access to the web, sometimes even when new tools arise, they are not allowed to be used at certain points of the river, and sometimes even before they are Recommended, they become very popular in areas accessed through certain browsers. But when it all started, the guy who found the river and built the fishhook/net factory just wanted the tools to be simple, for them to work at any part of the web, easy to control by any fisher, and for everyone to have a taste. It was designed for the free travel of all information. Look at the papers you hold in your hands now and get lost in the data. Make sense of who decides on the tools required to build websites.

Web as a Fashion Show

What if we imagine the web as our endless fashion show? Access to the runway is provided by different browsers and within it, countless websites displaying themselves. Most of us who browse the web, only watch the websites walk by, take what we want from them and leave. We can all watch the runway show, get lost in millions of websites, and even acquire some of the outfits provided to us by the designers. Or we can start creating them for ourselves, anyone who knows how to stitch two pieces of cloth can join the runway with their own outfits. Of course it is possible to delve deeper, to create an outfit after another, find new ways of stitching them together, new materials, innovative designs… The only caveat of the Fashion Show is that there is only a single space to acquire the materials used for the outfits. Only one group, in close communication and collaboration with the browsers. They make the needles, threads and fabrics we are using, they come up with the materials of the net. That place is called the W3C.


The Runway Show is a big industry, after all aren’t our lives surrounded by the clothes that is the Web? The tools on stitching, on making and styling websites is a game many desire to influence. So there are loads of organizations contributing and funding the W3C in the form of members. Evey organization can be a member, as long as they are capable of paying the fees. Only members’ employees can be a part of decision making processes on what new tools will be available for the developers. It consists of rigorous sessions, each session concludes with a draft. The journey starts with Editor’s Draft and keeps on going through different meetings until it is perfected for designers use and presented as a Recommendation. They are all open for the public to see and follow, so that dressmakers can chime in and those who only browse, can understand the process if they ever wish to.


Creating websites, like fashion, can be pursued in many different ways, some do it as a hobby, some only try because their friend insisted it’s fun and cool, and some earn a living out of it. Some designers will only know of the new tools as they appear on the market, when, say, their friends start using them. Others will have ideas for certain tools and will try to reach the W3C. And of course there are those developers who are very enthusiastic and experienced. Right way to refer to them, according to W3C, is “experts”. Then (and only then) they are invited to contribute in the manufacturing of the threads and the sewing machines. These invited experts are called to this duty without financial compensation. They put in hours and effort purely out of passion and desire. Their input is valuable since they know what equipment the dressmakers need, as they need it themselves. Sometimes invited experts are funded by members if their interests align. However, our data has no way of showing when such cases arise as they are in no obligation to declare it.


The browsers control the access to the web, sometimes even when new tools arise, they are not allowed to be displayed on certain parts of the Runway, and sometimes even before they are Recommended, they become very popular in areas accessed through certain browsers.

But when it all started, the guy who founded the Fashion Show and built the stitching materials factory just wanted the tools to be simple, for them to work at any part of the web, easy to control by any designer, and for everyone to have a look. It was designed for the free travel of all information. Look at the papers you hold in your hands now and get lost in the data. Make sense of who decides on the tools required to build websites.

Last Version

What if we imagine the web as our endless fashion show? Access to the runway is provided by different browsers and within it, countless websites displaying themselves. Most of us who browse the web only watch the websites walk by, take what we want from them and leave. We can all watch the runway show and get lost in a sea of websites. Or we can start creating them for ourselves. Anyone who knows how to stitch two pieces of cloth can join the runway with their own outfits. Of course it is possible to delve deeper, to create one outfit after another, find new ways of stitching them together, new materials, innovative designs… The only caveat of the internet Fashion Show is that there is only warehouse to acquire the materials used for the outfits. They make the needles, threads and fabrics we we have to use, and in so doing so they design the materials of the net. That place is called the World Wide Web Consortium (W3C).

The Runway Show is a big industry, after all aren’t our lives draped in the clothes of the Web? The tools we use to stitch and style a websites is a game many desire to influence. So there are loads of organizations contributing and funding the W3C in the form of members. Any organization can be a member, as long as they  can overcome the costs of joining. There are only two ways one can take part in the decision making processes on which new tools will be available for the developers. You can join either as a members' employee, e.g. Google employees, or you can be an 'Invited Expert'. Invited Experts are individuals passionate about CSS and the web, and either work for free, or when interests align, are funded by external companies and organisations. However, our data has no way of showing when such cases arise as they are in no obligation to declare it.

This process of designing new features for CSS consists of rigorous sessions, each session concluding with a draft. The journey starts with 'Editor’s Draft' and keeps on going through different meetings as various 'Working Drafts' until it is perfected for designers use, and presented as a 'Recommendation'. Theoretically these Specifications are open for the public to see and follow, so that dressmakers can chime in and those who only browse, can understand the process if they ever wish to. That is, if they can find them. The specifications date back to 1994 and are scattered around the W3C's website, sometimes only available through the links buried in other documents. There is no API and very limited search options. This project is an attempt to collect some of this data in one place, so that we can find some clues regarding the authorship of CSS, in other words 'A CSS WHODUNIT'.

Now you know all the people and the companies who influenced CSS through the years. How close are we in solving the puzzle? Even though it seems that the W3C hold the power when it comes to authoring the tools of web building, at the end of the day all they present are Recommendations. It it is the browsers (Chrome, Safari, Firefox etc) who choose whether or not to support arising features, and even to implement their own unaligned features. But when it all started, the people who founded the Fashion Show and built the original materials factory just wanted the tools to be simple, accessible, and transparent. It was designed for the free travel of all information, not for Oligopoly, commerce and control.  Now we find ourselves trying to solve a murder mystery, and whichever arm we trace back seems to be attached to the same few companies.

Problems That Have Been Solved

How to do Italics

Taken from the 114th page onwards from the mannual for OKI Microline 320, reached from the Dot matrix printing.

// Special Commands for IBM Proline
const lineBreakReturn = String.fromCharCode(10);
const returnChr = String.fromCharCode(13);
const half = String.fromCharCode(27, 60);
const fast = String.fromCharCode(27, 62);
const italic = String.fromCharCode(27, 37, 71);
const underline = String.fromCharCode(27, 45, 49);
const normal =
	String.fromCharCode(27, 45, 48) +
	String.fromCharCode(27, 37, 72) +
	String.fromCharCode(27, 84); // No underline + no italics + np super/sub Script

How to Print In Chunks

Why are the dates coming funny?

IMPORNANT NOTEs

Lines per page: 60ish

Print Pitch: 12

Emulator mode: Proprinter

Print time per page: ??

Num of Lines: ~18000

Num of pages: ~ 300

Quality: UTl

Time per page: 60secs



Glossary specifically for İmre because what is going on???

Puppeteer: Puppeteer is a JavaScript library which provides a high-level API to control Chrome or Firefox over the DevTools Protocol or WebDriver BiDi. Puppeteer runs in the headless (no visible UI) by default

TypeScript: TypeScript adds additional syntax to JavaScript to support a tighter integration with your editor. Catch errors early in your editor.

DOM: The Document Object Model (DOM) is a programming interface for web documents. It represents the page so that programs can change the document structure, style, and content. The DOM represents the document as nodes and objects; that way, programming languages can interact with the page.

A web page is a document that can be either displayed in the browser window or as the HTML source. In both cases, it is the same document but the Document Object Model (DOM) representation allows it to be manipulated. As an object-oriented representation of the web page, it can be modified with a scripting language such as JavaScript.

Cheerio: Cheerio parses markup and provides an API for traversing/manipulating the resulting data structure. It does not interpret the result as a web browser does. Specifically, it does not produce a visual rendering, apply CSS, load external resources, or execute JavaScript which is common for a SPA (single page application). This makes Cheerio much, much faster than other solutions. If your use case requires any of this functionality, you should consider browser automation software like Puppeteer and Playwright or DOM emulation projects like JSDom.

async: The async function declaration creates a binding of a new async function to a given name. The await keyword is permitted within the function body, enabling asynchronous, promise-based behavior to be written in a cleaner style and avoiding the need to explicitly configure promise chains.

Found These:

https://github.com/w3c

https://dev.w3.org/cvsweb/