Image created with Adobe Photoshop
Common Publishing Formats#
If you, a normal person like you and me, want to write a book, what tools would you use?
Before you answer, think carefully about the problems you might encounter and how to write and browse quickly...
Well, I guess you would probably say Latex + PDF or Word.
No one would be crazy enough to write in markdown, then convert it to HTML, and then convert it to XML. After all, you don't have many options if you want to ensure that the printed format is normal. Also, think about how many technologies you know and have used to create a webpage.
Are you surprised? Why don't we have any other options for publishing a book nowadays?
Why is PDF the final choice?#
In 1991, John Warnock, one of the co-founders of Adobe, proposed a system to make documents easy to distribute. At that time, Latex, DVI, and Unix machines were still dominant. We all know about Latex... It's a long way from having a .tex file to successfully printing it. So they wanted to create a format that would reduce the mental burden on end users.
Some readers may not understand why Latex lost... Let's take a lovely college student named Wei as an example. He wanted to print his senior brother Jie's thesis for reading.
Then he got his brother's .tex file and needed to find the packages his brother had used.
(Then he found out that one of the packages was written by his brother himself, but he didn't know where he had stored the floppy disk with the source code...)
Then he spent a lot of effort searching through various forums to find all the packages, only to find that the .sty file was corrupted.
(He asked his brother, and his brother said he hadn't touched the .sty file...)
He found out that his brother had used a Latex distribution developed exclusively by his university laboratory, and it seemed to work after introducing the .sty file...
Then he compiled this Latex...
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Error line 114514 !Text line contains an invalid character.
Then it seemed that the package was updated...
He shouted, "Jie, no!"
So PDF greatly alleviated the above problem.
The biggest contribution of PDF is that it first used XML for typesetting, defined the format of images, provided a series of metadata, unified font sets, and used powerful PostScript for drawing, which can create many fantastic effects.
So where did this good thing go wrong?
Patent Issues#
In 2008, Adobe released ISO 32000-1, which includes patents and technical documents related to the public license, production, use, and distribution of PDF-compatible applications.
Wait, this thing has a patent!?
Yes!
Not only does it have a patent, but it is also a buyout patent!
In other words, poor labs like Jie's can no longer release a PDF distribution based on PDF technology, otherwise they will face lawsuits.
And there is another major problem: the technical reference implementation they provide... actually relies on proprietary technology! My goodness!
Font Patents#
To solve the problem of jagged bitmap fonts being printed sharper by advanced laser printers, Adobe introduced the Type 1 and Type 3 font formats. The outline data of these fonts was saved by Adobe using encryption algorithms and keys. Want to print smooth fonts? Please pay a high license fee to Adobe.
Although Adobe's technology was cracked as soon as it was released, if they find out that you are using it... get ready to hire a lawyer. I believe your lawyer is not as good as Adobe's.
Image Patents#
Adobe has countless image patents, whether it's compression, formats, filters, or decoders. If you accidentally use Adobe's technology to process your images... you'll have to pay a licensing fee.
But these two patents are not the worst. The worst is the patent for PostScript.
What is PostScript, and the love-hate relationship between Microsoft, Apple, and Adobe.#
PostScript, a stack-based interpreted language that looks like Forth, originated from the ideas of the earliest members of Adobe. It was first implemented in 1978 and was even stuffed into every laser printer as Adobe's influence expanded.
Some people may wonder why a programming language is needed to print an image?
Because as Adobe started working on their vector fonts, it became necessary to convert vector data to bitmaps and print them (this step is commonly known as rasterization in computer graphics). In the 1990s, most of Adobe's profits came from the PostScript printer firmware, which shows its influence.
Then Apple and Microsoft couldn't sit still because Adobe was being a bit too shady!
Cooperation between Apple and Microsoft#
In the late 1980s, Apple began developing TrueType (which is still widely used today as .ttf) fonts to protest against Adobe's rogue behavior. Microsoft and Apple made a deal and obtained the right to use TrueType. It immediately became the primary font format for Windows.
GNU and FreeDesktop couldn't sit still either#
Are you all playing with closed-source and patented technologies here? The open-source community can't stand for it!
So they wrote FreeType and were sued by Adobe, Apple, and Microsoft together.
(When capitalists team up, they are all the same)
It wasn't until 2010 that the font patents were lifted, allowing FreeType to be released again. However, some font hinting features, which are still patented by Apple, were removed. This is also one of the reasons why Linux GUI fonts look ugly. Windows, don't try to argue, your fonts are just poorly designed.
Why PostScript is so bad#
If you have some understanding of parallelism and concurrency, you will know that it's best not to have global state...
PostScript is a language with hidden global state, as can be seen from the fact that opening a large PDF file and quickly flipping through the pages can cause a crash...
PostScript does not support transparency for images (how about transparency in PDF? That's Adobe's patent).
The implicit global state of PostScript means that any rendering error caused by incompatibility will be propagated to all subsequent pages...
This means that even a slight incompatibility can make your entire document unreadable.
Oh, I forgot to mention that except for Windows, which uses GDI, all other printer drivers are PostScript... In essence, the de facto standard for publishing is monopolized.
How do others render in 2022?#
Some people have realized that, hey, our frontend is so fancy now, and it doesn't seem to have these problems you mentioned.
(Have you tried printing a webpage with a printer? Not taking a screenshot)
With the advancement of frontend technologies such as HTML5 and CSS, and the rapid development of backend technologies, Google's Angle and Skia were previously available, and now the updated Servo can fully utilize the GPU for rendering. Things have changed a lot. But what about PDF? It's still slowly interpreting PostScript.
It's 2022, and switching from one webpage to another that you have never opened before takes at most 1-3 seconds, and with caching, it's basically in milliseconds. This has become a crushing reality.
Progress Comes from Openness, and Closedness Leads to Backwardness#
Monopolies are only temporary. When all of Adobe's patents expire, what awaits them is a giant called the Web.