LemonHX

LemonHX

CEO of Limit-LAB 喜欢鼓捣底层的代码,意图改变世界
twitter
tg_channel

Discussing the ease of use of programming languages and Strong Type and Weak Type

sjb.png

Welcome to anyone who is confused about the online debate on strong type and weak type languages and those who are learning type theory to come and watch~

I just want to say that most of the arguments on the internet about this topic are wrong! And they are all too extreme.

First of all, I would like to thank guest author dannypsnl. Everyone can check out his blog!#

What is the Essence of Type?#

Weak Type and Strong Type#

In fact, you cannot describe whether a language is strong type or weak type, because everyone has their own range of strong and weak, and for most ordinary developers, this range is incorrect.

Even for professional theoretical scholars, the range in their minds is also incorrect. After a night of discussion, we have roughly narrowed down this range to the following five categories. Now, whether it is the academic community or the pseudoscience community, it is impossible to describe the range with stable terms, so we can only try to describe and give examples.

First Tier: Runtime types completely match compile-time types and are all statically immutable and theoretically complete#

This tier of languages basically does not exist, because this range can guarantee that the entire language runs in a static and predictable state. Not to mention the halting problem, this model restricts the interaction between the language and the outside world.

Let's briefly talk about history. In the distant 20th century, there was a language called Miranda (the father of Haskell), and at that time, almost every university had a fork of this language... because Miranda was basically unable to write everyday programs, so everyone had to layer DSL on top of it to make various meaningful programs run by poking holes.

The meaningful programs here are not for publishing PL papers or various type-safe calculators.

Basically, Miranda cannot write any programs that interact with the machine... this is the historical premise.

Then this group of universities had a tea party, and in the end, they created Haskell, which is not so rigorous.

(Some parts of the history above have been omitted, but it is roughly like this)

Why did they lower themselves to the second tier? Because the practical value of the first tier is not universal enough. What is the point of creating a programming language that cannot write meaningful programs?

So let's summarize: For languages in the first tier, users can completely trust the code but cannot write any meaningful programs.

Second Tier: Runtime types completely match compile-time types and are all statically immutable, with the addition of casting#

  • Haskell
  • Rust
  • C

As long as we allow some casting and poking holes, we can actually communicate with the existing world with very complex side effects! Usually, languages in this tier are more theoretically complete, which allows programmers who have a perfect understanding of the theory behind them to write code without thinking.

(Note the bold font)

However, their theories are often very complex for normal undergraduate programmers without PL-related knowledge, such as Haskell's $System\ F_\omega$.

But this does not mean that it is a bad thing for professional programmers. However, there are always some cases where you have to go around in circles, commonly known as doing type gymnastics.

Let's summarize: Languages in this tier require some professional knowledge to write safe code without thinking. The evaluation of languages in this tier is often very polarized.

The recently popular Rust language is also in this position.

Actually, C is also in this position, but we will talk about C later.

Third Tier: Has compile-time types, runtime dynamics, immutable types, and checks#

In this tier, we can see some popular scripting languages.

  • Elixir
  • Ruby3
  • Typed Racket
  • TypeScript* (We put it here, otherwise we would have another tier called compile-time unreliable types independent of runtime behavior)

Except for TypeScript, which is widely used, we don't have much exposure to the languages in this tier. Essentially, we can guarantee that most (the tiers above only require writing good code) code in this tier conforms to the type checks during compilation, while reserving things like "any" to ensure the dynamism of scripting languages.

The writing experience in this tier is excellent, and unless it is a professional, the evaluation is unanimously positive.

To summarize, the main characteristic of this tier is that users will not encounter any runtime errors that do not meet the author's expectations, and users can trust most of the code before running it.

Fourth Tier: No compile-time types, runtime dynamics, immutable types, and checks#

In this part, people start to confuse whether a language is weakly typed or dynamically typed (strongly typed at runtime), such as several representative languages in this tier.

  • JS* (It is not in the fifth tier because fundamentally, JS can still justify itself)

  • Erlang

  • Ruby2

Users of languages in this tier will rarely encounter runtime errors that do not meet the author's expectations, and users can trust the code in most cases

(provided that you know what the correct behavior is)

Fifth Tier: Runtime dynamics and mutable types (commonly known as weakly typed)#

  • Python
  • Perl
  • PHP

These languages belong to products of limited author skills or were not intended to be popular when they were written or were limited by technology at that time. The main characteristic is that users will encounter runtime errors that do not meet their expectations, and users can only trust the documentation (provided that it is well-written) and cannot trust the code at all.

The classification of types will change depending on how you view the language#

At this point, some people may be surprised why C is in the second tier.

How can C, which has caused countless bugs, stand with Haskell?

In fact, when C was being created, they did not consider that you might cause such memory-unsafe problems, so this problem was not within the scope of what C was supposed to solve. Secondly, if you treat C as a nominal (i.e., named) type language, it would fall into the fifth tier, but if you really understand the intended use of the author, and you view its type as size (also known as C ABI), it can achieve the rigor of the second tier. This is why some C programmers say that C is a very rigorous language, but most people do not think so.

Because most people are not sensitive to numbers, this may be due to our genes.

Another possible approach: Structural Typing#

But there is no mature implementation in any language, but this set of things can indeed describe the behavior of the fourth tier based on the second tier.

What is the relationship between the usability of a language and its type?#

No relationship at all.

There is no statistical consensus. You can easily use very complex and complete type theory to create a very bad language.

The previous discussion did have advantages in syntax and simplicity of types, but now we don't write code in text editors anymore.

Overly flexible languages cannot produce good IDE plugins. For example, Racket's syntax macros can change the syntax arbitrarily, such as reader macros, etc.

Ruby still does not have good autocompletion because it generates a large amount of code at runtime.

Another important issue is the toolchain.

It determines whether a language is easy to use.

For example, there is a language called Rust, which is anti-human in terms of syntax and very difficult to write, but it has a world-class toolchain called Cargo, which allows it to dominate. If its toolchain were Haskell's Stack, this language would have died long ago.

trait XXX<T: Send + Sync + Clone> {
    type YYY<'a>: Future<Output = zzz::<AAA>::BBB<T>> where Self : 'a;
}

To summarize: Type can only determine whether the behavior of the code lying in a file meets your expectations or not. The degree of type soundness can only guarantee correctness, not usability.

Why do you think that one language is better than another?#

Everyone encounters different problems and needs to solve different things, and their requirements for correctness are also different. As a PL person like me, I still use PHP to write a blog. You should focus on how to do things well rather than whether a language has types or not.

Many times, whether you think a language is good or not is based on your experience, but the person asking this question is very likely not trying to do the same thing as you, and you cannot guarantee that your path is the optimal solution. So don't recommend languages randomly~

What are PL researchers doing?#

  • Creating terminology 🤫
  • On the way to the mental hospital 🚑
  • Preparing to argue with me 🤯

Conclusion#

I hope everyone can have a rational view of programming languages. They are really just ordinary tools.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.