Musings inspired by Eevee's post about Sylph. Read that first.
In Python's duck-typing system, it's enough if things quack. They don't actually need to quack like a duck.
Whenever you program, you need to keep track of the types of things you're dealing with. Some languages – the statically typed ones – require you to specify the type in front, and then they check it rigorously. In Python, you need to keep track of this in your head. It's possible that you're writing code where a variable could be either a string or a number:
def add(a, b):
return a + b
... or you might want a variable to changes type in the middle of a function, like in the example from Hitchhiker's guide to Python:
items = 'a b c d' # This is a string...
items = items.split(' ') # ...becoming a list
items = set(items) # ...and then a set
Those are smells: they should be possible but not easy.
Variables (the names, not just values) always have types, even if,
as in the “dynamic” languages, they're only implied – they only exist in the
programmer's head, as assumptions on the value's behavior.
If I use a startswith
I really want “a method with the semantics of String:startswith
”,
rather than “the method named startswith”.
Types of variables
I have some syntax in mind. If I rewrite one of the first examples in Eevee's post in it, I can get something like this:
def foo(a: String, b: StringPattern, c: List):
if a.startswith(b):
c.append(a)
By which I want to say b
can be string, or a tuple of strings,
or even a regex for all I care – as long as it can be matched against a string.
Those are the semantics I wrote the function for.
If you call it with different types then you're on your own – it might work,
and the language should allow it, but I offer no guarantees about the behavior of my function.
Here the “startswith” is, quite explicitly, String:startswith
.
The call is not a command to “quack”, but to “quack like a duck”.
The difference from statically-typed languages is that I'm not requiring a
to be
a string, or a string subtype.
I don't even care if a
implements the string interface.
I'm just saying that I want to call String:startswith
.
The String
declaration does not restrict the value, it provides a namespace for my operations on it.
And with the namespaced operations come some well-defined semantics: if you misimplement them,
it's not my fault that my foo
function won't work.
Similarly, I don't care if c
is actually a list. I'm just saying I want the
MutableSequence:append
method (for which List:append
is an alias).
After the append
call, c[-1]
should be whatever I just put there.
But that's your (the caller's) problem. If c
is not List
enough,
your code might break with the next version of my library.
Of course, usually a
will be an “actual string”, and the compiler can
optimize for that case.
I'll put a random snippet with syntax ideas here:
from somewhere import Snake
def print_length(x: List):
print(x.length)
if x has Snake:length:
print("Scary! I got a {}m long snake as argument!"(x.Snake:length))
Type declarations
For variables, the type can be specified declaration.
Python has no variable declarations, which I don't think is healthy, because the
resulting scoping rules are not intuitive. They happen to work OK for 99%
of the cases, but when you get hit by the 1%, you don't even know where to look
(unless you remember the warning from your tutorial).
Javascript's explicit var
keyword is a good thing, for much the same reasons as Python's
explicit self
.
In my syntax, variable declarations can specify the type:
str: String = "abcd"
list: List = [1, 2, 3]
for i:int in list:
print(i * i)
But, since most (or ideally all) expressions have a well-defined types, a shorthand comes to mind:
str := "abcd"
list := ["1", "2", "3"]
Operator namespaces
This would work:
def concat(a: List, b: List):
return a + b
def elementwise-add(a: Array, b: Array):
return a + b
a := [1, 2, 3]
b := [4, 5, 6]
assert concat(a, b) == [1, 2, 3, 4, 5, 6]
assert elementwise-add(a, b) == [5, 7, 9]
because while Sequence:\+
and Array:\+
are two different things,
lists can support both.
(Here List:\+
is an alias for Sequence:\+
– a List is primarily
a Sequence.)
(Extending this to custom operators with custom priority is left as an exercise...)
Definitions and Implementations
Let's make a Tree interface. I'll use PEP 484-ish syntax for generics:
T := TypePlaceholder()
interface Tree:
walk(self: T) -> Iterator[T]
This says that the type of x.Tree:walk().next()
will be the same as the
type of x
.
(For any x
. Remember it's defining the type of the expression, not the value;
it's defining the namespace that expression will use by default.)
Then you can define BinaryTree like this.
BinaryTree:walk
becomes an implementation of Tree:walk
.
class BinaryTree(Tree):
left: Tree
right: Tree
impl walk(self):
if self.left:
yield from self.left.walk()
yield self
if self.right:
yield from self.left.walk()
Note the kewyord is impl
, not def
: this ensures that if the Tree
interface
grows a new method, existing methods of the same name won't magically become
its implementations.
(Though you will get lots of warnings about the new method not being implemented
by classes that claim to implement Tree
.)
Next let's have a class for corporate employees.
An amployee is not a tree, and there's no Employee:walk
in our model,
but you can still use an employee as a tree node:
class Employee:
boss: Employee
underlings: List[Employee]
impl Tree:walk(self):
yield self
for underling in self.underlings:
yield from underling.walk()
And you might want to attach a Tree:walk
to something from some library you
can't easily change, like the stdlib.
impl Path.Tree:walk(self):
yield self
if self.isdir:
for child:Tree in self.listdir:
yield from child
The language should probably limit who can do this and where, though.
To step away from Tree for one paragraph:
Since every class derives from Object
, which has a to_string
method,
you can write impl to_string
in any class.
Or you can write def to_string
, which won't conflict with Object:to_string
(though it will raise eyebrows and trip up your linter).
And you can later add impl Object:to_string
, which won't be aliased to
YourClass:to_string
.
Interface conflicts
Consider this class:
class FamilyMember:
parents: List[FamilyMember]
children: List[FamilyMember]
impl Tree:walk(self):
# this is not be correct but w/e
yield self
for child:Tree in self.children:
yield from child
Now, if you inherit from both Employee
and FamilyMember
, you're in trouble.
The resulting Tree:walk
should fail loudly, unless you explicitly override it.
Bags of attributes
“But Petr“, I hear you saying, “I want my getattr(self, 'visit_' + node.type)
”!
Eh, really? I reply. That looks quite ugly.
I jut explained that I think attribute names should not be just names, didn't I?
If you want a collection of things keyed by strings, use a mapping.
You don't even want the keys to be strings here – you want Node subclasses.
That said, you could write getattr(self, Attribute(Visit, node.type))
,
and implement Visit:leaf
, where Visit
is an empty(!) interface.
That's it for now
I could rant on, about, say, registration conflicts or efficient vtable references, but let's end it here, throw it at the wall and see if it sticks.
(I welcome comments, by the way – if you have any, send them by e-mail or pull request)