A
critique of this
critique of Duck Typing by Henry Story.
Saturday May 26, 2007
Duck Typing done right
Dynamic Languages such as Python, Ruby and Groovy, make a big deal
of their flexibility. You can add new methods to classes, extend
them, etc... at run time, and do all kinds of funky stuff. You can
even treat an object as of a certain type by looking at it's methods.
This is called Duck
Typing: "If it quacks like a duck and swims like a Duck then
it's a duck", goes the well known saying. The main criticism of
Duck Typing has been that what is gained in flexibility is lost in
precision: it may be good for small projects, but it does not scale.
I want to show here both that the criticism is correct, and how to
overcome it.
Let us look at Duck Typing a little more closely. If something is
a bird that quacks like a duck and swims like a duck, then why not
indeed treat it like a duck? Well one reason that occurs immediately,
is that in nature there are always weird exceptions. It may be
difficult to see the survival advantage of looking like a duck, as
opposed to say looking like a lion, but one should never be surprised
at the surprising nature of nature. (He
stretches the analogy past its usefulness. It's not nature, its a
convenient name to wrap a computing idiom)
Anyway, that's
not the type of problem people working with duck typing ever have.
How come? Well it's simple: they usually limit the interactions of
their objects to a certain context, where the objects being dealt
with are such that if any one of them quacks like a duck, then it is
a duck. And so here we in essence have the reason for the criticism:
In order for duck typing to work, one has to limit the context, one
has to limit the objects manipulated by the program, in such a way
that the duck typing falls out right.(Yes
you do, and this limitation is very rarely seen as a problem).
Enlarge the context,(presumably to
where the objects being dealt with quack like a duck but are not
ducks), and at some point you will find objects that don't fit
the presuppositions of your code. So: for simple semantic reasons,
those programs won't scale. The more the code is mixed and meshed
with other code, the more likely it is that an exception will turn
up. The context in which the duck typing works is a hidden
assumption, usually held in the head of the small group of developers
working on the code. In short he is
saying:
Use in a large programs
will cause objects with the right signature but the wrong actions to
be successfully called in Duck typing but give the wrong result.
Duck typing only works
because the developers know what are compatible actions.
Only a small group of
developers can know what are compatible actions.
A slightly different way of coming to the same conclusion, is to
realize that these programming languages don't really do an analysis
of the sound of quacking ducks. Nor do they look at objects and try
to classify the way these are swimming. What they do is look at the
name of the methods attached on an object, and then do a simple
string comparison. If an object has the swim
method,
they will assume that swim
stands for the same type of
thing that ducks do. Now of course it is well established that
natural language is ambiguous and hence very context dependent. The
methods names gain their meaning from their association to english
words, which are ambiguous. There may for example be a method namedswim
, where those letters stand for the acronym "See
What I Mean". That method may return a link to some page on
the web that describes the subject of the method in more detail, and
have no relation to water activities. Calling that method in
expectation of a sound will lead to some unexpected results
But
once more, this is not a problem duck typing programs usually have.
Programmers developing in those languages will be careful to limit
the execution of the program to only deal with objects where swim
stand for the things ducks do. But it does not take much for that
presupposition to fail. Extend the context somewhat by loading some
foreign code, and at some point these presuppositions will break down
and nasty difficult to locate bugs will surface. Once again, the
criticism of duck typing not being scalable is perfectly valid. So
his criticism here is that:
Load 'foreign' code with
the right method signature and it is likely to fail.
Lets take his points one by one
and show why Duck Typing can and does work for many people:
His point 1:
The
difference between a large and a small program would be that the code
is so large that, objects are passed to functions based solely on
their method signature compatability rather than on what those
methods do, and what a function expects to do to the object.
As a
programmer you need to know the functionality of what you put
together. Small or large systems –you still need to know, you
should not pass it off, and only testing will show how right you are.
His point 2:
Turn
it on its head “If the developers don't know what are
compatable actions, Duck typing doesn't work”.
Yes,
for Duck typing to work, developers need to know about what they are
linking, but other processes, essential to the production of quality
software will make this tracktable.
His point 3:
Following
on from the answer to his point 3, If you are not building a large
system out of a collection of smaller ones with local, identifiable
interdependancies then it could be a problem but that isn't Duck
typing at fault – You have a mess that no one can understand
but are trying to force development on. Its not the size of the
codebase that is defeating Duck typing it is its haphazard
organization.
His point 4:
It
is wise to be wary of code you don't know the history behind. In
larger projects you are more likely to use foreign code but exploring
it for suitability for Duck typing is part and parcel of good
practice. You should know about what you are using, Duck typing or
not. If you have programmers using code that is still 'foreign' to
them, then you cannot expect quality.
You could go on and read his
solution to a poorly stated problem in his article.
Paddy.
On careful reading of his
follow-up
article, in the section “Scalability and efficiency”,
he turns things on their head by admitting that the original article
was about Duck Typing not being a replacement for Web
URI's. And they are not. We can agree on that. :-)
Is it also your opinion that the basestring class is unnecessary in python?
ReplyDeleteIt's my opinion that the basestring class need not be discussed in this article on Duck Typing.
ReplyDeleteTry comp.lang.python or your own blog and people should answer (especially on c.l.p)
- Paddy.
Based on your last comment there, I don't think you understood Henry's analogy at all. Maybe instead of URIs you should think in terms of the differences between Haskell's type-classes and dynamic duck typing.
ReplyDeleteThe difference is that you are matching on more than just the name of the method, you are matching on that method as a member of a particular interface. Method name collisions happen by accident, but you don't declare that your type implements a particular interface without knowing it.
Hi Greg,
ReplyDeleteIt is not acceptable to let "method names clash by accident". You have to know what your using. You say that someone has made that decision when they say it implements an interface. It had better!
I'm saying that with Duick typing, by passing the object to a method the programmer is stating that it works with it. Interfaces can be nice, but are limiting in a Dynamic language where what is compatible can shift depending on program state.
In practice, Duck Typing works well for Dynamic languages where interfaces would add static limitations.
As a python neophyte, my problem with Duck Typing has been that it increases the difficulty of learning a complex codebase.
ReplyDeleteFor the past two years I've been working with a large python-based open source framework for modeling land use. This code is not well documented (i.e. seldom do methods state what type(s) their parameters ought to be or what type they return). This is surely a failing of the developers.
However, under languages with stronger typing, e.g. Java, such poorly documented code is arguably less opaque--at the method signature level at least. Thus, it's easier for someone who doesn't have expert knowledge of the system to learn the system.
Hi Brian,
ReplyDeleteSeems like you have just poorly documented code.
Nothing in Duck-typing stops them from documenting their functions/methods. They should at least have document for initial intended behaviour. Parts of the Python documentation actually recognise that duck typing will be used and make this easier in their documentation - see the entry for readfp here: http://docs.python.org/lib/RawConfigParser-objects.html, or the descriptions of csvfile here: http://docs.python.org/lib/csv-contents.html .
Duck typing is about making code re-usable in more situations. Doing more with what you have.
- Paddy.
P.S. Brian,
ReplyDeleteJava does have manifest types - i.e. variables are typed and the type of every variable has to be given but Python is strongly typed, meaning Python does not allow automatic conversion between dissimilar types:
"123" + 4.5
Is an error in Python for example.
You can start from the Java equivalent of a Python program to print Hello World and as the size of an idiomatic Python program grows - the size of the equivalent Java program grows by even more. The Python built in sorted function, for example, sorts the items of any iterable object iterables to be sorted don't have to be changed to implement some interface or be derived from some class. the members of the iterable to be sorted can be from different types. This allows the well defined function sorted to be used without having to add 'adapter code' that tends to accumulate in Java, bloating Java code and obscuring the essence of the code.
- But this is just from reading about Java and many Java/Python comparisons. I don't write Java, but I do remember language comparisons.
- Paddy.
Paddy,
ReplyDeleteI guess my point is that poorly/inadequately documented code happens, and when it does, in my experience it is much more difficult to make sense of such code that is implemented in a Duck Typed system, than in a manifest-typed system.
Thanks for clarifying my understand of strong typing and related terminology.
Best,
Brian