Archive for the ‘Mathematicians’ Category

How to write math papers clearly

July 12, 2017 4 comments

Writing a mathematical paper is both an act of recording mathematical content and a means of communication of one’s work.  In contrast with other types of writing, the style of math papers is incredibly rigid and resistant to even modest innovation.  As a result, both goals suffer, sometimes immeasurably.  The clarity suffers the most, which affects everyone in the field.

Over the years, I have been giving advice to my students and postdocs on how to write clearly.  I collected them all in these notes.  Please consider reading them and passing them to your students and colleagues.  

Below I include one subsection dealing with different reference styles and what each version really means.  This is somewhat subjective, of course. Enjoy!

4.2. How to cite a single paper. The citation rules are almost as complicated as Chinese honorifics, with an added disadvantage of never being discussed anywhere. Below we go through the (incomplete) list of possible ways in the decreasing level of citation importance and/or proof reliability.

(1) “Roth proved Murakami’s conjecture in [Roth].” Clear.

(2) “Roth proved Murakami’s conjecture [Roth].” Roth proved the conjecture, possibly in a different paper, but this is likely a definitive version of the proof.

(3) “Roth proved Murakami’s conjecture, see [Roth].” Roth proved the conjecture, but [Roth] can be anything from the original paper to the followup, to some kind of survey Roth wrote. Very occasionally you have “see [Melville]”, but that usually means that Roth’s proof is unpublished or otherwise unavailable (say, it was given at a lecture, and Roth can’t be bothered to write it up), and Melville was the first to publish Roth’s proof, possibly without permission, but with attribution and perhaps filling some minor gaps.

(4) “Roth proved Murakami’s conjecture [Roth], see also [Woolf].” Apparently Woolf also made an important contribution, perhaps extending it to greater generality, or fixing some major gaps or errors in [Roth].

(5) “Roth proved Murakami’s conjecture in [Roth] (see also [Woolf]).” Looks like [Woolf] has a complete proof of Roth, possibly fixing some minor errors in [Roth].

(6) “Roth proved Murakami’s conjecture (see [Woolf]).” Here [Woolf] is a definitive version of the proof, e.g. the standard monograph on the subject.

(7) “Roth proved Murakami’s conjecture, see e.g. [Faulkner, Fitzgerald, Frost].” The result is important enough to be cited and its validity confirmed in several books/surveys. If there ever was a controversy whether Roth’s argument is an actual proof, it was resolved in Roth’s favor. Still, the original proof may have been too long, incomplete or simply presented in an old fashioned way, or published in an inaccessible conference proceedings, so here are sources with a better or more recent exposition. Or, more likely, the author was too lazy to look for the right reference, so overcompensated with three random textbooks on the subject.

(8) “Roth proved Murakami’s conjecture (see e.g. [Faulkner, Fitzgerald, Frost]).” The result is probably classical or at least very well known. Here are books/surveys which all probably have statements and/or proofs. Neither the author nor the reader will ever bother to check.

(9) “Roth proved Murakami’s conjecture.7 Footnote 7: See [Mailer].” Most likely, the author never actually read [Mailer], nor has access to that paper. Or, perhaps, [Mailer] states that Roth proved the conjecture, but includes neither a proof nor a reference. The author cannot
verify the claim independently and is visibly annoyed by the ambiguity, but felt obliged to credit Roth for the benefit of the reader, or to avoid the wrath of Roth.

(10) “Roth proved Murakami’s conjecture.7 Footnote 7: Love letter from H. Fielding to J. Austen, dated December 16, 1975.” This means that the letter likely exists and contains the whole proof or at least an outline of the proof. The author may or may not have seen it. Googling will probably either turn up the letter or a public discussion about what’s in it, and why it is not available.

(11) “Roth proved Murakami’s conjecture.7 Footnote 7: Personal communication.” This means Roth has sent the author an email (or said over beer), claiming to have a proof. Or perhaps Roth’s student accidentally mentioned this while answering a question after the talk. The proof
may or may not be correct and the paper may or may not be forthcoming.

(12) “Roth claims to have proved Murakami’s conjecture in [Roth].” Paper [Roth] has a well known gap which was never fixed even though Roth insists on it to be fixable; the author would rather avoid going on record about this, but anything is possible after some wine at a banquet. Another possibility is that [Roth] is completely erroneous as explained elsewhere, but Roth’s
work is too famous not to be mentioned; in that case there is often a followup sentence clarifying the matter, sometimes in parentheses as in “(see, however, [Atwood])”. Or, perhaps, [Roth] is a 3 page note published in Doklady Acad. Sci. USSR back in the 1970s, containing a very brief outline of the proof, and despite considerable effort nobody has yet to give a complete proof of its Lemma 2; there wouldn’t be any followup to this sentence then, but the author would be happy to clarify things by email.

ICM 2018 Speakers

May 9, 2017 3 comments

UCLA recently outed me (with permission) as a speaker at the next ICM in Rio. I am incredibly honored to be chosen, alongside my fantastic colleagues Matthias Aschenbrenner, Andrea Bertozzi, Ciprian Manolescu and Sucharit Sarkar.

P.S.  I have more to say on the subject of ICM, but that can wait perhaps.


A complete list of ICM speakers is available here.

You say goodbye and I say hello

September 2, 2016 Leave a comment

I’ve been meaning to write a few posts for a while now, but never could find the time. It really takes special effort to clean your thoughts and then put them in order.  However, the following story just fell into my mailbox.  It tells you how to save time by skipping on the greetings/salutations.  I am removing all math matters and leaving it undedited otherwise.  To protect the anonymity of my correspondent, I will call him “Kiran” throughout the email exhange.  Enjoy!  — IP

(1) [Math] Thanks! — Kiran

(2) Dear Kiran,
[Math] Best, — Igor

P.S. In the future, please address me as “Igor”, which is my first
name. It’s best to begin your email with customary “Dear Igor”.
Thank you.

(3) I’ve been writing emails for 25 years, so I’m not about to start taking advice on how to start them; but if you start a thread to me with “Dear Kiran”, I can be safely counted on to respond in kind for the *first* email in the thread. If it happens enough times, I might even remember to initiate same way. For instance, this is what happens when I exchange emails with Serre; but neither of us uses the salutation on replies after the first within a thread.

[Math] Best, — Kiran

(4) Dear Kiran,
[Math] Best, — Igor

P.S. With all due respect, I am going to continue using salutations
and expecting the same in every email irrespectively on the person or
the count in the thread. Neither the “25 years” nor argumentum ad
verecundiam seem convincing — I have been using email for just as
long and in similar circumstances. The 8 letters of “Dear Igor” is
really not too much to ask.

(5) Thanks, I think this last reference does exactly what I was looking for!

Best, — Kiran

P.S. I have something more to say on the subject of salutations, but since that is a low-priority discussion for me, I will have to put it off until I am more current on my email.

(6) “Dear” Igor,

I promised one more piece of information regarding salutations, so here goes. (Don’t bother replying to this email; I promise to delete the response without reading it!)

I recently had some email exchanges with Shinichi Mochizuki, and was a bit surprised by the fact that despite the fact that I met him more than 20 years ago, he began his email with “Dear Professor [redacted]” (and persisted with this in subsequent replies within the thread). However, when I asked about this, he made it clear that on one hand, he has a policy of using the same format of salutation no matter the recipient (to avoid having to worry about the level of formality, figuring it is safe to err on the side of being too formal sometimes), he has absolutely no expectations about how anyone will address his in response.

My point is that you misuse a certain term here and it’s not the gratituous Latinate rhetorical terminology; it’s the word “respect”. It is a fact that reasonable people can draw different conclusions about such matters as how it is appropriate to start an email. You are free to choose how you address me, but how I choose to structure my correspondence is my decision alone. What you think is “not too much to ask” is for me to keep you in mind as a special case when I don’t even have very much correspondence with you anyway; that’s a waste of mental real estate that I can little afford.

— Kiran

You should watch combinatorics videos!

May 2, 2015 4 comments

Here is my collection of links to Combinatorics videos, which I assembled over the years, and recently decided to publish.  In the past few years the number of videos just exploded.  We clearly live in a new era.  This post is about how to handle the transition.

What is this new collection?

I selected over 400 videos of lectures and seminars in Combinatorics, which I thought might be of interest to a general audience.  I tried to cover a large number of areas both within Combinatorics and related fields.  I have seen many (but not all!) of the talks, and think highly of them.  Sometimes I haven’t seen the video, but have heard this talk “live” at the same or a different venue, or read the paper, etc.  I tried to be impartial in my selection, but I am sure there is some bias towards some of my favorite speakers.

The collection includes multiple lectures by Noga Alon, Persi Diaconis, Gil Kalai, Don Knuth, László Lovász, János Pach, Vic Reiner, Paul Seymour, Richard Stanley, Terry Tao, Xavier Viennot, Avi Wigderson, Doron Zeilberger, and many many others. Occasionally the speakers were filmed giving similar talks at different institutions, so I included quick links to those as well so the viewer can choose.

Typically, these videos are from some workshops or public lecture series.  Most are hosted on the institution websites, but a few are on YouTube or Vimeo (some of these are broken into several parts).  The earliest video is from 1992 and the most recent video was made a few days ago.   Almost all videos are from the US or Canada, with a few recent additions from Europe.  I also added links to a few introductory lectures and graduate courses on the bottom of the page.

Why now?

Until a couple of years ago, the videos were made only at a few conference centers such as Banff, MSRI and IAS.  The choice was sparse and the videos were easy to find.  The opposite is true now, on both counts.  The number of recorded lectures in all areas is in tens of thousands, they are spread across the globe, and navigating is near impossible unless you know exactly what you are looking for.  In fact, there are so many videos I really struggled with the choice of which to include (and also with which of them qualify as Combinatorics).  I am not sure I can maintain the collection in the future – it’s already getting too big.  Hopefully, some new technology will come along (see below), but for now this will do.

Why Combinatorics?

That’s what I do.  I try to think of the area as broad as possible, and apologize in advance if I omitted a few things.  For the subarea division, I used as a basis my own Wikipedia entry for Combinatorics (weirdly, you can listen to it now in a robotic voice).  The content and the historical approach within sub-areas is motivated by my views here on what exactly is Combinatorics.

Why should you start watching videos now?

First, because you can.  One of the best things about being in academia is the ability (in fact, necessity) to learn.  You can’t possibly follow everything what happens in all fields of mathematics and even all areas of combinatorics.  Many conferences are specialized and the same people tend to meet a year after year, with few opportunities for outsiders to learn what’s new over there.  Well, now you can.  Just scroll down the list and (hopefully) be amazed at the number of classical works (i.e. over 5 y.o.) you never heard of, the variety of recent developments and connections to other fields.  So don’t just watch people in your area from workshops you missed for some reason.  Explore other areas!  You might be surprised to see some new ideas even on your favorite combinatorial objects.  And if you like what you see, you can follow the links to see other videos from the same workshops, or search for more videos by the same speaker…

Second, you should start watching because it’s a very different experience.  You already know this, of course.  One can pause videos, go back and forward, save the video to watch it again, or stop watching it right in the beginning.  This ability is to popular, Adam Sandler even made an awful movie about it…  On the other hand, the traditional model of lecture attendance is where you either listen intently trying to understand in real time and take notes, or are bored out your mind but can’t really leave.  It still has its advantages, but clearly is not always superior.  Let me elaborate on this below.

How to watch videos?

This might seem like a silly question, but give me a chance to suggest a few ideas…

0) Prepare for the lecture.  Make sure to have enough uninterrupted time.  Lock the door.  Turn off the cell phone.  Download and save the video (see below).  Download and save the slides.  Search for them if they are not on the lecture website (some people put them on their home pages).  Never delete anything – store the video on an external hard drive if you are running out of space.  Trust me, you never know when you might need it again, and the space is cheap anyway…

Some years ago I made a mistake by not saving Gil Kalai’s video of a talk titled “Results and Problems around Borsuk’s Conjecture”.  I found it very inspiring — it’s the only talk I referenced it in my book.  Well, apparently, in its infinite wisdom, PIMS lost the video and now only the audio is available, which is nearly useless for a blackboard talk.  What a shame!

1) Use 2 devices.  Have the video on a big screen, say, a large laptop or a TV hooked to your  laptop.  If the TV is too far, use a wireless mouse to operate a laptop from across the room or something like a Google stick to project from a far.  Then, have the slides of the talk opened on your tablet if you like taking computer notes or just like scrolling by hand gestures, or on your other laptop if you don’t.  The slides are almost universally in .pdf and most software including the Adobe Reader allows to take notes straight in the file.

Another reason to have slides opened is the inability for some camera people to understand what needs to be filmed.  This is especially severe if they just love to show the unusual academic personalities, or are used to filming humanities lectures where people read at the podium.  As a result, occasionally, you see them pointing a camera to a slide full of formulas for 2 seconds (and out of focus), and then going back for 2 minutes filming a speaker who is animatedly pointing to the screen (now invisible), explaining the math.  Ugh…

2) If the subject is familiar and you feel bored with the lengthy introduction, scroll the slides until you see something new.  This will give you a hint to where you should go forward in the video.  And if you did miss some definition you can pause the video and scroll the slides to read it.

3) If there are no slides, or you want to know some details which the speaker is purposefully omitting, pause the video and download the paper.  I do this routinely while listening to talks, but many people are too shy to do this out of misplaced fear that others might think they are not paying attention.  Well, there is no one to judge you now.

4) If you are the kind of person who likes to ask questions to clarify things, you still can.  Pause the video and search the web for the answer.  If you don’t find it, ask a colleague by skype, sms, chat, email, whatever.  If everything fails – write to the speaker.  She or he might just tell you, but don’t be surprised if they also ignore your email…

5) If you know others who might be interested in the video lecture, just make it happen.  For example, you can organize a weekly seminar where you and your graduate students watch the lectures you choose (when you have no other speakers).  If students have questions, pause the video and try to answer them.  In principle, if you have a good audience the speaker may agree to answer the questions for 5-10 min over skype, after you are done watching.  Obviously, I’ve never seen this happen (too much coordination?).  But why not try this – I bet if you ask nicely many speakers would agree to this.

6) If you already know a lot about the subject, haven’t been following it recently but want to get an update, consider binge watching.  Pick the most recent lecture series and just let it run when you do house shores or ride a subway.  When things get interesting, you will know to drop everything and start paying attention.

Why should you agree to be videotaped?

Because the audience is ready to see your talks now.  Think of this as another way of reaching out with your math to a suddenly much broader mathematical community (remember the “broad impact” section on your NSF grant proposal?).  Let me just say that there is nothing to fear – nobody is expecting you to have acting skills, or cares that you have a terrible haircut.  But if you make a little effort towards giving a good talk, your math will get across and you might make new friends.

Personally, I am extremely uncomfortable being videotaped – the mere knowledge of the camera filming makes me very nervous.  However I gradually (and grudgingly) concluded that this is now a part of the job, and I have to learn how to do this well.  Unfortunately, I am not there yet…

Yes, I realize that many traditionalists will object that “something will be missing” when you start aiming at giving good video talks at the expense of local audience.  But the world is changing if hasn’t changed already and you can’t stop the tide.  This happened before, many times.  For example, at some point all the big Hollywood studios have discovered that they can make movies simpler and make a great deal more money overseas to compensate for the loss in the US market.  They are completely hooked now, and no matter what critics say this global strategy is likely irreversible.  Of course, this leaves a room for a niche market (say, low budget art-house movies), but let’s not continue with this analogy.

How to give video lectures?

Most people do nothing special.  Just business as usual, hook up the mike and hope it doesn’t distort your voice too bad.  That’s a mistake.  Let me give a number of suggestions based mostly on watching many bad talks.  Of course, the advice for giving regular talks apply here as well.

0) Find out ahead of time if you get filmed and where the camera is.  During the lecture, don’t run around; try to stand still in full view of the camera and point to the screen with your hands.  Be animated, but without sudden moves.  Don’t use a laser pointer.  Don’t suddenly raise your voice.  Don’t appeal to the previous talks at the same workshop.  Don’t appeal to people in the audience – the camera can rarely capture what they say or do.  If you are asked a question, quickly summarize it so the viewer knows what question you are answering.  Don’t make silly off-the-cuff jokes (this is a hard one).

1) Think carefully whether you want to give a blackboard or a computer talk.  This is crucial.  If it’s a blackboard talk, make sure your handwriting is clear and most importantly BIG.  The cameras are usually in the very back and your handwriting won’t be legible otherwise.  Unless you are speaking the Fields Institute whose technology allows one to zoom into the high resolution video, nobody might be able to see what you write.  Same goes for handwritten slides unless they are very neat, done on a laptop, and the program allows you to increase their size.  Also, the blackboard management becomes a difficult issue.  You should think through what results/definitions should stay on the blackboard visible to the camera at all times and what can be safely deleted or lifted up if the blackboard allows that.

2) If it’s a computer talk, stick to your decision and make a lot of effort to have the slides look good.  Remember, people will be downloading them…  Also, make every effort NOT to answer questions on a blackboard next to the screen.  The lightning never works – the rooms are usually dimmed for a computer talk and no one ever thinks of turning the lights on just for 30 seconds when you explain something.  So make sure to include all your definition, examples, etc, in the slides.  If you don’t want to show some of them – in PowerPoint there is a way to hide them and pull them up only if someone asks to clarify something.  I often prepare the answers to some standard questions in the invisible part of my slides (such as “What happens for other root systems?” or “Do your results generalize to higher dimensions?”), sometimes to unintended comedic effect.  Anyhow, think this through.

3) Don’t give the same videotaped talk twice.  If you do give two or more talks on the same paper, make some substantial changes.  Take Rota’s advice: “Relate to your audience”…  If it’s a colloquium talk, make a broad accessible survey and include your results at the end.  Or, if it’s a workshop talk, try to make an effort to explain most proof ideas, etc.  Make sure to have long self-explanatory talk titles to indicate which talk is which.  Follow the book industry lead for creating subtitles.  For example use “My most recent solution of the Riemann hypothesis, an introduction for graduate students” or “The Pythagorean theorem: How to apply it to the Langlands Program and Quantum Field Theory”.

4) Download and host your own videos on your website alongside your slides and your relevant paper(s), or at least make clear links to them from your website.  You can’s trust anyone to keep your files.  Some would argue that re-posting them on YouTube will then suffice.  There are two issues here.  First, this is rarely legal (see below).  Second, as I mentioned above, many viewers would want to have a copy of the file.  Hopefully, in the future there will be a copyright-free arXiv-style video hosting site for academics (see my predictions below).

5) In the future, we would probably need to consider having a general rule about adding a file with errata and clarifications to your talk, especially if something you said is not exactly correct, or even just to indicate, post-factum, whether all these conjectures you mentioned have been resolved and which way.  The viewers would want to know.

For example, my student pointed out to me that in my recent Banff talk, one of my lemmas is imprecise.  Since the paper is already available, this is not a problem, but if it wasn’t this could lead to a serious confusion.

6) Watch other people’s videos.  Pay attention to what they do best.  Then watch your own videos.  I know, it’s painful.  Turn off the sound perhaps.  Still, this might help you to correct the worst errors.

7) For advanced lecturers – try to play with the format.  Of course, the videos allow you to do things you couldn’t do before (like embedding links to papers and other talks, inserting some Java demonstration clips, etc.), but I am talking about something different.  You can turn the lecture into an artistic performance, like this amazing lecture by Xavier Viennot.  Not everyone has the ability or can afford to do this, but having it recorded can make it worthwhile, perhaps.

Know your rights

There are some delicate legal issues when dealing with videos, with laws varying in different states in the US (and in other countries, of course).  I am not an expert on any of this and will write only as I understand them in the US.  Please add a comment on this post if you think I got any of this wrong.

1) Some YouTube videos of math lectures look like they have been shut by a phone.  I usually don’t link to those.  As I understand the law on this, anyone can film a public event for his/her own consumption.  However, you and the institution own the copyright so the YouTube posting is illegal without both of yours explicit permission (written and signed).  You can fight this by sending a “cease and desist” letter to the person who posted the video, but contacting YouTube directly might be more efficient – they have a large legal department to sort these issues.

2) You are typically asked to sign away your rights before your talk.  If an institution forgot to do this, you can ask to take your talk down for whatever reason.  However, even if you did sign the paper you can still do this – I doubt the institution will fight you on this just to avoid bad publicity.  A single email to the IT department should suffice.

3) If the file with your talk is posted, it is (obviously) legal for you to download it, but not to post it on your website or repost elsewhere such as YouTube or WordPress.  As far as I am concerned, you should go ahead and post it on your university website anyway (but not on YT or WP!).  Many authors typically post all their papers on their website even if they don’t own a copyright on them (which is the case or virtually all papers before 2000).  I am one of them.  The publishers just concluded that this is the cost of doing business – if they start going after people like us, the authors can revolt.  The same with math videos.  The institutions probably won’t have a problem with your university website posting as long as you acknowledge the source.  But involving a third party creates a host of legal problems since these internet companies are making money out of the videos they don’t own a copyright for.  Stay away from this.

4)  You can the edit the video by using numerous software, some of which is free to download.  Your can remove the outside noise, make the slides sharper, everything brighter, etc.  I wouldn’t post a heavily edited video when someone else owns a copyright, but a minor editing as above is ok I think.

5) If the institution’s website does not allow to download the video but has a streaming option (typically, the Adobe Flash or HTML5), you can still legally save it on your computer, but this depends on what software you choose.  There are plenty of software which capture the video being played on your computer and save it in a file.  These are 100% legal.  Other websites play the videos on their computers and allow you to download afterwards.  This is probably legal at the institutions, but a gray area at YouTube or Vimeo which have terms of service these companies may be violating.  Just remember – such videos can only be legal for personal consumption.  Also, the quality of such recording is typically poor – having the original file is much better.

What will happen in the future?

Yes, I will be making some predictions.  Not anything interesting like Gian-Carlo Rota’s effort I recently analyzed, but still…

1) Watching and giving video lectures will become a norm for everyone.  The ethical standards will develop that everyone gets to have the files of videos they made.  Soon enough there will be established some large well organized searchable (and not-for-profit!) math video depositories arXiv-style where you can submit your video and link to it from your website and where others can download from.  Right now companies like DropBox allow you to do this, but it’s for-profit (your have to pay extra for space), and it obviously needs a front like the arXiv.  This would quickly make my collection a thing of the past.

2) Good math videos will become a “work product”, just like papers and books.  It is just another venue to communicate your results and ideas.  People will start working harder on them.  They will become a standard item on CVs, grant applications, job promotions, etc.  More and more people will start referencing them just like I’ve done with Kalai’s talk.  Hopefully part 1) will happen soon enough so all talks get standard and stable links.

3) The video services will become ubiquitous.  First, all conference centers will acquire advanced equipment in the style of the Banff Center which is voice directed and requires no professional involvement except perhaps at the editing stage.  Yes, I am thinking of you, MFO.  A new library is great, but the talks you could have recorded there are priceless – it’s time to embrace the 21st century….

Second, more and more university rooms will be equipped with the cameras, etc.  UCLA already has a few large rooms like that (which is how we make the lamely named BruinCasts), but in time many department will have several such rooms to hold seminars.  The storage space is not an issue, but the labor cost, equipment and the broadband are.  Still, I give it a decade or two…

4) Watching and showing math videos will become a standard part of the research and graduate education.  Ignore the doomsayers who proclaim that this will supplant the traditional teaching (hopefully, not in our lifetime), but it’s clear already there are unexplored educational benefits from this.  This should be of great benefit especially to people in remote locations who don’t have access to such lectures otherwise.  Just like the Wikipedia has done before, this will even the playing field and help the talent to emerge from unlikely places.  If all goes well, maybe the mathematics will survive after all…

Happy watching everyone! 

Grading Gian-Carlo Rota’s predictions

November 27, 2014 3 comments

In this post I will try to evaluate Gian-Carlo Rota‘s predictions on the future of Combinatorics that he made in this 1969 article.  He did surprisingly well, but I am a tough grader and possibly biased about some of the predictions.  Judge for yourself…

It’s tough to make predictions, especially about the future

It is a truth universally acknowledged that humans are very interested in predicting the future. They do this incessantly, compiling the lists of the best and the worst, and in general can’t get enough of them. People tend to forget wrong predictions (unless they are outrageously wrong).  This allows a person to make the same improbable predictions over and over and over and over again, making news every time.  There are even professional prognosticators who make a living writing about the future of life and technology.  Sometimes these predictions are rather interesting (see here and there), but even the best ones are more often wrong than right…

Although rarely done, analyzing past predictions is a useful exercise, for example as a clue to the truthiness of the modern day oracles.  Of course, one can argue that many of the political or technology predictions linked above are either random or self-serving, and thus not worth careful investigation. On the other hand, as we will see below, Rota’s predictions are remarkably earnest and sometimes even brave.  And the fact that they were made so long ago makes them uniquely attractive, practically begging to be studied.

Now, it has been 45 years since Rota’s article, basically an eternity in the life span of Combinatorics. There, Rota describes Combinatorics as “the least developed branches of mathematics“. A quick review of the last few quotes on this list I assembled shows how much things have changed. Basically, the area moved from an ad hoc collection of problems to a 360-degree panorama of rapidly growing subareas, each of which with its own deep theoretical results, classical benchmarks, advanced tools and exciting open problems. This makes grading rather difficult, as it suggests that even random old predictions are likely to be true – just about anything people worked on back in the 1960 has been advanced by now. Thus, before turning to Rota, let’s agree on the grading scale.

Grading on a curve

To give you the feel for my curve, I will use the celebrated example of the 1899-1901 postcards in the En L’An 2000 French series, which range from insightful to utter nonsense (click on the titles to view the postcards, all available from Wikimedia).

Electric train.  Absolutely.  These were introduced only in 1940s and have been further developed in France among other countries.  Note the aerodynamic shape of the engine.  Grade:  A.

Correspondance cinema.  Both the (silent) cinema and phonograph were invented by 1900; the sound came to movie theaters only in 1927.  So the invention here is of a home theater for movies with sound.  Great prediction although not overly ambitious. Grade:  A-.

  Military cyclists.  While bicycle infantry was already introduced in France by 1900, military use of motorcycles came much later.  The idea is natural, but some designs of bikes from WW2 are remarkably similar.  Some points are lost due to the lack of widespread popularity in 2000.  Grade: B+.

  Electric scrubbing.  This is an electric appliance for floor cleaning.  Well, they do exist, sort of, obviously based on different principles.  In part due to the modern day popularity, this is solid prediction anyway.  Grade:  B.

 Auto-rollers.  Roller skates have been invented in 18th century and by 1900 became popular.  So no credit for the design, but extra credit for believing in the future of the mean of transportation now dominated by rollerblades. Thus the author’s invention is in the category of “motorized personal footwear”. In that case the corresponding modern invention is of the electric skateboard which has only recently become available, post-2000 and yet to become very popular. Grade: B-.

Barber.  The author imagines a barber operating machinery which shaves and cuts customer’s hair.   The design is so ridiculous (and awfully dangerous), it’s a good thing this never came about.  There are however electric shavers and hair cutters which are designed very differently.  Grade:  C.

•  Air cup.  The Wright brothers’ planes had similar designs, so no credit again.  The author assumes that personal air travel will become commonplace, and at low speeds and heights.  This is almost completely false.  However, unfortunately, and hopefully only very occasionally, some pilots do enjoy one for the road.  Grade:  D.

 Race in Pacific.  The author imagines that the public spectacle of horse racing will move underwater and become some kind of fish racing.  Ridiculous.  Also a complete failure to envision modern popularity of auto racing which already began in Paris in 1887.  Grade:  F.

Rota’s predictions on combinatorial problems:

In his paper, Rota writes:

Fortunately, most combinatorial problems can be stated in everyday language. To give an idea of the present state of the field, we have selected a few of the many problems that are now being actively worked upon.

We take each of these “problems” as a kind of predictions of where the field is going.  Here are my (biased and possibly uninformed) grades for each problem he mentions.

1)  The Ising Problem.  I think it is fair to say that since 1969 combinatorics made no contribution in this direction.  While physicists and probabilists continue studying this problem, there is no exact solution in dimension 3 and higher.  Grade: F.

2)  Percolation Theory.  The study of percolation completely exploded since 1969 and is now a subject of numerous articles in both probability, statistical physics and combinatorics, as well as several research monographs.  One connection is given by an observation that p-percolation on a complete graph Kn gives the Erdős–Rényi random graph model. Even I accidentally wrote a few papers on the subject some years ago (see one, two and three).  Grade: A.

3)  The Number of Necklaces, and Polya’s Problem.  Taken literally, the necklaces do come up in combinatorics of words and free Lie algebra, but this context was mentioned by Rota already. As far as I can tell, there are various natural (and interesting) generalizations of necklaces, but none surprising.  Of course, if the cyclic/dihedral group action here is replaced by other actions, e.g. the symmetric group, then modern developments are abundant.  But I think it’s a reach too far, since Rota knew the works of Young, MacMahon, Schur and others but does not mention any of it.  Similarly, Polya’s theorem used to be included in all major combinatorics textbooks (and is included now, occasionally), but is rarely taught these days.  Simply put, the g.f. implications haven’t proved useful.  Grade: D.

4)  Self-avoiding Walk. Despite strong interest, until recently there were very few results in the two-dimensional case (some remarkable results were obtained in higher dimensions). While the recent breakthrough results (see here and there) do use some interesting combinatorics, the authors’ motivation comes from probability. Combinatorialists did of course contribute to the study of somewhat related questions on enumeration of various classes of polyomino (which can be viewed as self-avoiding cycles in the grid, see e.g. here).  Grade: C.

5)  The Traveling Salesman Problem. This is a fundamental problem in optimization theory, connected to the study of Hamiltonian cycles in Graph Theory and numerous other areas. It is also one of the earliest NP-hard problems still playing a benchmark role in Theoretical Computer Science. No quick of summary of the progress in the past 45 years would give it justice. Note that Rota’s paper was written before the notions of NP-completeness. In this light, his emphasis on algorithmic complexity and allusions to Computability Theory (e.g. unsolvable problems) are most prescient.  So are his briefly mentioned connections to topology which are currently a popular topic.  Well done!  Grade: A+.

6)  The Coloring Problem.  This was a popular topic way before Rota’s article (inspired by the Four Color Theorem, the chromatic polynomial, etc.), and continues to be even more so, with truly remarkable advances in multiple directions.  Note Rota’s mentioning of matroids which may seem extraneous here at first, but in fact absolutely relevant indeed (in part due to Rota’s then-ongoing effort).  Very good but unsurprising prediction.  Grade: A-.

7)  The Pigeonhole Principle and Ramsey’s Theorem. The Extremal Graph Theory was about to explode in many directions, with the the Erdős-Stone-Simonovits theorem proved just a few years earlier and the Szemerédi regularity lemma a few years later.  Still, Rota never mentions Paul Erdős and his collaborators, nor any of these results, nor potential directions.  What a missed opportunity!  Grade: B+.

Rota’s predictions on combinatorial areas:

In the concluding section “The Coming Explosion”, Rota sets this up as follows:

Before concluding this brief survey, we shall list the main subjects in which current work in combinatorial theory is being done.

Here is a list and more of my comments.

1)  Enumerative Analysis.  Sure.  But this was an easy prediction to make given the ongoing effort by Carlitz, Polya, Riordan, Rota himself and many other peope.  Grade: A-.

2)  Finite Geometries and Block Designs.  The subject was already popular and it did continue to develop but perhaps at a different pace and directions than Rota anticipated (Hadamard matrices, tools from Number Theory).  In fact, a lot of later work was connection with with Group Theory (including some applications of CFSG which was an ongoing project) and in Coding Theory (as Rota predicted).  Grade: B-.

3)  Applications to Logic.  Rota gives a one-sentence desctiption:

The development of decision theory has forced logicians to make wide use of combinatorial methods.

There are various important connections between Logic and Combinatorics, for example in Descriptive Set Theory (see e.g. here or more recent work by my future UCLA colleague there).  Note however, that Infinitary Combinatorics was already under development, after the Erdős-Rado theorem (1956).  Another very interesting and more recent connection is to Model Theory (see e.g. here).  But the best interpretation here I can think of here are the numerous applications to Game Theory, which already existed (Nash’s equilibrium theorem is from 1950) and was under rapid development.  Either way, Rota was too vague in this case to be given much credit.  Grade: C.

4)  Statistical Mechanics.   He mentions the Ising model again and insists on “close connections with number theory”.  One can argue this all to be too vague or misdirected, but the area does indeed explode in part in the directions of problems Rota mentions earlier. So I am inclined to give him benefit of the doubt on this one. Grade: A-.

The final grade

In total, Rota clearly got more things right than wrong.  He displayed an occasional clairvoyance, had some very clever insights into the future, but also a few flops.  Note also the near complete lack of self-serving predictions, such as the Umbral Calculus that Rota was very fond of.  Since predictions are hard, successes have a great weight than failures.  I would give a final grade somewhere between A- and B+ depending on how far into the future do we think he was making the predictions.  Overall, good job, Gian-Carlo!

P.S.  Full disclosure:  I took a few advanced classes with Gian-Carlo Rota as a graduate student cross registering from Harvard to MIT, and he may have graded my homeworks (this was in 1994-1996 academic years).  I don’t recall the final grades, but I think they were good.  Eventually Rota wrote me a letter of recommendation for a postdoc position.

How NOT to reference papers

September 12, 2014 Leave a comment

In this post, I am going to tell a story of one paper and its authors which misrepresented my paper and refused to acknowledge the fact. It’s also a story about the section editor of Journal of Algebra which published that paper and then ignored my complaints. In my usual wordy manner, I do not get to the point right away, and cover some basics first. If you want to read only the juicy parts, just scroll down…

What’s the deal with the references?

First, let’s talk about something obvious. Why do we do what we do? I mean, why do we study for many years how to do research in mathematics, read dozens or hundreds of papers, think long thoughts until we eventually figure out a good question. We then work hard, trial-and-error, to eventually figure out a solution. Sometimes we do this in a matter of hours and sometimes it takes years, but we persevere. Then write up a solution, submit to a journal, sometimes get rejected (who knew this was solved 20 years ago?), and sometimes sent for revision with various lemmas to fix. We then revise the paper, and if all goes well it gets accepted. And published. Eventually.

So, why do we do all of that? For the opportunity to teach at a good university and derive a reasonable salary? Yes, sure, a to some degree. But mostly because we like doing this. And we like having our work appreciated. We like going to conferences to present it. We like it when people read our paper and enjoy it or simply find it useful. We like it when our little papers form building stones towards bigger work, perhaps eventually helping to resolve an old open problem. All this gives us purpose, a sense of accomplishment, a “social capital” if you like fancy terms.

But all this hinges on a tiny little thing we call citations. They tend to come at the end, sometimes footnote size and is the primary vehicle for our goal. If we are uncited, ignored, all hope is lost. But even if we are cited, it matters how our work is cited. In what context was it referenced is critically important. Sometimes our results are substantially used in the proof, those are GOOD references.

Yet often our papers are mentioned in a sentence “See [..] for the related results.” Sometimes this happens out of politeness or collegiality between authors, sometimes for the benefit of the reader (it can be hard navigating a field), and sometimes the authors are being self-serving (as in “look, all these cool people wrote good papers on this subject, so my work must also be good/important/publishable”). There are NEUTRAL references – they might help others, but not the authors.

Finally, there are BAD references. Those which refer derogatively to your work, or simply as a low benchmark which the new paper easily improved. Those which say “our bound is terribly weak, but it’s certainly better than Pak’s.” But the WORST references are those which misstate what you did, which diminish and undermine your work.

So for anyone out there who thinks the references are in the back because they are not so important – think again. They are of utmost importance – they are what makes the system work.

The story of our paper

This was in June 1997. My High School friend Sergey Bratus and I had an idea of recognizing the symmetric group Sn using the Goldbach conjecture. The idea was nice and the algorithm was short and worked really fast in practice. We quickly typed it up and submitted to the Journal of Symbolic Computations in September 1997. The journal gave us a lot of grief. First, they refused to seriously consider it since the Goldbach conjecture in referee’s words is “not like the Riemann hypothesis“, so we could not use it. Never mind that it was checked for n<1014, covering all possible values where such algorithm could possibly be useful. So we rewrote the paper by adding a variation based on the ternary Goldbach conjecture which was known for large enough values (and now proved in full).

The paper had no errors, resolved an open problem, but the referees were unhappy. One of them requested we change the algorithm to also work for the alternating group. We did. In the next round the same or another requested we cover the case of unknown n. We did. In the next round one referee requested we make a new implementation of the algorithm, now in GAP and report the results. We did. Clearly, the referees did not want our paper to get published, but did not know how to say it. Yet we persevered. After 4 back and forth revisions the paper more than doubled in size (completely unnecessarily). This took two years, almost to the day, but the paper did get accepted and published. Within a year or two, it became a standard routine in both GAP and MAGMA libraries.

[0] Sergey Bratus and Igor Pak, Fast constructive recognition of a black box group isomorphic to Sn or An using Goldbach’s Conjecture, J. Symbolic Comput. 29 (2000), 33–57.

Until a few days ago I never knew what was the problem the referees had with our paper. Why did a short, correct and elegant paper need to become long to include cumbersome extensions of the original material for the journal to accept it? I was simply too inexperienced to know that this is not the difference in culture (CS vs. math). Read on to find out what I now realized.

Our competition

After we wrote our paper, submitted and publicized on our websites and various conferences, I started noticing strange things. In papers after papers in Computational Group Theory, roughly a half would not reference our paper, but would cite another paper by 5 people in the field which apparently was doing the same or similar things. I recall I wrote to the authors of this competitive paper, but they wrote back that the paper is not written yet. To say I was annoyed was to understate the feeling.

In one notable instance, I confronted Bill Kantor (by email) who helped us with good advice earlier. He gave an ICM talk on the subject and cited a competition paper but not ours, even though I personally showed him the submitted preprint of [0] back in 1997, and explained our algorithm. He replied that he did not recall whether we sent him the paper. I found and forwarded him my email to him with that paper. He replied that he probably never read the email. I forwarded him back his reply on my original email. Out of excuses, Kantor simply did not reply. You see, the calf can never beat the oak tree.

Eventually, the competition paper was published 3 years after our paper:

[1] Robert Beals, Charles Leedham-Green, Alice Niemeyer, Cheryl Praeger, Ákos Seress, A black-box group algorithm for recognizing finite symmetric and alternating groups. I, Trans. AMS 355 (2003), 2097–2113.

The paper claims that the sequel II by the same authors is forthcoming, but have yet to appear. It was supposed to cover the case of unknown n, which [0] was required to cover, but I guess the same rules do not apply to [1]. Or maybe JSC is more selective than TAMS, one never knows… The never-coming sequel II will later play a crucial part in our story.

Anyhow, it turns out, the final result in [1] is roughly the same as in [0]. Although the details are quite different, it wasn’t really worth the long wait. I quote from [1]:

The running time of constructive recognition in [0] is about the same.

The authors then show an incredible dexterity in an effort to claim that their result is better somehow, by finding minor points of differences between the algorithms and claiming their importance. For example, take look at this passage:

The paper [0] describes the case G = Sn, and sketches the necessary modifications for the case G = An. In this paper, we present a complete argument which works for both cases. The case G = An is more complicated, and it is the more important one in applications.

Let me untangle this. First, what’s more “important” in applications is never justified and no sources were cited. Second, this says that BLNPS either haven’t read [0] or are intentionally misleading, as the case of An there is essentially the same as Sn, and the timing is off by a constant. On the other hand, this suggests that [1] treats An in a substantively more complicated way than Sn. Shouldn’t that be an argument in favor of [0] over [1], not the other way around? I could go on with other similarly dubious claims.

The aftermath

From this point on, multiple papers either ignored [0] in favor of [1] or cited [0] pro forma, emphasizing [1] as the best result somehow. For example, the following paper with 3 out of 5 coauthors of [1] goes at length touting [1] and never even mentioned [0].

[2] Alice Niemeyer, Cheryl Praeger, Ákos Seress, Estimation Problems and Randomised Group Algorithms, Lecture Notes in Math. 2070 (2013), 35–82.

When I asked Niemeyer as to how this could have happened, she apologized and explained: “The chapter was written under great time pressure.”

For an example of a more egregious kind, consider this paper:

[3] Robert Beals, Charles Leedham-Green, Alice Niemeyer, Cheryl Praeger, Ákos Seress, Constructive recognition of finite alternating and symmetric groups acting as matrix groups on their natural permutation modules, J. Algebra 292 (2005), 4–46.

They unambiguously claim:

The asymptotically most efficient black-box recognition algorithm known for An and Sn is in [1].

Our paper [0] is not mentioned anywhere near, and cited pro forma for other reasons. But just two years earlier, the exact same 5 authors state in [1] that the timing is “about the same”. So, what has happened to our algorithm in the intervening two years? It slowed down? Or perhaps the one in [1] got faster? Or, more plausibly, BLNPS simply realized that they can get away with more misleading referencing at JOA, than TAMS would ever allow?

Again, I could go on with a dozen other examples of this phenomenon. But you get the idea…

My boiling point: the 2013 JOA paper

For years, I held my tongue, thinking that in the age of Google Scholar these self-serving passages are not fooling anybody, that anyone interested in the facts is just a couple of clicks away from our paper. But I was naive. This strategy of ignoring and undermining [0] eventually paid off in this paper:

[4] Sebastian Jambor, Martin Leuner, Alice Niemeyer, Wilhelm Plesken, Fast recognition of alternating groups of unknown degree, J. Algebra 392 (2013), 315–335.

The abstract says it all:

We present a constructive recognition algorithm to decide whether a given black-box group is isomorphic to an alternating or a symmetric group without prior knowledge of the degree. This eliminates the major gap in known algorithms, as they require the degree as additional input.

And just to drive the point home, here is the passage from the first paragraph in the introduction.

For the important infinite family of alternating groups, the present black-box algorithms [0], [1] can only test whether a given black-box group is isomorphic to an alternating or a symmetric group of a particular degree, provided as additional input to the algorithm.

Ugh… But wait, our paper [0] they are citing already HAS such a test! And it’s not like it is hidden in the paper somehow – Section 9 is titled “What to do if n is not known?” Are the authors JLNP blind, intentionally misleading or simply never read [0]? Or is it the “great time pressure” argument again? What could possible justify such outrageous error?

Well, I wrote to the JLNP but neither of them answered. Nor acknowledged our priority. Nor updated the arXiv posting to reflect the error. I don’t blame them – people without academic integrity simply don’t see the need for that.

My disastrous battle with JOA

Once I realized that JLNP are not interested in acknowledging our priority, I wrote to the Journal of Algebra asking “what can be done?” Here is a copy of my email. I did not request a correction, and was unbelievably surprised to hear the following from Gerhard Hiss, the Editor of the Section on Computational Algebra of the Journal of Algebra:

[..] the authors were indeed careless in this attribution.

In my opinion, the inaccuracies in the paper “Fast recognition of alternating groups of unknown degree” are not sufficiently serious to make it appropriate for the journal to publish a correction.

Although there is some reason for you to be mildly aggrieved, the correction you ask for appears to be inappropriate. This is also the judgment of the other editors of the Computational Algebra Section, who have been involved in this discussion.

I have talked to the authors of the paper Niemeyer et al. and they confirmed that the [sic.] did not intend to disregard your contributions to the matter.

Thus I very much regret this unpleasent [sic.] situation and I ask you, in particular with regard to the two young authors of the paper, to leave it at that.

This email left me floored. So, I was graciously permitted by the JOA to be “mildly aggrieved“, but not more? Basically, Hiss is saying that the answer to my question “What can be done?” is NOTHING. Really?? And I should stop asking for just treatment by the JOA out of “regard to the two young authors”? Are you serious??? It’s hard to know where to begin…

As often happened in such cases, an unpleasant email exchange ensued. In my complaint to Michel Broué, he responded that Gerhard Hiss is a “respectable man” and that I should search for justice elsewhere.

In all fairness to JOA, one editor did behave honorably. Derek Holt wrote to me directly. He admitted that he was the handling editor for [1]. He writes:

Although I did not referee the paper myself, I did read through it, and I really should have spotted the completely false statement in the paper that you had not described any algorithm for determining the degree n of An or Sn in your paper with Bratus. So I would like to apologise now to you and Bratus for not spotting that. I almost wrote to you back in January when this discussion first started, but I was dissuaded from doing so by the other editors involved in the discussion.

Let me parse this, just in case. Holt is the person who implemented the Bratus-Pak algorithm in Magma. Clearly, he read the paper. He admits the error and our priority, and says he wanted to admit it publicly but other unnamed editors stopped him. Now, what about this alleged unanimity of the editorial board? What am I missing? Ugh…

What really happened? My speculation, part I. The community.

As I understand it, the Computational Group Theory is small close-knit community which as a result has a pervasive groupthink. Here is a passage from Niemeyer email to me:

We would also like to take this opportunity to mention how we came about our algorithm. Charles Leedham-Green was visiting UWA in 1996 and he worked with us on a first version of the algorithm. I talked about that in Oberwolfach in mid 1997 (abstract on OW Web site).

The last part is true indeed. The workshop abstracts are here. Niemeyer’s abstract did not mention Leedham-Green nor anyone else she meant by “us” (from the context – Niemeyer and Praeger), but let’s not quibble. The 1996 date is somewhat more dubious. It is contradicted by Niemeyer and Prager, who themselves clarified the timeline in the talk they gave in Oberwolfach in mid 2001 (see the abstract here):

This work was initiated by intense discussions of the speakers and their colleagues at the Computational Groups Week at Oberwolfach in 1997.

Anyhow, we accept that both algorithms were obtained independently, in mid-1997. It’s just that we finished our paper [0] in 3 months, while it took BLNPS about 4 years until it was submitted in 2001.

Next quote from Niemeyer’s email:

So our work was independent of yours. We are more than happy to acknowledge that you and Sergey [Bratus] were the first to come up with a polynomial time algorithm to solve the problem [..].

The second statement is just not true in many ways, nor is this our grievance as we only claim that [0] has a practically superior and theoretically comparable algorithm to that in [1], so there is no reason at all to single out [1] over [0] as is commonly done in the field. In fact, here is a quote from [1] fully contradicting Niemeyer’s claim:

The first polynomial-time constructive recognition algorithm for symmetric and alternating groups was described by Beals and Babai.

Now, note that both Hiss, Holt, Kantor and all 5 authors BLNPS were at both the 1997 and the 2001 Oberwolfach workshops (neither Bratus nor I were invited). We believe that the whole community operates by “they made a stake on this problem” and “what hasn’t happened at Oberwolfach, hasn’t happened.” Such principles make it easier for members of the community to treat BLNPS as pioneers of this problem, and only reference them even though our paper was published before [1] was submitted. Of course, such attitudes also remove a competitive pressure to quickly write the paper – where else in Math and especially CS people take 4-5 years(!) to write a technically elementary paper? (this last part was true also for [0], which is why we could write it in under 3 months).

In 2012, Niemeyer decided to finally finish the long announced part II of [1]. She did not bother to check what’s in our paper [0]. Indeed, why should she – everyone in the community already “knows” that she is the original (co-)author of the idea, so [4] can also be written as if [0] never happened. Fortunately for her, she was correct on this point as neither the referees nor the handling editor, nor the section editor contradicted false statements right in the abstract and the introduction.

My speculation, part II. Why the JOA rebuke?

Let’s look at the timing. In the Fall 2012, Niemeyer visited Aachen. She started collaborating with Professor Plesken from RWTH Aachen and his two graduate students: Jambor and Leuner. The paper was submitted to JOA on December 21, 2012, and the published version lists affiliation of all but Jambor to be in Aachen (Jambor moved to Auckland, NZ before the publication).

Now, Gerhard Hiss is a Professor at RWTH Aachen, working in the field. To repeat, he is the Section Editor of JOA on Computational Algebra. Let me note that [4] was submitted to JOA three days before Christmas 2012, on the same day (according to a comment I received from Eamonn O’Brien from JOA editorial board), on which apparently Hiss and Niemeyer attended a department Christmas party.

My questions: is it fair for a section editor to be making a decision contesting results by a colleague (Plesken), two graduate students (Jambor and Leuner), and a friend (Niemeyer), all currently or recently from his department? Wouldn’t the immediate recusal by Editor Hiss and investigation by an independent editor be a more appropriate course of action under the circumstances? In fact, this is a general Elsevier guideline if I understand it correctly.

What now?

Well, I am at the end of the line on this issue. Public shaming is the only thing that can really work against groupthink. To spread the word, please LIKE this post, REPOST it, here on WP, on FB, on G+, forward it by email, or do wherever you think appropriate. Let’s make sure that whenever somebody googles these names, this post comes up on top of the search results.

P.S. Full disclosure: I have one paper in the Journal of Algebra, on an unrelated subject. Also, I am an editor of Discrete Mathematics, which together with JOA is owned by the same parent company Elsevier.

UPDATE (September 17, 2014): I am disallowing all comments on this post as some submitted comments were crude and/or offensive. I am however agreeing with some helpful criticism. Some claimed that I crossed the line with some personal speculations, so I removed a paragraph. Also, Eamonn O’Brien clarified for me the inner working of the JOA editorial board, so removed my incorrect speculations on that point. Neither are germane to my two main complaints: that [0] is repeatedly mistreated in the area, most notably in [4], and that Editor Hiss should have recused himself from handling my formal complaint on [4].

UPDATE (October 14, 2014): In the past month, over 11K people viewed this post (according to the WP stat tools). This is a simply astonishing number for an inactive blog. Thank you all for spreading the word, whether supportive or otherwise! Special thanks to those of you in the field, who wrote heartfelt emails, also some apologetic and some critical – this was all very helpful.

Who named Catalan numbers?

February 5, 2014 2 comments

The question.  A year ago, on this blog, I investigated  Who computed Catalan numbers.  Short version: it’s Euler, but many others did a lot of interesting work soon afterwards.  I even made a  Catalan Numbers Page  with many historical and other documents.  But I always assumed that the dubious honor of naming them after Eugène Catalan belongs to Netto.  However, recently I saw this site which suggested that it was E.T. Bell who named the sequence.  This didn’t seem right, as Bell was both a notable combinatorialist and mathematical historian.  So I decided to investigate who did the deed.

First, I looked at Netto’s Lehrbuch der Combinatorik (1901).  Although my German is minuscule and based on my knowledge of English and Yiddish (very little of the latter, to be sure), it was clear that Netto simply preferred counting of Catalan’s brackets to triangulations and other equivalent combinatorial interpretations.  He did single out Catalan’s work, but mentioned Rodrigues’s work as well.  In general, Netto wasn’t particularly careful with the the references, but in fairness neither were were most of his contemporaries.  In any event, he never specifically mentioned “Catalan Zahlen”.

Second, I checked the above mentioned 1938 Bell’s paper in the Annals.  As I suspected, Bell mentioned “Catalan’s numbers” only in passing, and not in a way to suggest that Catalan invented them.   In fact, he used the term “Euler-Segner sequence” and provided careful historical and more recent references.

Next on my list was John Riordan‘s Math Review MR0024411, of this 1948 Motzkin’s paper.  The review starts with “The Catalan numbers…”, and indeed might have been the first time this name was introduced.  However, it is naive to believe that this MR moved many people to use this expression over arguably more cumbersome “Euler-Segner sequence”.  In fact, Motzkin himself is very careful to cite Euler, Cayley, Kirkman, Liouville, and others.  My guess is this review was immediately forgotten, but was a harbinger of things to come.

Curiously, Riordan does this again in 1964, in a Math Review on an English translation of a popular mathematics book by A.M. Yaglom and I.M. Yaglom (published in Russian in 1954).  The book mentions the sequence in the context of counting triangulations of an n-gon, without calling it by any name, but Riordan recognizes them and uses the term “Catalan numbers” in the review.

The answer.  To understand what really happened, see this Ngram chart.  It clearly shows that the term “Catalan numbers” took off after 1968.  What happened?  Google Books immediately answers – Riordan’s Combinatorial Identities was published in 1968 and it used “the Catalan numbers”.  The term took off and became standard within a few years.  

What gives?  It seems, people really like to read books.  Intentionally or unintentionally, monographs tend to standardize the definitions, notations, and names of mathematical objects.  In his notes on Mathematical writing, Knuth mentions that the term “NP-complete problem” became standard after it was used by Aho, Hopcroft and Ullman in their famous Data Structures and Algorithms textbook.  Similarly, Macdonald’s Symmetric Functions and Hall Polynomials became a standard source of names of everything in the area, just as Stanley predicted in his prescient review.

The same thing happened to Riordan’s book.  Although now may be viewed as tedious, somewhat disorganized and unnecessarily simplistic (Riordan admitted to dislike differential equations, complex analysis, etc.), back in the day there was nothing better.  It was lauded as “excellent and stimulating” in P.R. Stein’s review, which continued to say “Combinatorial identities is, in fact, a book that must be read, from cover to cover, and several times.”  We are guessing it had a tremendous influence on the field and cemented the terminology and some notation.

In conclusion.  We don’t know why Riordan chose the term “Catalan numbers”.  As Motzkin’s paper shows, he clearly knew of Euler’s pioneer work.  Maybe he wanted to honor Catalan for his early important work on the sequence.  Or maybe he just liked the way it sounds.  But Riordan clearly made a conscious decision to popularize the term back in 1948, and eventually succeeded.

UPDATE (Feb. 8, 2014)  Looks like Henry Gould agrees with me (ht. Peter Luschny).  He is, of course, the author of a definitive bibliography of Catalan numbers.  Also, see this curious argument against naming mathematical terms after people (ht. Reinhard Zumkeller).

UPDATE (Aug 25, 2014):  I did more historical research on the subject which is now reworked into an article History of Catalan Numbers.

UPDATE (Oct 13, 2016):  I came across a quote from Riordan himself (see below) published in this book review. In light of our investigation, this can be read as a tacit admission that he misnamed the sequence.  Note that Riordan seemed genially contrite yet unaware of the fact that Catalan learned about the sequence from Liouville who knew about Euler and Segner’s work. So the “temporary blindness” he is alleging is perhaps misaddressed…

“Nevertheless, the pursuit of originality and generality has its perils. For one
thing, the current spate of combinatorial mappings has produced the feeling
that multiplicity abounds. Perhaps the simplest example is the continuing
appearances of the Catalan numbers [..] Incidentally, these numbers
are named after E. Catalan because of a citation in Netto’s Kombinatorik, in
relation to perhaps the simplest bracketing problem, proposed in 1838. An
earlier appearance, which I first learned from Henry Gould, is due to the
Euler trio, Euler-Fuss-Segner, dated 1761. There are now at least forty
mappings, hence, forty diverse settings for this sequence; worse still, no end
seems in sight. In this light, the Catalan (or Euler-Fuss-Segner) originality
may be regarded as temporary blindness.”