What we’ve got here is failure to communicate

September 14, 2018 19 comments

Here is a lengthy and somewhat detached followup discussion on the very unfortunate Hill’s affair, which is much commented by Tim Gowers, Terry Tao and many others (see e.g. links and comments on their blog posts).  While many seem to be universally distraught by the story and there are some clear disagreements on what happened, there are even deeper disagreements on what should have happened.  The latter question is the subject of this blog post.

Note:  Below we discuss both the ethical and moral aspects of the issue.  Be patient before commenting your disagreements until you finish the reading — there is a lengthy disclaimer at the end.

Review process:

  1. When the paper is submitted there is a very important email acknowledging receipt of the submission.  Large publishers have systems send such emails automatically.  Until this email is received, the paper is not considered submitted.  For example, it is not unethical for the author to get tired of waiting to hear from the journal and submit elsewhere instead.  If the journal later comes back and says “sorry for the wait, here are the reports”, the author should just inform the journal that the paper is under consideration elsewhere and should be considered withdrawn (this happens sometimes).
  2. Similarly, there is a very important email acknowledging acceptance of the submission.  Until this point the editors ethically can do as they please, even reject the paper with multiple positive reports.  Morality of the latter is in the eye of the beholder (cf. here), but there are absolutely no ethical issues here unless the editor violated the rules set up by the journal.  In principle, editors can and do make decisions based on informal discussions with others, this is totally fine.
  3. If a journal withdraws acceptance after the formal acceptance email is sent, this is potentially a serious violation of ethical standards.  Major exception: this is not unethical if the journal follows a certain procedural steps (see the section below).  This should not be done except for some extreme circumstances, such as last minute discovery of a counterexample to the main result which the author refuses to recognize and thus voluntarily withdraw the paper.   It is not immoral since until the actual publication no actual harm is done to the author.
  4. The next key event is publication of the article, whether online of in print, usually/often coupled with the transfer of copyright.  If the journal officially “withdraws acceptance” after the paper is published without deleting the paper, this is not immoral, but depends on the procedural steps as in the previous item.
  5. If a journal deletes the paper after the publication, online or otherwise, this is a gross violation of both moral and ethical standards.  The journals which do that should be ostracized regardless their reasoning for this act.  Major exception: the journal has legal reasoning, e.g. the author violated copyright laws by lifting from another published article as in the Dănuț Marcu case (see below).

Withdrawal process:

  1.  As we mentioned earlier, the withdrawal of accepted or published article should be extremely rare, only in extreme circumstances such as a major math error for a not-yet-published article or a gross ethical violation by the author or by the handling editor of a published article.
  2. For a published article with a major math error or which was later discovered to be known, the journal should not withdraw the article but instead work with the author to publish an erratum or an acknowledgement of priority.  Here an erratum can be either fixing/modifying the results, or a complete withdrawal of the main claim.  An example of the latter is an erratum by Daniel Biss.  Note that the journal can in principle publish a note authored by someone else (e.g. this note by Mnёv in the case of Biss), but this should be treated as a separate article and not a substitute for an erratum by the author.  A good example of acknowledgement of priority is this one by Lagarias and Moews.
  3. To withdraw the disputed article the journal’s editorial board should either follow the procedure set up by the publisher or set up a procedure for an ad hoc committee which would look into the paper and the submission circumstances.  Again, if the paper is already published, only non-math issues such as ethical violations by the author, referee(s) and/or handling editor can be taken into consideration.
  4. Typically, a decision to form an ad hoc committee or call for a full editorial vote should me made by the editor in chief, at the request of (usually at least two) members of the editorial board.  It is totally fine to have a vote by the whole editorial board, even immediately after the issue was raised, but the threshold for successful withdrawal motion should be set by the publisher or agreed by the editorial board before the particular issue arises.  Otherwise, the decision needs to be made by consensus with both the handling editor and the editor in chief abstaining from the committee discussion and the vote.
  5. Examples of the various ways the journals act on withdrawing/retracting published papers can be found in the case of notorious plagiarist Dănuț Marcu.  For example, Geometria Dedicata decided not to remove Marcu’s paper but simply issued a statement, which I personally find insufficient as it is not a retraction in any formal sense.  Alternatively, SUBBI‘s apology is very radical yet the reasoning is completely unexplained. Finally, Soifer’s statement on behalf of Geombinatorics is very thorough, well narrated and quite decisive, but suffers from authoritarian decision making.
  6. In summary, if the process is set up in advance and is carefully followed, the withdrawal/retraction of accepted or published papers can be both appropriate and even desirable.  But when the process is not followed, such action can be considered unethical and should be avoided whenever possible.

Author’s rights and obligations:

  1. The author can withdraw the paper at any moment until publication.  It is also author’s right not to agree to any discussion or rejoinder.  The journal, of course, is under no obligation to ask the author’s permission to publish a refutation of the article.
  2. If the acceptance is issued, the author has every right not go along with the proposed quiet withdrawal of the article.  In this case the author might want to consider complaining to the editor in chief or the publisher making the case that the editors are acting inappropriately.
  3. Until acceptance is issued, the author should not publicly disclose the journal where the paper is submitted, since doing so constitutes a (very minor) moral violation.  Many would disagree on this point, so let me elaborate.  Informing the public of the journal submission is tempting people in who are competition or who have a negative opinion of the paper to interfere with the peer review process.  While virtually all people virtually all the time will act honorably and not contact the journal, such temptation is undesirable and easily avoidable.
  4. As soon as the acceptance or publication is issued, the author should make this public immediately, by the similar reasoning of avoiding temptation by the third parties (of different kind).

Third party outreach:

  1.  If the paper is accepted but not yet published, reaching out to the editor in chief by a third party requesting to publish a rebuttal of some kind is totally fine.  Asking to withdraw the paper for mathematical reasons is also fine, but should provide a clear formal math reasoning as in “Lemma 3 is false because…”  The editor then has a choice but not an obligation to trigger the withdrawal process.
  2. Asking to withdraw the not-yet-published paper without providing math reasoning, but saying something like “this author is a crank” or “publishing this paper will do bad for your reputation” is akin to bullying and thus a minor ethical violation.  The reason it’s minor is because it is journal’s obligations to ignore such emails.  Journal acting on such an email with rumors or unverified facts is an ethical violation on its own.
  3. If a third party learns about a publicly available paper which may or may not be an accepted submission with which they disagree for math or other reason, it it ethical to contact the author directly.  In fact, in case of math issues this is highly desirable.
  4. If a third party learns about a paper submission to a journal without being contacted to review it, and the paper is not yet accepted, then contacting the journal is a strong ethical violation.  Typically, the journal where the paper is submitted it not known to public, so the third party is acting on the information it should not have.  Every such email can be considered as an act of bullying no matter the content.
  5. In an unlikely case everything is as above but the journal’s name where the paper is submitted is publicly available, the third party can contact the journal.  Whether this is ethical or not depends on the wording of the email.  I can imagine some plausible circumstances when e.g. the third party knows that the author is Dănuț Marcu mentioned earlier.  In these rare cases the third party should make every effort to CC the email to everyone even remotely involved, such as all authors of the paper, the publisher, the editor in chief, and perhaps all members of the editorial board.  If the third party feels constrained by the necessity of this broad outreach then the case is not egregious enough, and such email is still bullying and thus unethical.
  6. Once the paper is published anyone can contact the journal for any reason since there is little can be done by the journal beyond what’s described above.  For example, on two different occasions I wrote to journals pointing out that their recently published results are not new and asking them to inform the authors while keeping my anonymity.  Both editors said they would.  One of the journals later published an acknowledgement of retribution.  The other did not.

Editor’s rights and obligations:

  1. Editors have every right to encourage submissions of papers to the journal, and in fact it’s part of their job.  It is absolutely ethical to encourage submissions from colleagues, close relatives, political friends, etc.  The publisher should set up a clear and unobtrusive conflict of interest directive, so if the editor is too close to the author or the subject he or she should transfer the paper to the editor in chief who will chose a different handling editor.
  2. The journal should have a clear scope worked out by the publisher in cooperation with the editorial board.  If the paper is outside of the scope it should be rejected regardless of its mathematical merit.  When I was an editor of Discrete Mathematics, I would reject some “proofs” of the Goldbach conjecture and similar results based on that reasoning.  If the paper prompts the journal to re-evaluate its scope, it’s fine, but the discussion should involve the whole editorial board and irrespective of the paper in question.  Presumably, some editors would not want to continue being on the board if the journal starts changing direction.
  3. If the accepted but not yet published paper seems to fall outside of the journal’s scope, other editors can request the editor in chief to initiate the withdrawal process discussed above.  The wording of request is crucial here – if the issue is neither the the scope nor the major math errors, but rather the weakness of results, then this is inappropriate.
  4. If the issue is the possibly unethical behavior of the handling editor, then the withdrawal may or may not be appropriate depending on the behavior, I suppose.  But if the author was acting ethically and the unethical behavior is solely by the handling editor, I say proceed to publish the paper and then issue a formal retraction while keeping the paper published, of course.

Complaining to universities:

  1. While perfectly ethical, contacting the university administration to initiate a formal investigation of a faculty member is an extremely serious step which should be avoided if at all possible.  Except for the egregious cases of verifiable formal violations of the university code of conduct (such as academic dishonesty), this action in itself is akin to bullying and thus immoral.
  2. The code of conduct is usually available on the university website – the complainer would do well to consult it before filing a complaint.  In particular, the complaint would typically be addressed to the university senate committee on faculty affairs, the office of academic integrity and/or dean of the faculty.  Whether the university president is in math or even the same area is completely irrelevant as the president plays no role in the working of the committee.  In fact, when this is the case, the president is likely to recuse herself or himself from any part of the investigation and severe any contacts with the complainer to avoid appearance of impropriety.
  3. When a formal complaint is received, the university is usually compelled to initiate an investigation and set up an ad hoc subcommittee of the faculty senate which thoroughly examines the issue.  Faculty’s tenure and life being is on the line.  They can be asked to retain legal representation and can be prohibited from discussing the matters of the case with outsiders without university lawyers and/or PR people signing on every communication.  Once the investigation is complete the findings are kept private except for administrative decisions such as firing, suspension, etc.  In summary, if the author seeks information rather than punishment, this is counterproductive.

Complaining to institutions:

  1. I don’t know what to make of the alleged NSF request, which could be ethical and appropriate, or even common.   Then so would be complaining to the NSF on a publicly available research product supported by the agency.  The issue is the opposite to that of the journals — the NSF is a part of the the Federal Government and is thus subject to a large number of regulations and code of conduct rules.  These can explain its request.  We in mathematics are rather fortunate that our theorems tend to lack any political implications in the real world.  But perhaps researchers in Political Science and Sociology have different experiences with granting agencies, I wouldn’t know.
  2. Contacting the AMS can in fact be rather useful, even though it currently has no way to conduct an appropriate investigation.  Put bluntly, all parties in the conflict can simply ignore AMS’s request for documents.  But maybe this should change in the future.  I am not a member of the AMS so have no standing in telling it what to do, but I do have some thoughts on the subject.  I will try to write them up at some point.

Public discourse:

  1. Many commenters on the case opined that while deleting a published paper is bad (I am paraphrasing), but the paper is also bad for whatever reason (politics, lack of strong math, editor’s behavior, being out of scope, etc.)  This is very unfortunate.  Let me explain.
  2. Of course, discussing math in the paper is perfectly ethical: academics can discuss any paper they like, this can be considered as part of the job.  Same with discussing the scope of the paper and the verifiable journal and other party actions.
  3. Publicly discussing personalities and motivation of the editors publishing or non-publishing, third parties contacting editors in chief, etc. is arguably unethical and can be perceived as borderline bullying.  It is also of questionable morality as no complete set of facts are known.
  4. So while making a judgement on the journal conduct next to the judgement on the math in the paper is ethical, it seems somewhat immoral to me.  When you write “yes, the journals’ actions are disturbing, but the math in the paper is poor” we all understand that while formally these are two separate discussions, the negative judgement in the second part can provide an excuse for misbehavior in the first part.  So here is my new rule:  If you would not be discussing the math in the paper without the pretext of its submission history, you should not be discussing it at all. 

In summary:

I argue that for all issues related to submissions, withdrawal, etc. there is a well understood ethical code of conduct.  Decisions on who behaved unethically hinge on formal details of each case.  Until these formalities are clarified, making judgements is both premature and unhelpful.

Part of the problem is the lack of clarity about procedural rules by the journals, as discussed above.  While large institutions such as major universities and long established journal publishers do have such rules set up, most journals tend not to disclose them, unfortunately.  Even worse, many new, independent and/or electronic journals have no such rules at all.  In such environment we are reduced to saying that this is all a failure to communicate.

Lengthy disclaimer:

  1. I have no special knowledge of what actually happened to Hill’s submission.  I outlined what I think should have happened in different scenarios if all participants acted morally and ethically (there are no legal issues here that I am aware of).  I am not trying to blame anyone and in fact, it is possible that none of these theoretical scenarios are applicable.  Yet I do think such a general discussion is useful as it distills the arguments.
  2. I have not read Hill’s paper as I think its content is irrelevant to the discussion and since I am deeply uninterested in the subject.  I am, however, interested in mathematical publishing and all academia related matters.
  3. What’s ethical and what’s moral are not exactly the same.  As far as this post is concerned, ethical issues cover all math research/university/academic related stuff.  Moral issues are more personal and community related, thus less universal perhaps.  In other words, I am presenting my own POV everywhere here.
  4. To give specific examples of the difference, if you stole your officemate’s lunch you acted immorally.  If you submitted your paper to two journals simultaneously you acted unethically.  And if you published a paper based on your officemate’s ideas she told you in secret, you acted both immorally and unethically.  Note that in the last example I am making a moral judgement since I equate this with stealing, while others might think it’s just unethical but morally ok.
  5. There is very little black & white about immoral/unethical acts, and one always needs to assign a relative measure of the perceived violation.  This is similar to criminal acts, which can be a misdemeanor, a gross misdemeanor, a felony, etc.



ICM Paper

March 14, 2018 2 comments

Well, I finally finished my ICM paper. It’s only 30 pp, but it took many sleepless nights to write and maybe about 10 years to understand what exactly do I want to say. The published version will be a bit shorter – I had to cut section 4 to satisfy their page limitations.

Basically, I give a survey of various recent and not-so-recent results in Enumerative Combinatorics around three major questions:

(1) What is a formula?
(2) What is a good bijection?
(3) What is a combinatorial interpretation?

Not that I answer these questions, but rather explain how one could answer them from computational complexity point of view. I tried to cover as much ground as I could without overwhelming the reader. Clearly, I had to make a lot of choices, and a great deal of beautiful mathematics had to be omitted, sometimes in favor of the Computational Combinatorics approach. Also, much of the survey surely reflects my own POV on the subject. I sincerely apologize to everyone I slighted and who disagrees with my opinion! Hope you still enjoy the reading.

Let me mention that I will wait for a bit before posting the paper on the arXiv. I very much welcome all comments and suggestions! Post them here or email privately.

P.S. In thinking of how approach this paper, I read a large number of papers in previous ICM proceedings, e.g. papers by Noga Alon, Mireille Bousquet-Mélou, Paul Erdős, Philippe Flajolet, Marc Noy, János Pach, Richard Stanley, Benny Sudakov, and many others. They are all terrific and worth reading even if just to see how the field has been changing over the years. I also greatly benefited from a short introductory article by Doron Zeilberger, which I strongly recommend.

How to write math papers clearly

July 12, 2017 7 comments

Writing a mathematical paper is both an act of recording mathematical content and a means of communication of one’s work.  In contrast with other types of writing, the style of math papers is incredibly rigid and resistant to even modest innovation.  As a result, both goals suffer, sometimes immeasurably.  The clarity suffers the most, which affects everyone in the field.

Over the years, I have been giving advice to my students and postdocs on how to write clearly.  I collected them all in these notes.  Please consider reading them and passing them to your students and colleagues.  

Below I include one subsection dealing with different reference styles and what each version really means.  This is somewhat subjective, of course. Enjoy!

4.2. How to cite a single paper. The citation rules are almost as complicated as Chinese honorifics, with an added disadvantage of never being discussed anywhere. Below we go through the (incomplete) list of possible ways in the decreasing level of citation importance and/or proof reliability.

(1) “Roth proved Murakami’s conjecture in [Roth].” Clear.

(2) “Roth proved Murakami’s conjecture [Roth].” Roth proved the conjecture, possibly in a different paper, but this is likely a definitive version of the proof.

(3) “Roth proved Murakami’s conjecture, see [Roth].” Roth proved the conjecture, but [Roth] can be anything from the original paper to the followup, to some kind of survey Roth wrote. Very occasionally you have “see [Melville]”, but that usually means that Roth’s proof is unpublished or otherwise unavailable (say, it was given at a lecture, and Roth can’t be bothered to write it up), and Melville was the first to publish Roth’s proof, possibly without permission, but with attribution and perhaps filling some minor gaps.

(4) “Roth proved Murakami’s conjecture [Roth], see also [Woolf].” Apparently Woolf also made an important contribution, perhaps extending it to greater generality, or fixing some major gaps or errors in [Roth].

(5) “Roth proved Murakami’s conjecture in [Roth] (see also [Woolf]).” Looks like [Woolf] has a complete proof of Roth, possibly fixing some minor errors in [Roth].

(6) “Roth proved Murakami’s conjecture (see [Woolf]).” Here [Woolf] is a definitive version of the proof, e.g. the standard monograph on the subject.

(7) “Roth proved Murakami’s conjecture, see e.g. [Faulkner, Fitzgerald, Frost].” The result is important enough to be cited and its validity confirmed in several books/surveys. If there ever was a controversy whether Roth’s argument is an actual proof, it was resolved in Roth’s favor. Still, the original proof may have been too long, incomplete or simply presented in an old fashioned way, or published in an inaccessible conference proceedings, so here are sources with a better or more recent exposition. Or, more likely, the author was too lazy to look for the right reference, so overcompensated with three random textbooks on the subject.

(8) “Roth proved Murakami’s conjecture (see e.g. [Faulkner, Fitzgerald, Frost]).” The result is probably classical or at least very well known. Here are books/surveys which all probably have statements and/or proofs. Neither the author nor the reader will ever bother to check.

(9) “Roth proved Murakami’s conjecture.7 Footnote 7: See [Mailer].” Most likely, the author never actually read [Mailer], nor has access to that paper. Or, perhaps, [Mailer] states that Roth proved the conjecture, but includes neither a proof nor a reference. The author cannot
verify the claim independently and is visibly annoyed by the ambiguity, but felt obliged to credit Roth for the benefit of the reader, or to avoid the wrath of Roth.

(10) “Roth proved Murakami’s conjecture.7 Footnote 7: Love letter from H. Fielding to J. Austen, dated December 16, 1975.” This means that the letter likely exists and contains the whole proof or at least an outline of the proof. The author may or may not have seen it. Googling will probably either turn up the letter or a public discussion about what’s in it, and why it is not available.

(11) “Roth proved Murakami’s conjecture.7 Footnote 7: Personal communication.” This means Roth has sent the author an email (or said over beer), claiming to have a proof. Or perhaps Roth’s student accidentally mentioned this while answering a question after the talk. The proof
may or may not be correct and the paper may or may not be forthcoming.

(12) “Roth claims to have proved Murakami’s conjecture in [Roth].” Paper [Roth] has a well known gap which was never fixed even though Roth insists on it to be fixable; the author would rather avoid going on record about this, but anything is possible after some wine at a banquet. Another possibility is that [Roth] is completely erroneous as explained elsewhere, but Roth’s
work is too famous not to be mentioned; in that case there is often a followup sentence clarifying the matter, sometimes in parentheses as in “(see, however, [Atwood])”. Or, perhaps, [Roth] is a 3 page note published in Doklady Acad. Sci. USSR back in the 1970s, containing a very brief outline of the proof, and despite considerable effort nobody has yet to give a complete proof of its Lemma 2; there wouldn’t be any followup to this sentence then, but the author would be happy to clarify things by email.

UPDATE 1. (Nov 1, 2017): There is now a video of the MSRI talk I gave based on the article.

UPDATE 2. (Mar 13, 2018): The paper was published in the Journal of Humanistic Mathematics. Apparently it’s now number 5 on “Most Popular Papers” list. Number 1 is “My Sets and Sexuality”, of course.

ICM 2018 Speakers

May 9, 2017 3 comments

UCLA recently outed me (with permission) as a speaker at the next ICM in Rio. I am incredibly honored to be chosen, alongside my fantastic colleagues Matthias Aschenbrenner, Andrea Bertozzi, Ciprian Manolescu and Sucharit Sarkar.

P.S.  I have more to say on the subject of ICM, but that can wait perhaps.


A complete list of ICM speakers is available here.

Fibonacci times Euler

November 5, 2016 2 comments

Recall the Fibonacci numbers F_n given by 1,1,2,3,5,8,13,21… There is no need to define them. You all know. Now take the Euler numbers (OEIS) E_n 1,1,1,2,5,16,61,272… This is the number of alternating permutations in S_n with the exponential generating function \sum_{n=0}^\infty E_n t^n/n! = \tan(t)+\sec(t).  Both sequences are incredibly famous. Less known are connection between them.

(1) Define the Fibonacci polytope \Phi_n to be a convex hull of  0/1 points in \Bbb R^n with no two 1 in a row. Then  \Phi_n has F_{n+1} vertices and vol(\Phi_n)=E_n/n! This is a nice exercise.

(2) F_n \cdot E_n \ge n! (by just a little). For example, F_4 \cdot E_4 = 5 \cdot 5 = 25 > 4!. This follows from the fact that

F_n \sim \frac{1}{\sqrt{5}} \, \phi^{n+1} and E_n\sim \frac{4}{\pi}\left(\frac{2}{\pi}\right)^{n} n!, where \phi=(1+\sqrt{5})/2 is the golden ratio. Thus, the product F_n \cdot E_n \sim c n! \left(\frac{2\phi}{\pi}\right)^n. Since \pi = 3.14 and 2\phi = 3.24, the inequality F_n \cdot E_n \ge n! is easy to see, but still a bit surprising that the numbers are so close.

Together with Greta Panova and Alejandro Morales we wrote a little note “Why is π < 2φ?” which gives a combinatorial proof of (2) via a direct surjection. Thus we obtain an indirect proof of the inequality in the title.  The note is not a research article; rather, it is aimed at a general audience of college students.  We will not be posting it on the arXiv, so I figure this blog is a good place to advertise it.

The note also explains that the inequality (2) also follows from Sidorenko’s theorem on complementary posets. Let me briefly mention a connection between (1) and (2) which is not mentioned in the note.  I will assume you just spent 5 min and read the note at this point.  Following Stanley, the volume of \Phi_n is equal to the volume of the chain polytope (=stable set polytope), see Two Poset Polytopes.  But the latter is exactly the polytope that Bollobás, Brightwell and Sidorenko used in their proof of the upper bound via polar duality.

You say goodbye and I say hello

September 2, 2016 Leave a comment

I’ve been meaning to write a few posts for a while now, but never could find the time. It really takes special effort to clean your thoughts and then put them in order.  However, the following story just fell into my mailbox.  It tells you how to save time by skipping on the greetings/salutations.  I am removing all math matters and leaving it undedited otherwise.  To protect the anonymity of my correspondent, I will call him “Kiran” throughout the email exhange.  Enjoy!  — IP

(1) [Math] Thanks! — Kiran

(2) Dear Kiran,
[Math] Best, — Igor

P.S. In the future, please address me as “Igor”, which is my first
name. It’s best to begin your email with customary “Dear Igor”.
Thank you.

(3) I’ve been writing emails for 25 years, so I’m not about to start taking advice on how to start them; but if you start a thread to me with “Dear Kiran”, I can be safely counted on to respond in kind for the *first* email in the thread. If it happens enough times, I might even remember to initiate same way. For instance, this is what happens when I exchange emails with Serre; but neither of us uses the salutation on replies after the first within a thread.

[Math] Best, — Kiran

(4) Dear Kiran,
[Math] Best, — Igor

P.S. With all due respect, I am going to continue using salutations
and expecting the same in every email irrespectively on the person or
the count in the thread. Neither the “25 years” nor argumentum ad
verecundiam seem convincing — I have been using email for just as
long and in similar circumstances. The 8 letters of “Dear Igor” is
really not too much to ask.

(5) Thanks, I think this last reference does exactly what I was looking for!

Best, — Kiran

P.S. I have something more to say on the subject of salutations, but since that is a low-priority discussion for me, I will have to put it off until I am more current on my email.

(6) “Dear” Igor,

I promised one more piece of information regarding salutations, so here goes. (Don’t bother replying to this email; I promise to delete the response without reading it!)

I recently had some email exchanges with Shinichi Mochizuki, and was a bit surprised by the fact that despite the fact that I met him more than 20 years ago, he began his email with “Dear Professor [redacted]” (and persisted with this in subsequent replies within the thread). However, when I asked about this, he made it clear that on one hand, he has a policy of using the same format of salutation no matter the recipient (to avoid having to worry about the level of formality, figuring it is safe to err on the side of being too formal sometimes), he has absolutely no expectations about how anyone will address his in response.

My point is that you misuse a certain term here and it’s not the gratituous Latinate rhetorical terminology; it’s the word “respect”. It is a fact that reasonable people can draw different conclusions about such matters as how it is appropriate to start an email. You are free to choose how you address me, but how I choose to structure my correspondence is my decision alone. What you think is “not too much to ask” is for me to keep you in mind as a special case when I don’t even have very much correspondence with you anyway; that’s a waste of mental real estate that I can little afford.

— Kiran

The power of negative thinking, part I. Pattern avoidance

May 26, 2015 2 comments

In my latest paper with Scott Garrabrant we disprove the Noonan-Zeilberger Conjecture. Let me informally explain what we did and why people should try to disprove conjectures more often.  This post is the first in a series.  Part II will appear shortly.

What did we do?

Let F ⊂ Sk be a finite set of permutations and let Cn(F) denote the number of permutations σ ∈ Sn avoiding the set of patterns F. The Noonan-Zeilbeger conjecture (1996), states that the sequence {Cn(F)} is always P-recursive. We disprove this conjecture.  Roughly, we show that every Turing machine T can be simulated by a set of patterns F, so that the number aof paths of length n accepted by by T is equal to Cn(F) mod 2.  I am oversimplifying things quite a bit, but that’s the gist.

What is left is to show how to construct a machine T such that {an} is not equal (mod 2) to any P-recursive sequence.  We have done this in our previous paper, where give a negative answer to a question by Kontsevich.  There, we constructed a set of 19 generators of GL(4,Z), such that the probability of return sequence is not P-recursive.

When all things are put together, we obtain a set F of about 30,000 permutations in S80 for which {Cn(F)} is non-P-recursive.  Yes, the construction is huge, but so what?  What’s a few thousand permutations between friends?  In fact, perhaps a single pattern (1324) is already non-P-recursive.  Let me explain the reasoning behind what we did and why our result is much stronger than it might seem.

Why we did what we did

First, a very brief history of the NZ-conjecture (see Kirtaev’s book for a comprehensive history of the subject and vast references).  Traditionally, pattern avoidance dealt with exact and asymptotic counting of pattern avoiding permutations for small sets of patterns.  The subject was initiated by MacMahon (1915) and Knuth (1968) who showed that we get Catalan numbers for patterns of length 3.  The resulting combinatorics is often so beautiful or at least plentiful, it’s hard to imagine how can it not be, thus the NZ-conjecture.  It was clearly very strong, but resisted all challenges until now.  Wilf reports that Richard Stanley disbelieved it (Richard confirmed this to me recently as well), but hundreds of papers seemed to confirm its validity in numerous special cases.

Curiously, the case of the (1324) pattern proved difficult early on.  It remains unresolved whether Cn(1324) is P-recursive or not.  This pattern broke Doron Zeilberger’s belief in the conjecture, and he proclaimed that it’s probably non-P-recursive and thus NZ-conjecture is probably false.  When I visited Doron last September he told me he no longer has strong belief in either direction and encouraged me to work on the problem.  I took a train back to Manhattan looking over New Jersey’s famously scenic Amtrack route.  Somewhere near Pulaski Skyway I called Scott to drop everything, that we should start working on this problem.

You see, when it comes to pattern avoidance, things move from best to good to bad to awful.  When they are bad, they are so bad, it can be really hard to prove that they are bad.  But why bother – we can try to figure out something awful.  The set of patterns that we constructed in our paper is so awful, that proving it is awful ain’t so bad.

Why is our result much stronger than it seems?

That’s because the proof extends to other results.  Essentially, we are saying that everything bad you can do with Turing machines, you can do with pattern avoidance (mod 2).  For example, why is (1324) so hard to analyze?  That’s because it’s even hard to compute both theoretically and experimentally – the existing algorithms are recursive and exponential in n.  Until our work, the existing hope for disproving the NZ-conjecture hinged on finding an appropriately bad set of patterns such that computing {Cn(F)} is easy.  Something like this sequence which has a nice recurrence, but is provably non-P-recursive.  Maybe.  But in our paper, we can do worse, a lot worse…

We can make a finite set of patterns F, such that computing {Cn(F) mod 2} is “provably” non-polynomial (Th 1.4).  Well, we use quotes because of the complexity theory assumptions we must have.  The conclusion is much stronger than non-P-recursiveness, since every P-recursive sequence has a trivial polynomial in n algorithm computing it.  But wait, it gets worse!

We prove that for two sets of patterns F and G, the problem “Cn(F) = Cn(G) mod 2 for all n” is undecidable (Th 1.3).  This is already a disaster, which takes time to sink in.  But then it gets even worse!  Take a look at our Corollary 8.1.  It says that there are two sets of patterns F and G, such that you can never prove nor disprove that Cn(F) = Cn(G) mod 2.  Now that’s what I call truly awful.

What gives?

Well, the original intuition behind the NZ-conjecture was clearly wrong.  Many nice examples is not a good enough evidence.  But the conjecture was so plausible!  Where did the intuition fail?  Well, I went to re-read Polya’s classic “Mathematics and Plausible Reasoning“, and it all seemed reasonable.  That is both Polya’s arguments and the NZ-conjecture (if you don’t feel like reading the whole book, at least read Barry Mazur’s interesting and short followup).

Now think about Polya’s arguments from the point of view of complexity and computability theory.  Again, it sounds very “plausible” that large enough sets of patterns behave badly.  Why wouldn’t they?  Well, it’s complicated.  Consider this example.  If someone asks you if every 3-connected planar cubic graph has a Hamiltonian cycle, this sounds plausible (this is Tait’s conjecture).  All small examples confirm this.  Planar cubic graphs do have very special structure.  But if you think about the fact that even for planar graphs, Hamiltonicity is NP-complete, it doesn’t sound plausible anymore.  The fact that Tutte found a counterexample is no longer surprising.  In fact, the decision problem was recently proved to be NP-complete in this case.  But then again, if you require 4-connectivity, then every planar graph has a Hamiltonian cycle.  Confused enough?

Back to the patterns.  Same story here.  When you look at many small cases, everything is P-recursive (or yet to be determined).  But compare this with Jacob Fox’s theorem that for a random single pattern π, the sequence {Cn(π)} grows much faster than originally expected (cf. Arratia’s Conjecture).  This suggests that small examples are not representative of complexity of the problem.  Time to think about disproving ALL conjectures based on that evidence.

If there is a moral in this story, it’s that what’s “plausible” is really hard to judge.  The more you know, the better you get.  Pay attention to small crumbs of evidence.  And think negative!

What’s wrong with being negative?

Well, conjectures tend to be optimistic – they are wishful thinking by definition.  Who would want to conjecture that for some large enough a,b,c and n, there exist a solution of an + bn = cn?  However, being so positive has a drawback – sometimes you get things badly wrong.  In fact, even polynomial Diophantine equations can be as complicated as one wishes.  Unfortunately, there is a strong bias in Mathematics against counterexamples.  For example, only two of the Clay Millennium Problems automatically pay $1 million for a counterexample.  That’s a pity.  I understand why they do this, just disagree with the reasoning.  If anything, we should encourage thinking in the direction where there is not enough research, not in the direction where people are already super motivated to resolve the problem.

In general, it is always a good idea to keep an open mind.  Forget all this “power of positive thinking“, it’s not for math.  If you think a conjecture might be false, ignore everybody and just go for disproof.  Even if it’s one of those famous unsolved conjectures in mathematics.   If you don’t end up disproving the conjecture, you might have a bit of trouble publishing computational evidence.  There are some journals who do that, but not that many.  Hopefully, this will change soon…

Happy ending

When we were working on our paper, I wrote to Doron Zeilberger if he ever offered a reward for the NZ-conjecture, and for the disproof or proof only?  He replied with an unusual award, for the proof and disproof in equal measure.  When we finished the paper I emailed to Doron.  And he paid.  Nice… 🙂