You judge a theory based on all the evidence you have for it: past Patriots’ transgressions, the pressure gap between the home and visitor game balls in last week’s AFC championship, your personal feelings about Bill Belichick’s moral foundations, and so on. The Patriots’ sudden improvement in preventing fumbles doesn’t close the case against them, but it’s one more piece of evidence.
A bunch of non-statistical pieces of evidence are mentioned but not discussed. It could be a matter of space. It could be a oddly-worded suggestion for Bayesian updating--no individual piece of evidence is sufficient, but each should increment you toward believing one possibility over the other. But it seems like all of these should get more attention, not less, precisely because the statistical results were inconclusive. Past Patriots transgressions, for example, would only count if you had a theory which suggested that, having been busted for cheating, the Patriots would turn around and select a new means of cheating which no one talked about for nine years. But when one puts it that way, it seems like an uncertain candidate for updating one's prior beliefs, since it is related to one's underlying beliefs about the situation, and people don't think about the many possible factors that might lead to either continuity or change. Like the data, it's not a simple matter.
I like statistics in sports like football and soccer specifically because they contain individual elements that are harder to measure in abstraction from everything else. Statistics here are a way of gaining entry into seeing specific parts of the game that might otherwise go unnoticed. Nobody claims them to capture everything that's important: any analysis has to be supplemented with many others. Soccer has many moments like this Harry Kane goal against Chelsea. The play works only because a player who never touches the ball runs himself out of the play before it even begins, and by doing so takes the key defenders out of the play. How does one measure that? (I sometimes think the popularity of Total Football amongst the stats-friendly has to do with the fact that it correlates lots of objective measures to winning: time of possession, passes completed, tackles, takeaways, fouls, etc)
The answer, of course, here as in Deflater-mess, as in every other type of data analysis, is that smart observers know when to use stats and when (and how) to use and judge other types of evidence and data.