I agree with you that it is very important to get feedback from the musicological and music-theoretic communities. Even though I work with a musician on these matters, it is really hard to cover all technical details of the three aspects of the research: mathematical, cognitive and musical.
Regarding the composers, we noted what you said, and reported it in the Supporting Information. It is true that the Baroque period seems to be under represented. However, if Bach copied the Italian work, then some of the statistics should kept.
On the other hand, the melody perception is an issue by itself. It is in fact something we can not control, since we don't know how data was pre processed. Depending on the frequency of the cases you mention, it could perfectly bias the analysis. We believe that it is not the case, because we made two controls. The first one is published in the Supporting information with an alternative corpus, and the second one is a preliminary result with a corpus which consists only of melodies.
Anyway, that is not a guarantee of anything, but seems to point in the same direction as our intuition.
Lastly, and not less important, I want to thank you for the work you put in your mail. Not only for the examples, which were really good, but for the criticism on the corpus. As you say, this feedback is indispensable for the growth of this line of research.
Best wishes for you,