analysis of paired comparison data

I recently carried out a listening test on the perception of auditory distance using paired comparison judgments. To construct a psychological scale, I tried to fit a Thurstone-Mosteller Case V model as well as a Bradley-Terry-Luce model (which give similar results). Unfortunately for both models the goodness of fit (chi-square test) gives a very low p-value, meaning that the models should be rejected.
Then, I was thinking about trying to apply the Thurstone-Mosteller Case III (which does not assume that the discriminal dispersions are equal), however I don't know how to do it (in particular how to estimate the dispersion values), and I did not find any literature on this case. Does anyone have some experience with such analysis?

