It is normal to have a small discrepancy between the audio count and the final count. Usually, it's less than 3, but in the case of small numbers, it can be off by a little more as it has trouble finding the pattern with fewer data. The more reps you do, the better it should get at detecting the number.

