Sound recordings are commonly distorted by channel and background noise. The performance of audio
identification is mainly degraded by that noise. For an audio fingerprinting system, Haitsma and Kalker introduced a
robust and efficient audio hashing scheme applying high-pass filtering (differentiation) to the frequency-temporal
sequence of perceptual filter-bank energies. However, the robustness of the audio fingerprinting scheme is still
important in real noisy environments. This paper introduces some alternatives of frequency-temporal filtering for
effective audio fingerprinting of sound recordings in real environments. As the alternative to frequency filtering, a type
of band-pass filter, instead of a high-pass filter, is used to enhance robustness to background noise in a real situation.
As the alternative to temporal filtering, RASTA, instead of a high-pass filter, is used for normalizing sound recording
conditions. As well, this paper introduces a two-stage audio fingerprinting scheme to achieve synergy of the
combination of frequency-temporal filtering. Experimental results show that the proposed method is effective for sound
recordings in real environments.