From the Engagdet test, it sounds like both sides need to have HD Voice. The idea is that it does ambient noise cancellation with the second mic, but this audio stream is "encoded" by your phone and only "decoded" by the receiving phone. I can only imagine that all to be a bunch of bs, but who knows, maybe its really just a digital stream of audio at some increased bitrate compared to a "regular" voice call. Sprint will supposedly be using this on a bunch of new handsets coming out soon...but that's all conjecture until we see the EVO Shift 4G LTE+ with more LTEness.
Spring Samsung GS3 - Cyanogenmod 10.1
Microsoft Surface RT