 What kind of streamers are there and why do digital audio equipment differ in audio quality? A registration of my presentation during the NWB Brucell Hi-Fi Show 2022. November 5th and 6th my colleague Jan Feinster of alphaaudio.net and myself were asked to do a presentation on the NWB Brucell Hi-Fi Show 2022. I was already asked if a video registration of my presentation would become available. I didn't expect it to be registered but Jan was also very kind to register and edit not only his very interesting presentation on networks for audio but also my presentation. A link to his presentation is in the top right corner at the end of this video and in the show notes. We had to work in a rather noisy hall so I am slightly less focused than normal. Enjoy. I will try to tell you a bit about network players, how the structure is of network players but also why some network players sound different than others. A lot of people think that bits are bits and that all digital equipment sounds the same for that reason. That's not true but I'll come to that later on. I'm often asked these two questions what kind of streamers are there, where do they store their information, how many information can they store and all that. That's the first thing I will handle and the second is as I said the sound difference between digital players. To start off with the first question let's see what components there are in a streamer. That's quite abstract but you'll get the point when I'm finished. You have of course storage that can be a hard disk, that can be a USB drive, that can be even a CD. Then you have something that has to control the information flow. You have a database that contains the information about the music so the album name, track name, artist, composer and all that information and you have a user interface so you can instruct the machine what to play and you have what I call here the renderer that's anything from the end of where it is digital to the loudspeakers. When you start up a device it starts to query the hard disk, it reads all the files, all the metadata and it reports back and that is then stored in the database. The database then can feed the information back to control that can feed the user interface where the user can ask for a certain track and then that's sent back to control and it will send the music. This is in fact what all digital players do. It all started with computers. The old computer of me went to my son, he discovered he could play music with it on his PC speakers and he was very happy. There's a difference between a computer and a music player in that the computer uses a file system and not a database, it is a database, it's a simple database but it doesn't contain more information about the music so artists, album names, they're all not in the file system of a computer as you can see here. The next thing that happened was to bring the A conversion outside of the computer because in the computer it is really very poor sound quality. That can be fed over a SPDIF or USB, nowadays there are even more standards. And the next step was to use a music player software that can index the music, the composer, the artist, the album art and all that and create its own database in the computer. That can be done in several ways. The first way to do it is a server-based, then you have in the computer a program running that works as a music server that builds the database and you have an external network player that gets the information from that database and receives it over the network and feeds it to the renderer. And that is usually controlled by a smartphone or tablet, in the old days you had dedicated remote controls but not here, there are several systems that can do that. The most used is UPnP, AV and DnLay. UPnP, AV and DnLay are in fact the same as such that UPnP, AV was first there and then the big companies like Philips and Sony who owned a lot of copyrights of music too said well we don't want the MP3s to play so they came with a version that denied anything that had copyright or was MP3 and that was called DnLay. So if you hear DnLay it's the same as UPnP, AV especially since nowadays those copy protection is all banned. The second system is Logitech media server with the squeeze box and then there are brand specific devices like Sonos, Yamaha and Aurelix. I will have them passed by one by one. DnLay, UPnP, AV is very wide accepted by the consumer electronics industry. All well-known brands like Brands, like Denon, like Hegel, I think also, Cambridge Audio, Arkham, Lin, they all use DnLay as a protocol and as a server. It supports audio but also video and photos. If you have photos and videos on your computer and you have a smart TV you can watch using the same protocol, those photos and videos. There need to be a server program, it's a small program on a computer or a NAS. It's a very small program and it can be done on a very cheap NAS. I have one of my test equipment, it's a 100 euro NAS that performs well. It's not the quickest but it works well. Many servers that use very limited support for metadata. Usually you can get the artist, you can get the album name but composer often isn't supported and so on. There are good programs like MinimServer that is free and does only audio and supports all the metadata and there is no provision for gapless playback. So if you play, for instance, Sargent Pepper, you get a short silence between the tracks. That is nowadays solved in good hardware so if you use DnLay and you're going to buy new hardware make sure that it's capable of playing gapless. These are the brands that supported the DnLay standard and there are many more but to give you an idea how big this is. The next one was the Squeezebox. It was bought by Logitech at the moment that it got successful and Logitech has abandoned it but has promised to maintain the server program so people with a Squeezebox can still use it safely and it's partly open source so you can have plug-ins for all kinds of applications like Tidal, Corbus, Spotify, all those things are supported on the internet radio. The sports music video and audio too and photos and it's free. The server program is free and since you can't buy Squeezeboxes anymore the solution is to buy Raspberry Pi with a simple sound card and if you're handy with computer you can make that function as a Squeezebox. Then you get the self indexing streamers and those are different from the ones I showed you before in that there is no database in the computer. There's a file system that knows where the files are but there is no database that tells you what track is made by what artist and so on. That is done in the network player itself and that has the advantage that it's very fast and it can be remote controlled like D&A is. There are three brands that are rather popular. Sonos is very wide in the market and it's well known. Then you have Blue Sound and if you want Blue Sound that are higher quality it's done by the sister company NAD and Aurelic is another brand but that's again higher in the market that works this way. Let's look at Sonos. Sonos is easy to install. I once wrote that my mother could install it and it's really true. Fast browsing and searching for music. It has a large install base so when a streaming service wants to hit the market they really have to be friends with Sonos so it's supported about any streaming service all over the world. The internal memory defines how many tracks you can index and that has to do with the amount of metadata. If the memory is full the database can't grow anymore and you're on the limits. How much that is depends on the amount of metadata you use. If you only have the artist and the album name in the file it's limited. If you have very large images of the cover art then it gives a burden on the memory of course. One thing is it's a closed system. If you really want to work with Sonos you have to buy other Sonos equipment to get a bigger installation. Quality wise it's targeted at the average consumer. Let me define that by saying the people that don't come here. I think that's safe. If you want one step there's one thing in the mesh network that's an interesting feature. There was an interesting feature ten years ago when you had more Sonos devices and I was lazy to draw five of the same but it can be speakers as well. The moment you connect one device it starts connecting all other devices it sees and so extend the network and this is exclusive for Sonos. That made it a very robust system in the days that you only had one access point in your house. Nowadays there are repeaters for little money and most people have a very good coverage throughout the house so this is not a real feature anymore, not an important feature anyway. Not the same as Blue Sound. Blue Sound is a brand that came I believe eight years after Sonos and they looked very carefully at what Sonos did. They tried to be at a higher sound level and I think they succeeded quite well. Again it's a very easy installation. It's self-indexing like Sonos. It has a smartphone or tablet or a computer to operate it. You can also operate it from a computer. The primary controls are on the device meaning that you can set the volume, allow the volume on the device or pause it or select the next track and some models even have a preset so when you get out of bed in the morning and you press one it starts your favorite radio program and in the evening when you're with a glass of wine on the couch you can press three and it plays your favorite playlist. It works as the Sonos system with a share on your system as I showed you in the diagram. You have to share a part of your drive where your music is and it will get music from there. There's a model with an integrated amplifier and there's a model with an integrated hard disk and CD drive so you can rip it. There's not a model that has all these three things. If you want a higher, even higher sound quality you can have an NAD player that uses the same blue sound technique but has a higher quality audio. There is something that is called self-storing and indexing. In fact we're talking about a phone for instance. So it stores the information on the smartphone or tablet. It has a sound quality that depends on the operating system. Some operating systems like Android convert every audio to 48 kHz and the conversion always gives loss in audio quality. Sound quality depends on the hardware. Some phones sound better, some iPads sound better than others. It's very easy to install of course but it's not always a bit perfect. A bit perfect player and Android has some bit perfect players can be a solution but still in most cases you're limited to 48 kHz. In case of an external renderer, in case you want quality an external renderer is a good thing. So an external DAC in this case or use AirPlay or the Google Chromecast system to get the music to a better playback system. Short about AirPlay, if you have a Windows computer, iTunes isn't that great on a Windows computer and it's a very robust streaming protocol. It's protected also very well and it's licensed by many brands so most AV receivers support iTunes as a streaming format. It's a close system on the front end meaning that the music you play has to be your own music on your computer or come from Apple Music. You can't use Tile or Kobus or Amazon of what you want. It won't do that. The unit looks like this, a lot of units look like this but it's integrated in a lot of amplifiers and AV receivers. You can also use a network bridge if you don't want one of the other systems. This network bridge can be a lot of things, it can be a rune endpoint, it can be an HD and a DLNA endpoint, it does the squeeze box so you can use all kinds of protocols with it and it's in fact a USB or SPDIF at a distance. So you put the computer in the living room or in the computer room and such a box in the living room and connect it to your hi-fi and you don't have any problems with the sound of the computer that the force cooling does. And this is silent both acoustically and electronically so it sounds better. The last variant is the cloud storing and indexing. In fact there are many ways to do that. The database and the music files are in the cloud. As I said all the streaming services work that way. It's played via network player, computer or smartphone or tablet and a separate render can be used. It looks like this if you look at the diagram. It's not that different from the computer that's used as a storage. Only now it's in the cloud. Devices that support it like this, the middle one is out of production but again the protocol is supported by many current audio equipment. Then we're going to talk about why digital equipment can sound different from other digital equipment. To do that I'm going to explain you first how digital works. You probably know that but it's a good repetition. To do that I found a audio signal that looks like this. Most unlikely that ever an audio signal will look like this. It's a straight line but in the end you will understand why I do this now and it's the same for real audio waves. If you want to digitize this what is done is at equally distant times every 44,100 times per second the amplitude is measured. How strong, how many volts is the signal at that moment? It looks like this. Those values are stored in a table and that can be stored on any medium or transported to any medium you can think of. On playback that is read in again and those values are plotted again and in the beginning of the digital era you were shown this diagram to prove that the digital signal was converted to analog. I can assure you that this sounds horrible and this is not what is happening. That's because Nyquist, Mr Nyquist has already defined that if you digitize something you have to filter it at half the sampling rate. If you do that the pen that draws this becomes slow because of the filtering. It can't do the 44,000 jumps, can only do 20,000 jumps. What you then get is a straight line and that's what's happening in theory at least. In practice the filtering is one of the problems but I get back to that. If those values are not plotted on the right time you don't get a straight line back, that's why you use a straight line. If I put a white line behind it you can see that the red line is not straight anymore and this is what you see here is called jitter. It's one of the forms of jitter, there are many forms of jitter. This is caused by just one interfering frequency. How come? Well it can be an instable clock. Clocks those are crystals and you have them that cost a quarter of a dollar and you have them that cost 50 dollars, you have them that cost 100 dollars. And that of course is a quality difference and it's not in the construction. What they do is they build a lot and they measure the best one out and you have to pay extra for that and the ones that don't measure good are dumped on the market for a low price. That's what you get in cheap equipment. Therefore those people that say well I've seen this Chinese DAC from the 100 euros and it has the same ESS9028 DAC chip in it than the Mi-Tek Brooklyn that cost 2,000 euros. That's true it's the same chip but my car, my Mercedes uses the same petrol filter as a simple street car. So it doesn't say anything. The rest has to be good as well. You also need to avoid problems with interfacing. If you use a wrong cable between two devices for digital connection and they have the wrong impedance you will get losses. The ground loop is a problem if you have equipment not connected well. There will be a ground loop by a different potential. And it can be a noisy power supply that also gives a problem. I'll explain it later on. What you see here is what most people see as a digital signal. If the line is high it's a 1, if the line is low it's a 0. The problem is it's not a digital signal. It's an analog square wave that is used to encode digital information. It's like on submarines where they have those lights that they flicker to tell stories to each other without the enemy knowing what is done. Those lights are not letters those lights are flashes but they are used to indicate letters. It's the same here. It's an analog square wave that is used to encode zeros and ones. And analog signals do distort. This is a square wave but it's not a perfect square wave for the simple reason that you can't make a perfect square wave. If I put a line here you see it's not fully straight. If it was straight it would be a miracle because you can't make perfect square waves. You need an unlimited bandwidth. So from 0 hertz to 100 million hertz or more even at 100 million hertz it's not a perfect square wave. It will be better but not perfect. You have the same of course here and the time here between those two is the ramp up time and of course at the way down it's the same story you have a ramp up down time. Now the idea is that somewhere halfway you decide it's a 1 or a 0. So this is a fourth fault signal. At two faults it switches from 0 to 1. That's this point for instance. And the way down the same story it's there. Now if you have a problem that point shifts. And if it shifts we've seen at the reconstruction I showed you with the red line that was not straight that there is a problem distortion. Now how can they occur jitter is carried through the digital data. It can influence all kind of processes. It can be a single frequency like the example I showed you. It can also be noise but it can also be a music dependent modulation that depends on the design and how wrong it was done. What you get is this is the square wave you see when there is jitter. You now see that there is a problem in time. Some signals are earlier than other and some ones will be earlier and some ones will be later. This is the threshold point again and you see that this can be a threshold point two if the jitter is to the left. And this is to the right and the same you have on the down going line. And then you see that you have this distance between the up and the down but it can also be in time here. Or it can be here. And then you get those poles shifting and you get a red line that is not straight. This is one problem. What do we get when there is a problem with the ground plane. Everything is measured from the ground plane up. There is no such thing as two faults. It's always two faults compared to the ground plane. If there is a noise level of one tenth of a dB or there is a hum because of a ground loop what you get is that in this example in the ground level shifts one tenth of a fold up. What you then get is again other points where the threshold point is and where it is decided whether it is a one or a zero. There are these points. And now we have three different points where the transition can be where it is decided whether it is a zero or a one. This is an enormous problem. Let me explain one thing. There is a lot of people that know a lot about networking that tell me this is nonsense what you say because well the Bank of England would have a problem if this was a problem. And they are right. But that only goes for the digital domain. The Bank of England when they go to analogue they have someone at the counter counting the money. If that was done electronically I don't know how that would go. The problem here is that you only have distortion during the digital to analogue conversion. That's the only point where the problem really exists. Before that the digital signal is safe and will not be destroyed under normal conditions. If that goes wrong you really have faulty equipment. And the distortion comes from phase noise that the clock crystal is not precise enough. It comes from leaking current. If you have devices not earthed in the right way there will be a shift of the ground plane and there can be noise on the ground plane due to the power supply. This is one problem. That's another problem. Well it's not a problem if you solve it. There is equipment that did solve this quite well. But it's the reason why some equipment is not sounding as good as others. I told you about the filtering. At the half-decent frequency there has to be a filtering. The filter was needed to not have this strange staircase but have a straight line there. And there are several ways to do that. In the beginning of the digital audio you had what now is called linear-faced measure filter. It's a filter that gives the reviewers that use their measurement equipment to decide whether equipment is sounding well or not. They use this. It's a filter that started at 20 kilohertz and it goes down at 22.05 kilohertz for a 16-bit, for 24-bit it's even 144 dB. It looks like this. It's an extremely steep filter. If you want to be 96 dB down at 22 kilohertz that's 96 dB not per octave as we normally specify filters but it's per single note. One note on the piano. Now an octave if you ask a musician says it's eight tones. The problem is that they are musicians so they count from C to C. That's not fair you shouldn't do from C to B. So it's seven. And that's not even completely true because there are two half-note distances in an octave. Let's say seven. In that case you have 96 dB times seven is let's say 700 dB per octave filter. That's extremely, extremely steep. And you can't do that well. If you want to maintain a linear face you get about 10 cycles of pre and post ringing. So when the signal starts you already had 10 wobbly images of that same signal warning you that the signal is coming. We're not used to that. In nature an echo can only come after the event and not before the event. That doesn't sound natural. So they made a different filter. They experimented and found out ways to limit the problem. That's now called a linear phase listen filter. It still starts at 20, it doesn't start at 20 kHz but somewhat lower for instance 18 kHz. It's not as steep. It's only 60 dB down at 22 kHz. You then get that you don't apply the rules of Nyquist in the right way and you get what is called alienation. That sounds like a robot voice if you really have a serious problem. But the way this is done they're only alienating at very high frequencies and that's almost inaudible when it's done very good it's inaudible. It looks like this. It's a friendly curve. What you win then is still no phase shift but you get about one cycle of pre and post echo that looks like this. This is already better but still your ears are not used to hearing things before they start. So it's still not very nice. You hear it for instance when you have a rim shot on a snare it doesn't sound right. It doesn't have the bite. It doesn't have the right place in the stereo image. Then there were people that said well what happens if we lose the perfect phase behavior? What if we have a small deviation in phase in the high frequencies? They again had a measure filter to start with. It starts at 20 kilohertz. This is an ever. It's 60 beats down at 22 kilohertz. It looks like this. And it has no pre ringing but it has twice as much post ringing. Since that's more natural, still not ideal but it's more natural, already sounds a lot better. And you do have some very face response at high frequencies I said. And the pre ringing from the recording is also a bit filtered out because at the recording about the same happens. You have to have a filter at 20 kilohertz to record at 44.1 kilohertz. There's a guy called Peter Craven that published a lot about this and he called this filter an appetizing filter. Peter Craven is one of the people behind MQA and that's one of the reasons why with affordable equipment you can get much better results if you use the MQA implementation in the DAC. The signal looks like this then. So you have a signal and you have a lot of echoes. Don't hear them as discrete echoes because they're too close together but it's not ideal but it's better than having a pre echo to my taste at least. Then there was someone who said well we can improve that by filtering at a milder slope and taking like the one we did with the linear face filter. We say well we accept some aliasing and we use that to have a friendlier behavior. It looks like this and when you then go back to the ringing there's no pre ringing. It's the same as the previous example but there's only a little post ringing and you have a varying phase at a higher frequency like the previous and that's when it's done not really audible. But it sounds a lot natural because you don't have any pre ringing and only a little post ringing. Then you can do another trick and that's upsampling. If you upsample you have a computer or DSP calculate values in between those normal values there are. If you do that you end up with a two times or four times higher sampling frequency then you're allowed to have a filter at a higher frequency. If you do it four times you get let's say 196 kilohertz then you can filter at 96 kilohertz and that's not what is done by the way. What they do is from 20 kilohertz use a very slow filter and because it's a very gradient filter it sounds a lot better. But good upsampling needs a very good code in a reasonably fast processor. So you see that in DACs that have FPGAs built in you can do it in your computer it's a bit of a fuss but you can do that and if it's done in a DAC chip that is often used then it's quite poor because the computational power of a DAC chip is very limited and therefore there's an in between where the calculation can be done by a small processor outside the DAC chip and then it's a good compromise. Let's wrap it up by saying digital audio is robust as long as it's digital. All digital audio to analog conversion jitter is critical. To find jitter money is the main factor it's no other way cost money and during the conversion the reconstruction filter is critical. The quality of the reconstruction filter differs greatly amongst components and if you buy a hundred euro chinese DAC don't expect that to sound the same as a thousand euro quality DAC because a good filter design is complex and costly and you need a designer that is very capable of doing that. I'll be back next Friday at 5pm central european time. If you don't want to miss that subscribe to my channel or follow me on the social media so you will be informed when new videos are out. Help me reach even more people by giving this video a thumb up or link to this video in the social media, it is much appreciated. Many thanks to those viewers that support this channel financially. It keeps me independent and lets me improve the channel further. If that makes you feel like supporting my work too, the links are in the comments below this video on YouTube. I am Hans van Beekhuyzen, thank you for watching and see you in the next show or on the hbproject.com. And whatever you do, enjoy the music.