Week 6: Tech Write-up: Bandwidth
Posted On: Feb 10, 2014 4:11:49 GMT
Ari, Archangel, and 6 more like this
Post by iamallama on Feb 10, 2014 4:11:49 GMT
This week I am going to give Ari a week off and do a small, more technical write-up about LlamaQuest. This week is about bandwidth and some things we are doing to not waste what resources we have. I figure if this is liked, once every 3-4 weeks I'll come out of the dungeon and write something like this up.
When it comes to MMOs one major problem that needs to be addressed is bandwidth usage. While it might not seem like your connection uses a lot of bandwidth, multiply that by the 500 people connected to the server (a high average for Fantasy Online) and things add up really quick. Even worse still is that the more people connected the more bandwidth you use. And I don't mean with four times the users comes four times the bandwidth. With four times the users you are likely to see closer to six or seven times the bandwidth. In a simple example, lets say you have one user connected, user A. Once every five seconds they click the screen. That click needs to be sent to the server, processed and the result sent back. So every five seconds one communication goes to the server, one comes back totaling two communications. Now user B enters the game. Every time user A clicks, it gets set to the server and now needs to be sent back to both user A and user B. So that is three communications. But user B isn't going to just be sitting there doing nothing, so they start clicking once every five seconds. So now with only two users you have six communications every five seconds. Add user C to the server and now you are up to twelve. Add users D through Z and you are up to 702 (one communication up plus 26 back to each user multiplied by 26 users). In one day you would hit 12,130,560 communications with only 26 users. Still doesn't seem like a lot? If each of those requests was 1 KB, that would come out to a whopping 11.6 gigs a day. I alone can download that in a few hours. Not much. Now multiply by 30 days and you get 347 gigs in a month. Not bad, I can download that in a few days at home. Now lets look at the going rates for bandwidth on Amazon's cloud platform. As of writing this, $0.12 per gig. So to keep those 26 people happy clicking away once every five seconds it costs me $41 a month. I can handle that and could even possibly get those 26 people to give me $2 each to cover the costs. But I want to make more than 26 people happy in my click farm. So lets do the math really quick to 100 users. That's 10,100 communications every 5 seconds at 1 KB each, which comes to 174,528,000 KB or 166.4 gigs each day which comes out to 4,992 gigs each month and at the rate of $0.12 per gig that comes to just a hair under $600 a month to keep my 100 happy clickers clicking. So now I need everybody to give me $6 just to keep up and that buys you one click every five seconds. With 500 users that same math comes out to over 123,843 gigs a month and a hair over $10,000 a month in just bandwidth, not counting the monster server that would be needed to handle that much traffic. Now everyone needs to pay me $20 a month for the privilege of clicking on my farm.
When was the last time you waited five seconds between each action? Also, in reality, those clicks aren't the only thing being sent to and from the server. There are heartbeats sent from the client every few seconds to tell the server you are still connected. Every time you change clothes or a slot every one around you needs to know. Every time you log-on or log-off everyone around you needs to know. I think you get the point. The server gets hit, hard. So, how can you fix it? There are two main methods: 1) lower the amount of data you send back and forth and 2) lower the number of times you talk with the server.
The number one rule here is be concise. Only tell the server what you need to say. Not every little thing. How do you do that? Pre-loading and caching. By pre-loading I mean you send the client every single bit of information they need before they need it and cache the data there as long as possible. In my opinion, this is one thing that Fantasy Online didn't do very well. Lets run through another scenario. In FO, every sprite you saw was cached for as long as possible, but the layering of sprites was done server side. What this means is, lets say I am wearing nothing and I put on a tutu. All other players would be told, "User iamallama just changed his pants to a tutu". Then every player would check to see if they have that combination of clothing and they haven't yet so they connect to the server and download my new sprite with nothing but the base skin with a tutu. That image was downloaded and saved on everyone's computer in their cache. Then I put on my Nordic Helm. So now everyone checks to see if they have seen the base skin with a tutu and nordic. Nope, never seen before, download it again. Now I take off my tutu. Everyone checks to see if they have seen the base skin with nordic, but wait, you have all seen the base skin and the nordic. However you haven't seen them in this combination. So it has to be downloaded again. I take off my nordic and I am naked. You've all seen me naked right? Of course, it was the state I was initially in. So you don't need to download that again. Get the picture? If there were only 5 slots, skin, legs, body, face and head. And each of those slots had only 10 items (what a boring game right? I mean there are only 50 things to wear!) that would still mean there are 100,000 possible combinations that needs to be cached. How is it being done in LlamaQuest? Well all items are pre-loaded. Meaning that when you download the game, you download all 10 items for each of the 5 slots. Then everything is layered client side. Each layer is separate. When I put on a tutu, everyone already has the tutu spritesheet. They just layer it over the naked body and it's done. What this also does is allows the server to communicate with the client using even shorter ids. I put on a tutu and the server just sends out "iamallama+43" which means the same thing. Much shorter. And the client already has everything it needs cached so the server doesn't need to tell anyone that the tutu gives 1 million points in awesomeness, they already know.
The second method is to not send the server as many updates. This is probably the first way most people will think to cut bandwidth. It is obvious that when you don't talk to the server, you don't send as much data. One problem with this is that in order to not send as much data, you need to rely more on the client to do processing. One thing that I have learned with security minded programming, is that the user is an evil dirty monster that wants to destroy your code and consume your soul. In other words, you need to rely on the client to calculate some stuff, but never trust them. How does that work? When something happens, you check it on the client. Since they have everything cached, you can use that data to check if what they did was valid (both methods go hand in hand). If something looks fishy, you just warn the user (or not) and stop right there. If it looks good, you send it to the server to be calculated again before broadcasted out to every user. Ideally, no bad data would get sent to the server. But this isn't a perfect world. It is actually quite easy to send the server data that looks good but really isn't. I used to use this method in FO to walk across buildings and reach unreachable areas before mods had the power to just teleport anywhere. I would just record the network traffic of me walking to the right across the screen. Then stand next to a building and send that traffic to the server. The server just tells everyone "iamallama moved 500 pixels to the right" and I appear to be walking across the top of the building. The collision handling was handled entirely client side and if it didn't line up, you were told you couldn't move there. If it lined up you told the server what you did and the server just accepted it as valid truth, which it wasn't. So by checking that it was valid on the server, no one would see me fly.
The next part, and the part the deals with the "Don't yell so loud" is a new awesome feature of SmartFox Server (SFS, the server software I'm using). It's awesome in that I don't need to make it myself like I was planning. The most recent versions of SFS come with what they call "Area of Interest" (AOI). What this means is that if you had a map that was equivalent of 100 by 100 miles in size. If someone was standing in the north-west corner and you were standing in the opposite south-east corner, do you really need to know every move of the other guy? Probably not (unless they were raiding your house or something). You really only need to know about what is going on within about 1-2 miles of where you are standing. So that is where this comes in. Given your position, the server will draw a box around you. If anything happens within that box, you are notified. Anything outside that box, you don't care. You are in your own little bubble and let anyone outside of it be damned. Given the initial bandwidth example of 26 users, A through Z, placed all around the map. You very likely will be communicating with 2-3 others and that is it. Obviously there will be hubs that people hang out in and those are the exception. But every person outside the hub will not have to tell those inside what is going on in their life and vice versa.
Tech Write-up: Bandwidth
When it comes to MMOs one major problem that needs to be addressed is bandwidth usage. While it might not seem like your connection uses a lot of bandwidth, multiply that by the 500 people connected to the server (a high average for Fantasy Online) and things add up really quick. Even worse still is that the more people connected the more bandwidth you use. And I don't mean with four times the users comes four times the bandwidth. With four times the users you are likely to see closer to six or seven times the bandwidth. In a simple example, lets say you have one user connected, user A. Once every five seconds they click the screen. That click needs to be sent to the server, processed and the result sent back. So every five seconds one communication goes to the server, one comes back totaling two communications. Now user B enters the game. Every time user A clicks, it gets set to the server and now needs to be sent back to both user A and user B. So that is three communications. But user B isn't going to just be sitting there doing nothing, so they start clicking once every five seconds. So now with only two users you have six communications every five seconds. Add user C to the server and now you are up to twelve. Add users D through Z and you are up to 702 (one communication up plus 26 back to each user multiplied by 26 users). In one day you would hit 12,130,560 communications with only 26 users. Still doesn't seem like a lot? If each of those requests was 1 KB, that would come out to a whopping 11.6 gigs a day. I alone can download that in a few hours. Not much. Now multiply by 30 days and you get 347 gigs in a month. Not bad, I can download that in a few days at home. Now lets look at the going rates for bandwidth on Amazon's cloud platform. As of writing this, $0.12 per gig. So to keep those 26 people happy clicking away once every five seconds it costs me $41 a month. I can handle that and could even possibly get those 26 people to give me $2 each to cover the costs. But I want to make more than 26 people happy in my click farm. So lets do the math really quick to 100 users. That's 10,100 communications every 5 seconds at 1 KB each, which comes to 174,528,000 KB or 166.4 gigs each day which comes out to 4,992 gigs each month and at the rate of $0.12 per gig that comes to just a hair under $600 a month to keep my 100 happy clickers clicking. So now I need everybody to give me $6 just to keep up and that buys you one click every five seconds. With 500 users that same math comes out to over 123,843 gigs a month and a hair over $10,000 a month in just bandwidth, not counting the monster server that would be needed to handle that much traffic. Now everyone needs to pay me $20 a month for the privilege of clicking on my farm.
When was the last time you waited five seconds between each action? Also, in reality, those clicks aren't the only thing being sent to and from the server. There are heartbeats sent from the client every few seconds to tell the server you are still connected. Every time you change clothes or a slot every one around you needs to know. Every time you log-on or log-off everyone around you needs to know. I think you get the point. The server gets hit, hard. So, how can you fix it? There are two main methods: 1) lower the amount of data you send back and forth and 2) lower the number of times you talk with the server.
1) Be Concise
The number one rule here is be concise. Only tell the server what you need to say. Not every little thing. How do you do that? Pre-loading and caching. By pre-loading I mean you send the client every single bit of information they need before they need it and cache the data there as long as possible. In my opinion, this is one thing that Fantasy Online didn't do very well. Lets run through another scenario. In FO, every sprite you saw was cached for as long as possible, but the layering of sprites was done server side. What this means is, lets say I am wearing nothing and I put on a tutu. All other players would be told, "User iamallama just changed his pants to a tutu". Then every player would check to see if they have that combination of clothing and they haven't yet so they connect to the server and download my new sprite with nothing but the base skin with a tutu. That image was downloaded and saved on everyone's computer in their cache. Then I put on my Nordic Helm. So now everyone checks to see if they have seen the base skin with a tutu and nordic. Nope, never seen before, download it again. Now I take off my tutu. Everyone checks to see if they have seen the base skin with nordic, but wait, you have all seen the base skin and the nordic. However you haven't seen them in this combination. So it has to be downloaded again. I take off my nordic and I am naked. You've all seen me naked right? Of course, it was the state I was initially in. So you don't need to download that again. Get the picture? If there were only 5 slots, skin, legs, body, face and head. And each of those slots had only 10 items (what a boring game right? I mean there are only 50 things to wear!) that would still mean there are 100,000 possible combinations that needs to be cached. How is it being done in LlamaQuest? Well all items are pre-loaded. Meaning that when you download the game, you download all 10 items for each of the 5 slots. Then everything is layered client side. Each layer is separate. When I put on a tutu, everyone already has the tutu spritesheet. They just layer it over the naked body and it's done. What this also does is allows the server to communicate with the client using even shorter ids. I put on a tutu and the server just sends out "iamallama+43" which means the same thing. Much shorter. And the client already has everything it needs cached so the server doesn't need to tell anyone that the tutu gives 1 million points in awesomeness, they already know.
2) Don't yell so loud
The second method is to not send the server as many updates. This is probably the first way most people will think to cut bandwidth. It is obvious that when you don't talk to the server, you don't send as much data. One problem with this is that in order to not send as much data, you need to rely more on the client to do processing. One thing that I have learned with security minded programming, is that the user is an evil dirty monster that wants to destroy your code and consume your soul. In other words, you need to rely on the client to calculate some stuff, but never trust them. How does that work? When something happens, you check it on the client. Since they have everything cached, you can use that data to check if what they did was valid (both methods go hand in hand). If something looks fishy, you just warn the user (or not) and stop right there. If it looks good, you send it to the server to be calculated again before broadcasted out to every user. Ideally, no bad data would get sent to the server. But this isn't a perfect world. It is actually quite easy to send the server data that looks good but really isn't. I used to use this method in FO to walk across buildings and reach unreachable areas before mods had the power to just teleport anywhere. I would just record the network traffic of me walking to the right across the screen. Then stand next to a building and send that traffic to the server. The server just tells everyone "iamallama moved 500 pixels to the right" and I appear to be walking across the top of the building. The collision handling was handled entirely client side and if it didn't line up, you were told you couldn't move there. If it lined up you told the server what you did and the server just accepted it as valid truth, which it wasn't. So by checking that it was valid on the server, no one would see me fly.
The next part, and the part the deals with the "Don't yell so loud" is a new awesome feature of SmartFox Server (SFS, the server software I'm using). It's awesome in that I don't need to make it myself like I was planning. The most recent versions of SFS come with what they call "Area of Interest" (AOI). What this means is that if you had a map that was equivalent of 100 by 100 miles in size. If someone was standing in the north-west corner and you were standing in the opposite south-east corner, do you really need to know every move of the other guy? Probably not (unless they were raiding your house or something). You really only need to know about what is going on within about 1-2 miles of where you are standing. So that is where this comes in. Given your position, the server will draw a box around you. If anything happens within that box, you are notified. Anything outside that box, you don't care. You are in your own little bubble and let anyone outside of it be damned. Given the initial bandwidth example of 26 users, A through Z, placed all around the map. You very likely will be communicating with 2-3 others and that is it. Obviously there will be hubs that people hang out in and those are the exception. But every person outside the hub will not have to tell those inside what is going on in their life and vice versa.