Working in the gaming industry as a game developer, CTO, CEO, game analyst or any other role is challenging. I worked more than 10 years as a Backend Developer and also as a Cloud Team Lead for a gaming studio for the past 3 years and often struggled with my “backend” role.
Conversations about the cloud with c-level people were especially challenging. Everyone says they want the cloud but on the other hand they are all afraid of rewriting their games to be cloud-ready. I'm writing this article for those of you who are not afraid. For the forward-thinking ones who are ready to take a step to the cloud but don't know which provider would suit them best. I'm hoping to help you understand all you need to know about the differences between the cloud providers and make the best decision for your business.
Except for basic comparison and pricing information I will also share with useful tips and tricks on different approaches you can take when migrating to the cloud, that will significantly change the monthly/yearly costs for the whole development or maintenance of your game.
To help you understand the whole whole picture I need to start with the basic data.
Based on the Canalys reports AWS is the biggest cloud provider with 31% of the market share, the second is MS Azure with 20% and Google comes third with 6%.
Why am I showing you this “boring” data?
1. Even though AWS has the biggest market share (AWS is also on the market the longest), both MS Azure and GCP are doing their best to catch up. This is definitely great for us as customers.
2. AWS has enormous experience. They offer many robust solutions, with minimum downtime and also have a lot of experienced people (developers/ devOps/ infrastructure guys) on their team. On the other hand MS Azure comes with super easy integration of MS systems. E.g. .NET and Core .NET makes CI/CD pipelines almost seamless integration with Visual Studio and team integration via MS Office 365 and Teams.
GCP has its own mix of tools, APIs and Cloud technologies that supports as much OpenSource as possible. It is also known for the world's best Kubernetes Engine, and integration of Google Workspace (formerly G Suite).
This continual competition between AWS, Azure and GCP results in better prices for us as customers, improved technologies, tools and better support.
One of the most important challenges when building an online game is the distance between you and your players. The more options of servers to place your game on you have the less headaches you'll get from LAG. :)
Let's have a quick look on availability zones of the three biggest cloud providers.
|
Availability Zones |
Geographic regions |
More |
AWS |
77 |
24 |
|
GCP |
73 |
24 |
|
AZURE |
60+ |
16 |
https://azure.microsoft.com/en-us/global-infrastructure/geographies/ |
AWS now spans 77 Availability Zones within 24 geographic regions around the world in more than 190 countries, and has announced plans for 9 more Availability Zones and 3 more AWS Regions in Indonesia, Japan, and Spain (more info here).
Microsoft Azure currently consists of 60+ regions worldwide, available in 140 countries (more info here).
Almost identical to AWS, Google Cloud is available in 24 regions with 73 Availability zones across 16 countries (more info here).
The advantage of GCP is its own internal network between some nodes, that will give you less routing hops and can reduce response time greatly.
As you can see almost all Cloud providers have the same amount of regions and availability zones. But! There is one big difference in GCP. GCP doesn't have a region inside China. Hong Kong region is not behind “the Great Firewall of China”, so Google can’t guarantee that requests from inside China will be served. Google claims it is not banned. But I know from my personal experience that most of the GCP services are not available even with the use of VPN. The Chinese gaming market is big and attractive. If you want to target this region you should consider different solutions than GCP. On the other hand a private Google network in other regions is great and can give you advantage in managing Game Server loads.
Your first monthly bill will definitely put your development, infrastructure and debugging skills to the test!
All three companies make enormous effort to make price comparison as difficult as possible. They use different metrics, different time periods, counting of the network traffic inside region or outside region and so on. The subscription model introduced lately by all providers (from 1 up to 3 years commitment) makes it even trickier. Just the basic pricing comparison would fill the whole book. If you'd like to have a look at some rough numbers you can read this article. (article here).
In my opinion, the pricing can be compared accurately only with these services: Virtual Machines (AWS EC2, Azure Cloud Compute and GCP Compute Engine) and Storage services (AWS S3, Azure Storage, Google Cloud Storage). Once we get into Managed Services, Managed Kubernetes or Serverless, prices and what you get for them differ a lot. For example if you compare AWS Lambda and Google Cloud Function, in the same use-cases using Google Cloud comes up 4 times cheaper than AWS. On the other hand AWS offers the possibility to use more programming languages to run AWS functions. Also time, memory and CPU usage restrictions are more attractive on AWS.
Let's get back to the real cost of cloud now. From my point of view the real power of today's cloud are managed containers, managed Kubernetes, managed DBs, Serverless Engines, load balancers and other managed services. They allow you to spend less time on ops and focus on development of your game. But where's the catch?
All of these services come with seemingly higher costs. But is the cost truly higher?
Think about how much you spend on infrastructure or DevOps teams. With managed services this cost goes away. You won't even need to spend endless time analysing peak performance of each service or creating scaling and load balancing policies.
Your game is promoted on Google Play or Apple Store without you even knowing about it.
There is no bigger frustration as the one you feel when you see your servers crash 10 seconds after game launch because of heavy traffic. With managed services You don't have to worry about manual scaling that would take you at least 3 hour. All is done automatically. But of course - it is not magic. Not every service or application is ideal for a serverless approach especially when it consumes infinite CPU or RAM power.
And watch out...Your first monthly bill will definitely put your development, infrastructure and debugging skills to test! There is no DDoS attack on serverless, only infinite server up scale. The infinite loop in stateless service can cost you a LOT. And I mean it. But after some time you will learn how to avoid mistakes such as setting the load balancer on 100 percent in one second and as such launching 300 more instances of the microservice than you need in one minute by wrongly setting load balancer on 100% in 1 second.
You can save money on subscriptions and some services here and there but real savings and cost effectiveness comes with a good architecture (both infrastructure and the service itself) and understanding of how tools and services work in the cloud. If you know how to use them right you will get more value for your money. I highly recommend having a look on how billing quotas and allerting works on each of the cloud platforms. (You can read more here - GCP, AWS and Azure). Personally, I like the way it is done in AWS. GCP is too detailed for me (lots of clicking and scrolling) and Azure is totally confusing.
You might be asking why I put some quotas interface here. The reason is that in the first months (and maybe even the first 1 or 2 years), you'll spend some time setting quotas and alerts in case something went wrong (and it definitely will) so get used to it, get comfortable with it, get the most out of it!
Azure Virtual Machines, Google Compute Engine or AWS EC2 are just aliases for Virtual Servers that you can rent on one of these platforms. From micro servers where you can run internal VPN of your Game Studio or your intranet almost for free to super powerful CPU/GPU 90 core muscle beast. Again, it is hard to provide a general comparison of prices. There are different subscription plans (1 to 3 years).
To get a rough idea you can view the pricing list for Azure VM, AWS EC2, GCP Compute Engine VM. However, there are lots of ways to get even more for your money. You can for example use the option AWS reservations, that allows you to reserve compute power in specific regions in advance (for up to 3 years) at discounted price. AWS Spot instances, Azure Low priority VM, or Google provides “Preemptible” VMs. It is a VM that you share with others and in some cases, it can be stopped to allow the compute power to be used by another customer. You can save some money, but you have to know how to correctly handle these forced stops.
Most of the compute power used nowadays is still utilized by VMs but I think it should only be the first step towards utilizing the full potential of the Cloud in services such as Serverless and Container management.
If you are looking for a cloud provider only to operate VMs you will probably get a better deal from your local provider or from some specialized Gaming Cloud Providers.
VM’s are great if you need to run your legacy service or need a semi step by migrating your services from on-premises to the cloud. But the cost of maintaining a VM fleet in man-days can bring you headaches in the future (in case you build it from scratch. If you have already working CI/CD pipeline, all you need is some fine-tuning).
If you have to use VM try to Dockerize your solutions inside VM to utilize maximum CPU/Memory usage. Google and AWS can monitor these values and encourage you by cost reductions if you utilize your VM for more than 70% during its run time. Both providers know these resources can be used better in managed Kubernetes or Serverless services.
So who is the winner here? AWS leads in best availability and minimum downtime, but you will pay for it with higher prices, MS Azure is somewhere in the middle, and GCP gets points from giving you price discounts if you utilize your VM and recommending more efficient services for your use case they have on offer. As a Game Developer my preference is GCP because I do think the age of VMs in Gaming is already history.
Finally, we are getting to the chapter where we can discuss topics more specific for the gaming industry.
I assume you have basic knowledge of Kubernetes - an orchestration tool for (not only) Docker containers. All three major cloud providers support Kubernetes. On top of that, they have their own managed version known as Kubernetes as a Service that provides a simpler way to manage, scale and monitor your Container network. They are:
These services don't differ all that much. If you crave more detailed info there is a nice article about it here.
The most significant difference is not in the performance or pricing but in the CLIs to these services.
More powerful the CLI is, the easier it is for you to create and maintain your solution.
For example Terraform and Ansible with Managed Kubernetes and its CLI will help you create infrastructure as code and keep your CI/CD pipelines easy to understand for developers. Of course, you can achieve this with VMs but I intentionally skipped this part in the VM section because I think Kubernetes was already born with Infrastructure as Code, CI/CD and DevOps simplification in mind.
But let's get back to gaming. Pure Kubernetes is nice for running the backend of your game with DBs and Analytics, Stateless Microservices BUT where are dedicated game servers?
Here's the catch. Game Servers are not stateless. They are stateful and there is a danger of running them on “pure Kubernetes”. And here comes the Clash of the Titans of the Cloud. Different approaches to make Managed Kubernetes to communicate with your game servers.
AWS comes with the Dedicated Game Server Hosting system GameLift which provides SDK (over the Unity or Unreal Engine) that can communicate with Amazon EKS and let the Engine know better when to scale up or down and other important info about the overall health of your nodes or clusters.
And that's not all - you will get free matchmaking Flex Match integrated monitoring and other cool features as on demand mode for these servers.
It is a nice package overall. By using AWS GameLift SDK you will get into a vendor lock-in situation but on the other hand all you need you need to provide them is your game code. AWS takes care of the rest.
Google has a different approach. They understand that you are on Kubernetes because you try to avoid vendor lock-in, so they came up with their own, open source solution - Agones. It's a platform for deploying, hosting, scaling, and orchestrating dedicated game servers for large scale multiplayer games.(source: https://agones.dev/site/docs/overview/) And same as Amazon, Google comes with a Matchmaking solution - OpenMatch.
Contrary to AWS 'solution where you have some JSON or YAML files for configuration, OpenMatch allows you to write your own code. Thus you have free hands to specify any type of rules and how they will be processed. (source: https://openmatch.dev/site/) From my own experience it is a big advantage. Based on CCU and popularity of your game you can change the algorithm of the player ratings or use 3 or more algorithms depending on time of day/weekdays/season. Correctly and properly working matchmaking is one of the core structures of multiplayer games. (https://agones.dev/site/docs/overview/)
Microsoft created its own platform PlayFab especially for gaming. It provides you with tools and SDKs not only for dedicated Game servers but also with solutions for cross-platform user account sharing (Nintendo Switch, XBox, PS4….). There is built-in user authentication, Matchmaking with its own leaderboards system with match statistics, Integrated chat and analytics plus prevention of DDoS and automated monitoring to identify this type of threads.
Of course all of these tools and features can be built on AWS or GCP but do not underestimate the time and costs needed to develop it on your own and also think about Microsoft's experiences taken from thousands of games running on Xbox platforms. This is definitely not only Microsoft's attempt to get into the Gaming industry but a finely tuned tool set up to work on Xbox and other platforms.
Apart from containerisation, where you can run most of your Game Backend services on Kubernetes you can now also host your Dedicated Servers on the same network or same Kubernetes Clusters as your other services. This gives you more options in pricing and allows for more flexibility to your end solution. Long gone are the times when you had to host your Dedicated Game server by another provider (like HostGator or BlueHost) or use third party SDK and monitoring for game services like Photon, GameSparks solutions or open sourced Nakama. You can opt to build your Game Backend in one place with a fully managed Game SDK from Azure, AWS or implement a fully open-source solution from GCP.
None of the providers clearly wins in this category. In my opinion, MS Azure offers the most advanced package. I'd rank GCP in last place, however, it does have a huge advantage over others in the lack of vendor lock-in.
In the part 2 I am covering much more topics such as Cloud Storage, DBs, Cache and Queues, NoSQL in SQL world and Queues, Subscriptions and Tasks. You will also see a final recommendations.