Footprint of Google Cloud Platform – Basics of Google Cloud Platform

Independent geographical areas are known as regions, while zones make up regions. Zones and regions are logical abstractions of the underlying physical resources that are offered in one or more datacenters physically located throughout the world. Within a region, the Google Cloud resources are deployed to specific locations referred to as zones. It is important that zones are seen as a single failure area within a region. Figure 1.8 shows the footprint of GCP:

Figure 1.8: Footprint of GCP

The time this book was written there were about 34 regions, 103 zones and 147 network edge location across 200+ countries. GCP is constantly increasing its presence across the globe, please check the link mentioned below to get the latest numbers.

Image source: https://cloud.google.com/about/locations

The services and resources offered by Google Cloud may either be handled on a zonal or regional level, or they can be managed centrally by Google across various regions.:

  • Zonal resources: The resources in a zone only work in that zone. When a zone goes down, some or all of the resources in that zone can be affected.
  • Regional resources: They are spread across multiple zones in a region to make sure they are always available.
  • Multiregional resources: Google manages a number of Google Cloud services to be redundant and spread both inside and between regions. These services improve resource efficiency, performance, and availability.
  • Global resources: Any resource within the same project has access to global resources from any zone. There is no requirement to specify a scope when creating a global resource.

Network edge locations are helpful for hosting static material that is well-liked by the user base of the hosting service. The material is temporarily cached on these edge nodes, which enables users to get the information from a place that is much closer to where they are located. Users will have a more positive experience as a result of this.

There are few benefits associated with the GCP’s regions and zones. When it comes to ensuring high availability, high redundancy, and high dependability, the notion of regions and zones is helpful. Obey the laws and regulations that have been established by the government. Data rules might vary greatly from one nation to the next.

Introduction to Google Cloud Platform – Basics of Google Cloud Platform

Google Cloud Platform is one of the hyper scale infrastructure providers in the industry. It is a collection of cloud computing services that are offered by Google. These services operate on the same infrastructure that Google employs for its end-user products, including YouTube, Gmail, and a number of other offerings. The Google Cloud Platform provides a wide range of services, such as computing, storage, and networking, among other things.

Google Cloud Platform was first launched in 2008, and as of now, it is the third cloud platform that sees the most widespread use. Additionally, there is a growing need for platforms that are hosted on the cloud.

The Google cloud gives us a service-centric perspective of all our environments in addition to providing a standard platform and data analysis for deployments, regardless of where they are physically located. Using the capabilities of sophisticated analytics and machine learning offered by Google Cloud, we can extract the most useful insights from our data. Users will be able to automate procedures, generate predictions, and simplify administration and operations with the support of Google’s serverless data analytics and machine learning platform. The services provided by Google Cloud encrypt data while it is stored, while it is being sent, and while it is being used. Advanced security mechanisms protect the privacy of data.

Account creation on Google Cloud Platform

Users can create free GCP account from the link https://cloud.google.com/free.

Free account provides 300$ credit for a period of 90 days.

Steps for creating a free account are as follows:

  1. Open https://cloud.google.com/free.
  2. Click on Get started for free.

The opening screen looks like Figure 1.2:

Figure 1.2: GCP account creation

  1. Login with your Gmail credentials, create one if you do not have. This can be seen illustrated in Figure 1.3:

Figure 1.3: GCP account creation enter valid mail address

  1. Selection of COUNTRY and needs:

Figure 1.4: GCP account creation country selection

  1. Select the Country and project. Check the Terms of service and click on CONTINUE.
  2. Provide phone number for the identity verification as shown in Figure 1.5:

Figure 1.5: GCP account creation enter phone number

  1. Free accounts require a credit card. Verification costs Rs 2. Addresses must be provided. Click on START MY FREE TRIAL on this page:

Figure 1.6: GCP account creation enter valid credit card details

  1. Users will land into this page once the free trail has started. The welcome page can be seen in Figure 1.7:

Figure 1.7: Landing page of GCP

Importance of Cloud for data scientist – Basics of Google Cloud Platform

Since the beginning of the previous decade, the expansion of data has followed an exponential pattern, and this trend is expected to continue. The safe and secure storage of data should be one of the top priorities of every company. The cloud is usually the top option when it comes to storing and processing the enormous quantity of data since it has all of the advantages that were discussed above. As a consequence of this, a data scientist in today’s world has to have experience with cloud computing in addition to their expertise in statistics, machine learning algorithms, and other areas.

However, due to the low processing capacity of their CPU, they are unable to carry out these responsibilities in a timely way, assuming that they are even capable of doing so at all. In addition, the memory of the machine is often incapable of storing massive datasets because of their size. It determines how quickly the assignment is performed and how well it was accomplished overall. Data scientists are now able to investigate more extensive collections of data without being constrained by the capabilities of their local workstations thanks to the cloud. Utilizing the cloud might result in a decrease in the cost of infrastructure since it eliminates the requirement for a physical server. In addition, depending on the cloud for data storage can lead to a reduction in the cost of infrastructure. In addition to offering data storage services, many cloud platforms including google cloud platform also has other services caterings to data ingestion, data processing, analytics, AI and data visualization.

Types of Cloud

There are three types of cloud based on different capabilities:

  • Public Cloud
  • Private Cloud
  • Hybrid Cloud

Public Cloud: The public cloud is a massive collection of readily available computing resources, including networking, memory, processing elements, and storage. Users can rent these resources, which are housed in one of the public cloud vendors globally dispersed and fully managed datacenters, to create your IT architecture. Using a web browser, users have access to your resources in this form of cloud. Google Cloud Platform is an example for Public Cloud.

A major advantage of the public cloud is that the underlying hardware and logic are hosted, owned, and maintained by each vendor. Customers are not responsible for purchasing or maintaining the physical components that comprise their public cloud IT solutions. In addition, Service Level Agreements (SLAs) bind each provider to a monthly uptime percentage and security guarantee in accordance with regulations.

Private Cloud: Unlike public clouds, private clouds are owned and operated only by a single organization. They have usually been housed in the company’s datacenter and run on the organization ‘s own equipment. To host their private cloud on their equipment, however, an organization may use a third-party supplier. Even if the resources are housed in a remotely managed datacenter, private cloud has certain characteristics with public cloud in this case. They may be able to provide certain administrative services but they would not be able to offer the full range of public cloud services.

If the private cloud is housed in your own datacenter, organization have complete control over the whole system. A self-hosted private cloud may help to comply with some of the stricter security and compliance regulations.

Hybrid Cloud: This kind of cloud computing is a blend and integration of both public and private clouds, as the name of this form of cloud computing indicates. In this manner, it will be able to provide you with the advantages associated with a variety of cloud kinds when it comes to cloud computing. It enables a larger degree of flexibility in terms of the transmission of data and expands the alternatives available to a company for its adoption. This guarantees a high level of control as well as an easy transition while giving everything at rates that are more economical.

Advantages of Cloud – Basics of Google Cloud Platform

There are various advantages of cloud as shown in Figure 1.1, and mentioned as follows:

Figure 1.1: Advantages of Cloud platform

  • Cost efficiency: In terms of IT infrastructure management, cloud computing is undoubtedly the most cost-effective option. It is incredibly affordable for organizations of any size to transition from on-premises hardware to the cloud thanks to a variety of pay-as-you-go and other scalable choices. Using cloud resources instead of purchasing costly server equipment and PCs that need a lot of time to set up and maintain, such as long hours of setup and maintenance. Cloud also helps in reduced spending on compute, storage, network, operational and upgrade expenses.
  • Scalability and elasticity: Overall, cloud hosting is more flexible than hosting on a local machine. You do not have to undertake a costly (and time-consuming) upgrade to your IT infrastructure if you need more bandwidth. This increased degree of latitude and adaptability may have a major impact on productivity.

Elasticity is only employed for a short amount of time to deal with rapid shifts in workload. This is a short-term strategy used to meet spikes in demand, whether they are unanticipated or seasonal. The static increase in workload is met through scalability. To cope with an anticipated rise in demand, a long-term approach to scalability is used.

  • Security: Cloud platform provides a multitude of cutting-edge security measures, which ensure the safe storage and management of any data. Granular permissions and access control using federated roles are two examples of features that may help limit access to sensitive data to just those workers who have a legitimate need for it. This helps reduce the attack surface that is available to hostile actors. Authentication, access control, and encryption are some of the fundamental safeguards that providers of cloud storage put in place to secure their platforms and the data that is processed on those platforms. After that, users can implement additional security measures of their own, in addition to these precautions, to further strengthen cloud data protection and restrict access to sensitive information stored in the cloud.
  • Availability: The vast majority of cloud service providers are quite dependable in terms of the provision of their services; in fact, the vast majority of them maintain an uptime of 99.9 percent. Moving to the cloud should be done with the intention of achieving high availability. The goal is to make your company’s goods, services, and tools accessible to your clients and workers at any time of day and from any location in the world using any device that can connect to the internet.
  • Reduced downtime: Cloud based solutions provide the ability to operate critical systems and data directly from the cloud or to restore them to any location. During a catastrophic event involving information technology, they make it easier for you to get these systems back online, reducing the amount of manual work required by conventional recovery techniques.
  • Increased Collaboration: Developers, QA, operations, security, and product architects are all exposed to the same infrastructure and may work concurrently without tripping on one another’s toes in cloud settings. To minimize disputes and misunderstanding, cloud roles and permissions provide more visibility and monitoring of who performed what and when. Different cloud environments, such as staging, QA, demo, and pre-production, may be created for specialized reasons. The cloud makes transparent collaboration simpler and promotes it.
  • Insight: A bird’s-eye perspective of your data is also provided through the integrated cloud analytics that are offered by cloud platforms. When your data is kept in the cloud, it is much simpler to put in place, monitoring systems and create individualized reports for doing information analysis throughout the whole organization. You will be able to improve efficiency and construct action plans based on these insights, which will allow your organization to fulfil its objectives.
  • Control over data: Cloud provides you total visibility and control over your data. You have complete control over which users are granted access to which levels of specified data. This not only gives you control, but also helps simplify work by ensuring that staff members are aware of the tasks they have been allocated. Additionally, it will make working together much simpler. Because several users may make edits to the same copy of the text at the same time, there is no need that multiple copies of the document be distributed to the public.
  • Automatic software updates: There is nothing more cumbersome than being required to wait for the installation of system upgrades, especially for those who already have a lot on their plates. Applications that are hosted in the cloud instantly refresh and update themselves, eliminating the need for an IT personnel to carry out manual updates for the whole organization. This saves critical time and money that would have been spent on consulting from other sources.
  • Ease of managing: The use of cloud can streamline and improve IT maintenance and management capabilities through the use of agreements supported by SLA, centralized resource administration, and managed infrastructure. Users can take advantage of a simple user interface without having to worry about installing anything. In addition, users are provided with management, maintenance, and delivery of the IT services.

Introduction – Basics of Google Cloud Platform

You will learn about the Google cloud platform in this chapter, as well as its benefits and the role it plays in today’s digital revolution. Basic knowledge of cloud computing, including cloud service models, GCP account creation, footprint, range of services, and GCP hierarchy. This chapter will also introduce a few key GCP services, including storage, computation, google BigQuery and identity and access management, is then provided.

Structure

In this chapter, we will cover the following topics:

  • Introduction and basics of Cloud platform
  • Advantages of Cloud
  • Importance of Cloud for data scientists
  • Types of Cloud
  • Introduction to Google Cloud platform
  • Footprint of Google Cloud
  • Cloud service model
  • Services of GCP
  • Hierarchy of GCP
  • Interacting with GCP services
  • Storage in GCP
  • Compute in GCP
  • BigQuery
  • Identity and Access Management

Objectives

Before diving into the Vertex AI of the Google Cloud platform, it is very essential to grasp a few significant principles and vital services of the cloud platform. Users will have a solid understanding of the GCP components and services by the time this chapter ends. Detailed instructions for using GCP’s storage, compute, and BigQuery services are included.

Introduction to Cloud

The term Cloud describes the applications and databases that run on servers that can be accessed over the Internet. Data centers across the globe host the cloud servers. Organizations can avoid managing physical servers or running software on their own computers by utilizing cloud computing. The cloud enables users to access the same files and applications from almost any device, because the computing and storage takes place on servers in a data center, instead of locally on the user device.

For businesses, switching to cloud computing removes some IT costs and overhead: for instance, they no longer need to update and maintain their own servers, as the cloud vendor they are using will do that.

Deploying with Terraform – Deploying Skills Mapper

To deploy the environment, you need to run the Terraform commands in the terraform directory.

First, initialize Terraform to download the needed plugins with:

terraform
init

Then check that you have set the required variables in your terraform.tfvars with:

terraform
validate

All being well, you should see Success! The configuration is valid.

Although Terraform can enable Google Services, and these scripts do, it can be unreliable as services take time to enable. Use the enable_service.sh script to enable services with gcloud:

./enable_services.sh

Terraform will then show how many items would be added, changed, or destroyed. If you have not run Terraform on the projects before, you should see a lot of items to be added.

When you are ready, run the apply command:

terraform
apply

Again, Terraform will devise a plan for meeting the desired state. This time, it will prompt you to approve applying the plan. Enter yes and watch while Terraform creates everything from this book for you. This may take 30 minutes, the majority of which will be the creation of the Cloud SQL database used by the fact service.

When completed, you will see several outputs from Terraform that look like this:

application-project = “skillsmapper-application”
git-commit = “3ecff393be00e331bb4412f4dc24a3caab2e0ab8”
management-project = “skillsmapper-management”
public-domain = “skillsmapper.org”
public-ip = “34.36.189.201”
tfstate_bucket_name = “d87cf08d1d01901c-bucket-tfstate”

The public-ip is the external IP of the global load balancer. Use this to create an A record in your DNS provider for the domain you provided.

Reapplying Terraform

If you make a change to the Terraform configuration, there are a few things you need to do before deploying Terraform again.

First, make sure you are using the application project:

gcloud
config
set
project
$APPLICATION_PROJECT_ID

Terraform is unable to change the API Gateway configuration, so you will need to delete it and allow Terraform to recreate it.

Also, if Cloud Run has deployed new versions of the services, you will need to remove them and allow Terraform to recreate them, too, as Terraform will have the wrong version.

This time you will notice only a few added, changed, or destroyed resources, as Terraform only applies the differences to what is already there.

Deleting Everything

When you have finished with Skills Mapper, you can also use Terraform to clean up completely using:

terraform
destroy

This will remove all the infrastructure that Terraform has created.

At this point, you may also like to unlink the billing accounts from the projects so they can no longer be billed: gcloud
beta
billing
projects
unlink
$APPLICATION_PROJECT_ID
gcloud
beta
billing
projects
unlink
$MANAGEMENT_PROJECT_ID

Terraform Backend – Deploying Skills Mapper

Terraform records the state of all the infrastructure it has created so that when the configuration is applied, it only makes the changes needed to get the infrastructure to the desired state. There could be no changes, the configuration could have been changed, or the infrastructure could have been changed outside Terraform, for example by someone issuing gcloud commands. Terraform will work out what needs to be done to get to the desired state.

By default, Terraform keeps this state on the machine that was used to apply the configuration. This means it cannot be shared. Alternatively, Terraform state can store the state of the infrastructure in a backend. In the case of Google Cloud, you can use a Cloud Storage bucket for this purpose.

Create a Cloud Storage bucket to store the Terraform state using gcloud in the management project. As bucket names need to be unique, using the project number as a suffix is a good way to ensure this.

Configure Identity Platform

In Chapter 7, you enabled Identity Platform. If you have created a new application project, you will need to enable it again in the project and make a note of the API key, as you will need to pass it to Terraform as a variable.

Setting Terraform Variables

Terraform uses variables to customize the configuration. These are defined in a terraform.tfvars file in the terraform directory. Many of these have defaults you can override, but you will need to set the following variables before deployment.

Create a terraform.tfvars file in the terraform directory with the following content:

  KeyExample valueDescription
domainskillsmapper.orgThe domain name to use for the environment
regioneurope-west2The region to deploy the environment to
billing_account014…The ID of the billing account associated with your projects
management_project_idskillsmapper-managementThe ID of the management project
application_project_idskillsmapper-applicationThe ID of the application project
api_keyAIzaSyC…The API key for Identity Platform
app_installation_idskillsmapperThe ID of the app installation for GitHub used when setting up the factory
github_repohttps://github.com/SkillsMapper/skillsmapper.gitThe name of the GitHub repository to use for the factory
github_tokenghp_…The GitHub token to use for the factory

If you have set all the environment variables for other chapters in this book, you can generate the terraform.tfvars from the file terraform.tfvars.template in the example code:

envsubst
<
terraform.tfvars.template
>
terraform.tfvars

With this file created, you are ready to deploy using Terraform.

Installing Terraform – Deploying Skills Mapper

Terraform is a command-line tool that you can install on your local machine. It’s compatible with Windows, Mac, and Linux, and you can download it directly from the Terraform website. After downloading, you’ll need to add it to your system’s path to enable command-line execution. You can verify the installation by running terraform –version, which should return the installed version.

Terraform makes use of plugins that allow it to communicate with the APIs of service providers like Google Cloud. Not surprisingly, in this setup, you will mainly be using the Google Cloud provider. Terraform is not perfect, though, and it is common to come across small limitations. The Skills Mapper deployment is no exception, so there are a few workarounds required.

Terraform Workflow

Using the Terraform tool has four main steps:

terraform init

Initialize the Terraform environment and download any plugins needed.

terraform plan

Show what Terraform will do. Terraform will check the current state, compare it to the desired state, and show what it will do to get there.

terraform apply

Apply the changes to the infrastructure. Terraform will make the changes to the infrastructure to get to the desired state.

terraform destroy

Destroy the infrastructure. Terraform will remove all the infrastructure it created.

Terraform Configuration

Terraform uses configuration files to define the desired state. For Skills Mapper, this is in the terraform directory or the GitHub repository. There are many files in this configuration, and they are separated into modules, which is Terraform’s way of grouping functionality for reuse.

Preparing for Terraform

Several prerequisites need to be in place before you can deploy using Terraform.

Creating Projects

First, you need to create two projects, an application project and a management project, as you did earlier in the book. Both projects must have a billing project enabled. The instructions for this is are Chapter 4.

Ensure you have the names of these projects available as environment variables (e.g., skillsmapper-application and skillsmapper-management, respectively):

APPLICATION_PROJECT_ID
=
skillsmapper-application

MANAGEMENT_PROJECT_ID
=
skillsmapper-management

Reintroducing Terraform – Deploying Skills Mapper

In most of this book, you have been using gcloud commands to deploy everything. If you wanted to ship the product, you could do what I have done in the book and produce a step-by-step guide to the commands. However, it is easy to make a mistake when following instructions. What would be much better is to automate all those commands in a way that could consistently deploy everything for you with a single command.

One option would be to put all the commands in shell scripts. However, when using gcloud commands you are effectively calling the Google Cloud API in the background. What is better is to use a tool that makes the same API calls but is designed for this type of automation. This is the principle of infrastructure as code (IaC).

In this appendix, you have the opportunity to set up everything discussed in this book in one go with automation.

Note

The code for this chapter is in the terraform folder of the GitHub repository.

Reintroducing Terraform

The tool designated for automating the creation of infrastructure in this context is Terraform, an open source offering from HashiCorp. Terraform exemplifies an IaC tool, a concept briefly explored in Chapter 5 when it was utilized to deploy the tag updater.

While Google Cloud offers a similar tool called Deployment Manager, it is limited to supporting only Google Cloud. On the other hand, Terraform’s applicability extends to all public clouds and various other types of infrastructure. This broader compatibility has made Terraform more widely accepted, even within the Google Cloud ecosystem.

To understand the distinction between using Terraform and manual methods like gcloud commands or shell scripts, consider the difference between imperative and declarative approaches:

Imperative approach

Using gcloud commands or shell scripts is an imperative method. Here, you act as a micromanaging manager, explicitly directing the Google Cloud API on what actions to perform and how to execute them.

Declarative approach

Terraform operates on a declarative principle. Instead of micromanaging each step, you define a specific goal, and Terraform takes the necessary actions to achieve it. This approach is similar to how Kubernetes functions; you declare the desired state, and the tool works to realize that state.

The declarative nature of Terraform allows for a more streamlined and efficient process, aligning the tool with the objectives without requiring detailed command over each step.

What Terraform is effectively doing is taking the destination defined as a YAML configuration and working out the route to get there, provisioning the entire secure environment. This is reproducible and repeatable, so if you wanted to have multiple environments with the same configuration (e.g., dev, QA, and prod) you could build them with the same recipe, ensuring a consistent product.

Terraform also allows you to specify variables and compute values to customize the deployment. It also understands the dependencies between resources and creates them in the right order. Most importantly, it keeps track of everything that is created; if you want to remove everything, it can clean up after itself.

The code used to define the desired state also acts as a way of documenting all the infrastructure. If anyone wants to understand all the infrastructure used in the system, the Terraform configuration is a central source of truth. As it is code, it can be shared in a source code repository and versioned with an audited history. This means developers can issue pull requests for changes, for example, rather than having to raise tickets with an operations team. It is a great example of how a tool enables DevOps or SRE practices.

This appendix is here to help you use Terraform to deploy your own Skills Mapper environment. It is not intended to go into Terraform in depth. For that, I recommend the Terraform documentation or Terraform: Up and Running (O’Reilly) by Yevgeniy Brikman.

Conferences and Events – Going Further

Google hosts two significant events annually: Google Cloud Next and Google I/O, each serving distinct audiences and covering unique areas of focus.

Google I/O, typically held in the second quarter, is a developer-oriented conference. It’s designed primarily for software engineers and developers utilizing Google’s consumer-oriented platforms, such as Android, Chrome, and Firebase, as well as Google Cloud. The event offers detailed technical sessions on creating applications across web, mobile, and enterprise realms using Google technologies. It’s also renowned for product announcements related to Google’s consumer platforms.

Conversely, Google Cloud Next is aimed at enterprise IT professionals and Google Cloud developers, taking place usually in the third quarter. Its focus revolves around Google Cloud Platform (GCP) and Google Workspace. The event provides insights into the latest developments and innovations in cloud technology. It also presents networking opportunities, a wealth of learning resources, and expert-led sessions dedicated to helping businesses leverage the power of the cloud for transformative operational changes. Its feel is notably more corporate than Google I/O.

Both conferences record the hundreds of talks presented and make them accessible on YouTube. This wealth of knowledge is a fantastic resource for keeping abreast of the latest developments in Google Cloud and gaining an in-depth understanding of technical areas.

In addition to these main events, numerous local events tied to Google Cloud Next and Google I/O are organized by local Google teams or community groups. These include Google I/O Extended and Google Cloud Next Developer Days, which offer a summary of the content from the larger events. The Google Events website is a reliable source to stay updated on upcoming happenings.

Summary

As you turn the last page of this book, my hope is that it has kindled a fire in you—a deep, consuming desire to explore the vast and fascinating world of Google Cloud, but more importantly, to build with it and innovate. If it has, then this book has served its purpose.

Remember, you are not alone on this journey. There’s an immense community of like-minded cloud enthusiasts and Google Cloud experts, eager to support and guide you on this path. They’re rooting for your success—so embrace their help!

Writing this book has been an enriching experience, filled with growth and discovery. I trust that you’ve found reading it just as enjoyable. I would be thrilled to hear about your unique experiences and journeys with Google Cloud. Your feedback on this book is not only welcome but greatly appreciated.

To share your thoughts and experiences, or simply reach out, please visit my website at https://danielvaughan.com.

As you venture further into the world of cloud computing, remember: every day brings new opportunities for growth and innovation. Embrace them with open arms.

Happy cloud computing, and here’s to the incredible journey that lies ahead!