Right now, as we are running Kubernetes locally, we have only one worker node running our applications, so all these pods will run on this node. In a production environment, where we will likely have several worker nodes in our cluster, Kubernetes will try to schedule these pods across different nodes. So, even if one of the nodes fails, our application will still be up.
Scaling it down
Now, you can probably imagine what we need to do to scale the number of pods down. Just change the deployment manifest to match the number of replicas you want, and Kubernetes will make sure to terminate the pods for us.