How to install Stackdriver monitoring agent in Google Container VM images?
Asked Answered
W

4

11

I followed this instruction https://cloud.google.com/monitoring/agent/install-agent#linux-install

$ curl -O "https://repo.stackdriver.com/stack-install.sh"
$ sudo bash stack-install.sh --write-gcm
Unidentifiable or unsupported platform.

The content of /etc/os-release.

$ cat /etc/os-release
BUILD_ID=8820.0.0
NAME="Container-VM Image"
GOOGLE_CRASH_ID=Lakitu
VERSION_ID=55
BUG_REPORT_URL=https://crbug.com/new
PRETTY_NAME="Google Container-VM Image"
VERSION=55
GOOGLE_METRICS_PRODUCT_ID=26
HOME_URL="https://cloud.google.com/compute/docs/containers/vm-image/"
ID=gci

https://cloud.google.com/compute/docs/containers/vm-image/faq#what_is_the_software_package_manager_for_container-vm_image

In order to update a particular package, the entire OS image needs to be updated

So, it seems that we must wait till update for a stackdriver agent installed version of image or give it up.

Also this vm image is not my choice. Newly created GKE nodes use Container-VM images by default. So for now I'll try to create nodes via gcloud container node-pools create --image-type

Waistcloth answered 5/10, 2016 at 1:46 Comment(6)
On GCE default images, the nodes already have stackdriver installed. Or at least they have the fluentd logger which forwards things to google/stackdriverAmylopsin
Really? Since stack driver is still in beta, they didn't pre-installed agents. I asked once at cloud service support. If agents are not installed we cannot monitor memory usages.Waistcloth
I see, maybe not for Memory.. But I see logs from kubernetes apps in stackdriver logs w/o needing to do anything. Are you using GKE?Amylopsin
Yes, I use GKE. As you said, I can see logs.Waistcloth
For GKE, monitoring works off the box and doesn't require stackdriver agent. If you go to your Stackdriver UI, you can view your GKE clusters as first class entities. Since the new Container-VM image is optimized for security and containers, we are working on containerizing the stackdriver agent. Sorry for the inconvenience!Refectory
@VishnuKannan: As hiroshi pointed out above, monitoring only includes external metrics which excludes memory. We run into OOM issue quite often in GKE so it's important to track memory usage on our nodes (as Node events are only persisted for an hour). The GKE UI options "Add cloud monitoring" implies that the agent will be installed but it isn't (and can't be on GCI). Really looking forward to a fix for this.Aardvark
E
8

You can enable Stackdriver Monitoring Agent on Container OS VM Instances, just run this command (and restart it) in order to enable the monitoring agent:

gcloud compute instances add-metadata instance-name --metadata=google-monitoring-enabled=true
Egor answered 23/1, 2020 at 13:43 Comment(3)
Thanks for the answer, but I cannnot find document about google-monitoring-enabled metadata. How do you know this?Waistcloth
I had a discussion with Google support and they shared this and it works.Egor
The metadata google-monitoring-enabled=true will install the "node problem detector", which is not Stackdriver Monitoring Agent. It cannot monitor the metrics that Stackdriver can do, such as "memory utilization" and "disk usage" etc. github.com/kubernetes/node-problem-detector cloud.google.com/container-optimized-os/docs/how-to/…Medor
K
7

You can do

sudo systemctl start stackdriver-logging
sudo systemctl start stackdriver-monitoring

It will spin up some containers with the agent running. Data will show up in your stackdriver dashboard a few minutes later.

I didn't find it documented anywhere, so I can't tell in which images exactly this is available. But I tested it in Container-Optimized OS 77-12371.114.0 stable

Kwan answered 6/11, 2019 at 13:31 Comment(2)
Thanks for your answer. Next time I create a node with Google Container VM image I'll try it.Waistcloth
Great answer, this should really be added to the official docs on container-optimized os. I have send GCP a feedback to include this.Makeyevka
S
6

As far as I know (and what has been confirmed to me by Google), the new Chromium OS image currently does not support the Stackdriver agent. As a workaround I upgraded the node pool back to ‘container-vm’ (which has the Debian image) by using the following command:

$ gcloud container clusters upgrade YOUR_CLUSTER_NAME --image-type=container_vm --node-pool=YOUR_NODE_POOL

Replace the cluster name and set the node pool name to the one which was upgraded to gci earlier (In my case 'default-pool'). The node versions will be upgraded to the newest ones. You can however add an option to deploy another version.

You should now be able to install the Stackdriver agent just as you are used to and set up your desired custom metrics.

Shandy answered 29/11, 2016 at 14:42 Comment(0)
P
2

The way I was able to get around the issue with the agent's incompatibility with the new Chromium image was to deploy the agent as a container running in privileged mode (conveniently already built: https://github.com/wikiwi/stackdriver-agent) within a kubernetes DaemonSet so it runs on each host. Here's the YAML for what I ended up using (spaces matter):

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: stackdriver-agent
spec:
  template:
    metadata:
      labels:
        app: stackdriver-agent
    spec:
      containers:
      - name: stackdriver-agent
        image: wikiwi/stackdriver-agent
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /mnt/proc
          name: procmnt
        env:
          - name: MONITOR_HOST
            value: "true"
      volumes:
      - name: procmnt
        hostPath:
          path: /proc
Perilune answered 6/12, 2017 at 20:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.