- 출처: https://prometheus.io/docs/guides/node-exporter/
Goal
- localhost 에서 Node exporter 시작
- Node exporter로 부터 metric을 수집하도록 Prometheus instance 를 설정하고 시작
Overview
아래 Architecture 에서 exporter로 부터 metric을 수집해보고 prometheus instance 에서 제대로 수집했는지 확인해본다.
Node exporter 설치
Prometheus Node Exporter는 하나의 바이너리 파일이다. download page(https://prometheus.io/download/#node_exporter) 에서 받은 후 압축을 풀어준다.
tar xvfz node_exporter-*.*-amd64.tar.gz
cd node_exporter-*.*-amd64
./node_exporter
...
node_exporter-1.5.0.darwin-amd64 % ./node_exporter
ts=2023-04-22T08:11:04.742Z caller=node_exporter.go:180 level=info msg="Starting node_exporter" version="(version=1.5.0, branch=HEAD, revision=1b48970ffcf5630534fb00bb0687d73c66d1c959)"
ts=2023-04-22T08:11:04.742Z caller=node_exporter.go:181 level=info msg="Build context" build_context="(go=go1.19.3, user=root@3f01c57ed3b7, date=20221129-19:02:03)"
ts=2023-04-22T08:11:04.743Z caller=filesystem_common.go:111 level=info collector=filesystem msg="Parsed flag --collector.filesystem.mount-points-exclude" flag=^/(dev)($|/)
ts=2023-04-22T08:11:04.744Z caller=filesystem_common.go:113 level=info collector=filesystem msg="Parsed flag --collector.filesystem.fs-types-exclude" flag=^devfs$
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:110 level=info msg="Enabled collectors"
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=boottime
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=cpu
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=diskstats
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=filesystem
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=loadavg
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=meminfo
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=netdev
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=os
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=powersupplyclass
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=textfile
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=thermal
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=time
ts=2023-04-22T08:11:04.744Z caller=node_exporter.go:117 level=info collector=uname
ts=2023-04-22T08:11:04.746Z caller=tls_config.go:232 level=info msg="Listening on" address=[::]:9100
ts=2023-04-22T08:11:04.747Z caller=tls_config.go:235 level=info msg="TLS is disabled." http2=false address=[::]:9100
정상적으로 수행되었다면 위와 같이 9100 port로 실행될것이다. browser를 통해 http://localhost:9100/metrics 주소로 접속해보자. 아래와 같이 metric이 정상수집되고 있다면 이상이 없는것이다.
# HELP node_boot_time_seconds Unix time of last boot, including microseconds.
# TYPE node_boot_time_seconds gauge
node_boot_time_seconds 1.680654071643156e+09
# HELP node_cpu_seconds_total Seconds the CPUs spent in each mode.
# TYPE node_cpu_seconds_total counter
node_cpu_seconds_total{cpu="0",mode="idle"} 233845.1
node_cpu_seconds_total{cpu="0",mode="nice"} 0
node_cpu_seconds_total{cpu="0",mode="system"} 130121.92
node_cpu_seconds_total{cpu="0",mode="user"} 251268.11
node_cpu_seconds_total{cpu="1",mode="idle"} 236195.84
node_cpu_seconds_total{cpu="1",mode="nice"} 0
node_cpu_seconds_total{cpu="1",mode="system"} 126460.57
node_cpu_seconds_total{cpu="1",mode="user"} 252597.6
Prometheus instance 설정
Node Exporter는 h/w 나 kernel의 지표를 노출하고 있을 뿐이다. 즉 Promethus 입장에서 Node Exporter는 metric을 수집할 target 이다.
prometheus.yml 설정에서 아래와 같이 수정해준다.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "node"
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ["localhost:9100"]
targets에 Node Exporter가 점유하고 있는 port인 "localhost:9100" 를 기입해주고 prometheus를 실행해주자.
prometheus-2.43.0.darwin-amd64 % ./prometheus --config.file=./prometheus.yml
...
ts=2023-04-22T08:16:18.385Z caller=main.go:520 level=info msg="No time or size retention was set so using the default time retention" duration=15d
ts=2023-04-22T08:16:18.387Z caller=main.go:564 level=info msg="Starting Prometheus Server" mode=server version="(version=2.43.0, branch=HEAD, revision=edfc3bcd025dd6fe296c167a14a216cab1e552ee)"
ts=2023-04-22T08:16:18.387Z caller=main.go:569 level=info build_context="(go=go1.19.7, platform=darwin/amd64, user=root@1fd07b70056a, date=20230321-12:56:36, tags=netgo,builtinassets)"
ts=2023-04-22T08:16:18.387Z caller=main.go:570 level=info host_details=(darwin)
ts=2023-04-22T08:16:18.387Z caller=main.go:571 level=info fd_limits="(soft=122880, hard=unlimited)"
ts=2023-04-22T08:16:18.387Z caller=main.go:572 level=info vm_limits="(soft=unlimited, hard=unlimited)"
ts=2023-04-22T08:16:18.395Z caller=web.go:561 level=info component=web msg="Start listening for connections" address=0.0.0.0:9090
ts=2023-04-22T08:16:18.398Z caller=main.go:1005 level=info msg="Starting TSDB ..."
ts=2023-04-22T08:16:18.400Z caller=tls_config.go:232 level=info component=web msg="Listening on" address=[::]:9090
ts=2023-04-22T08:16:18.401Z caller=tls_config.go:235 level=info component=web msg="TLS is disabled." http2=false address=[::]:9090
ts=2023-04-22T08:16:18.403Z caller=repair.go:56 level=info component=tsdb msg="Found healthy block" mint=1682051912908 maxt=1682056800000 ulid=01GYHC09ZKW40C8MAZPNB63JGA
ts=2023-04-22T08:16:18.403Z caller=repair.go:56 level=info component=tsdb msg="Found healthy block" mint=1682056800000 maxt=1682064000000 ulid=01GYHGNHAAF9J8D8A3FJHGA947
ts=2023-04-22T08:16:18.418Z caller=head.go:587 level=info component=tsdb msg="Replaying on-disk memory mappable chunks if any"
ts=2023-04-22T08:16:18.422Z caller=head.go:658 level=info component=tsdb msg="On-disk memory mappable chunks replay completed" duration=3.497375ms
ts=2023-04-22T08:16:18.422Z caller=head.go:664 level=info component=tsdb msg="Replaying WAL, this may take a while"
ts=2023-04-22T08:16:18.428Z caller=head.go:700 level=info component=tsdb msg="WAL checkpoint loaded"
ts=2023-04-22T08:16:18.430Z caller=head.go:735 level=info component=tsdb msg="WAL segment loaded" segment=3 maxSegment=6
ts=2023-04-22T08:16:18.434Z caller=head.go:735 level=info component=tsdb msg="WAL segment loaded" segment=4 maxSegment=6
ts=2023-04-22T08:16:18.442Z caller=head.go:735 level=info component=tsdb msg="WAL segment loaded" segment=5 maxSegment=6
ts=2023-04-22T08:16:18.443Z caller=head.go:735 level=info component=tsdb msg="WAL segment loaded" segment=6 maxSegment=6
ts=2023-04-22T08:16:18.443Z caller=head.go:772 level=info component=tsdb msg="WAL replay completed" checkpoint_replay_duration=6.710958ms wal_replay_duration=14.499ms wbl_replay_duration=208ns total_replay_duration=24.755625ms
ts=2023-04-22T08:16:18.445Z caller=main.go:1026 level=info fs_type=1a
ts=2023-04-22T08:16:18.445Z caller=main.go:1029 level=info msg="TSDB started"
ts=2023-04-22T08:16:18.445Z caller=main.go:1209 level=info msg="Loading configuration file" filename=./prometheus.yml
ts=2023-04-22T08:16:19.308Z caller=main.go:1246 level=info msg="Completed loading of configuration file" filename=./prometheus.yml totalDuration=862.318334ms db_storage=2.5µs remote_storage=4.334µs web_handler=959ns query_engine=2.417µs scrape=856.822917ms scrape_sd=203.583µs notify=1.109333ms notify_sd=66.75µs rules=23.584µs tracing=566.125µs
ts=2023-04-22T08:16:19.308Z caller=main.go:990 level=info msg="Server is ready to receive web requests."
ts=2023-04-22T08:16:19.308Z caller=manager.go:974 level=info component="rule manager" msg="Starting rule manager..."
Prometheus 를 통해 Node Exporter metric을 수집 확인
"localhost:9090" 주소로 접속해서 Graph를 보면 아래와 같이 정상적으로 수집되고 있음을 확인할 수 있다.
'Framework and Tool > Prometheus' 카테고리의 다른 글
With grafana (0) | 2023.04.22 |
---|---|
Prometheus - 살펴보기(first steps) (0) | 2022.01.06 |
Prometheus - 개요 (0) | 2022.01.02 |
댓글