v1.4.1

GitLab: updates to v12.8.1 and adds some custom alerting, plus tweaks
and fixes

Thanos: updates to v0.10.1 or newer, uses Thanos Store's experimental
features to reduce memory impact and OOM errors. Importantly, also
connects the SFO cluster to Thanos Querier through VPN + kubectl proxying.

Node termination handling: adds the GKE node termination handler on
top of estafette-preemptible-killer, since GKE-induced node
terminations were still being observed.

NGINX ingress: updates to v0.29.0 and adds some global configuration
of proxy_next_upstream and proxy_buffering to try to deal with
upstream pods that are frequently being cycled.
parent 2aa2572f
......@@ -7,41 +7,45 @@
- [Added](#added)
- [Changed](#changed)
- [Removed](#removed)
- [1.4.0 - 2020-01-11](#140-2020-01-11)
- [1.4.1 - 2020-03-03](#141-2020-03-03)
- [Added](#added-1)
- [Changed](#changed-1)
- [Removed](#removed-1)
- [1.3.0 - 2020-01-04](#130-2020-01-04)
- [1.4.0 - 2020-01-11](#140-2020-01-11)
- [Added](#added-2)
- [Changes](#changes)
- [Changed](#changed-2)
- [Removed](#removed-2)
- [1.2.0 - 2019-11-13](#120-2019-11-13)
- [1.3.0 - 2020-01-04](#130-2020-01-04)
- [Added](#added-3)
- [Changes](#changes-1)
- [Changes](#changes)
- [Removed](#removed-3)
- [1.1.1 - 2019-09-24](#111-2019-09-24)
- [1.2.0 - 2019-11-13](#120-2019-11-13)
- [Added](#added-4)
- [Changes](#changes-2)
- [Changes](#changes-1)
- [Removed](#removed-4)
- [1.1.0 - 2019-08-23](#110-2019-08-23)
- [1.1.1 - 2019-09-24](#111-2019-09-24)
- [Added](#added-5)
- [Changes](#changes-3)
- [Changes](#changes-2)
- [Removed](#removed-5)
- [1.0.0 - 2019-07-01](#100-2019-07-01)
- [1.1.0 - 2019-08-23](#110-2019-08-23)
- [Added](#added-6)
- [Changes](#changes-4)
- [Changes](#changes-3)
- [Removed](#removed-6)
- [0.2.0 - 2018-03-05](#020-2018-03-05)
- [1.0.0 - 2019-07-01](#100-2019-07-01)
- [Added](#added-7)
- [Changes](#changes-5)
- [Changes](#changes-4)
- [Removed](#removed-7)
- [0.1.1 - 2017-12-05](#011-2017-12-05)
- [0.2.0 - 2018-03-05](#020-2018-03-05)
- [Added](#added-8)
- [Changes](#changes-5)
- [Removed](#removed-8)
- [0.1.1 - 2017-12-05](#011-2017-12-05)
- [Added](#added-9)
- [Changes](#changes-6)
- [0.1.0 - 2017-11-25](#010-2017-11-25)
- [Added](#added-9)
- [Added](#added-10)
- [Changes](#changes-7)
- [Removed](#removed-8)
- [Removed](#removed-9)
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
......@@ -53,10 +57,27 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
## [Unreleased]
### Added
### Changed
### Removed
## [1.4.1] - 2020-03-03
### Added
- [cluster-network-tunneler](https://gitlab.palpant.us/justin/cluster-network-tunneler), using OpenVPN, LDAP, kubectl, and a restricted service account to make a Kubernetes service running in my home network accessible from my GKE cluster, securely.
- Export GCS storage metrics in [stackdriver-exporter](https://gitlab.palpant.us/justin/stackdriver-exporter)
- Add the GKE node termination handler from GCP, deployed with [gke-preemptible-killer](https://gitlab.palpant.us/justin/gke-preemptible-killer)
- Add rsync-to-dropbox for CloudSQL backup dumps in [backup-to-dropbox](https://gitlab.palpant.us/justin/rclone-to-dropbox)
- Add experimental-index feature to [Thanos Store](https://gitlab.palpant.us/justin/palpantlab-gitlab/-/blob/master/deploy/kubectl-apply/gke/thanos-store.yaml#L59) and use v0.11.0-rc.0 for Store only
- Add [permanent custom alerts](https://gitlab.palpant.us/justin/palpantlab-gitlab/-/blob/master/deploy/helm-upgrade/gke/prometheus-alerts.yaml), stop trying to tweak configmap (it gets overridden on deploy)
### Changed
- Upgrade Gitlab to v12.8.1 and remove embedded GitLab helm chart, go back to using upstream directly now that Prometheus is at a reasonable version
- Upgrade Thanos to v0.10.1 and tweak memory and CPU allowance for all components.
- Upgrade Kubernetes dashboard to v2.0.0-rc1
- Added forward-thanos-sidecar to the list of Querier stores.
- Decrease resource usage for CloudSQL backup CronJobs
- [Upgrade NGINX ingress](https://gitlab.palpant.us/justin/palpantlab-ingress) to nginx-ingress-1.31.0, NGINX v0.29.0
- Add `proxy_buffering` and play around with `proxy_next_upstream` configuration for ingress to try to load balance between frequently-killed pods more effectively. Outages for node deaths _do_ seem somewhat shorter, but it's hard to tell.
- Upgrade [estafette-gke-preemptible-killer](https://gitlab.palpant.us/justin/gke-preemptible-killer) to v1.2.5
- Fix redis being allocated to non-stable nodes due to upstream chart changes.
### Removed
## [1.4.0] - 2020-01-11
......@@ -245,7 +266,8 @@ Lastly, I have split up the single mono-repo into individual repos to support si
- HAProxy, all instances
- Most of the 9s in my previous uptime. But they will be back, and better than ever!
[Unreleased]: https://gitlab.palpant.us/justin/palpantlab-infra/compare/v1.4.0...HEAD
[Unreleased]: https://gitlab.palpant.us/justin/palpantlab-infra/compare/v1.4.1...HEAD
[1.4.0]: https://gitlab.palpant.us/justin/palpantlab-infra/compare/v1.4.0...v1.4.1
[1.4.0]: https://gitlab.palpant.us/justin/palpantlab-infra/compare/v1.3.0...v1.4.0
[1.3.0]: https://gitlab.palpant.us/justin/palpantlab-infra/compare/v1.2.0...v1.3.0
[1.2.0]: https://gitlab.palpant.us/justin/palpantlab-infra/compare/v1.1.1...v1.2.0
......
......@@ -11,7 +11,7 @@
- [CloudSQL databases](#cloudsql-databases)
- [Cluster-hosted services](#cluster-hosted-services)
- [Ingress](#ingress)
- [Source control with locally hosted Gitlab](#source-control-with-locally-hosted-gitlab)
- [Source control with private hosted Gitlab](#source-control-with-private-hosted-gitlab)
- [GitLab CI Runner](#gitlab-ci-runner)
- [Personal (static) Websites](#personal-static-websites)
- [Boxomon](#boxomon)
......@@ -19,6 +19,7 @@
- [Kubernetes Dashboards](#kubernetes-dashboards)
- [GitLab-based monitoring (Prometheus/Alertmanager/Grafana/Thanos)](#gitlab-based-monitoring-prometheusalertmanagergrafanathanos)
- [Backups](#backups)
- [cluster-network-tunneler](#cluster-network-tunneler)
- [palpantlab-sfo](#palpantlab-sfo)
- [Kubernetes Nodes](#kubernetes-nodes-1)
- [193.168.0.31/ubuntu-node-01.sfo.palpant.us](#193168031ubuntu-node-01sfopalpantus)
......
Subproject commit 7e328aef45e6388a6576b702a2c246f97077bc83
Subproject commit 4b7dbf2800b20b67b2a21e472b2722ba050ef850
Subproject commit a71310987a840c8e76bdffe695cc615da9564c6c
Subproject commit ba140916b4cd421d5e3f30e6a60d5c26f0572b22
Subproject commit bef9bc95777120ac45a9c483f479d1f0075a1595
Subproject commit 6573d447ee6d119efa756989465110787268a198
Subproject commit 4db354a7da00cd6e74217e3cb575d3c8c38a30f3
Subproject commit e869059160b47997a0188e1dbc5690221a4f400c
Subproject commit 699a5941b8bbd3dadd0836ddb3105a3975ffa9e5
Subproject commit c6b59b914309c85fffe8cbb74d392113848bea15
Subproject commit 2e0908de61350b6f8079e9d8c9afa5ca724fe4a9
Subproject commit 2ba9cf643c7876a4828fa9eef69edf2c3d5a720e
Subproject commit 1eab8441885315b03289436ef9d140fdb0bee58a
Subproject commit 2cefad763e719261ff7cdc89d7fb7962874880fe
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment