Docker volume NFS share-en

Fórumok

Sziasztok!

 

Van tobb docker hostom, amiknek szeretnek egy kozos perma storage-et kialakitani, NFS segitsegevel. (nem swarm, csak tobb standalone server) 

A gondom az, hogy vannak olyan kontenerek, amik nem hajlandoak elindulni NFS volume-on. Pl Gitlab, Seafile (a DB resze)

A storage egy Dell EMC DataDomain, az export ezen igy nez ki (itt sok lehetoseg nincs az allitasra amugy):

NFSv3,rw,no_root_squash,no_all_squash,secure,sec=sys

Kliens oldal (a dockeren belul, Portainer-el hoztam letre az NFS mountot, ez kb a default config amit felajanl)

addr=10.51.36.11,rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14

A gond az gyanitom a gitlabnal is a DB-vel lesz (a logok szerint) Van valakinek otlete a dologra, vagy sikerult mar ezt ebben a formaban mukodesre birni?

Hozzászólások

Megoldás az, hogy megkerülöd a problémát, és felcsatolod az NFS-t a host oprendszere alá, aztán "local" volume-ot adsz meg a konténernek?

Milyen parameterekkel hozod letre magat a docker volume-ot?

[
    {
        "CreatedAt": "2022-11-15T14:33:36+01:00",
        "Driver": "local",
        "Labels": null,
        "Mountpoint": "/var/lib/docker/volumes/gitlab_system_config/_data",
        "Name": "gitlab_system_config",
        "Options": {
            "device": ":/data/col1/docker_storage/gitlab_system/gitlab_system_config",
            "o": "addr=10.51.36.11,rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14",
            "type": "nfs"
        },
        "Scope": "local"
    }
]

Eddig jut a gitlab folyton... Aztan restartolja magat...

 

[2022-11-16T09:55:29+00:00] INFO: Retrying execution of execute[/opt/gitlab/bin/gitlab-ctl start postgresql], 19 attempts left
    [execute] fail: postgresql: runsv not running
[2022-11-16T09:55:31+00:00] INFO: Retrying execution of execute[/opt/gitlab/bin/gitlab-ctl start postgresql], 18 attempts left
    [execute] ok: run: postgresql: (pid 348) 1s
[2022-11-16T09:55:34+00:00] INFO: execute[/opt/gitlab/bin/gitlab-ctl start postgresql] ran successfully
    - execute /opt/gitlab/bin/gitlab-ctl start postgresql
  * database_objects[postgresql] action create
    * postgresql_user[gitlab] action create
      * execute[create gitlab postgresql user] action run (skipped due to not_if)
       (up to date)
    * postgresql_user[gitlab_replicator] action create
      * execute[create gitlab_replicator postgresql user] action run (skipped due to not_if)
      * execute[set options for gitlab_replicator postgresql user] action run (skipped due to not_if)
       (up to date)
    * postgresql_database[gitlabhq_production] action create
      * execute[create database gitlabhq_production] action run (skipped due to not_if)
       (up to date)
    * postgresql_extension[pg_trgm] action enable
      * postgresql_query[enable pg_trgm extension] action run (skipped due to only_if)
       (up to date)
    * postgresql_extension[btree_gist] action enable
      * postgresql_query[enable btree_gist extension] action run (skipped due to only_if)
       (up to date)
     (up to date)
  * version_file[Create version file for PostgreSQL] action create
    * file[/var/opt/gitlab/postgresql/VERSION] action create (up to date)
     (up to date)
  * ruby_block[warn pending postgresql restart] action run (skipped due to only_if)
  * execute[reload postgresql] action nothing (skipped due to action :nothing)
  * execute[start postgresql] action nothing (skipped due to action :nothing)
Recipe: praefect::disable
  * service[praefect] action nothing (skipped due to action :nothing)
  * runit_service[praefect] action disable
    * ruby_block[disable praefect] action run (skipped due to only_if)
     (up to date)
  * consul_service[praefect] action delete
    * file[/var/opt/gitlab/consul/config.d/praefect-service.json] action delete (up to date)
     (up to date)
Recipe: gitlab-kas::enable
  * directory[/var/opt/gitlab/gitlab-kas] action create (up to date)
  * directory[/var/log/gitlab/gitlab-kas] action create (up to date)
  * directory[/opt/gitlab/etc/gitlab-kas] action create (up to date)
  * ruby_block[websocket TLS termination] action run (skipped due to only_if)
  * version_file[Create version file for Gitlab KAS] action create
    * file[/var/opt/gitlab/gitlab-kas/VERSION] action create (up to date)
     (up to date)
  * file[/var/opt/gitlab/gitlab-kas/authentication_secret_file] action create (up to date)
  * file[/var/opt/gitlab/gitlab-kas/private_api_authentication_secret_file] action create (up to date)
  * file[/var/opt/gitlab/gitlab-kas/redis_password_file] action create (skipped due to only_if)
  * template[/var/opt/gitlab/gitlab-kas/gitlab-kas-config.yml] action create (up to date)
  * env_dir[/opt/gitlab/etc/gitlab-kas/env] action create
    * directory[/opt/gitlab/etc/gitlab-kas/env] action create (up to date)
    * file[/opt/gitlab/etc/gitlab-kas/env/SSL_CERT_DIR] action create (up to date)
    * file[/opt/gitlab/etc/gitlab-kas/env/OWN_PRIVATE_API_URL] action create (up to date)
     (up to date)
  * service[gitlab-kas] action nothing (skipped due to action :nothing)
  * runit_service[gitlab-kas] action enable
    * ruby_block[restart_service] action nothing (skipped due to action :nothing)
    * ruby_block[restart_log_service] action nothing (skipped due to action :nothing)
    * ruby_block[reload_log_service] action nothing (skipped due to action :nothing)
    * directory[/opt/gitlab/sv/gitlab-kas] action create (up to date)
    * template[/opt/gitlab/sv/gitlab-kas/run] action create (up to date)
    * directory[/opt/gitlab/sv/gitlab-kas/log] action create (up to date)
    * directory[/opt/gitlab/sv/gitlab-kas/log/main] action create (up to date)
    * template[/opt/gitlab/sv/gitlab-kas/log/config] action create (up to date)
    * ruby_block[verify_chown_persisted_on_gitlab-kas] action nothing (skipped due to action :nothing)
    * link[/var/log/gitlab/gitlab-kas/config] action create (up to date)
    * template[/opt/gitlab/sv/gitlab-kas/log/run] action create (up to date)
    * directory[/opt/gitlab/sv/gitlab-kas/env] action create (up to date)
    * ruby_block[Delete unmanaged env files for gitlab-kas service] action run (skipped due to only_if)
    * template[/opt/gitlab/sv/gitlab-kas/check] action create (skipped due to only_if)
    * template[/opt/gitlab/sv/gitlab-kas/finish] action create (skipped due to only_if)
    * directory[/opt/gitlab/sv/gitlab-kas/control] action create (up to date)
    * link[/opt/gitlab/init/gitlab-kas] action create (up to date)
    * file[/opt/gitlab/sv/gitlab-kas/down] action nothing (skipped due to action :nothing)
    * directory[/opt/gitlab/service] action create (up to date)
    * link[/opt/gitlab/service/gitlab-kas] action create[2022-11-16T09:55:38+00:00] INFO: link[/opt/gitlab/service/gitlab-kas] created
      - create symlink at /opt/gitlab/service/gitlab-kas to /opt/gitlab/sv/gitlab-kas
    * ruby_block[wait for gitlab-kas service socket] action run (skipped due to not_if)
  
Recipe: gitlab::database_migrations
  * ruby_block[check remote PG version] action nothing (skipped due to action :nothing)
  * rails_migration[gitlab-rails] action run
version: '3.6'
services:
  web:
    image: 'hun25-21v:5000/gitlab/gitlab-ce:latest'
    restart: always
    hostname: 'hun400-33v'
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'https://hun400-33v'
    ports:
      - '80:80'
      - '443:443'
      - '2223:22'
    volumes:
      - 'gitlab_system_config:/etc/gitlab'
      - 'gitlab_system_logs:/var/log/gitlab'
      - 'gitlab_system_data:/var/opt/gitlab'
    shm_size: '256m'
volumes:
  gitlab_system_config:
    external: true
    name: gitlab_system_config
  gitlab_system_logs:
    external: true
    name: gitlab_system_logs
  gitlab_system_data:
    external: true
    name: gitlab_system_data

megnezem! kozben talaltam egy ilyet a logokban:

 

[2022-11-16T11:43:33+00:00] INFO: Running queued delayed notifications before re-raising exception
Running handlers:
[2022-11-16T11:43:33+00:00] ERROR: Running exception handlers
There was an error running gitlab-ctl reconfigure:
runit_service[gitaly] (gitaly::enable line 110) had an error: Mixlib::ShellOut::ShellCommandFailed: ruby_block[restart_log_service] (gitaly::enable line 66) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitaly/log ----
STDOUT: timeout: run: /opt/gitlab/service/gitaly/log: (pid 264) 30s, got TERM
STDERR: 
---- End output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitaly/log ----
Ran /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitaly/log returned 1
Running handlers complete
[2022-11-16T11:43:33+00:00] ERROR: Exception handlers complete
Infra Phase failed. 29 resources updated in 39 seconds
[2022-11-16T11:43:33+00:00] FATAL: Stacktrace dumped to /opt/gitlab/embedded/cookbooks/cache/cinc-stacktrace.out
[2022-11-16T11:43:33+00:00] FATAL: ---------------------------------------------------------------------------------------
[2022-11-16T11:43:33+00:00] FATAL: PLEASE PROVIDE THE CONTENTS OF THE stacktrace.out FILE (above) IF YOU FILE A BUG REPORT
[2022-11-16T11:43:33+00:00] FATAL: ---------------------------------------------------------------------------------------
[2022-11-16T11:43:33+00:00] FATAL: Mixlib::ShellOut::ShellCommandFailed: runit_service[gitaly] (gitaly::enable line 110) had an error: Mixlib::ShellOut::ShellCommandFailed: ruby_block[restart_log_service] (gitaly::enable line 66) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1'
---- Begin output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitaly/log ----
STDOUT: timeout: run: /opt/gitlab/service/gitaly/log: (pid 264) 30s, got TERM
STDERR: 
---- End output of /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitaly/log ----
Ran /opt/gitlab/embedded/bin/sv restart /opt/gitlab/service/gitaly/log returned 1
Thank you for using GitLab Docker Image!
Current version: gitlab-ce=15.5.3-ce.0
Configure GitLab for your system by editing /etc/gitlab/gitlab.rb file
And restart this container to reload settings.
To do it use docker exec:
  docker exec -it gitlab editor /etc/gitlab/gitlab.rb
  docker restart gitlab
For a comprehensive list of configuration options please see the Omnibus GitLab readme
https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/README.md
If this container fails to start due to permission problems try to fix it by executing:
  docker exec -it gitlab update-permissions
  docker restart gitlab
Cleaning stale PIDs & sockets

Gondolom, próbáltad betenni ezt a volume-t egy bármilyen, írni nem akaró konténer alá és belépni a konténerbe és onnan írni a kötetre. Csak azért írom, mert nem írtad, hogy mi a helyzet ezzel. Be tudsz állítani a leírt fájlon mindenféle csoporttagságot, jogosultságot? Nem változik meg az az id, aki leírja a fájlt?

Próbáltad "igazi" NFS-szerverrel, amit rendesen tudsz állítgatni? Ott megy?

maga a gitlab kontener is tud, nem azzal van problema. A telepitesi (elsao inditas) folyamat nem fut le, az adatbazis migralasa resznel elakad.

 

hun400-33v:/var/lib/docker/volumes/gitlab_system_data/_data # ls -all
total 12
drwxr-xr-x 19 root root 1215 Nov 16 11:50 .
drwx-----x  1 root root   28 Nov 15 14:33 ..
drwxr-xr-x  2  998  998  101 Nov 15 14:38 .bundle
-rw-r--r--  1  998  998  359 Nov 15 14:38 .gitconfig
drwxrwxrwx  4 root root  295 Nov 16 10:38 .snapshot
drwx------  2  998  998  165 Nov 15 14:38 .ssh
drwx------  2  998 root  101 Nov 16 10:39 backups
-rw-------  1 root root   38 Nov 16 11:49 bootstrapped
drwxr-xr-x  2 root root  101 Nov 16 11:42 crond
drwx------  3  998  998  162 Nov 16 10:39 git-data
drwx------  3  998 root  451 Nov 16 11:49 gitaly
drwxr-xr-x  3  998 root  156 Nov 16 10:39 gitlab-ci
drwx------  2  998 root  389 Nov 16 11:49 gitlab-kas
drwxr-xr-x  9  998 root  665 Nov 16 11:49 gitlab-rails
drwx------  2  998 root  160 Nov 16 11:50 gitlab-shell
drwxr-x---  3  998  999  273 Nov 16 11:49 gitlab-workhorse
drwx------  3 root root  224 Nov 16 11:49 logrotate
drwxr-x---  3 root  999  207 Nov 16 11:48 nginx
drwxr-xr-x  3  996 root  396 Nov 16 11:49 postgresql
drwxr-x---  2  997  998  278 Nov 16 11:49 redis
drwx------  2  993 root  228 Nov 16 11:49 registry
-rw-r--r--  1 root root   40 Nov 16 10:39 trusted-certs-directory-hash

 

Nyilvan (bar ezt mondani sem kell ) a belso local storage-en megy rendesen.

 

valami nepszeru distro alatti (gondolom erre gondolsz) NFS servert meg nem probaltam. De a vegso megoldasnak igy is annak kene lenni, h ezt hasznalom.

 

Kontenerbol (belulrol) ugyanez:

 

root@hun400-33v:/# cd /var/opt/gitlab/
root@hun400-33v:/var/opt/gitlab# ls -all
total 14
drwxr-xr-x 22 root              root       1399 Nov 16 10:56 .
drwxr-xr-x  1 root              root         12 Nov  7 22:08 ..
drwxr-xr-x  2 git               git         101 Nov 15 13:38 .bundle
-rw-r--r--  1 git               git         359 Nov 15 13:38 .gitconfig
drwxrwxrwx  4 root              root        295 Nov 16 09:38 .snapshot
drwx------  2 git               git         165 Nov 15 13:38 .ssh
drwxr-x---  3 gitlab-prometheus root        219 Nov 16 10:54 alertmanager
drwx------  2 git               root        101 Nov 16 09:39 backups
-rw-------  1 root              root         38 Nov 16 10:49 bootstrapped
drwxr-xr-x  2 root              root        101 Nov 16 10:42 crond
drwx------  3 git               git         162 Nov 16 09:39 git-data
drwx------  3 git               root        451 Nov 16 10:55 gitaly
drwxr-xr-x  3 git               root        156 Nov 16 09:39 gitlab-ci
drwxr-xr-x  2 git               root        230 Nov 16 10:54 gitlab-exporter
drwx------  2 git               root        389 Nov 16 10:54 gitlab-kas
drwxr-xr-x  9 git               root        665 Nov 16 10:56 gitlab-rails
drwx------  2 git               root        160 Nov 16 10:56 gitlab-shell
drwxr-x---  3 git               gitlab-www  273 Nov 16 10:54 gitlab-workhorse
drwx------  3 root              root        224 Nov 16 10:56 logrotate
drwxr-x---  9 root              gitlab-www  627 Nov 16 10:51 nginx
drwxr-xr-x  3 gitlab-psql       root        396 Nov 16 10:54 postgresql
drwxr-x---  4 gitlab-prometheus root        271 Nov 16 10:54 prometheus
drwxr-x---  2 gitlab-redis      git         278 Nov 16 10:54 redis
drwx------  2 registry          root        284 Nov 16 10:50 registry
-rw-r--r--  1 root              root         40 Nov 16 09:39 trusted-certs-directory-hash

Van hibaüzenet vagy hibajelenség?

A gitlab-nak a főbb komponensei egész részletes logot tolnak induláskor - emlékeim szerint.

Na most ha a gitlab-rails hasal el, akkor miért nem annak a logját nézzük? Vagy valami olyasmit, amiben benne is van hogy minek és mi a kínja?

:)

2022-11-16T12:05:11.488Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"batched_background_migration_job_transition_logs", :connection_name=>"main"}
2022-11-16T12:05:11.519Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:11.525Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"incident_management_pending_alert_escalations", :connection_name=>"main"}
2022-11-16T12:05:11.554Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:11.559Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"incident_management_pending_issue_escalations", :connection_name=>"main"}
2022-11-16T12:05:11.586Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:11.591Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"verification_codes", :connection_name=>"main"}
2022-11-16T12:05:11.620Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:11.621Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:11.622Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:11.623Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:11.624Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:11.625Z: {:message=>"Finished sync of dynamic postgres partitions"}
2022-11-16T12:05:12.173Z: {:message=>"Syncing dynamic postgres partitions"}
2022-11-16T12:05:12.175Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.234Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"audit_events", :connection_name=>"main"}
2022-11-16T12:05:12.291Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.297Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"web_hook_logs", :connection_name=>"main"}
2022-11-16T12:05:12.330Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.337Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"loose_foreign_keys_deleted_records", :connection_name=>"main"}
2022-11-16T12:05:12.395Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.401Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"batched_background_migration_job_transition_logs", :connection_name=>"main"}
2022-11-16T12:05:12.433Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.438Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"incident_management_pending_alert_escalations", :connection_name=>"main"}
2022-11-16T12:05:12.466Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.472Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"incident_management_pending_issue_escalations", :connection_name=>"main"}
2022-11-16T12:05:12.502Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.508Z: {:message=>"Checking state of dynamic postgres partitions", :table_name=>"verification_codes", :connection_name=>"main"}
2022-11-16T12:05:12.537Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.538Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.539Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.541Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.541Z: {:message=>"Switched database connection", :connection_name=>"main"}
2022-11-16T12:05:12.542Z: {:message=>"Finished sync of dynamic postgres partitions"}
2022-11-16T12:05:30.114Z: Ci::StuckBuilds::DropPendingService: Cleaning pending timed-out builds
2022-11-16T12:05:32.639Z: ActiveRecord connections disconnected
2022-11-16T12:05:33.794Z: ActiveRecord connection established
2022-11-16T12:05:33.815Z: ActiveRecord connection established
Szerkesztve: 2022. 11. 16., sze – 13:19

Ez most furcsa!!! most mast csinaltam, ezert hagytam futni a kontenert, ment a folyamatos restartolgatas, aztan egyszer csak (kb 20 perc utan) elindult!!!

 

A kontener lelovese/inditasa mar gyors! csak a stackbol letrehozas volt vagy 20 perc....

A DataDomain egy deduplikációs mentőeszköz, nem való tier1 tároló feladatokra. Egy megfelelő tárolót kellene találni, mert ezzel csak gondod lesz a későbbiekben, ha esetleg össze is jön. Pláne ha jól értem és DB-t akarsz rajta elindítani, ez nagyon nem tranzakciós feladatokra lett tervezve. Igen tud NFS-t azért, hogy tudj rá menteni.

az NFS clientnek van egy bizarr megoldasa, ami NFS "silly rename" neven fut:

Unix applications often open a scratch file and then unlink it. They do this so that the file is not visible in the file system name space to any other applications, and so that the system will automatically clean up (delete) the file when the application exits. This is known as "delete on last close", and is a tradition among Unix applications.

Because of the design of the NFS protocol, there is no way for a file to be deleted from the name space but still remain in use by an application. Thus NFS clients have to emulate this using what already exists in the protocol. If an open file is unlinked, an NFS client renames it to a special name that looks like ".nfsXXXXX". This "hides" the file while it remains in use. This is known as a "silly rename." Note that NFS servers have nothing to do with this behavior.

After all applications on a client have closed the silly-renamed file, the client automatically finishes the unlink by deleting the file on the server. Generally this is effective, but if the client crashes before the file is removed, it will leave the .nfsXXXXX file. If you are sure that the applications using these files are no longer running, it is safe to delete these files manually.

forras: https://nfs.sourceforge.net/

regebben en futottam bele olyan problemaba, hogy egy app ugy mukodott, h 

- letrehozott es megnyitott egy temp fajlt egy mappaban

- kitorolte a fajlt utana, amig meg nyitott file descriptor volt ra

- dolgozott bele, majd torolte a szulomappajat, mielott meg lezarta volna a fajlt

ha nem futott NFS-en, akkor gond nelkul ment.

ha NFS-en at futott, akkor a szulomappa torleskor kapott egy mappa nem ures (ENOTEMPTY) hibat, mivel az .nfsXXXXX fajl meg ottmaradt benne.

 

ezt egy jopar oranyi strace/sysdig debuggolas utan sikerult kideriteni, es mivel nem akartunk belenyulni a kodba, ezert vegul lemondtunk az NFS-rol.

ha itt is valami ilyesmi van, es 20 perc utan x restart utan egyszercsak sikerul, akkor lehet olyan h *egyszerre* akar pl egy szulomappat torolni es fut le egy close a filedescriptorra is es joesetben elobb takaritja el a silly renamed/deleted fileokat, mint a szulomappat.

Vegul SC5000-el csinaltam egy uj volume-t, ez ugye alapbol ez ossze van "drotozva" a VMware-el, mert o adja a VM-ek storage-at es ez a volume-t RDM diskkent (iSCSI-nak latja) hozzaadtam egy XigmaNAS VM-hez. Innen mar sima volt, ZFS pool meg egy sajat Datasets, ezt pedig kideployoltam NFS3-al. Megy siman , kb 3 perc alatt elindult gond nelkul a kontener.... (elso inditas, az hosszabb mert letrehozza a fajlokat)