https://www.puppet.com/community logo
Join Slack
Powered by
# puppet-enterprise
  • a

    Adrian Parreiras Horta

    12/04/2025, 9:25 PM
    It will also wait until all borrowed, i.e. currently used, JRubies are returned to the pool before updating the code, which is a great way to find out if you have a stuck or super long-running request because the update will never finish.
    c
    • 2
    • 2
  • j

    jms1

    12/04/2025, 10:20 PM
    i'll be honest, i'm not sure exactly what i'm asking because i have no idea how the components talk to each other ... what i know is, on the "old" PE2016 servers using a custom
    r10k
    -alike script, some agents will occasionally fail with an error message like "unable to find module stdlib", because they happen to request a catalog while their environment is in the middle of being rebuilt. for PE2023 the "real" PE servers (being used to configure normal machines) will be using code manager so i don't expect this to ever happen, but on the "dev" PE servers (only used to prototype and test new puppet code before it gets committed and pushed to a repo) the environments are still being built by hand, because having to commit, push, and then wait for code manager to rebuild the environment, takes long enough that i lose track of what i was doing ... instead i use a script to
    rsync
    my changes directly to the environment, and i can go from "save changes" to "run
    puppet agent -t
    on a scratch machine" in about five seconds.
  • j

    jms1

    12/04/2025, 10:23 PM
    i vaguely remember years ago, somebody told me that, whatever name they were using for the
    r10k
    wrapper now known as "code manager", had some kind of secret API that it used to (1) wait until the compiler wasn't using a given environment, (2) "lock" that environment so the compiler wouldn't use it, (3) rebuild the files in the environment, and (4) "unlock" it so the compiler could build catalogs again ... but they also said "i think that's an internal thing that puppetlabs (at the time) doesn't want to share the details of"
  • j

    jms1

    12/04/2025, 10:25 PM
    i'm hoping that things have changed and this stuff isn't really a big secret, it's just something nobody has ever asked about before ... although i'll be honest, i fully expect to be "perforced", which in this case means being told that it's something perforce doesn't want to tell anybody about.
  • j

    jms1

    12/04/2025, 10:26 PM
    (also it's quittin' time in florida, i'll check back tomorrow to see if anybody answered ... have a good evening all)
  • c

    CVQuesty

    12/05/2025, 1:26 PM
    I’m typically fascinated by the types of questions you ask like you’re the Mystical Merlin or something. I’ve literally been in hundreds of environments, done hundreds of deployments, worked in small-ish environments with servers numbered in dozens as well as massive environments with servers numbered in hundreds of thousands, and not once have I run in to the types of scenarios you bring up here. Makes me think I missed something in my education, or I’m doing somnething wrong.
    j
    • 2
    • 2
  • c

    csharpsteen

    12/05/2025, 7:04 PM
    Code manager is a client <-> server service built into
    pe-puppetserver
    . The server side runs on the PE Primary and exposes a HTTP API that receives deployment requests which it handles by: • Authorizing the request using a PE RBAC token. • Running
    r10k deploy environment
    for each control repo branch listed in the request. This step includes logic to safely run
    r10k deploy
    concurrently against multiple branches along with logic to prune the r10k caches to prevent them from growing too large. • When
    r10k
    finishes running, post-run scripts are executed. By default, this includes running
    puppet generate types
    on the environment, if needed. • The result of
    r10k deploy enviroment
    + post run scripts is committed to an internal Git repository known as "File Sync Storage". • Optionally, the API call may wait before returning a HTTP response until all clients ACK the deployment is live or a timeout has elapsed. The client side runs in
    pe-puppetserver
    and
    pe-orchestration-services
    and: • Polls the Primary for new deployments, every 5 seconds. This polling request also serves as the deployment ACK by notifying the Primary of the latest environment versions the client has deployed. • If there are new deployments, runs a
    git fetch
    operation to pull commits from File Sync Storage to local client copies of the repository. • Deploys updated environments brought in by
    git fetch
    . These updates are made atomic either by a very heavy JRuby read+write lock (legacy deployment) or by creating a new versioned copy of the environment and updating a symlink (lockless deployment, modern default). Lockless deployment gets a significant speed boost from modern GNU coreutils that default to
    cp --reflink=auto
    and a filesystem that supports reflinks (XFS, BTRFS, ZFS, notably NOT EXT4). Basically, for best performance run PE infrastructure on RHEL 9 or newer or Ubuntu 24.04 or newer (but don't use Ubuntu's filesystem default of
    ext4
    ) . • Performs cleanup of superseded environment content and git history. Code Manager also serves a dual purpose when DR is enabled in that it syncs deployed code, CA state, and PE configuration from the Primary to the Replica.
    🙌 1
  • c

    csharpsteen

    12/05/2025, 7:13 PM
    Turning Code Manager on also activates "Static Catalogs". These contain pre-computed metadata for
    file
    resources that use content deployed through Code Manager. This provides two benefits: • An entire source of JRuby contention is eliminated as agents no longer have to make
    file_metadata
    requests to determine expected checksums, it's all just there in the catalog.
    file_content
    requests are also cheaper as they stay in the Java layer and hit the JGit service instead of going down to JRuby. • The agent gets file content from the same deployment that its catalog was compiled from. Not a different version that may have come down in a subsequent deployment. IIRC, the above combined into something like a 20% cut to the JRuby load in Puppet Lab's internal infrastructure when it was benchmarked years ago. Milage will vary, custom file mounts serving large blobs are still something to shift over to a dedicated file or artifact server.
  • j

    jms1

    12/05/2025, 8:29 PM
    i just saw these responses ... i read thought it all once but i'm not grokking it in fullness, probably because it's late in the day on a friday and my brain already has one foot (or neuron?) out the door ... i saved it into obsidian so i can refer back to it later, even if "later" means after slack blocks access to it in 90 days ... @csharpsteen thank you for this, and don't be surprised if i come back with more questions, either next week or in january (i'll be out from 12-13 to 01-04,
    $DAYJOB
    has a "use it or lose it" policy for PTO)
  • k

    kenyon

    12/09/2025, 6:08 PM
    With PE, what is the correct way to manage the puppet agent service (that is,
    service { 'puppet': ...}
    )? I see in the puppet_agent module that PE is supposed to manage it somehow, but I don't see anything in the
    puppet_enterprise
    module nor in the PE docs about how to manage it. https://github.com/puppetlabs/puppetlabs-puppet_agent/blob/b2ec88bf5a8fa331a485d5770dff6d51a0d07dd4/manifests/init.pp#L245-L253
  • k

    kenyon

    12/09/2025, 6:09 PM
    I have a need to restart the puppet service when the machine is joined to the domain (otherwise the puppet service can't resolve Active Directory users and groups until a restart), so need to first have the puppet service managed in order to notify it
  • k

    kenyon

    12/09/2025, 6:44 PM
    at least this comment observes the same thing: https://github.com/puppetlabs/puppetlabs-puppet_agent/pull/461#issuecomment-1155731041
  • k

    kenyon

    12/09/2025, 6:47 PM
    and even further back https://github.com/puppetlabs/puppetlabs-puppet_agent/issues/145
  • c

    csharpsteen

    12/09/2025, 7:52 PM
    Having
    puppet
    manage
    puppet
    tends to go very weird (making a process kill its self is just signing up for odd occurances). Using the
    service
    task to stop
    puppet
    would be my recommendation. Do not stop the service for an extended period of time on Infrastructure nodes, executing
    puppet agent --disable
    is a better approach there so that they stay active in PuppetDB.
  • k

    kenyon

    12/09/2025, 8:36 PM
    you would use the
    service
    task from a puppet run? I thought about it being weird managing itself, but I just tested this and the systemd unit has
    KillMode=process
    , so restarting the service doesn't kill the puppet process that is applying the catalog, so it seems to achieve the desired effect of instantiating a new environment for the agent
    c
    • 2
    • 2
  • b

    bastelfreak

    12/09/2025, 8:41 PM
    I try to manage as much of my Puppet Infra and that's working very well. puppet service on the compilers/primary is also managed
  • c

    csharpsteen

    12/09/2025, 9:16 PM
    No, I'd use the
    service
    task outside of a Puppet run, from Orchestrator. Puppet is great at managing services, but having it manage its own
    puppet
    service is a fundamentally different animal. I can say from a decade+ of experience that that is a briar patch full of sharp edges. Been there, got the scars, find a different way to do it if you can.,
  • c

    csharpsteen

    12/09/2025, 9:23 PM
    Another approach is to run the puppet agent via cron or SystemD timer. That both solves thundering herd and ensures you have a fresh process tree for each run --- which would address the glibc+nsswitch quirks around joining a domain. https://forge.puppet.com/modules/reidmv/puppet_run_scheduler/readme is one implementation of that strategy.
  • c

    csharpsteen

    12/09/2025, 9:28 PM
    Or, take a page from the Windows book and reboot the node after joining a domain. Likely there are other services and processes outside of puppet that need to restart in order to pick up the nsswitch changes. The
    reboot
    resource gets that all in one operation.
  • k

    kenyon

    12/09/2025, 9:35 PM
    hm
    reboot
    resource is an idea. We do that for physical machines because they have to be unplugged and moved after provisioning, but of course that doesn't happen for VMs. Have to be very sure that my domain join
    exec
    doesn't cause unnecessary reboots though
  • k

    kenyon

    12/09/2025, 9:43 PM
    systemd timer idea seems safer and better
  • h

    hashim vayalar

    12/10/2025, 6:34 AM
    I couldn't find any puppet agent package in /opt/puppetlabs/server/data/packages/public/2023.8.6/el-9-x86_64-8.15.0 after i enable "Class: pe_repo:platform:el_8_x86_64" from PE console. Is there anything else we need to do ? I tried running puppet agent -t multiple times in our primary server. No use. What is preventing creating puppet related packages for el-9 in those folder ?
  • v

    vchepkov

    12/10/2025, 6:53 AM
    a typo. you can't expect el9 by enabling el8
  • v

    vchepkov

    12/10/2025, 6:54 AM
    Also you have to have a license file installed to add packages that do not match your server's OS
  • h

    hashim vayalar

    12/10/2025, 10:12 AM
    I have license file and manually added packages in those directory to solve my problem now. So that make sense "can't expect el9 by enabling el8", that might be the reason
  • j

    jms1

    12/10/2025, 2:32 PM
    i didn't realize you needed a license in order to enable (for example) RHEL 8 packages from a RHEL 9 PE server ... is that something that perforce "forgot" to document, or is it documented somewhere and i just missed it? (either one is entirely possible)
  • b

    bastelfreak

    12/10/2025, 2:33 PM
    that's a new "feature"
  • j

    jms1

    12/10/2025, 2:37 PM
    sure ... but is a documented feature, or is it a land mine they've planted to fsck with people who don't have license files on their PE servers?
  • b

    bastelfreak

    12/10/2025, 2:41 PM
    I think it was documented after they changed it
  • c

    csharpsteen

    12/10/2025, 3:00 PM
    as a PE trial user you have access to the agent on the operating system you’ve installed the Puppet server on.
    https://help.puppet.com/pe/2023.8/topics/purchasing_and_installing_a_license_key.htm