`epmt` Developer and Advanced Usage Guide¶

This document contains detailed information about epmt configuration, data collection, database submission, analysis, metrics, debugging, and CI/CD infrastructure.

For installation and quick-start instructions, see README.md.

Table of Contents¶

Configuration
Environment Variables
Getting Current Configuration Information
The Three Modes of epmt
The First Mode, Data Collection
- Collecting Performance Data
- Data Collection with SLURM epilog and prolog
The Second Mode, Data Submission
The Third Mode, Data Analysis and Visualization
Debugging
Performance Metrics Data Dictionary
Addition of new metrics
CI/CD Workflows and Caching
Workflows
Caches and Invalidation
Cache Invalidation Gap vs. make
Forcing a Rebuild
Weekly Pre-warming
Troubleshooting
Virtual Environments:
Common Error Messages
- version GLIBC_x.xx not found

Configuration¶

epmt houses default settings (epmt_default_settings.py), user settings (settings.py), and a submodule (epmt_settings.py) that appends the user settings to the end of the default settings, overriding them.

Custom user settings should only be included by developers who understand epmt's requirements. The default settings should first be evaluated, they look a little bit like the following:

$ cat src/epmt/epmt_default_settings.py
  ...
  papiex_options = "COLLATED_TSV"
  epmt_output_prefix = "/tmp/epmt/"
  input_pattern = "*-papiex*.[ct]sv"
  ...
  db_params = {
      'url': 'sqlite:///:memory:',
      'echo': False,
  }
  ...

Environment Variables¶

The following variables replace, at run-time, the values in the db_params dictionary found in settings.py:

EPMT_DB_PROVIDER
EPMT_DB_USER
EPMT_DB_PASSWORD
EPMT_DB_HOST
EPMT_DB_DBNAME
EPMT_DB_FILENAME

Getting Current Configuration Information¶

You can examine all current settings by passing the --help option:

$ ./epmt --help
usage: epmt [-n] [-d] [-h] [-a] [--drop]
            [epmt_cmd] [epmt_cmd_args [epmt_cmd_args ...]]

positional arguments:
  epmt_cmd       start, run, stop, submit, dump
  epmt_cmd_args  Additional arguments, command dependent

optional arguments:
  -n, --dry-run  Don't touch the database
  -v, --verbose  Increase level of verbosity/debug
  -h, --help     Show this help message and exit
  -a, --auto     Do start/stop when running
  --drop         Drop all tables/data and recreate before importing

settings.py (overridden by below env. vars):
db_params               {'host': 'localhost', 'password': 'example', 'user': 'postgres', 'dbname': 'EPMT', 'provider': 'postgres'}
debug                   False                                                   
input_pattern           *-papiex-[0-9]*-[0-9]*.csv                              
install_prefix          ../papiex-oss/papiex-oss-install/                       
papiex_options          PERF_COUNT_SW_CPU_CLOCK                                 
epmt_output_prefix      /tmp/epmt/                                              

environment variables (overrides settings.py):

The Three Modes of `epmt`¶

There are three modes to epmt usage; data collection, data submission, and data analysis. Each has different dependencies:

Collection requires Python 2.6 or higher, and compiled papiex libraries for process tagging
Submission requires Python packages for SQL and database interactions (sqlalchemy, sqlite, postgres)
Analysis requires jupyter, iPython, and additionally epmt-dash and plotly for dashboard-style analytics

The First Mode, Data Collection¶

Collecting Performance Data¶

Assuming you have epmt installed and in your path, let's modify a job file:

$ cat my_job.sh
#!/bin/bash
# Example job script for Torque or SLURM
./compute_the_world --debug

This becomes:

$ cat my_job_epmt.sh
#!/bin/bash
# Example job script for Torque or SLURM
epmt start
epmt run ./compute_the_world --debug 
epmt stop

Or more succinctly by automating the start/stop cycle with the --auto or -a flag:

$ cat my_job_epmt2.sh
#!/bin/bash
# Example job script for Torque or SLURM
epmt -a run ./compute_the_world --debug

But usually we want to run more than one executable. We could have any number of run statements:

$ cat my_job_epmt3.sh
#!/bin/bash
# Example job script for Torque or SLURM
epmt start
epmt run ./initialize_the_world --random 
epmt run ./compute_the_world 
epmt run ./postprocess_the_world 
epmt stop

Let's skip all the markup and do it with only environment variables. epmt provides the configuration to export to the environment through the source command. This is intended for use in a job file and should be evaluated by the running shell, be it bash or csh. epmt source prints the required environment variables in Bash format unless either the SHELL or _ environment variable ends in csh.

Note the unset of LD_PRELOAD before stop! This prevents the data collection routine from running on epmt stop itself.

$ cat my_job_bash.sh
#!/bin/bash
# Example job script for Torque or SLURM

### Preamble, collect job metadata and monitor all processes/threads  
epmt start
eval `epmt source`

./initialize_the_world --random 
./compute_the_world 
./postprocess_the_world 

# Postamble, disable monitoring and collect job metadata
unset LD_PRELOAD
epmt stop

Here's an example for csh, when run interactively,

$ /bin/csh
> epmt -j1 source
setenv PAPIEX_OPTIONS PERF_COUNT_SW_CPU_CLOCK;
setenv PAPIEX_OUTPUT /tmp/epmt/1/;
setenv LD_PRELOAD /Users/phil/Work/GFDL/epmt.git/../papiex-oss/papiex-oss-install/lib/libpapiex.so:/Users/phil/Work/GFDL/epmt.git/../papiex-oss/papiex-oss-install/lib/libpapi.so:/Users/phil/Work/GFDL/epmt.git/../papiex-oss/papiex-oss-install/lib/libpfm.so:/Users/phil/Work/GFDL/epmt.git/../papiex-oss/papiex-oss-install/lib/libmonitor.so

Data Collection with SLURM epilog and prolog¶

Using configured prolog and epilogs with SLURM tasks allows one to skip job instrumentation entirely, except for job tags (EPMT_JOB_TAGS) and process tags (PAPIEX_TAGS). These are configured in slurm.conf for jobs submitted with sbatch but can be tested on the command line when using srun.

The above Csh job is equivalent to the below sequence using a prolog and epilog, with the exception of the trailing submit statement.

$ SLURM_TASK_SCRIPT_DIR=${EPMT_PREFIX}/epmt-install/slurm
$ srun -n1 \\
    --task-prolog=${SLURM_TASK_SCRIPT_DIR}/slurm_task_prolog_epmt.sh \\
    --task-epilog=${SLURM_TASK_SCRIPT_DIR}/slurm_task_epilog_epmt.sh
$ sleep 1

For this job to work using sbatch, make the following modifications in slurm.conf, substituting the appropriate path for $EPMT_PREFIX:

SLURM_TASK_SCRIPT_DIR=${EPMT_PREFIX}/epmt-install/slurm
TaskProlog=${SLURM_TASK_SCRIPT_DIR}/slurm_task_prolog_epmt.sh
TaskEpilog=${SLURM_TASK_SCRIPT_DIR}/slurm_task_epilog_epmt.sh

If this fails, the papiex installation is likely either missing or misconfigured in settings.py. The -a flag tells epmt to treat this run as an entire job.

The Second Mode, Data Submission¶

After collecting the data, jobs (groups of processes) are imported into the database with the submit command. This command takes arguments in the form of directories or tar files that must contain a job_metadata file.

Normal operation is to submit one or more directories:

epmt submit <dir1/> [...]

One can also submit a list of compressed tar files:

epmt submit <compressed_dir_file.*z> [...]

There is also a mode where the current environment is used to determine where to find the data.

epmt submit

Manual Submission Example¶

We can submit our previous job to the database defined in settings.py by running the epmt submit command with the directory returned by stage (found in the location set by settings.py epmt_output_prefix):

$ epmt -v submit /tmp/epmt/1/
INFO:epmt_cmds:submit_to_db(/tmp/epmt/1/,*-papiex-[0-9]*-[0-9]*.csv,False)
INFO:epmt_cmds:Unpickling from /tmp/epmt/1/job_metadata
INFO:epmt_cmds:1 files to submit
INFO:epmt_cmds:1 hosts found: ['linuxkit-025000000001-']
INFO:epmt_cmds:host linuxkit-025000000001-: 1 files to import
INFO:epmt_job:Binding to DB: {'filename': ':memory:', 'provider': 'sqlite'}
INFO:epmt_job:Generating mapping from schema...
INFO:epmt_job:Processing job id 1
INFO:epmt_job:Creating user root
INFO:epmt_job:Creating job 1
INFO:epmt_job:Creating host linuxkit-025000000001-
INFO:epmt_job:Creating metricname usertime
INFO:epmt_job:Creating metricname systemtime
INFO:epmt_job:Creating metricname rssmax
INFO:epmt_job:Creating metricname minflt
INFO:epmt_job:Creating metricname majflt
INFO:epmt_job:Creating metricname inblock
INFO:epmt_job:Creating metricname outblock
INFO:epmt_job:Creating metricname vol_ctxsw
INFO:epmt_job:Creating metricname invol_ctxsw
INFO:epmt_job:Creating metricname num_threads
INFO:epmt_job:Creating metricname starttime
INFO:epmt_job:Creating metricname processor
INFO:epmt_job:Creating metricname delayacct_blkio_time
INFO:epmt_job:Creating metricname guest_time
INFO:epmt_job:Creating metricname rchar
INFO:epmt_job:Creating metricname wchar
INFO:epmt_job:Creating metricname syscr
INFO:epmt_job:Creating metricname syscw
INFO:epmt_job:Creating metricname read_bytes
INFO:epmt_job:Creating metricname write_bytes
INFO:epmt_job:Creating metricname cancelled_write_bytes
INFO:epmt_job:Creating metricname time_oncpu
INFO:epmt_job:Creating metricname time_waiting
INFO:epmt_job:Creating metricname timeslices
INFO:epmt_job:Creating metricname rdtsc_duration
INFO:epmt_job:Creating metricname PERF_COUNT_SW_CPU_CLOCK
INFO:epmt_job:Adding 1 processes to job
INFO:epmt_job:Earliest process start: 2019-03-06 15:36:56.948350
INFO:epmt_job:Latest process end: 2019-03-06 15:37:06.996065
INFO:epmt_job:Computed duration of job: 10047715.000000 us, 0.17 m
INFO:epmt_job:Staged import of 1 processes, 1 threads
INFO:epmt_job:Staged import took 0:00:00.189151, 5.286781 processes per second
INFO:epmt_cmds:Committed job 1 to database: Job[u'1']

Compressed Directory Submission Example¶

This might happen at the end of the day via a cron job:

epmt submit <dir>/*tgz

Internal-batch Job Submission Example¶

These commands could be part of every users job, or in the batch systems configurable preambles/postambles.

$ cat my_job.sh
#!/bin/bash
# Example job script for Torque or SLURM
echo "$PBS_JOBID or $SLURM_JOBID"
epmt start
epmt run ./compute_the_world --debug 
epmt stop
epmt submit

The start/stop cycle can be removed with the --auto or -a flag, which performs start and stop for you:

epmt -a run ./debug_the_world --outliers
epmt submit

Data From Current Session Submission Example¶

If not inside of a batch environment, epmt will attempt to fake-and-bake a job id. This is useful when performing interactive runs. You may not be able to submit these jobs to a shared database due to constraints on job ID uniqueness, since the session ID is not guaranteed to be unique across reboots, much less other systems. However, this use case is perfectly acceptable when using a private database:

$ epmt start
WARNING:epmt_cmds:JOB_ID unset: Using session id 6948 as JOB_ID
WARNING:epmt_cmds:JOB_NAME unset: Using job id 6948 as JOB_NAME
WARNING:epmt_cmds:JOB_SCRIPTNAME unset: Using process name 6948 as JOB_SCRIPTNAME
WARNING:epmt_cmds:JOB_USER unset: Using username phil as JOB_USE$ epmt run ./debug_the_world --outliers
$ epmt stop
$ epmt submit

The Third Mode, Data Analysis and Visualization¶

epmt uses an IPython notebook data analytics environment. Starting the Jupyter notebook is easy from the epmt notebook command:

$ epmt notebook
[I 15:39:24.236 NotebookApp] Serving notebooks from local directory: /home/chris/Documents/playground/MM/build/epmt
[I 15:39:24.236 NotebookApp] The Jupyter Notebook is running at:
[I 15:39:24.236 NotebookApp] http://localhost:8888/?token=9c7529e19e12cb8121d66ff471e96fdd3056f6acc4480274
[I 15:39:24.236 NotebookApp]  or http://127.0.0.1:8888/?token=9c7529e19e12cb8121d66ff471e96fdd3056f6acc4480274
[I 15:39:24.236 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 15:39:24.263 NotebookApp] 

    To access the notebook, open this file in a browser:
        file:///home/chris/.local/share/jupyter/runtime/nbserver-18690-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=9c7529e19e12cb8121d66ff471e96fdd3056f6acc4480274
     or http://127.0.0.1:8888/?token=9c7529e19e12cb8121d66ff471e96fdd3056f6acc4480274

The notebook command supports passing parameters to jupyter, such as host IP for sharing access with machines on the local network, notebook token, and notebook password:

epmt notebook -- --ip 0.0.0.0 --NotebookApp.token='thisisatoken' --NotebookApp.password='hereisasupersecurepassword'

Debugging¶

epmt can be passed both -n (dry-run) and -v (verbosity) to help with debugging. Add more -v flags to increase verbosity level (-vvv).

epmt -v start

Or to attempt a submit without touching the database:

epmt -vv submit -n /dir/to/jobdata

Also, one can decode and dump the job_metadata file in a dir or compressed dir.

$ epmt dump ~/Downloads/yrs05-25.20190221/CM4_piControl_C_atmos_00050101.papiex.gfdl.19712961.tgz 
exp_component           atmos                                                   
exp_jobname             CM4_piControl_C_atmos_00050101                          
exp_name                CM4_piControl_C                                         
exp_oname               00050101                                                
job_el_env_changes      {}                                                      
job_el_env_changes_len  0                                                       
job_el_from_batch       []                                                      
job_el_status           0                                                       
job_el_stop             2019-02-20 22:13:23.131187                              
job_pl_env              {'LANG': 'en_US', 'PBS_QUEUE': 'batch', 'SHELL': '/bin/csh', 'PBS_ENVIRONMENT': 'PBS_BATCH', 'PAPIEX_TAGS': 'atmos', 'SHLVL': '3', 'PBS_WALLTIME': '216000', 'MOAB_NODELIST': 'pp057.princeton.rdhpcs.noaa.gov', 'PBS_VERSION': 'TORQUE-6.0.2', 'PAPIEX_OUTPUT': '/vftmp/Foo.Bar/pbs20345339/papiex',  'LOADEDMODULES': '', 'LC_TIME': 'C', 'MACHTYPE': 'x86_64', 'PAPIEX_OPTIONS': 'PERF_COUNT_SW_CPU_CLOCK', 'MOAB_GROUP': 'f'}
job_pl_env_len          81                                                      
job_pl_from_batch       []                                                      
job_pl_groupnames       ['f', 'f']                                              
job_pl_hostname         pp057                                                   
job_pl_id               20345339.moab01.princeton.rdhpcs.noaa.gov               
job_pl_jobname          CM4_piControl_C_atmos_00050101                          
job_pl_scriptname       CM4_piControl_C_atmos_00050101                          
job_pl_start            2019-02-20 19:58:41.274267                              
job_pl_submit           2019-02-20 19:58:41.274463                              
job_pl_username         Foo.Bar

Performance Metrics Data Dictionary¶

epmt collects data both from the job runtime and the applications run in that environment. See the src/epmt/models/ directory for fixed data stored related to each object. Metric data is stored differently; the data collector's data dictionary can be found in papiex-oss/README.md. At the time of this writing, it looked like this:

Key	Scope	Description
1. tags	Process	User specified tags for this executable
2. hostname	Process	hostname
3. exename	Process	Name of the application, usually argv[0]
4. path	Process	Path to the application
5. args	Process	All arguments to exe excluding argv[0]
6. exitcode	Process	Exit code
7. exitsignal	Process	Exited due to a signal
8. pid	Process	Process id
9. generation	Process	Incremented after every exec() or PID wrap
10. ppid	Process	Parent process id
11. pgid	Process	Process group id
12. sid	Process	Process session id
13. numtids	Process	Number of threads caught by instrumentation
14. numranks	Process	Number of MPI ranks detected
15. tid	Process	Thread id
16. mpirank	Thread	MPI rank
17. start	Process	Microsecond timestamp at start
18. end	Process	Microsecond timestamp at end
19. usertime	Thread	Microsecond user time
20. systemtime	Thread	Microsecond system time
21. rssmax	Thread	Kb max resident set size
22. minflt	Thread	Minor faults (TLB misses/new page frames)
23. majflt	Thread	Major page faults (requiring I/O)
24. inblock	Thread	512B blocks read from I/O
25. outblock	Thread	512B blocks written to I/O
26. vol_ctxsw	Thread	Voluntary context switches (yields)
27. invol_ctxsw	Thread	Involuntary context switches (preemptions)
28. cminflt	Process	minflt (20) for all wait()ed children
29. cmajflt	Thread	majflt (21) for all wait()ed children
30. cutime	Process	utime (17) for all wait()ed children
31. cstime	Thread	stime (18) for all wait()ed children
32. num_threads	Process	Threads in process at finish
33. starttime	Thread	Timestamp in jiffies after boot thread was started
34. processor	Thread	CPU this thread last ran on
35. delayacct_blkio_time	Thread	Jiffies process blocked in D state on I/O device
36. guest_time	Thread	Jiffies running a virtual CPU for a guest OS
37. rchar	Thread	Bytes read via syscall (maybe from cache not dev I/O)
38. wchar	Thread	Bytes written via syscall (maybe to cache not dev I/O)
39. syscr	Thread	Read syscalls
40. syscw	Thread	Write syscalls
41. read_bytes	Thread	Bytes read from I/O device
42. write_bytes	Thread	Bytes written to I/O device
43. cancelled_write_bytes	Thread	Bytes discarded by truncation
44. time_oncpu	Thread	Nanoseconds spent running
45. time_waiting	Thread	Nanoseconds runnable but waiting
46. timeslices	Thread	Number of run periods on CPU
47. rdtsc_duration	Thread	If PAPI, real time cycle duration of thread
*	Thread	PAPI metrics

Addition of new metrics¶

Additional metrics can be configured either in two ways:

The papiex_options string in settings.py if using epmt run or epmt source
The value of the PAPIEX_OPTIONS environment variable if using LD_PRELOAD directly.

The value of these should be a comma separated string:

export PAPIEX_OPTIONS="PERF_COUNT_SW_CPU_CLOCK,PAPI_CYCLES"

To list available and functioning metrics, use one of the included command line tools:

papi_avail and papi_native_avail (via papi)
check_events and showevtinfo (via libpfm)
perf list (linux)

The PERF_COUNT_SW_* events should work on any system that has the proper /proc/sys/kernel/perf_event_paranoid setting.

One should verify the functionality of the metric using the papi_command_line tool:

papi_command_line PERF_COUNT_SW_CPU_CLOCK
papi_command_line CYCLES

CI/CD Workflows and Caching¶

epmt's GitHub Actions CI is split into focused workflows that use actions/cache to avoid rebuilding expensive artifacts on every pull request run.

Workflows¶

Workflow	Trigger	Purpose
`docker_build_test.yml`	`push` to `main`, `pull_request`	Full build + test pipeline; restores cached artifacts before building
`slurm_image_build.yml`	Weekly (Mon 06:00 UTC), `workflow_dispatch`	Builds the `slurm-cluster` Docker image from source and saves it to cache
`weekly_tarball_build.yml`	Weekly (Mon 06:00 UTC), `workflow_dispatch`	Compiles `papiex` and downloads `epmt-dash`, then saves both to cache
`build_and_test_epmt.yml`	`push` to `main`, `pull_request`	Source-tree unit tests (no Docker)

Caches and Invalidation¶

Each cache step is keyed on its build prerequisites — analogous to how make uses file modification times to decide whether a target must be rebuilt. Changing a prerequisite produces a new cache key, causing a cache miss and forcing a fresh build.

Cache	Cache key components	Invalidation trigger	Notes
`epmt-build` Docker image	`OS_TARGET` + `PYTHON_VERSION` + `SQLITE_VERSION` + `hashFiles(Dockerfile, requirements.txt.py3)`	Edit the Dockerfile or requirements file, or bump any version variable	Fully content-hash based — closest analogy to `make`
`papiex` compiled tarball	`PAPIEX_VERSION` + `OS_TARGET`	Bump `PAPIEX_VERSION` in `docker_build_test.yml`, `weekly_tarball_build.yml`, and `Makefile`	Version-gated; relies on papiex using immutable release tags
`test-release` Docker image	`OS_TARGET` + `PYTHON_VERSION` + `SQLITE_VERSION` + `hashFiles(Dockerfile, requirements.txt.py3)` + `github.sha`	Always rebuilds (by design) — `restore-keys` prefix reuses unchanged early layers via `--cache-from`	Image content changes every commit; layer reuse keeps it fast
`slurm-cluster` Docker image	`IMAGE_TAG` + `SLURM_TAG` + `SLURM_CLUSTER_TAG`	Bump any of the three version variables in `docker_build_test.yml` and `slurm_image_build.yml`	Version-string based; upstream tag mutations without a version bump won't invalidate
`epmt-dash` UI directory	`EPMT_DASH_SRC_BRANCH`	Change `EPMT_DASH_SRC_BRANCH` in `docker_build_test.yml`, `weekly_tarball_build.yml`, and `Makefile`	Branch-name based; new commits to same branch don't invalidate — weekly workflow bounds staleness to ≤1 week

Cache Invalidation Gap vs. `make`¶

make detects prerequisite changes via file modification times regardless of version numbers. The GitHub Actions caches above use version strings or content hashes instead, so:

epmt-build is fully make-like — changing the Dockerfile or requirements file immediately produces a different hash and forces a rebuild.
papiex and slurm-cluster require a deliberate version bump in the workflow env block to trigger a rebuild. Remote source changes without a version bump go undetected until the next weekly build.
epmt-dash requires either a branch rename or waiting for the weekly build. The weekly weekly_tarball_build.yml workflow always fetches fresh content, bounding maximum staleness to one week.

Forcing a Rebuild¶

To force any individual cache to rebuild, bump the relevant version variable in the env: section of both the weekly workflow and docker_build_test.yml, and update the Makefile to match:

# docker_build_test.yml  (and weekly_tarball_build.yml)
env:
  PAPIEX_VERSION: "2.3.16"          # bump to force papiex rebuild
  EPMT_DASH_SRC_BRANCH: "new-branch" # change to force epmt-dash rebuild
  IMAGE_TAG: "25.05.4"              # bump to force slurm-cluster rebuild

The epmt-build cache is invalidated automatically whenever Dockerfiles/Dockerfile.rocky-8-epmt-build or requirements.txt.py3 is modified — no manual version bump is needed.

Weekly Pre-warming¶

slurm_image_build.yml and weekly_tarball_build.yml run every Monday morning to pre-warm their respective caches before the working week begins. docker_build_test.yml then restores those caches on pull request and push runs, skipping the expensive Docker compile steps. If the cache is missing (first run, eviction, or new key), docker_build_test.yml falls back to building the artifact inline so the pipeline never silently skips a required build step.

Troubleshooting¶

Virtual Environments¶

Note that often in virtual environments, hardware counters are not often available in the VM.

Common Error Messages¶

`version GLIBC_x.xx not found`¶

The collector library may not have been built for the current environment or the release OS version does not match the current environment.

epmt Developer and Advanced Usage Guide¶