Language: en

PoPy Data Format¶

The PoPy data file records observation and dosing regimens for each individual in a study.

The columns or fields in the data file are split into four main types in Table 5:-

Table 5 PoPy data fields¶
Field	Comment
Required Fields	TYPE/ID/TIME
Dosing Fields	dosing regime data
Observation Fields	observed measurements
Extra Fields	extra co-variate information

The data file values for each field can be accessed using the c[X] notation in the PoPy script file.

Required Fields¶

A PoPy data set requires the following fields:-

TYPE - type of row
ID - identity
TIME - time field

Note the names ‘TYPE’, ‘ID’ and ‘TIME’ are the default names of these three required fields. You can use other field names if you choose to redefine them in the script file DATA_FIELDS section.

TYPE¶

The ‘TYPE’ field specifies the event that is happening in each row of the data file. The different types of row are as follows:-

obs - Measurements that contribute to the log likelihood as defined in the PREDICTIONS section.
dose - Creates a dose according to the dosing functions in the DERIVATIVES section.
pred - Extra prediction data points. PoPy will output extra p[X] data at these time points, but they do not contribute to the likelihood.
reset - Set the s[X] compartment states back to the initial values (usually zero)
reset+dose - A ‘reset’ combined with a ‘dose’ event.

The row types above have direct equivalents in Nonmem in terms of the EVID integer values.

Typically a drug trial data set mainly consists mainly of ‘obs’ and ‘dose’ rows with a few ‘reset’ rows, per subject.

ID¶

The ‘ID’ field value defines the individual for a given row. As PoPy is a PopPK/PD system. The ‘ID’ field is required because the data is split over multiple individuals to form a population.

Note that non-population analysis can be performed in PoPy by assigning all rows the same ‘ID’ value.

TIME¶

The ‘TIME’ field defines the time stamp for each row.

The time field is required to be monotonically increasing, unless a TYPE = ‘reset’ or ‘reset+dose’ row is reached. Note that when the ID identifier changes between rows, then an implicit ‘reset’ occurs.

For an example of a valid combination of TYPE/ID/TIME data see Table 6.

Table 6 PoPy time reset example¶
TYPE	ID	TIME	comment
obs	Bob	0.0	observation at time zero
dose	Bob	4.0	dose for bob at time 4.0
obs	Bob	4.0	observation for bob at time 4.0
obs	Bob	8.0	later observation
obs	Ruth	0.0	time goes back, ok cos new ID
dose	Ruth	10.0	dose for Ruth at time 10.0
obs	Ruth	20.0	later observation
reset	Ruth	30.0	`s[X]` reset at time 30.0
obs	Ruth	1.0	observation following reset

In Table 6 the time always increases or stays the same in consecutive rows, but time is allowed to go backwards after a new ID or a reset.

Dosing Fields¶

Dosing events are created in the data file using ‘dose’ values in the TYPE field.

There are two methods of associating data dose rows with the DERIVATIVES section in the PoPy script file, as follows:-

Single Dose Type
Multiple Dose Types

The first involves using just the ‘dose’ value, the second involves defining dose type names.

The amount of each dose is usually specified in an AMT field, see below.

AMT¶

Note in PoPy AMT is not a keyword. It is just the conventional name for the dose amount field used in this documentation. See AMT for the Nonmem keyword.

Single Dose Type¶

The simplest way to create doses at a set of fixed times is shown in Table 7.

Table 7 PoPy single dose type example¶
TYPE	TIME	AMT	comment
dose	1.0	100	dose of 100 at time 1.0
dose	2.0	200	dose of 200 at time 2.0
dose	3.0	100	dose of 100 at time 3.0

Note that this creates 3 doses at times [1.0, 2.0, 3.0]. The script file loading this data set should have a DERIVATIVES section something like:-

DERIVATIVES: |
    d[DEPOT] = @bolus{amt: c[AMT]} - m[KE] * s[DEPOT]

Note that the @bolus dose has no name associated with it.

Multiple Dose Types¶

If you have multiple types of dose in your analysis, e.g. two different drugs being prescribed, then you need to give each dose type a name, as shown in Table 8.

Table 8 PoPy multi dose type example¶
TYPE	TIME	AMT_DRUG1	AMT_DRUG2	comment
dose:drug1	1.0	100	0	100 units of drug1
dose:drug2	2.0	0	200	200 units of drug2
dose:drug1	3.0	50	0	50 units of drug1

The data file above creates 2 doses of drug1 and 1 dose of drug2. The script file loading this data set should have a DERIVATIVES section something like:-

DERIVATIVES: |
    dose[drug1] = @bolus{amt: c[AMT_DRUG1]}
    dose[drug2] = @bolus{amt: c[AMT_DRUG2]}
    d[DEPOT1] = dose[drug1] - m[KE1] * s[DEPOT1]
    d[DEPOT2] = dose[drug2] - m[KE2] * s[DEPOT2]

The important aspect here is that the @bolus doses are defined with names ‘drug1’ and ‘drug2’. These names also appear in the TYPE field in the data set as ‘dose:drug1’ and ‘dose:drug2’.

An alternative naming syntax is as follows:-

DERIVATIVES: |
    d[DEPOT1] = @bolus{amt: c[AMT_DRUG1], name: 'drug1'} - m[KE1] * s[DEPOT1]
    d[DEPOT2] = @bolus{amt: c[AMT_DRUG2], name: 'drug2'} - m[KE2] * s[DEPOT2]

Note that when creating a PoPy data set, you only need to specify a name for each type of dose. You can leave the modelling decision of where each dose appears in the compartment model to a later time.

Observation Fields¶

Another important set of fields in the data file are the columns that define observed measurements. Observation rows are defined by setting TYPE = ‘obs’.

This section shows examples of the following:-

Single Observed Field
Observed Field with missing data
Multiple Observed Fields

Note in each case the PREDICTIONS section of the PoPy script file is associated with observation fields in the data file in order to compute the likelihood correctly.

Single Observed Field¶

An example of a single observed field is shown in Table 9.

Table 9 PoPy single observed field example¶
TYPE	DRUG_CONC
obs	10.5
obs	15.5
obs	2.0

In this simple case the PREDICTIONS section may look something like:-

PREDICTIONS: |
    p[DRUG_CONC] = s[CEN]/m[V]
    c[DRUG_CONC] ~ norm(p[DRUG_CONC], m[ANOISE_var])

Note that the c[DRUG_CONC] references the ‘DRUG_CONC’ field of the data set. Here the likelihood is computed by comparing the model prediction p[DRUG_CONC] and the data file observation c[DRUG_CONC] for all rows of the data set, where TYPE = ‘obs’.

Therefore all values of the data column ‘DRUG_CONC’ have to be valid observations. If you have missing values then you need to use the data structure in Observed Field with missing data.

Observed Field with missing data¶

An example of a single observed field, with some missing data is shown in Table 10.

Table 10 PoPy single observed field missing data example¶
TYPE	DRUG_CONC	DRUG_CONC_FLAG	comment
obs	10.5	1	DRUG_CONC valid
obs	0.0	0	DRUG_CONC invalid
obs	-5.0	0	DRUG_CONC invalid
obs	2.0	1	DRUG_CONC valid

In this case the PREDICTIONS section may still look something like:-

PREDICTIONS: |
    p[DRUG_CONC] = s[CEN]/m[V]
    c[DRUG_CONC] ~ norm(p[DRUG_CONC], m[ANOISE_var])

However not all the TYPE = ‘obs’ rows contribute to the likelihood in this case. Only the rows that have TYPE = ‘obs’ and DRUG_CONC_FLAG = 1.

It is similar to having the following ‘if’ statement in your PREDICTIONS section:-

PREDICTIONS: |
    p[DRUG_CONC] = s[CEN]/m[V]
    if c[DRUG_CONC_FLAG] > 0.5:
        c[DRUG_CONC] ~ norm(p[DRUG_CONC], m[ANOISE_var])

You can include the ‘if’ statement in your PREDICTIONS section if you like, but it is not required (or encouraged).

Note also that missing out the ‘DRUG_CONC_FLAG’ field from your data set, has a similar effect to creating a ‘DRUG_CONC_FLAG’ field and setting all the values to 1. i.e. Flags default to 1 in PoPy.

If you have multiple observation types in your data set then flag fields become more important, see the example data structure in Multiple Observed Fields.

Multiple Observed Fields¶

An example of multiple observed fields, is shown in Table 11.

Table 11 PoPy multiple observed fields¶
TYPE	DRUG1	DRUG1_FLAG	DRUG2	DRUG2_FLAG	comment
obs	10.5	1	0.2	1	Both drugs valid
obs	10.5	1	0.0	0	only drug1 valid
obs	-4.1	0	0.0	0	both drugs invalid
obs	-4.1	0	0.5	1	only drug2 valid

In this case the PREDICTIONS section may look something like:-

PREDICTIONS: |
    p[DRUG1] = s[CEN1]/m[V1]
    c[DRUG1] ~ norm(p[DRUG1], m[ANOISE_var1])
    p[DRUG2] = s[CEN2]/m[V2]
    c[DRUG2] ~ norm(p[DRUG2], m[ANOISE_var2])

Here PoPy uses the ‘DRUG1_FLAG’ and ‘DRUG2_FLAG’ fields from the data set to only compute the likelihood from valid observations. You don’t have to use ‘if’ statements in the PREDICTIONS section to achieve this.

Extra Fields¶

The other columns of the PoPy data file are available to use in the following verbatim sections:-

For example see below for a simple example of covariate modelling using the MODEL_PARAMS:-

MODEL_PARAMS: |
    m[X] = f[X] + f[X_Y_EFFECT]*c[Y]

Here the m[X] parameter is modelled as having a linear relationship with the c[Y] covariate from the data file.

It is also possible to use c[X] variables in the other sections. One usage case is when you already have PK parameters estimated (from a previous study) and wish to use these c[X] variables in the DERIVATIVES section, instead of estimating m[X] parameters for each individual.

Next Steps¶

You can use the information above to construct your own PoPy data sets from real data. If you have a previously constructed Nonmem data set then see Nonmem Data to PoPy Data File for guidance on how to convert such a data set to PoPy format.

See Generate data and Fit using Simple PopPK Model for an example of creating a synthetic PoPy data file from a single script. It is also possible to create multiple data sets, see Generate multiple data sets and Fit using Simple PopPK Model.

Fitting a Simple PopPK Model using PoPy

Generate data and Fit using Simple PopPK Model

Documentation

PoPy Data Format¶

Required Fields¶

TYPE¶

ID¶

TIME¶

Dosing Fields¶

AMT¶

Single Dose Type¶

Multiple Dose Types¶

Observation Fields¶

Single Observed Field¶

Observed Field with missing data¶

Multiple Observed Fields¶

Extra Fields¶

Next Steps¶

Additional Information

Contents

Browse

You are here:

Getting help

Download:

PoPy Links