Nonmem to PoPy Data conversions using P2NDAT and N2PDAT Scripts¶
See Nonmem Data to PoPy Data File for an overview of how the Nonmem data format maps to PoPy format. It is very possible to use this format information to write your own data conversion script in a general purpose programming language, e.g. R or Python.
However, we provide a convenient N2PDat Script, to automatically convert from Nonmem to PoPy without doing any programming. We actually provide two scripts that are mirror images of each other as follows:-
- P2NDat Script - converts from PoPy to Nonmem data
- N2PDat Script - converts from Nonmem to PoPy data
The two types of conversion scripts are illustrated in this section using the following files from the PoPy examples folder:-
c:\PoPy\examples\p2ndat_script.pyml
n2pdat_script.pyml
fit_example1_data.csv
Here ‘fit_example1_data.csv’ is in PoPy Data Format and is the simple PK data file discussed in Fitting a Simple PopPK Model using PoPy. ‘p2ndat_script.pyml’ is a PoPy script that will convert the original PoPy ‘fit_example1_data.csv’ to Nonmem format, see P2NDAT Example. The ‘n2pdat_script.pyml’ will convert the newly created Nonmem data file back to PoPy format, see N2PDAT Example.
The data files in this section form a loop:-
fit_example1_data.csv -p2ndat-> fit_example1_nm_data.csv -n2pdat-> fit_example1_data_v2.csv
Where ‘fit_example1_data.csv’ and ‘fit_example1_data_v2.csv’ are both compatible PoPy data files and ‘fit_example1_nm_data.csv’ is in Nonmem format.
P2NDAT Example¶
The first few rows of the original ‘fit_example1_data.csv’ file are shown in Table 53.
TYPE | ID | TIME | AMT | DV_CENTRAL | DV_CENTRAL_FLAG |
reset | 1 | 0 | 100 | 0 | 0 |
dose | 1 | 1 | 100 | 0 | 0 |
obs | 1 | 7.22152181887 | 100 | 55.3986503177 | 1 |
obs | 1 | 13.7633242874 | 100 | 43.5423043551 | 1 |
obs | 1 | 19.4607360933 | 100 | 24.3960137842 | 1 |
obs | 1 | 44.9645896939 | 100 | 3.06161955063 | 1 |
obs | 1 | 48.3691740856 | 100 | 2.84311907493 | 1 |
reset | 2 | 0 | 100 | 0 | 0 |
dose | 2 | 1 | 100 | 0 | 0 |
obs | 2 | 7.03200507014 | 100 | 48.1712193857 | 1 |
You can view the example P2NDat Script, Open a PoPy Command Prompt in this folder:-
c:\PoPy\examples\
And type:-
$ popy_edit p2ndat_script.pyml
Then run using:-
$ popy_run p2ndat_script.pyml
This will create a new Nonmem data file ‘fit_example1_nm_data.csv’. The first ten rows are shown in Table 54.
TIME | ID | AMT | DV | MDV | EVID | CMT |
0 | 1 | 0 | 0 | 1 | 3 | 1 |
1 | 1 | 100 | 0 | 1 | 1 | 1 |
7.22152181887 | 1 | 0 | 55.3986503177 | 0 | 0 | 1 |
13.7633242874 | 1 | 0 | 43.5423043551 | 0 | 0 | 1 |
19.4607360933 | 1 | 0 | 24.3960137842 | 0 | 0 | 1 |
44.9645896939 | 1 | 0 | 3.06161955063 | 0 | 0 | 1 |
48.3691740856 | 1 | 0 | 2.84311907493 | 0 | 0 | 1 |
0 | 2 | 0 | 0 | 1 | 3 | 1 |
1 | 2 | 100 | 0 | 1 | 1 | 1 |
7.03200507014 | 2 | 0 | 48.1712193857 | 0 | 0 | 1 |
The differences between the input PoPy data Table 53 and the output Nonmem data Table 54. Are summarised in the Table 55
Input PoPy column | Output Nonmem column | Comments |
---|---|---|
TYPE | EVID | reset->3, dose->1, obs->0 |
ID | ID | no change |
TIME | TIME | no change |
AMT | AMT | dose rows no change, obs/reset rows -> 0 |
DV_CENTRAL | DV | no change |
DV_CENTRAL_FLAG | MDV | 1-DV_CENTRAL_FLAG |
Note the corresponding columns are not in the same order between ‘fit_example1_data.csv’ and ‘fit_example1_nm_data.csv’. The P2NDat Script has removed the ‘TYPE’, ‘DV_CENTRAL’ and ‘DV_CENTRAL_FLAG’ PoPy fields, to leave ‘TIME’, ‘ID’ and ‘AMT’, then added the newly created Nonmem specific ‘DV’, ‘MDV’, ‘EVID’ and ‘CMT’ columns.
The ‘fit_example1_nm_data.csv’ contains a ‘CMT’ field to specify that the Nonmem dosing occurs in compartment 1. PoPy specifies the dosing compartment entirely in the script file, see Dosing Fields, so the output ‘CMT’ column has no corresponding column in the PoPy data file. You have to specify the ‘CMT’ value in your P2NDat Script manually, see OUTPUT_NONMEM_FIELDS.
P2NDAT Script Syntax¶
You can view the example P2NDat Script here:-
c:\PoPy\examples\p2ndat_script.pyml
Each section is discussed below.
METHOD_OPTIONS¶
Just specifies the script type:-
METHOD_OPTIONS: {py_module: p2ndat}
See METHOD_OPTIONS for more info.
FILE_PATHS¶
Just specifies the input PoPy data file and output Nonmem data file:-
FILE_PATHS:
input_popy_file: fit_example1_data.csv
output_nonmem_file: fit_example1_nm_data.csv
INPUT_POPY_FIELDS¶
Describes the columns of the input PoPy file:-
INPUT_POPY_FIELDS:
time_field: TIME
id_field: ID
type_field: TYPE
dv_fields: ['DV_CENTRAL']
amt_fields: ['AMT']
rate_fields: []
dur_fields: []
dose_labels: ['']
Here ‘time_field’, ‘id_field’ and ‘type_field’ are the PoPy data file Required Fields. They default to the above values.
The ‘dv_fields’ is a list of PoPy Observation Fields that will be moved into the Nonmem DV field. Note you can specify multiple observed columns, each observed field will result in extra rows in the Nonmem data output, as Nonmem only ever has one DV observation column.
The ‘amt_fields’ is a list of PoPy Dosing Fields, i.e. columns that contain dose amounts. Similar to the ‘dv_fields’, if you specify multiple dosing amount columns, then the Nonmem data output will contain extra rows, as Nonmem only has one AMT field.
The ‘rate_fields’ and ‘dur_fields’ are blank because we only have bolus dosing here. If you have infusion dosing then add the @inf_rate and @inf_dur rate and duration parameters here.
The ‘dose_labels’ field contains the dosing names used in the PoPy data file. In this case dose_labels= [‘’] means PoPy dose names are not used. i.e. the TYPE column just uses ‘dose’ values. If you use ‘dose:my_dose_name’, ‘dose:my_other_dose_name’ in your PoPy data file, to describe Multiple Dose Types, then you need to list the names here, e.g. [‘my_dose_name’, ‘my_other_dose_name’].
OUTPUT_NONMEM_FIELDS¶
Describes the columns of the output Nonmem file:-
OUTPUT_NONMEM_FIELDS:
comment_prefix: '#'
column_names: auto
time_field: TIME
id_field: ID
evid_field: EVID
dv_field: DV
mdv_field: MDV
amt_field: AMT
rate_field: none
dur_field: none
cmt_field: CMT
obs_cmt_numbers: [1]
dose_cmt_numbers: [1]
Here ‘comment_prefix’ allows loading of Nonmem data files with comment lines. Lines starting with the ‘comment_prefix’ symbol are ignored.
‘column_names: auto’, uses the columns names in the ‘.csv’ data file. You could rename them using a list here, a bit like the Nonmem $INPUT section.
The ‘time_field’, ‘id_field’, ‘evid_field’, ‘dv_field’, ‘mdv_field’, ‘rate_field’, ‘dur_field’ and ‘cmt_field’ allows you to specify the Nonmem key fields ‘ID’, ‘EVID’, ‘DV’, ‘MDV’, ‘AMT’, ‘RATE’, ‘DUR’ and ‘CMT’. These fields default to the Nonmem key names.
Note that if you do not require some of the Nonmem fields, e.g. in this case ‘rate_field’ and ‘dur_field’, because these only relate to infusion dosing and there is only bolus dosing in this example. Then you can assign null values using ‘none’.
The ‘obs_cmt_numbers’ is a list of compartment indices to appear in the CMT column to be created by the P2NDat Script. The ‘OUTPUT_NONMEM_FIELDS->obs_cmt_numbers’ list must be the same length as the ‘INPUT_POPY_FIELDS->dv_fields’ list. The elements of both lists must correspond to the same type of observation. e.g. in this case all PoPy observations ‘DV_CENTRAL’ occur in Nonmem compartment one. The P2NDat Script will copy the PoPy ‘DV_CENTRAL’ value into the Nonmem DV column for all rows with TYPE =’obs’ and set MDV =0 for these rows.
The ‘dose_cmt_numbers’ is a list of compartment indices to appear in the CMT column to be created by the P2NDat Script. The ‘OUTPUT_POPY_FIELDS->dose_cmt_numbers’ list must be the same length as the ‘INPUT_POPY_FIELDS->amt_fields’ list. The elements of both lists must correspond to the same type of dose. e.g. in this case all PoPy dose amounts ‘AMT’ occur in Nonmem compartment one. The P2NDat Script will copy the PoPy ‘AMT’ value into the Nonmem AMT column for all rows with TYPE =’dose’ and set AMT =0.0 for other rows.
If you have multiple doses and multiple observation fields in your input PoPy data, then you have to specify the dv_fields/obs_cmt_numbers and amt_fields/dose_cmt_numbers list pairs carefully.
OUTPUT_OPTIONS¶
Describes the output options. Currently, the only option is to remove fields from the final data file:-
OUTPUT_OPTIONS:
drop_fields: ['TYPE', 'DV_CENTRAL', 'DV_CENTRAL_FLAG']
Here we are removing the old PoPy fields from the Nonmem data output. This is useful in this case, as we wish to demonstrate regenerating the orig PoPy fields, when we use a N2PDat Script in the next section.
N2PDAT Example¶
The P2NDat Script converts data from PoPy to Nonmem format. Here we discuss the N2PDat Script that computes the inverse conversion from Nonmem to PoPy format.
Assuming you have run the N2PDAT Example, view the example N2PDat Script in your text editor, by typing:-
$ popy_edit n2pdat_script.pyml
Then run the N2PDat Script using:-
$ popy_run n2pdat_script.pyml
This will create a new PoPy data file ‘fit_example1_data_v2.csv’. The first ten rows are shown in Table 56.
TIME | ID | AMT | DV_CENTRAL | DV_CENTRAL_FLAG | TYPE |
0 | 1 | 0 | 0 | 0 | reset |
1 | 1 | 100 | 0 | 0 | dose:_bolus |
7.22152181887 | 1 | 0 | 55.3986503177 | 1 | obs |
13.7633242874 | 1 | 0 | 43.5423043551 | 1 | obs |
19.4607360933 | 1 | 0 | 24.3960137842 | 1 | obs |
44.9645896939 | 1 | 0 | 3.06161955063 | 1 | obs |
48.3691740856 | 1 | 0 | 2.84311907493 | 1 | obs |
0 | 2 | 0 | 0 | 0 | reset |
1 | 2 | 100 | 0 | 0 | dose:_bolus |
7.03200507014 | 2 | 0 | 48.1712193857 | 1 | obs |
The differences between the input Nonmem data Table 54 and the output PoPy data Table 56. Are summarised in the Table 57
Input Nonmem column | Output PoPy column | Comments |
---|---|---|
TIME | TIME | no change |
ID | ID | no change |
AMT | AMT | no change |
DV | DV_CENTRAL | no change |
MDV | DV_CENTRAL_FLAG | 1-MDV |
EVID | TYPE | 3->reset,1->dose:_bolus,0->obs |
CMT | N/A | PoPy has no ‘CMT’ equivalent |
The N2PDat Script has removed ‘DV’, ‘MDV’, ‘EVID’ and ‘CMT’ Nonmem fields from ‘fit_example1_nm_data.csv’ and replaced them with ‘TYPE’, ‘DV_CENTRAL’ and ‘DV_CENTRAL_FLAG’ PoPy fields in ‘fit_example1_data_v2.csv’.
The ‘fit_example1_data_v2.csv’ contains no ‘CMT’ field because PoPy specifies the dosing compartment entirely in the script file, see Dosing Fields.
N2PDAT Script Syntax¶
You can view the example N2PDat Script here:-
c:\PoPy\examples\n2pdat_script.pyml
Each section is discussed below.
METHOD_OPTIONS¶
Specifies the script type:-
METHOD_OPTIONS: {py_module: n2pdat}
See METHOD_OPTIONS for more information.
FILE_PATHS¶
Specifies the input Nonmem data file and output PoPy data file:-
FILE_PATHS:
input_nonmem_file: fit_example1_nm_data.csv
output_popy_file: fit_example1_data_v2.csv
INPUT_NONMEM_FIELDS¶
Describes the columns of the input Nonmem file:-
INPUT_NONMEM_FIELDS:
comment_prefix: '#'
column_names: auto
date_field: none
date_format: none
time_field: TIME
id_field: ID
evid_field: EVID
dv_field: DV
mdv_field: MDV
amt_field: AMT
rate_field: none
dur_field: none
cmt_field: CMT
obs_cmt_numbers: [1]
dose_cmt_numbers: [1]
This is the same as the OUTPUT_NONMEM_FIELDS section. The only difference is that this section is now describing an input Nonmem data file instead of an output Nonmem data file.
The ‘obs_cmt_numbers’ and ‘dose_cmt_numbers’ list have to correspond to the ‘dv_fields’ and ‘amt_fields’ in the OUTPUT_POPY_FIELDS section to get sensible PoPy data output. See below for more explanation.
OUTPUT_POPY_FIELDS¶
Describes the columns of the output PoPy file:-
OUTPUT_POPY_FIELDS:
time_field: TIME
id_field: ID
type_field: TYPE
dv_fields: ['DV_CENTRAL']
amt_fields: ['AMT']
rate_fields: []
dur_fields: []
dose_labels: ['']
This is the same as the INPUT_POPY_FIELDS section. The only difference is that this section is now describing an output PoPy data file instead of an input PoPy data file.
Here the ‘dv_fields’ is a list of PoPy observation columns to be created by the N2PDat Script, based on the input Nonmem DV field. The ‘OUTPUT_POPY_FIELDS->dv_fields’ list must be the same length as the ‘INPUT_NONMEM_FIELDS->obs_cmt_numbers’ list. The elements of both lists must correspond to the same type of observation. e.g. in this case all Nonmem observations occur in compartment one, so for Nonmem data rows with EVID =0 and CMT =1 the Nonmem DV value is copied into the PoPy DV_CENTRAL column with DV_CENTRAL_FLAG=1.
The ‘amt_fields’ is a list of PoPy dose amount columns to be created by the N2PDat Script, based on the input Nonmem AMT field. The ‘OUTPUT_POPY_FIELDS->amt_fields’ list must be the same length as the ‘INPUT_NONMEM_FIELDS->dose_cmt_numbers’ list. The elements of both list must correspond to the same type of dose. e.g. in this case all Nonmem doses occur in compartment one, so for Nonmem data rows with EVID =1 and CMT =1 the Nonmem AMT value is copied into the PoPy AMT column, with all other rows set to zero.
If you have multiple doses and multiple observation fields in your input Nonmem data, then you have to specify the obs_cmt_numbers/dv_fields and dose_cmt_numbers/amt_fields list pairs carefully.
OUTPUT_OPTIONS¶
Describes the output options, currently, just which fields to remove:-
OUTPUT_OPTIONS:
drop_fields: ['DV', 'MDV', 'EVID', 'CMT']
Here we are removing the Nonmem specific fields. In a real life conversion it may be sensible to keep the Nonmem fields, so that you can perform a side by side sanity check from within the PoPy output file. Note the fields above will be of little use to a PoPy Fit Script, compared to the ‘DV_CENTRAL’, ‘DV_CENTRAL_FLAG’ and ‘TYPE’ fields, created by the N2PDat Script.
Compare original PoPy data with P2NDAT/N2PDAT version¶
In this walk through we have taken a PoPy data file ‘fit_example1_data.csv’, run P2NDat Script to create a Nonmem data version. Then we ran N2PDat Script to re-create the original PoPy data file ‘fit_example1_data_v2.csv’ from the Nonmem data.
You can compare the first 10 rows of both the input PoPy data set in Table 53 and the output PoPy data in Table 56.
Both files contain the same column headers i.e. ‘TYPE’, ‘ID’, ‘TIME’, ‘AMT’, ‘DV_CENTRAL’, ‘DV_CENTRAL_FLAG’. The values in each column are the same apart from ‘AMT’ column has zero values in non-dose rows. Also the ‘dose’ value in the TYPE field is now ‘dose:_bolus’. Both the input and output .csv files are valid PoPy data formats for the PK/PD problem described in Fitting a Simple PopPK Model using PoPy.